Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Advanced Signal Processing (Read 9718 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Advanced Signal Processing

Advanced Signal Processing
Introducing the Empirical Mode Decomposition


I will discuss in this thread the following topics.

1. Introduction
2. Empirical Mode Decomposition
3. Use in Audio Processing
4. Devising a better (and open source) audio compression format
5. High-level acoustic language

1. Introduction
=========
Unfortunately, my knowledge of audio processing is very limited. However, from my knowledge, I believe that present formats (like mp3) are very limited in scope. One reason is the fact, that most concepts are based on data frames that are analysed using a Fourier transform.

However, the Fourier transform is limited to stationary and linear data, severly hampering its use in more advanced audio processing (where data is definetly non-linear and non-stationary).

While music is a combination of various waves, these waves may change non-linearly over short time periods. Nevertheless, music is still highly ordered, so it should be possible to decompose most music signals into a finite number of Intrinsic Mode Functions. This functions do NOT apply only to one data frame, but might extend to nearby frames, therefore greatly reducing the size of the final compressed stream. My idea is to use something similar as in video compression and to predict the wave(s) for the next 50 ms up to 1-5s in advance, so that only the difference to this prediction needs to be encoded.

2. Empirical Mode Decomposition (EMD)
=====================
Quote
EMD is an adaptive decomposition with which any complicated signal can be decomposed into its Intrinsic Mode Functions (IMF). EMD is an analysis method that in many aspects gives a better understanding of the physics behind the signals. Because of its ability to describe short time changes in frequencies that can not be resolved by Fourier spectral analysis it can be used for nonlinear and non-stationary time series analysis. The original purpose for the EMD was to find a decomposition which made it possible to use the instantaneous frequency for time-frequency analysis of non-stationary signals.


Basically, the EMD allows such analysis.

The original article describing EMD can be found here. It gets pretty technical (but the introduction is good to explain some limitations of the Fourier transform; it has also some 100 pages).

A more brief description can be found here and an accompanying PowerPoint presentation exemplifying the algorithm is here (see the emd.ppt). Further articles can be downloaded from here, see e.g. this one.

A description of EMD use in image compression can be found here. I also maintain a wiki page, that has some information on EMD (see this page).

3. Use in Audio Processing
=================
I believe that EMD is the ideal tool for compressing audio streams, especially music. Voice and various instruments generate specific waves that are maintained (or changed non-randomly) over various time scales. Therefore, we could detect these waves (over time scales of up to a few seconds) and subtract them from the actual signal, getting a less complex signal that can be better compressed using a lossy algorithm.

Also, these waves are available for more than one data frame, tehrefore maximizing the compression.

4. Devising a better (and open source) audio compression format
==========================================
I belive, the open source community should use this information to devise a better compression format that existing ones (which are focused entirely on single frames). Even if it will be years before a functioning program becomes available (or computing power is adequate), it would be totally open-source and NOT covered by any patents or copyrights.

This is definetly something for the future.

5. High-level acoustic language
====================
I even go a step further. There are specific high-level languages in various fields. A good example is the high level graphics language used in Asymptote on sourceforge.net. There are high-level (and lower level) commands to draw various objects (polygons, circles, and so on).

I really miss such a language to create a signal. As exemplified for EMD, we would generate basic waves (not just sinusoidal) that extend to various amounts of time in the future. They also may change non-linearly.

Over this, we add various frequencies and/or harmonics to every frame.

It would be wise to define such a high level acoustic language, that permits such sort of operations. This would be useful to recreate the signal using such a language.


I hope that people with knowledge about signal processing will look into these new possibilities and will try to implement them. Unfortuantely, my time and knowlede are very limited, so my help will be very limited here. Nevertheless, it is my strong belief, that this features are worth implementing and should be pursued in the future.

Kind regards,

Leonard Mada
[aka discoleo]

Advanced Signal Processing

Reply #1
However, from my knowledge, I believe that present formats (like mp3) are very limited in scope. One reason is the fact, that most concepts are based on data frames that are analysed using a Fourier transform.

Actually, in no commonly used audio coder is the data analyzed in the Fourier domain. The data is analyzed by a filterbank, not a Fourier transform.

Quote
However, the Fourier transform is limited to stationary and linear data, severly hampering its use in more advanced audio processing (where data is definetly non-linear and non-stationary).

Nothing whatsoever prevents one from using a Fourier transform on non-Stationary data. There are neither mathematical nor practical limits to such.

Fourier analysis does not apply to the MODELLING of nonlinear systems, but it works just fine in analyzing the input and output.

Ergo, this premise is also defective in a fundamental sense.

There is a kernel of a real issue here, however, in that the HUMAN EAR does something other than Fourier analysis, even though it does do a time-frequency tiling, it is not one with uniform time/frequency tiling.

However, you have not based your arguments on the ear, but rather on incorrect mathematical statements.
Quote
Because of its ability to describe short time changes in frequencies that can not be resolved by Fourier spectral analysis it can be used for nonlinear and non-stationary time series analysis. The original purpose for the EMD was to find a decomposition which made it possible to use the instantaneous frequency for time-frequency analysis of non-stationary signals.


Excuse me? A short-term Fourier Transform can exactly find short-term frequency content, changing or not.

Now, yes, there are other ways to do time-frequency analysis.

Quote
3. Use in Audio Processing
=================
I believe that EMD is the ideal tool for compressing audio streams, especially music. Voice and various instruments generate specific waves that are maintained (or changed non-randomly) over various time scales. Therefore, we could detect these waves (over time scales of up to a few seconds) and subtract them from the actual signal, getting a less complex signal that can be better compressed using a lossy algorithm.

I suggest that you study a long-known method of music modelling called "sinusoidal coding" which despite the name, does more than use pure sinusoids for modelling.

This is well-known kind of analysis, and one that is very well suited to manipulation, time-scale changing, and some other kinds of processing, but that has proven (for a variety of very good reasons) inefficient for low-rate coding.

What you describe appears to be no more or less than what is already done.

Don't forget that audio is percieved in a time/frequency analysis that takes place mechanically on the human cochlea. Audio IS a frequency-domain phenominon because of the peripheral perceptual processing.

Video has no such time-domain frequency processing, and no fundamental frequency processing in the spatial domain, either, rather the eye is a spatial receptor.

What applies to video may or may not apply at all to audio. The mathematics remains the same, but the requirements for processing, compression, analysis, synthesis, etc, are very, very different.

I refer you to http://www.aes.org/sections/pnw/ppt/audiovsvideo.ppt for more discussion on that issue.
-----
J. D. (jj) Johnston

Advanced Signal Processing

Reply #2
Quote
Quote
...that most concepts are based on data frames that are analysed using a Fourier transform...

The data is analyzed by a filterbank, not a Fourier transform.


What I wanted to say is, that the whole data is inside the frame. Most lossy codecs do NOT extrapolate this data to nearby frames, while lossless codecs do so very inefficiently. But I may be wrong.

Basically, EMD is used for adaptively representing nonstationary signals as sums of zero-mean AM-FM components. Therefore, I believe that EMD allows one to split a segment of the signal into different Intrinsic Mode Functions. These waves would persist beyond a single frame. Music and voice could be easily decomposed using this technique. The residuals woud then be encoded using a lossy format and added to those waves to recreate the original signal.

Fact is, that FFT is often performed in signal processing (and various filters), so basically, one cannot obtain using these classical methods the Intrinsic Mode Functions of a signal. I again may be wrong (but here I believe that I am not).

Maybe you could read this article first, to get a better idea what EMD is: http://perso.ens-lyon.fr/patrick.flandrin/NSIP03.pdf


Quote
... long-known method of music modelling called "sinusoidal coding"


However, I believe this is very limited, too. I again might be wrong, but EMD allows me to detect, e.g. 1-100... overlapping waves (with zero-mean AM-FM) that are present in a variable time frame (defined here as anything between a few ms to the whole track - although it would be computationally overkill to do it on a whole music track). I believe, sinusoidal coding is something completely different.

\\ EDIT 1
Added this URL:
Maybe an interesting article about music and EMD is this one:
http://www.elec.qmul.ac.uk/people/josh/doc...eiss-DAFx05.pdf

To quote from that article:
Quote
EMD can be used both for short-term measurements like fundamental frequency, chord and onset, and long-term structures like melody, rhythm and tempo contours.


\\ EDIT 2
Another useful link:
http://doc.gold.ac.uk/~map01ra/dmrn/events...006ensemble.pdf

\\ EDIT 3
OK, this article might have some information regarding point 5 in my first post:
http://doc.gold.ac.uk/~map01ra/dmrn/events...on2006sound.pdf

Advanced Signal Processing

Reply #3
[quote name='discoleo' date='Nov 27 2006, 15:24' post='452677']What I wanted to say is, that the whole data is inside the frame. Most lossy codecs do NOT extrapolate this data to nearby frames, while lossless codecs do so very inefficiently. But I may be wrong.

[/quote]
NEVER! All filter banks must have some kind of overlap. 1/2 overlap is the least. Layer 2 (perhaps one of the least good compression algorithms) has the most overlap.

It is true that the gain available scales with the length of the basis vectors.  There is data from a number of sources (I recall jj showing a graph of both rate gain and stationarity showing that somewhere between 800 and 1400 length for 2:1 block switch, or double that for more complicated block switching) that shows that the block lengths used in the better codecs (AAC, WMA Pro, ...) are just about optimum in terms of stationarity vs. rate gain.
Quote


Basically, EMD is used for adaptively representing nonstationary signals as sums of zero-mean AM-FM components. Therefore, I believe that EMD allows one to split a segment of the signal into different Intrinsic Mode Functions. These waves would persist beyond a single frame. Music and voice could be easily decomposed using this technique. The residuals woud then be encoded using a lossy format and added to those waves to recreate the original signal.


Heh, you do realize that it's not "easily decomposed", don't you?  You do need to show, for CODING purposes, that your representation is unique and a tight frame (i.e. you don't get data growth).  You may find that this is not as simple as you expect.

As for persisting beyond a single frame, well, if you were right, longer filterbanks in an audio coder would work splendiferiously, and still be log(n) processing.

Quote


Fact is, that FFT is often performed in signal processing (and various filters), so basically, one cannot obtain using these classical methods the Intrinsic Mode Functions of a signal. I again may be wrong (but here I believe that I am not).


FFT's are not lossy, they contain all of the information in that part of the original signal.

But, once again, FFT's aren't used in coding.
Quote

... long-known method of music modelling called "sinusoidal coding"


However, I believe this is very limited, too. I again might be wrong, but EMD allows me to detect, e.g. 1-100... overlapping waves (with zero-mean AM-FM) that are present in a variable time frame (defined here as anything between a few ms to the whole track - although it would be computationally overkill to do it on a whole music track). I believe, sinusoidal coding is something completely different.

[/quote]
No, it's not. You just described a variety of sinusoidal coding.
Quote


To quote from that article:
Quote
EMD can be used both for short-term measurements like fundamental frequency, chord and onset, and long-term structures like melody, rhythm and tempo contours.



Just like sinusoidal coding.

But nowhere does it say it's an efficient representation.

I notice you didn't respond to the points about the differences between audio and video, either. I encourage you to do so before you invest too much time in this.

Edited to add: Ok, I just can't find the tag error here. Sorry.
-----
J. D. (jj) Johnston

Advanced Signal Processing

Reply #4
Maybe I was NOT able to explain what I mean. I will try with an analogy.

Lets say we have a band and various musical instruments. We define the voices of the band members and the basic sounds produced by the instruments. These are our basic waves.

On this we add another layer of complexity. We now define how these waves change in time (linearly or non-linearly), i.e. these are the musical notes and the lyrics of the songs.

And lastly, some band members may sing from time to time a little wrong. Or some note does not sound perfectly or there is some other glitch. These would be the residuals, that we code for every frame.

Common compression algorithms code one time frame (some 5 ms) then move to the next. It is always one such frame that is coded. You do NOT have 100 frames coded (or a frame of 1 s).

Instead, I would do something very different (here my analogy with video, BUT it was maybe a bad analogy, because it is NOT the same).

How is the audio signal produced?
Various waves overlap producing the final pattern. These basic waves extend usually over more than one 5 ms time frame.

We decompose the original signal in a reasonable number of waves.
Lets code it this way:
- define wave 1: frame 1 till frame 100
  -- it changes linearly: some parameters
- wave 2: frame 1 till 80
  -- non-linear change: some paramaters
- wave 3: ...
- ...

Then, for every 5 ms frame, we calculate the residuals (diff from the superposition of the above waves). We then pick only those residuals for storing, which would be perceived by the human ear.

However, unlike classic models, instead of coding every frequency in every frame inside some time segment, we code the frequency ONLY ONCE inside a time period of 100 ms up to a few seconds and define a linear (or non-linear change) for the corresponding wave.
- for linear change we need 2 bytes to store the parameters (+one bit for the change-type linear)
- for non-linear changes we may need 2-4 bytes (+some bits for change-type)
+ how many frames in advance will this wave persist

We end up with maybe 10 bytes to store information for 100 to 1000 frames (for one basic wave). This is good compression, i.e. instead storing the amplitudes for 2 kHz on the next 1000 frames, we store only 10 bytes, wich should perfectly describe how this behaves in time.

I hope that I could it explain this time more effectively.

Advanced Signal Processing

Reply #5
Common compression algorithms code one time frame (some 5 ms) then move to the next. It is always one such frame that is coded. You do NOT have 100 frames coded (or a frame of 1 s).

Well, it's 20 to 40 milliseconds coded in one "block", with that much overlap with the adjacent blocks in one of several ways, for most of the good lossy coding algorithms.

Many lossless algorithms use a high-order LPC predictor, which, barring bit-depth problems, can have an enormous impulse response of seconds at a time.

Your assertions on how coders work now are JUST WRONG.  Get it now?

Quote
However, unlike classic models, instead of coding every frequency in every frame inside some time segment, we code the frequency ONLY ONCE inside a time period of 100 ms up to a few seconds and define a linear (or non-linear change) for the corresponding wave.


And, we do NOT code every frequency in every frame, either. In AAC, at 64kb/s/channel, I'd have to say that we code perhaps 1/10th of the "frequencies" available. (and said frequencies are actually bandpass outputs from a filterbank, not frequencies, but that's a more sophisticated quibble). AAC (and some other codecs) also have a predictor strategy that predicts across blocks.

Your idea of how present coders works is just completely off.

Your idea of basic waves is also quite off, because the thing that limits the block length in current codecs is THE CHANGES IN YOUR"basic wave".  Not amplitude or somethign like that, either, but fundamental spectral content changes, which you would have to describe.

And, you know what? When you describe it, you're right back where we are now, AT BEST.

You are proposing sinusoidal decomposition and coding. Really. Yes.
-----
J. D. (jj) Johnston

Advanced Signal Processing

Reply #6
Quote
Your assertions on how coders work now are JUST WRONG. Get it now?


That might be correct, my speciality is quite different. Yet I would like to do some math now.

Quote
In AAC, at 64kb/s/channel, I'd have to say that we code perhaps 1/10th of the "frequencies" available.


OK, I say 1/10th is still too much. (or too less, depending on the perspective)

Now, imagine, we code 1/10th of the frequencies in the first block (frame). But we could magically predict how (most of) these frequencies would change for the next (on average) 50-100 blocks.

Lets take a 40 ms block => first block = 64 kb/50 = 1.28 kb
We would need some data to store these change functions, lets say we end with 10 bytes/ frequency => 12.8 kb (would behave as 640 kb/s). The next 50 blocks still would need some 10-20 kb/s to store non-zero residuals.
=> makes 12.8 + 50 *20/50 = 32.8 kbits

With classic codec: 12.8 * 51 = 76.8 kb for the same segment. That is, we would have some 30-40 kb left (either better compression or higher quality).

I aknowledge, that in order for this algorithm to function, we have to assert that the wave will behave predictable on a scale of 40*50 ms = 2 s. I believe that in most music, most waves will behave this way. Of course, there will be sound files were this algorithm will fail. But for most music it shouldn't.

EMD
===
From all existing methods, EMD is the most advanced signal processing tool and it could actually allow such predictions.

My single hope is that some people from the community start to learn how EMD works. From what I've heard and saw, it is really powerfull. You can google for "Empirical Mode Decomposition" to find additional articles on EMD. And I am sure, EMD will play a significant role in the future.

PATENTS
======
Unfortunately, I was told that there is/are some patents on EMD. I googled today and found this link: http://www.freepatentsonline.com/6862558.html. I really do NOT know what the implications are, although I found free matlab code implementing EMD and many articles on EMD. So research is done in this area. Interestingly, the patent covers speech recognition.

Quote
Because our emphasis will be on speech analysis, we should first examine the principles of Human Speech Recognition (HSR) and Automatic (Machine) Speech Recognition (ASR). As summarized in the classical paper by Allen (1994), typical ASR systems start with a front end that transforms the speech signal into a feature vector. This processes is mostly through spectral analysis over a fixed period of time, within such period the speech signal is assumed to be stationary. The analysis is strictly on frequency. HSR on the other hand, processes information across frequency localized in time. Thus, the process is assumed to be nonstationary. These localized speech features are known as the formats. To extract features localized in time but across all frequencies requires time-frequency analysis.


I have provided some links to EMD articles. Many more can be found by googling. So I hope that interested people really read them. Some of my ideas may look childish, but maybe others can see the potential of EMD and have brilliant new ideas for new algorithms.

Advanced Signal Processing

Reply #7
Now, imagine, we code 1/10th of the frequencies in the first block (frame). But we could magically predict how (most of) these frequencies would change for the next (on average) 50-100 blocks.


Except that you can't.

Signals are not that stationary.

It's been measured.
-----
J. D. (jj) Johnston

Advanced Signal Processing

Reply #8
Quote
Except that you can't.

Signals are not that stationary.


I believe that this is exactly the flaw of your argument. One can predict non-stationary signals. See the many documents on the web, e.g. http://cat.inist.fr/?aModele=afficheN&cpsidt=3573456, http://www.icgst.com/ACSE05/papers/P1110523106.pdf and others.

Indeed, EMD was developed especially with non-stationarity and non-linearity in mind. Prior to EMD, there were NO (real) methods to deal with those conditions.

From a different perspective:
- lets say we have a music band of 5 members playing 5 instruments + some vocals. Even IF every instrument behaves as IF it has dozens of independent wave changes, we will still have probably less then 1000 change functions. Probably more of the order of 100-200. Maybe a big orchestra will generate 1000-2000 different functions (BUT again, most instruments will behave quite similarly, so the functions would actually be identical).

In other words, many waves would change over time using the same function. And I believe that it would be possible to have a good (non-linear) prediction over 1 to 5 s.

Advanced Signal Processing

Reply #9
I believe that this is exactly the flaw of your argument. One can predict non-stationary signals.



Nonsense, at best this is quibbling over what "stationary" means.

Stationary does not mean "sine wave", it means literally ANYTHING that continues to have a given set of characteristics, which can be time-varying.

If you can "predict" something, it's more than merely stationary.

If I take a simple markov process with r1=.9, I can predict that, but only to the autocorrelation level.

What you are saying (involving exact prediction) goes well beyond that.

Again, people have used a variety of basis vectors, complete, imcomplete, or overcomplete sets of them, and no, it doesn't work.

All you're doing is hyping a particular set of basis vectors. You appear to be ignoring all of the random components that exist in music completely, as well as the time duration of quasi-stationarity in music signals.

So, you're quibbling about what "stationary" means, and when you say you can predict "nonstationary" signals, you're simply saying that they were stationary in some sense after all.

You have made the claim here, I think it's time for you to go prove it. Show us your evidence that you can "predict" a series of brushed-cymbal sounds, EXACTLY. Go fer it. When you're done with that, try Suzanne Vega's "Tom's Diner".

Indeed, EMD was developed especially with non-stationarity and non-linearity in mind. Prior to EMD, there were NO (real) methods to deal with those conditions.


Really? NO real methods? None at all? Really?

I wonder, then, what happened to all those papers I've read.  Check out the way that the guns on the Iowa Class battleships were aimed, even.

Please, go do some research before you do something like that again!
-----
J. D. (jj) Johnston

Advanced Signal Processing

Reply #10
Quote
Nonsense, at best this is quibbling over what "stationary" means.


Work on prediction of non-stationary systems [in general] has been done in the late '50s - early '60s. Please read more carefully.

Quote
Really? NO real methods? None at all? Really?


All methods had severe limitations. Therefore, true, NOT real methods. [even EMD, having NO mathematical represenatation could be viewed as incomplete]


This post was intended to introduce some new methods/ideas to the public. While I may have little ideas about current audio compression algorithms, I never intended to devise personally such a compression algorithm.

Interested people should just read the references I provided and judge themselves if these new methods can be helpfull. Therefore I see NO need to argue any further.

/* EDIT

A LAST LINK, this one from NASA about EMD: http://tco.gsfc.nasa.gov/HHT/index.html

EDIT */


Kind regards,

discoleo

Advanced Signal Processing

Reply #11
Work on prediction of non-stationary systems [in general] has been done in the late '50s - early '60s. Please read more carefully.

Please do not advise me on what to read. Work on predicting all manner of systems has proceeded for many years. Work on nonstationary systems, in general or specific, did not stop after the early 1960's, as the algorithm in your cell phone shows most clearly.
Quote
All methods had severe limitations. Therefore, true, NOT real methods. [even EMD, having NO mathematical represenatation could be viewed as incomplete

I see, these unspecified methods from the 1960's had "severe limitations".

Now, for the "EMD" that you speak of,  please explain:

Is this an incomplete, complete, or overcomplete representation?

In what fashion can you demonstrate the completeness?

In what fashion does one proceed with the decomposition?

What length of history must one use, say, on a real-time signal from a live concert, in order to finish the encoding?
Quote
This post was intended to introduce some new methods/ideas to the public. While I may have little ideas about current audio compression algorithms, I never intended to devise personally such a compression algorithm.

I see nothing particularly new here. I've read and reviewed papers with a variety of decomposition methods for many years, and I see nothing particularly new here.

What part of this particular analysis method you believe is novel?

Why do you believe that it can predict a part of a signal for seconds? 

Perhaps a bit of Bruckner, I suppose, but even then you'll be hard pressed to model the chaotic part of the "stationary" part of the brass section, and of course that chaotic part is audible.

Any commonly used woodwind, for instance, has an unpredictable part that can to some extent be modeled by "FM and AM modulation" ala the methods that Max Matthews, Hall Alles and such were using in the 1970's.  Their method was far from complete or natural, and was based more on an ad-hoc modelling approach, of course, but was also intended for synthesis on 1970's digital hardware.
Quote
Interested people should just read the references I provided and judge themselves if these new methods can be helpfull. Therefore I see NO need to argue any further.

You are advocating this method, but you have shown nothing in it that is new, and your claims for what it does seem rather, well, incomplete.  In particular, you seem to have failed to even respond to the issues surrounding decomposition of any signal with a chaotic or random part, or even recognize that these issues were presented to you.
Quote
/* EDIT

A LAST LINK, this one from NASA about EMD: http://tco.gsfc.nasa.gov/HHT/index.html

EDIT */


Absolutely nothing in that cover sheet suggests that this work is particular applicable to music coding, or that it can provide substantial coding gain.

If you wish to assert that it can, the burden of proof is on you to do it, and to show us, here, or via publication ( The AES, or the Signal Processing Society of the IEEE, for instance, would be quite willing to publish novel, substantiated work), that your assertions that predictions can carry on for seconds, etc, in fact can work TO REDUCE DATA RATE.

Speaking as one of the many people here who have analyzed audio signals, there are precious few audio signals that remain with the same note/chord/what-have-you for that long, let alone that remain with instruments in the same state for that long.

It would be enlightening for you to try this on a heavily harmonic pipe organ note to start with. See how long you can predict the waveform (you did say noiseless, yes?) of a single sustained note.

Give it a try, why don't you?
-----
J. D. (jj) Johnston

Advanced Signal Processing

Reply #12
It seems to me that what you are suggesting would be suitable for synthesis (wavetables, SAOL,...) more than for natural audio coding.

Advanced Signal Processing

Reply #13
It seems to me that what you are suggesting would be suitable for synthesis (wavetables, SAOL,...) more than for natural audio coding.


Quite so. All of my questions about the compactness of representation have been ignored.
-----
J. D. (jj) Johnston

Advanced Signal Processing

Reply #14
What is EMD?

Maybe this technique is NOT well understood so I will give another try.

Quote
"... think of representing signals in terms of (both) amplitude and frequency modulated (AMFM) components such that: x(t) = SUM(j = 1 to k) Aj(t) * cos(PHIj(t))", where both the amplitude AND phi vary simultaneusly (depending on time).

"The rationale for such a modelling is to compactly encode possible nonstationarities in a time variation of the amplitudes and frequencies of Fourier-like modes. More generally, signals may also be generated by nonlinear systems for which oscillations are not necessarily associated with circular functions, thus suggesting decompositions of the form ..."



Why do I think that music can be compressed much more efficiently?

Lets suppose we have a calssical concert. The whole information can be stored in a few kb of musical script and yet every competent orchestra will reproduce it almost identically. So, obviously there is very much redundant inforamtion in the music, aka the entropy is low.

This does NOT apply only to classical music. Actually, when I hear a guitar or bass, I hear mostly repetitive, very redundant sequences. Same for most other instruments. All pop, electronic, techno, metal, ... music is composed of highly repetitive/ redundant music.


I have NO intention to develop this algorithm myself. BUT I hope that really smart people will develop the right tools and techniques for a better compression / encoding. And I also believe that EMD will have an importance, too.

 

Advanced Signal Processing

Reply #15
What is EMD?

Maybe this technique is NOT well understood so I will give another try.

Quote

"... think of representing signals in terms of (both) amplitude and frequency modulated (AMFM) components such that: x(t) = SUM(j = 1 to k) Aj(t) * cos(PHIj(t))", where both the amplitude AND phi vary simultaneusly (depending on time).

"The rationale for such a modelling is to compactly encode possible nonstationarities in a time variation of the amplitudes and frequencies of Fourier-like modes. More generally, signals may also be generated by nonlinear systems for which oscillations are not necessarily associated with circular functions, thus suggesting decompositions of the form ..."



Why do I think that music can be compressed much more efficiently?


This.

Is.

Not.

New.

I don't know how to say it any more clearly. The quote also begs the definition of "stationary", and avoids the issue of divergent chaotic systems in instruments completely.

Please address the questions addressed to compactness of representation. That is key for coding.


Lets suppose we have a calssical concert. The whole information can be stored in a few kb of musical script and yet every competent orchestra will reproduce it almost identically.


Since I've personally heard both Klaus Tennsdadt and Zubin Mehta conduct the very same orchestra on the very same piece of music, and since the two performances were nothing even remotely like identical in any fashion, I can very comfortably say that your assertion above is purely, completely wrong.

Let us take two violins, playing the same notes, with the same timing.

1) Do you assert the waveforms sound the same?
2) Do you assert that the two performances sound the same?
3) Do you assert that you can code them both with the same bits?
-----
J. D. (jj) Johnston