Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Question about lossless audio compression algorithms (Read 6674 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Question about lossless audio compression algorithms

For lossless formats, in their algorithm, do they compare the audio from the left and right channel? Most of the time (at least for pop music), the left and right channels are identical (correct me if I'm wrong). If they're the same, you can just encode 1 channel, and apply it to the other channel.

Is this method of encoding audio possible?

Most music, from pop to classical, are repetitive, from the chorus to actually repeating melodies in classical music. Is it possible to make audio compression be two pass. Let the first pass to analyze and pick out the parts that are repetitive, and the second pass do the encoding, and only encode the repetitive parts once, and apply it to where ever it needs it.

Question about lossless audio compression algorithms

Reply #1
I think that neither L & R, no repetitive parts are perfectly identical, but AFAIR the algorithms can take advantage of the L - R similarity.
Ceterum censeo, there should be an "%is_stop_after_current%".

Question about lossless audio compression algorithms

Reply #2
You first describe a process vaguely akin to joint stereo. I don't know where to begin with the fundamental misunderstandings in your last paragraph.

Question about lossless audio compression algorithms

Reply #3
For lossless formats, in their algorithm, do they compare the audio from the left and right channel? Most of the time (at least for pop music), the left and right channels are identical (correct me if I'm wrong). If they're the same, you can just encode 1 channel, and apply it to the other channel.

Is this method of encoding audio possible?


Yes, lossless encoders routinely use such approach. It's usually called M/S (Mid-Side aka "sum and difference" channels) coding, which sometimes is also called "joint stereo". It's just that the Left and Right channels are almost never really identical. But they are often similar, and then a lossless encoder would use the M/S method instead of L/R.

Most music, from pop to classical, are repetitive, from the chorus to actually repeating melodies in classical music. Is it possible to make audio compression be two pass. Let the first pass to analyze and pick out the parts that are repetitive, and the second pass do the encoding, and only encode the repetitive parts once, and apply it to where ever it needs it.

Existing lossless encoders are not able to do it.
This is extremely difficult task, because what to us may sound like a repetitive musical passage is not strictly mathematically repetitive to the lossless encoder. There is always some ambient noise, and also a human player can never play the same passage exactly identical. Thus, to find such repetitive passages within the audio, a kind of "pattern recognition" algorithm is needed, and this is very difficult. It is somewhat similar to the problem of conversion of normal audio to MIDI. There has been some success in that direction, but this is still far from perfect, and it cannot be used for arbitrary audio sources.

Question about lossless audio compression algorithms

Reply #4
Well, I'll give it a try...

First of all, although audio may SOUND the same doesn't mean they ARE the same. Many factors are making such thing impossible. One thing are resampling - Let's say the original master is 196000 hz and will be downsampled to CD spec 44100 hz. Let's just assume that a chorus were just copy/pasted (even while that is highly unlikely), imagine that the one sample in 196000 is shiftes just a few samples, when it's downsampled to 44100, they will differ from the first downsample.

However, your theory could be applied to lossy encodings, but it will probably require very long encode times and just offer very little advantage in overall filesize.
Can't wait for a HD-AAC encoder :P

Question about lossless audio compression algorithms

Reply #5
However, your theory could be applied to lossy encodings, but it will probably require very long encode times and just offer very little advantage in overall filesize.


Why very little?

Since it is lossy compression, you can probably just "copy and paste", since people are not going to hear the difference anyway.

BTW, I do think the chorus are copy and pasted. I know in some songs, Flo-Rida's Low for example, is very difficult to sing the chorus part perfectly 4 times throughout the song.

Question about lossless audio compression algorithms

Reply #6
Because we're only talking about the chorus, and more than often some of them actually differs slightly anyway. Why sacrifice at most 500kB in a song... and it will require a new lossy-codec which is not compatible with anything anyway. I'd say it's a lose-lose situation.
Can't wait for a HD-AAC encoder :P

Question about lossless audio compression algorithms

Reply #7
Quote
Most of the time (at least for pop music), the left and right channels are identical (correct me if I'm wrong).
You are WRONG!  

Here's something you can try...  With an audio editor you can subtract the left & right channels.  When you do this, you will find that there is virtually never any silence (unless you have a "mono" file*).  You can do this by inverting one channel and mixing the left & right channels together.  Or, some (most?) audio editors have a "vocal elimination" filter that works by subtracting left from right to remove the "center channel" audio.  Audacity[/color] (FREE) has a Vocal Remover filter.

With an audio editor, you can create a L-R file and a L+R file, then add/subtract to re-create the original stereo file, but I don't know if this helps with lossless compression.  And, you'd have to take care to avoid rounding errors...  It's going to take more than 16 bits to losslessly store the sum & difference of a 16-bit file!  (Even the difference can go over 16 bits if the channels happen to be out-of-phase at some point in time.)


Quote
Is it possible to make audio compression be two pass. Let the first pass to analyze and pick out the parts that are repetitive, and the second pass do the encoding, and only encode the repetitive parts once, and apply it to where ever it needs it.
  If the whole chorus is copy-and-pasted, you could take advantage of that.  But, repetitive parts are not generally "digitally identical".  Rap & pop producers often use "loops", but they usually loop the instruments separately, and when mixed there will be digital differences.  I'd guess that you'd have to search through several hunderd songs before you found one with a few seconds of digitally repeated data.

If you record (through an analog-to-digital converter) an identical sound twice, the digital files will be different.  This is because you are sampling the analog signal, and you sample different points on the analog wave each time you digitize...  The overall wave shape will be the same, the sound will be the same, but all of the bytes in the file will be different!




*CDs are always 2-channels.  If you can find a mono CD, both channels will be identical, and if you subtract you will get silence!

Question about lossless audio compression algorithms

Reply #8
Exploiting long term correlations (longer then a few transform blocks for transform codecs - low thousands samples) isn't really computationally feasible.  The encoding process would take too long to be worthwhile, and the decoder complexity and memory requirements would be prohibitively expensive for anything but modern PC processors.  Scanning ahead seconds or minutes in a track is just crazy.

Question about lossless audio compression algorithms

Reply #9
With an audio editor, you can create a L-R file and a L+R file, then add/subtract to re-create the original stereo file, but I don't know if this helps with lossless compression.  And, you'd have to take care to avoid rounding errors...  It's going to take more than 16 bits to losslessly store the sum & difference of a 16-bit file!  (Even the difference can go over 16 bits if the channels happen to be out-of-phase at some point in time.)

Technically, both the sum and difference would require 17 bits to be fully represented. However, (and this is what I did in very early versions of WavPack), you can simply truncate both to 16-bit and the result will still be bit-perfect (you have to wrap though, not clip!) The disadvantage of this is that there will sometimes be discontinuities which screw up the predictors.

The other trick I noticed early on is that the sum and difference of two numbers will always have the same LSB, so you can also right-shift one of the results and still not lose any information. If you right-shift the sum, it becomes basically the average (and that one can’t clip). Therefore, what you end up with is an average that is (obviously) about the same magnitude as the individual channels, and a difference which can be pretty much anywhere.

In some popular music with a strong vocal mixed exactly to center, the difference signal can be significantly lower than the average and you get a nice lossless compression advantage. In other situations, like music recorded in live spaces with a pair of widely spaced microphones, there will be little short-term correlation between channels and joint stereo will degrade the compression. Overall, it usually helps more than hurts, and it is the default in WavPack to use it.

Question about lossless audio compression algorithms

Reply #10
I just did an experiment...  I took a ~2.5 minute 44.1kHz stereo song-file and I made 3 FLACs - A regular stereo file, a mono file, and a stereo file with identical data in both channels.  I don't know how it works (probably M/S), but the FLAC encoder is "smart enough" to figure-out that there is redundant data in the "2-channel mono file".

Mono FLAC ~6.2MB
True stereo FLAC ~13.4MB
Stereo FLAC with identical channels ~6.2MB

Question about lossless audio compression algorithms

Reply #11
I just did an experiment...  I took a ~2.5 minute 44.1kHz stereo song-file and I made 3 FLACs - A regular stereo file, a mono file, and a stereo file with identical data in both channels.  I don't know how it works (probably M/S), but the FLAC encoder is "smart enough" to figure-out that there is redundant data in the "2-channel mono file".


thats just joint stereo at work. it means that the middle signal is the only one with data and the side channel is all 0's

all 0's is easily compressible.

Question about lossless audio compression algorithms

Reply #12
Note, though, that many CDs with true-mono content utilize different dithers or noise shapers in each channel, destroying the 100% correlation between the two channels. In such situations it might behoove you to sum to mono yourself and save as 24 bits to keep the resulting extra bit of resolution around (while still getting a lower bitrate than for stereo).

 

Question about lossless audio compression algorithms

Reply #13
Most music, from pop to classical, are repetitive, from the chorus to actually repeating melodies in classical music. Is it possible to make audio compression be two pass. Let the first pass to analyze and pick out the parts that are repetitive, and the second pass do the encoding, and only encode the repetitive parts once, and apply it to where ever it needs it.

There were a couple of papers on this technique, and I learned just today that it's actually patent pending. The proposed method, however, doesn't look too promising in terms of practical implementation. The authors talk about breaking the track into fixed-size segments... Good luck with live-performer recordings. Even for electronic music, there will be sampling differences as discussed above.

Edit: It appears from the patent application's preliminary report (page 4) that the compression notion is already claimed by Fuji Xerox Co., Ltd.