Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Most 'true' way to de-emphasize CD image (Read 78509 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Most 'true' way to de-emphasize CD image

Reply #50
2. FLAC has a very imperfect method of expressing the filtering efficiently. In fact, it makes no special attempt to, since in the general case there is no utility in doing so, and even in this specific case, the prediction (to give a perfect result, recovering 8 zeroed LSBs) would have to be far more accurate than anything else it ever achieves.

Seems that you agree on my previous proposed explanation:
Quote from: knutinh link=msg=0 date=
If the 8lsb can be "created" as a function of the last N input samples, and this function is held constant over a file, then a predictor looking for autocorrelation can in principle find that function (or the convolution of the original source spectrum and the filter). Once you have that function you can do predictions and transmitt only the prediction residue.

There are surely practical constraints that can prevent this (buffer sizes, processing power, truely random dither etc), but no-one has elaborated on those.

So perhaps the linear prediction (somehow the words Yule-Walker flash in the back of my head?) is inperfect. Perhaps it does only the 8 msb. Perhaps it has too short a buffer. Perhaps it does only a partial search through candidate coefficients.
Quote
If you take the 24-bits with 8 zero LSBs, and add a DC offset of 000000000000000010101011 to it, thus breaking the "perfect" trick FLAC can use on those 8 zero LSBs, you'll probably find the compression differential between that and the filtered version narrows dramatically (unless FLAC is smart enough to remove it - I haven't tried). This is despite the DC offset carrying exactly 8 extra bits of information for the entire file over and above the 24--bits with 8 zero LSBs.

FLAC isn't a form of artificial intelligence trying to find the absolute smallest lossless representation of a given set of data, including reverse engineering any algorithms that may have "bloated" that data; it's just trying to do a reasonable job in a reasonable time.

I was thinking about that. If other trivial inserted datasets into the 8 lsb also cause a large increase in filesize, then it seems to suggest that flac simply isnt really clever wrg the 8 lsb of a 24 bit stream. There may be very good reasons why it isnt.

Throughout this discussion I have had a crude mental model of lossless codecs where they try to predict future samples from a historical buffer, transmitting only the (slowly varying) model + the prediction residue (further compressed using entropy coding). If this is indeed a valuable reference for this kind of high-level discussions, it really boils down to the statistics of the input signal and the capabilities of the prediction, doesnt it?

-k

Most 'true' way to de-emphasize CD image

Reply #51
This does not make sense. A filter does not create information (look up Shannon).


That's true under the condition that we have machines operating on real numbers, which we don't. Instead we have samples of finite width, which when processed, can to lead to new samples of theoretically infinite width, which we represent by rounding.


I fail to see how this should lead to a dramatically worse performance. If rounding appears as uniformly random, it is a noise component which cannot be compressed, fair enough -- but are you altering the 24-16=8 bits? No. (Are you ever rounding off more than 1 LSB? That's not a retorical question, I don't know the answer.) And the finite-wordlength constraint merely means that there are more functions fitting, modulo the rounding.

So in principle, it should be possible to compress to something fairly near original size (17/16 if roundoff and dither is in the LSB only, right?).



Now my original inquiry was due to the observation that files became > 50% larger -- up to 70%, actually. That corresponds to (more than!) a fully 24 bit recording (27 bits in the 70 % case).
Now assume you have
(A) 24 bits of music (resp. 27), FLACed
vs
(B) the file in (A), which you crop down to 16 bits (discarding 1/3 of the information), adding a little dither, applying a filter (adding no information although representing it in a 24 bit file) and rounding off. Then you FLAC it.

Is it at all reasonable that file (B) should be as large as file (A)?

Most 'true' way to de-emphasize CD image

Reply #52
Yes, but the design is such that it works "best" with the vast majority of content out there - which means it might not do very well on a given special case. The predictor looks at the sample values as a whole (i.e. all 24-bits) - it's never looking at the 8LSBs specifically (or the 16 MSBs!) and seeing if they can be predicted from anything.

Plus I've read (don't know if it's true) that the design of FLAC isn't especially focussed on 24/96 audio. Certainly it's a practical implementation of lossless coding - it's not meant to reach for perfection in terms of compression ratio.

FWIW, IIRC, the Wavepack author said that audio which had been created from another source (e.g. by simple normalisation, or I suppose by filtering - I think the specific case he mentioned was normalisation without dither which meant certain sample values would be completely unused), where that other source would have required fewer bits to store, wasn't unheard of - but he didn't think it was worth building something to reverse engineer the transformation. Sometimes it would noticeably reduce the bitrate, but the encoding effort would slow things down too much. (with apologies to David if I've mis-remembered this! I searched but couldn't find the quote.)

Cheers,
David.

Most 'true' way to de-emphasize CD image

Reply #53
(Are you ever rounding off more than 1 LSB? That's not a retorical question, I don't know the answer.)


Rounding, in the strict sense of the word*, only affects the 1 LSB. But when you process at 16 bit values at 32 or 64 bit precision, you get true 32 and 64 bit results. Storing them in 24 bit will give 8 unique bits over 16 bit and not some trivially correlated pattern.

The thing is, you guys asked for a very special case handling and why the predictor isn't able to detect the pattern automagically. It is just too special to scan for specifically. We also can't use an universal predictor covering all possible cases. A complete implementation would probably itself be unpredictable (halting problem) or at least have unpractical exponential costs. As 2Bdecided put it very nicely: "FLAC isn't a form of artificial intelligence"

That FLAC has considerably worse performance at 24 bit than 16 bit is true, but not necessarily related to its effectiveness in the "reconstruct n LSB as a function of the l-n MSB in the last x samples" special case.

*Instead of saying to 'round' a 32 bit value to 24 bit, as I may have done, the term 'word length reduction' would have been more appropriate.

Most 'true' way to de-emphasize CD image

Reply #54
I guess that whatever per-channel correlation is in a music file is generally mainly to be found in the 8msb (no matter what the bit-depth happens to be)?

-k

Most 'true' way to de-emphasize CD image

Reply #55
Why exactly 8?

Most 'true' way to de-emphasize CD image

Reply #56
Rounding, in the strict sense of the word*, only affects the 1 LSB.propriate.
go on, think that one through again!

(or round 8999.9 to the nearest integer and see how many of the numbers change).


You could just say "truncate". Apart from this special case, it's often just an academic difference with audio, since rounding = add 0.5 then truncate. A 0.5 LSB DC offset is usually irrelevant - except here I suppose.


Back to the original topic, if you want a lossless but efficient method, I'm sure it's possible to calculate how many bits you need to store optimal 16-bit + de-emph without creating a detectable difference. I bet 20 is enough, in which case, dump the last four bits. I'm not even going to say dither, though I suppose you could. Either way, it'll bring the FLAC bitrate down.

You could say that 20 isn't as good as 24. True. 24 isn't as good as 32. 32 isn't as good as 64. But you have to stop somewhere. 16 is already enough IMO, but if you're using lossless you might want some extra headroom, no matter how irrational or at least unimportant.

Cheers,
David.

Most 'true' way to de-emphasize CD image

Reply #57
Why exactly 8?

Could just as well say 1 or 15. The further into the noise/dither floor you are trying to predict, the harder it is, right?

-k

Most 'true' way to de-emphasize CD image

Reply #58
go on, think that one through again!


You're right. Rounding can affect all digits of a value's representation.

However, rounding never affects the actual, represented value by more than what the LSB represents.


Most 'true' way to de-emphasize CD image

Reply #59
Why exactly 8?

Could just as well say 1 or 15. The further into the noise/dither floor you are trying to predict, the harder it is, right?


That's absolutely a point (with a reservation below), if it is true. It is an easy test (which takes some time): gather a test corpus of 16 bit recordings (or, if you can find a representative test corpus of 24-bit recordings). Truncate down by 1 bit, 2 bits, ... down to some practicality bound (8 is a round figure?  Maybe even 4.). Compress. Plot compressed-file size as function of bits. Does it look linear?

(One could do the same with dithering, but that should simply shift the effective # of bits, right? 16 bits dithered down to 15 would be roughly equivalent -- in information content -- to somewhere between 15 and 16 of the original 16? So that there would be expected one bit difference between 16-to-15-with-dithering 16-to-14-with-dithering?)


But then the reservation: If the file is originally 16 bits, and becomes 24 bits just by padding with zeroes and applying a filter, then the last 8 bits are not in the noise floor, are they?

Most 'true' way to de-emphasize CD image

Reply #60
FWIW, I checked this with multiple lossless encoders. No surprises in terms of relative performance -- relative filesizes about what you would expect in a 16-bit test. So all compress about equally (relative to their assumed quality) bad -- they all returned file sizes greater than the original 16-bit WAV.

I.e.: With the de-emphasis algorithm and a single flag indicating application of it, then even WAV -- without any compression whatsoever -- would outperform today's state-of-the-art lossless algorithms (on this fairly hardrock-ish test corpus, that is). That's the state of today's art


Procedure:
The 16-bit files were converted to 24 bits de-emphed by SoX. Each album converted to one single WAV file by foobar2000. Test corpus with 24-bit file sizes below, total 11 286 552 576 bytes in the file (11 286 517 144 according to foobar2k, strange since it is not divisible by 3 ...).
Then each wav file was compressed with a range of encoders (Monkey's and TAK by their GUI applications, ofr.exe by CLI, while Foobar handled FLAC (1.2.1) and WavPack -- the latter unbearably slow, taking hours.) In order of filesize:

7 761 846 272 bytes - flac -8       
7 672 954 880 bytes - WavPack high, x5   
7 629 369 344 bytes - Monkey's extrahigh   
7 582 380 032 bytes - TAK -p4       
7 533 199 360 bytes - ofr extranew   

For comparison, the original 16 bit signal approximated by removing 1/3
11 286 552 576 bytes*2/3= 7 524 368 384 bytes (file size)
11 286 517 144 bytes*2/3= 7 524 344 763 (minus a third. Audio size)


Test corpus with 24-bit wav filesizes:


. 779 289 380 Backstreet Girls - Boogie Till You Puke
. 607 080 644 Black Sabbath - Black Sabbath [Castle orig
. 679 535 180 Black Sabbath - Black Sabbath, Vol. 4 [Castle orig
.  635 569 244 Black Sabbath - Master of Reality [Castle orig
.  704 629 844 Carnivore - Retaliation
1 099 159 028 Ebba Grön - Ebba Grön, 1978-1982
.  762 788 924 In Slaughter Natives - Enter Now the World
.  814 826 924 Leonard Bernstein - West Side Story
.  938 924 324 Lifelover - Konkurs
.  992 003 084 MZ.412 - Burning the Temple of God
.  933 198 380 MZ.412 - In Nomine Dei Nostri Satanas Luciferi Excelsi
.  803 773 700 Ordo Equilibrio - Reaping the Fallen...The First Harvest
.  864 977 444 Raison d'Etre - Prospectus I
.  670 761 044 Roger Waters - The Pros and Cons of Hitch Hiking

Most 'true' way to de-emphasize CD image

Reply #61
For reference:
7 761 846 272 bytes: flac -8 (24 bits)      
4 043 931 648 bytes: Same, but set to 16 bits output (no dithering).

The first 16 bits take up 52.1% of the filesize. The last 8 bits (probably with dithering?): 47.9%.

Most 'true' way to de-emphasize CD image

Reply #62
I haven't found much info on emphasis. From what I understand, it's some kind of tag on the audio file that tells the CD amplifier to amplify this song by this much.

Close?

Most 'true' way to de-emphasize CD image

Reply #63
I haven't found much info on emphasis. From what I understand, it's some kind of tag on the audio file that tells the CD amplifier to amplify this song by this much.

Close?


No, it dictates application of a certain EQ curve which attenuates the treble.


Your description fits ReplayGain, which exists for file formats like FLAC and WavPack.

Most 'true' way to de-emphasize CD image

Reply #64
I haven't found much info on emphasis. From what I understand, it's some kind of tag on the audio file that tells the CD amplifier to amplify this song by this much.

Close?


No, it dictates application of a certain EQ curve which attenuates the treble.


Ok, still not understanding it. Any recommended reading?


Most 'true' way to de-emphasize CD image

Reply #66
I haven't found much info on emphasis. From what I understand, it's some kind of tag on the audio file that tells the CD amplifier to amplify this song by this much.

Close?


No, it dictates application of a certain EQ curve which attenuates the treble.


Ok, still not understanding it. Any recommended reading?


Pre-emphasis and de-emphasis has been around a long time and is used in FM radio, analog TV audio, analog tape and LPs. The theory is since the high frequency components are typically lower amplitude than the lows and mids, we can use a little of the unused real estate by boosting the highs on the transmit/record side and lower them at receive/playback. Any noise introduced after pre-emphasis will be attenuated along with the excessive highs. This restores the response and reduces the noise.

When CDs were introduced some thought pre-emphasis would be a good idea to reduce quantizing noise and while a few discs were made with pre-emphasis, most were not and for many years, none. On the CD the pre-emphasis is an analog high boost ahead of the A-D converter. This also sets a flag bit on the CD to activate the filter during playback if needed. During playback an analog filter after the DAC restores the response and reduces he quantizing errors. If you extract the digital audio from the disc during a rip session, you now have boosted highs but no analog filter to correct it. In theory the analog filter after the DAC is the best but in practice digitally processing the stream is certainly acceptable and possibly more accurate as the digital filter is not subject to 5% or even 1% component tolerances. They tell me Sox works well and I'm happy with the CoolEdit/Audition filter settings I used - all of 2 times.



Most 'true' way to de-emphasize CD image

Reply #68
[misread]

Most 'true' way to de-emphasize CD image

Reply #69
I looked through the main topics under technical but didn't find anything.

It can be found under Signal Processing which is under Technical, though I found it simply by typing "preemphasis" in the search field (or "pre-emphasis", it doesn't matter either way) .

Most 'true' way to de-emphasize CD image

Reply #70
Are they still making CDs with pre-emphasis [...]?

Yes, unfortunately. My most-recent CD purchase: http://www.discogs.com/Lifelover-Konkurs/release/1513652 from 2008.


Argh. Cthulhu has risen from R'lyeh again, as reported here: http://forum.dbpoweramp.com/showthread.php...ll=1#post121273 .

The current 'newest pre-emphasis CD' to my knowledge is http://www.discogs.com/Marc-Almond-Michael...release/3066984 , released June 2011.

Re: Most 'true' way to de-emphasize CD image

Reply #71
Please excuse my posting in this old topic.  I did not find any additional information on this topic, will try to ask here.
@ Porcus, would be interesting to know the outcomes of your investigation on how to properly de-emphasize CD image using Sox.
Would you recommend up-convert to 24-bits before de-emphasizing with Sox? Did any one was be able to find if Sox does the internal math at a higher bit depth and 16-bit to 24-bit up-convert would not be required?


Re: Most 'true' way to de-emphasize CD image

Reply #72
Don't remember. I ended up deleting my de-emph'ed files and retrieving my CD rips, when there arrived a way to de-emphasize on-the-fly upon playback with foobar2000 (the foo_deemph component).

(And I keep them as WavPack; I wanted to distinguish them by file format, should I accidentally lose the PRE_EMPHASIS tag - there I got an excuse for calling myself a WavPack user. They are not so distinguished out now as musicians started to put 32-bit floating-point PCM from their workstations directly out on the web (like one song on this EP and a one song off Soundcloud here) and I'd rather just keep them as I downloaded them - that means WavPack or OptimFrog, which in practice means WavPack. But I digress.)

 

Re: Most 'true' way to de-emphasize CD image

Reply #73
@Porcus, thanks for clearing things up.
I will keep de-emphing with Sox, no 24-bit up-convert before, and burn them to CD, as my CD player Cambridge Azure 840C does not perform the de-emphasis procedure.