HydrogenAudio

Hosted Forums => foobar2000 => General - (fb2k) => Topic started by: Nintendo Maniac 64 on 2014-04-08 09:38:47

Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-08 09:38:47
Basically the idea is to apply ReplayGain to the waveform with 24bit precision when exporting to a 24bit file rather than Foobar calculating at 32bit floating point and then downsampling to 24bit, thereby avoiding the issue of dither or aliasing in the resulting audio file (particularly if the source was 16bit).
Title: 24bit ReplayGain on waveform file conversion?
Post by: Case on 2014-04-08 14:36:31
Dithering is optional. The error that altering the loudness causes is quantization error. If the math was for some reason done with artifically limited precision you would only get bigger distortion.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-08 19:33:18
Dithering is optional.

...which is why I said "dither or aliasing".

The error that altering the loudness causes is quantization error.  If the math was for some reason done with artifically limited precision you would only get bigger distortion.

But when the source is 16bit, you're already getting quite a massive increase in precision going from 16bit to 24bit.  Are you saying that you would still end up with quantization errors even when going from 16bit to 24bit?
Title: 24bit ReplayGain on waveform file conversion?
Post by: lvqcl on 2014-04-08 20:16:19
32bit -> 24-bit conversion is not "downsampling", it's "bit depth reduction". So no aliasing occurs.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-08 20:19:51
32bit -> 24-bit conversion is not "downsampling", it's "bit depth reduction"

That is what I meant, I just did not know the proper term.

And no aliasing occurs?  I thought this to originally be the case but I could have sworn that I saw aliasing when I did a spectrum analysis; I guess I'll have to double/triple/quadruple-check that...
Title: 24bit ReplayGain on waveform file conversion?
Post by: db1989 on 2014-04-08 20:32:10
...which is why I said "dither or aliasing".
What has aliasing got to do with it? People seem to use this as a catch-all buzzword for every vaguely defined artefact, whether or not it has any relevance to the situation. As well as it being a misnomer for imaging in many cases, it also gets used wrongly to describe quantisation noise. This seems to be another example of one of these or even both.

Quote
But when the source is 16bit, you're already getting quite a massive increase in precision going from 16bit to 24bit.  Are you saying that you would still end up with quantization errors even when going from 16bit to 24bit?
We’re talking about binary numbers here, powers of 2. Increasing bit-depth by even 1 bit can be done in a mathematically lossless way, simply by multiplying the linear, fixed-point sampling value (accounting for the sign bit if necessary) by 2 ^ added bits of resolution, a.k.a. padding with zeroes/shifting left. Quantisation noise simply does not apply. If you observe that, or imaging, whatever you used to perform the upsampling and/or to render the spectrograph are hopeless at their jobs.

Dithering at the new higher bit-depth won’t achieve anything except adding more, albeit quieter, noise; any quantisation distortion from the original 16-bit stage, probably inaudible, is already ‘burned in’. Dither is for downsampling – to forestall introducing inharmonic quantisation noise by replacing it with less grating random noise – not for upsampling.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-08 20:36:11
Dither is for downsampling – to forestall introducing inharmonic quantisation noise by replacing it with less grating random noise – not for upsampling.

The issue is was that applying ReplayGain to the waveform of a 16bit file and exporting to 24bit actually results in the following occurring:
16bit -> 32float -> 24bit.

Now as was stated, there shouldn't be any artifacts going from 32float to 24bit, but nevertheless there is bit depth reduction going on here.
Title: 24bit ReplayGain on waveform file conversion?
Post by: db1989 on 2014-04-08 20:52:26
Quote
The issue is was that applying ReplayGain to the waveform of a 16bit file and exporting to 24bit actually results in the following occurring:
16bit -> 32float -> 24bit.
Good!

DSP at a higher depth followed by downsampling is statistically less generative of errors than DSP unnecessarily locked to a lower depth throughout.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-08 20:54:34
DSP at a higher depth followed by downsampling is statistically less generative of errors than DSP unnecessarily locked to a lower depth throughout.

Even when the source is 16bit and the result is 24bit?  Wouldn't that be way more than enough headroom for gain adjustments?
Title: 24bit ReplayGain on waveform file conversion?
Post by: lvqcl on 2014-04-08 21:01:28
I don't understand why do you think that 16->24->(gain adjustment)->24  is better than 16->32f->(gain adjustment)->32f->24.

Do you think that there will be no truncation/dithering without intermediate 32bit float?
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-08 21:13:50
EDIT: Hang on, editing my post.
Title: 24bit ReplayGain on waveform file conversion?
Post by: db1989 on 2014-04-08 21:21:41
Quote
Even when the source is 16bit and the result is 24bit? Wouldn't that be way more than enough headroom for gain adjustments?
So now we’re talking about yet another different buzzword? Headroom is at best tangentially relevant to ReplayGain and is definitely not relevant to the ideas being propounded here. Meanwhile, I’m still waiting for a valid, reasoned explanation of why two basic tenets of binary mathematics – that higher-depth signals can losslessly encode lower-depth ones and allow processing with less incurred noise/distortion – are false.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-08 21:35:35
So now we’re talking about yet another different buzzword?

I apologize, I should have put a disclaimer in my first post stating that I'm not intricately familiar with the technical terminology of audio.  I'm familiar with the concept themselves, just not their names - this was displayed above with the case of "bit depth reduction".
Title: 24bit ReplayGain on waveform file conversion?
Post by: db1989 on 2014-04-08 21:49:50
With the best intentions, I can only recommend abstaining from using terms if you’re unsure of what they mean. Use a clumsy literal explanation if you have to! It’s better than inaccurately using a defined term, which can only lead to confusion for everyone involved.

What meaning were you actually thinking about? And, again: Does that concept explain how processing at 32 bits and then downsampling to 24 could possibly be worse than processing at 24 the entire way, with the reduced precision the latter involves inherently?

We want to help here, but it’s hard when we’re not sure what the topic of discussion actually is.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-08 22:13:04
Ok, I just wanted to confirm that I'm definitely getting aliasing with 16->32f->(gain adjustment)->32f->24 if I don't use dither.

I'd gladly try to explain what I'm thinking regarding 24bit gain processing but I think I just burned out my brain or something because I can't wrap my head around the idea at all.  Alternatively maybe I was just too tired and looney when I typed up this thread last night and now that I'm more awake I logically cannot see the logic (or non-logic) that was going through my tired brain.

Maybe I'll have an epiphany when I'm in the shower later, they say that's where you do your best work.
Title: 24bit ReplayGain on waveform file conversion?
Post by: mudlord on 2014-04-08 22:15:03
Wait a goddamn minute, is this the same Nintendo Maniac from various emulation forums?
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-08 22:19:00
Well at least with the current way Foobar works, when using a typical 44.1KHz 16bit song that most likely already has been dithered to its current bit depth, should I apply dither or let it truncate when doing the 16b->32f->(gain)->32f->24b?

-------------------------------------------------------------

Wait a goddamn minute, is this the same Nintendo Maniac from various emulation forums?

It's not like you're not the Mudlord on those very same forums.  I've been aware of you being on Hydrogen Audio for quite a while now, but it's not like I'm going to seek you out just to say "hey I recognize you", that could seem creepy.

FYI, I use this username pretty much everywhere.  I actually don't spend a lot of time on emulation forums - it's just that is where you are also active.

EDIT: Since I've been recognized I might as well add my avatar, just as long as there's no criticism along the lines of "go back fapping to your imaginary waifu you weaboo" (for reference I've gotten several such remarks from a member on AVS Forum)
Title: 24bit ReplayGain on waveform file conversion?
Post by: db1989 on 2014-04-08 22:39:21
Ok, I just wanted to confirm that I'm definitely getting aliasing with 16->32f->(gain adjustment)->32f->24 if I don't use dither.
As converted by which program and/or spectrographically visualised by which program? There is no logical reason for this to be happening, assuming, again, that you are saying “aliasing” when you really mean imaging.

Since this is far too little-known online: Images are the spuriously produced reflections around integer multiples of the sampling frequency, produced by DACs or other ‘stair-stepping’ processes. Aliasing, in reality, is when an ADC or other digital system is fed a frequency higher than half its sampling rate, which, when sampled, necessarily becomes folded down below the Nyquist frequency (0.5 * sampling frequency), typically becoming an ugly inharmonic tone (an alias).

Quote
I'd gladly try to explain what I'm thinking regarding 24bit gain processing but I think I just burned out my brain or something because I can't wrap my head around the idea at all.  Alternatively maybe I was just too tired and looney when I typed up this thread last night and now that I'm more awake I logically cannot see the logic (or non-logic) that was going through my tired brain.

Maybe I'll have an epiphany when I'm in the shower later, they say that's where you do your best work.
Well, I guess we’re interested to read your theory if it does emerge later.

Quote
Well at least with the current way Foobar works
What way is that? Is it somehow different from the norm or the ideal?

Quote
when using a typical 44.1KHz 16bit song that most likely already has been dithered to its current bit depth, should I apply dither or let it truncate when doing the 16b->32f->(gain)->32f->24b?
“truncate” where? from 32 to 24 bits? Your final bit-depth is going to be 24 bits in either case, so at best, there will be no difference, if the processing at both depths ends up rounding to the same final points on a 24-bit scale. If. And what’s dither going to do? Overlay more, and quiter, noise on top of that already inaudible 24-bit quantisation distortion? Theoretically, you’d be limiting the ability of the DSP to do its job more precisely, and then you’d be shovelling a perceptually insignificant volume of noise on top for good measure. Again, I don’t understand the missing rationale here.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-08 22:53:29
Oh wow, I'm a derp.  The aliasing I've been seeing is due to resampling, not from the gain adjustment!

The thing was I was resampling 2x to make it much more clear on a waveform spectrum what was aliasing because, say your source audio was 48KHz, the audio data would have only normally gone up to 24KHz on the spectrum even if you resample to 96KHz but aliasing would continue all the way up to 48KHz.

Therefore it doesn't matter what it was that I was thinking of last night because the results are already lossless.


Quote
What way is that? Is it somehow different from the norm or the ideal?

I was just saying "Currently" because who knows how in the future Foobar will do things, for all we know maybe it'll use 64bit precision for gain and volume calculations.
Title: 24bit ReplayGain on waveform file conversion?
Post by: db1989 on 2014-04-08 23:06:01
The thing was I was resampling 2x to make it much more clear on a waveform spectrum what was aliasing because, say your source audio was 48KHz, the audio data would have only normally gone up to 24KHz on the spectrum even if you resample to 96KHz but aliasing would continue all the way up to 48KHz.
I really don’t understand what you’re saying in this quote. If you mean that upsampling produced extra reflections above the original Nyquist frequency, then you’re using either a bad resampler or an extremely oversensitive spectrograph. There are no images in the original file when properly reconstructed, so these cannot be ‘unearthed’ by resampling – unless in a spurious, unwanted way. Nothing is being made “much more clear” here.

And again, if your description means what I think it does, the term you would be looking for is “imaging”. I was going to edit this into my previous post, but it seems to be getting ever more relevant: http://lavryengineering.com/pdfs/lavry-sam...ng-aliasing.pdf (http://lavryengineering.com/pdfs/lavry-sampling-oversampling-imaging-aliasing.pdf) This is Dan Lavry’s excellent paper describing these two phenomena, the fundamental difference/opposition between them, and related subjects. I highly recommend some background reading like this.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-08 23:21:32
you’re using either a bad resampler

I'm using Foobar's PPHS resampler with Ultra Mode enabled

or an extremely oversensitive spectrograph.

Just using the default in Audacity, Algorithm set to "Spectrum", Function set to "Hanning window", size set to "512", and Axis set to "Linear frequency".

Basically this is the process I do in foobar:

16bit 48KHz mono sine wave -> 96khz 32float -> apply replaygain tags

From there I then convert into 2 files without dithering, one at 24bit and 32float, which then applies "WaveGain".

I then open them both in Audacity, do an "Invert" to one of the waveforms, "Select all" and then "Mix and Render".  From the resulting waveform I then "Normalize" and then do "Plot Spectrum", which will show audio data going all the way up to 48KHz in a relatively flat saw-tooth-like formation.
Title: 24bit ReplayGain on waveform file conversion?
Post by: kode54 on 2014-04-09 00:32:13
The SoX resampler component, also available on this forum, is considerably better at handling aliasing, and is also much faster than the PPHS resampler running in Ultra Mode. It also comes in several configurable flavors, such as one which will upsample to the next highest rate in a given rate list, and downsample to the highest one in the list, while not resampling anything that matches the list. At least, I think one of the versions can do that.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-09 04:14:42
The SoX resampler component, also available on this forum

I actually already have it.

is considerably better at handling aliasing

Oh, I did not know that it was of notably higher quality than PPHS with Ultra mode enabled, I'll have to test that out.  This then begs the question of why is SoX not included as Foobar's default resampler?

and is also much faster than the PPHS resampler running in Ultra Mode.

I thought this was SoX's main benefit so as for better use with real-time resampling.

It also comes in several configurable flavors, such as one which will upsample to the next highest rate in a given rate list, and downsample to the highest one in the list, while not resampling anything that matches the list. At least, I think one of the versions can do that.

I use this very functionality to only upsample 32KHz and 22050Hz for real-time playback since my sound card isn't capable of handling those sample rates natively over ASIO (it's a Xonar so WASAPI is always resampled).  Interestingly I actually have to put two copies of SoX mod2 in my DSP list since 22050Hz needs to be upsampled by a multiple of 2 while 32KHz needs to be upsampled by a multiple of 3.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Kohlrabi on 2014-04-09 09:13:56
What meaning were you actually thinking about? And, again: Does that concept explain how processing at 32 bits and then downsampling to 24 could possibly be worse than processing at 24 the entire way, with the reduced precision the latter involves inherently?
To chime in, 24bit integer does not have a lower precision than 32bit float, but rather a much lower (dynamic) range. The precision is determined by the mantissa for floating point (http://en.wikipedia.org/wiki/Floating_point) numbers, and IEEE 754 32bit float has 24 bits of precision (http://en.wikipedia.org/wiki/Single-precision_floating-point_format#IEEE_754_single-precision_binary_floating-point_format:_binary32), the same as 24bit integer.
Title: 24bit ReplayGain on waveform file conversion?
Post by: 2Bdecided on 2014-04-09 12:18:32
I then open them both in Audacity, do an "Invert" to one of the waveforms, "Select all" and then "Mix and Render".  From the resulting waveform I then "Normalize" and then do "Plot Spectrum", which will show audio data going all the way up to 48KHz in a relatively flat saw-tooth-like formation.
If you peak normalise it before looking at it, aren't you losing sight of how big the error is (or isn't)?

If you're trying to compare the errors between different processes, peak normalising them independently will wreck this comparison.


Starting at 44.1kHz or 48kHz, and 16-bits, with a target of a higher sample rate and/or a higher bitdepth + ReplayGain, I don't believe that any of the things you're discussing in this thread cause any audible difference even under the most extreme circumstances.

Refusing to work in 32-bit float because you have a 24-bit output is as misguided as refusing to working in 24-bits because you have a 16-bit output or refusing to working in 16-bits because you have an 8-bit output. More bits during processing are not a problem: they're a potential benefit, and at least not worse (assuming everything else is equal).

Cheers,
David.
Title: 24bit ReplayGain on waveform file conversion?
Post by: EpicForever on 2014-04-09 18:05:01
OMFG... are you serious? You have produced 25 posts about what is better - applying RG to 32bps float or to 24bps fixed? OMG... Then maybe we should start divagation about HDMI cable jitters? Please...
Let me ask you all a question - is this "difference" even measurable by some fancy calculations? I mean - is it possible to show any mathematic proof that in fact there is any greater noise applied in one of these routines compared to another?
But I have better question - especially for "Nintendo Maniac" - if this difference really exists - is it greater than regular noise present in typical audio channel (caused by D/A and amplification stages)? I don't think so...
And if someone is so purist then why not install ASIO, disable RG and any DSP and stream audio via HDMI to his hi-end stereo system?
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-09 20:49:10
If you're trying to compare the errors between different processes, peak normalising them independently will wreck this comparison.

I'm normalizing after the two waveforms have been mixed into a single waveform.

Starting at 44.1kHz or 48kHz, and 16-bits, with a target of a higher sample rate and/or a higher bitdepth + ReplayGain, I don't believe that any of the things you're discussing in this thread cause any audible difference even under the most extreme circumstances.

Fair point.  Admittedly I wasn't that interested in audible differences in the first place but was more concerned with preserving lossless-ness since, like I posted, I originally thought that 32float -> 24bit gain adjustments were supposed to be lossless and yet I wasn't seeing that, therefore my original concern.

Refusing to work in 32-bit float because you have a 24-bit output is as misguided as refusing to working in 24-bits because you have a 16-bit output or refusing to working in 16-bits because you have an 8-bit output. More bits during processing are not a problem: they're a potential benefit, and at least not worse (assuming everything else is equal).

I think you (and EpicForever) missed a post of mine.  Here, let me quote it for you:

Oh wow, I'm a derp.  The aliasing I've been seeing is due to resampling, not from the gain adjustment!


In case I wasn't clear enough, let me spell it out for you - the lossy difference I was seeing was due to resampling and not bit-depth reduction.


I would have edited my opening post to say that this was a false alarm and that I was seeing differences caused by resampling and not bit depth reduction, but it won't me edit that post anymore.
Title: 24bit ReplayGain on waveform file conversion?
Post by: kode54 on 2014-04-10 01:27:36
I already caught that last part, at least.

There may yet be some rounding errors converting 24 bit to 32 bit float, gain scaling, then converting back. However, they are likely to be minuscule rounding errors, considering the 24 bit integer precision.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-10 01:31:05
There may yet be some rounding errors converting 24 bit to 32 bit float, gain scaling, then converting back. However, they are likely to be minuscule rounding errors, considering the 24 bit integer precision.

From my testing 32float -> gain change -> 32float results in the exact same waveform as 32float -> gain change -> 32float -> 24bit -> 32float.

EDIT: Except now I just tested again to make absolute sure and now they're not the same?  Hang on...


EDIT 2: Ok, figured it out.  The following two processes result in the exact same waveform:



While the following two processes result in different waveforms:



The key is what the starting bit depth is.
Title: 24bit ReplayGain on waveform file conversion?
Post by: kode54 on 2014-04-10 01:49:41
Yeah, well, the initial waveform being 32 bit float does add the possibility of there being higher precision, and also unclipped values which may be clipped when converting down to integer.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-10 01:55:29
Clipping... that reminds me, Foobar's replay gain scanner (only the metadata part, not the "WaveGain" exporter) will adjust the waveform gain to the default target of -18 LUFS no matter what, even if it involves increasing the gain and clipping the peaks of the waveform.

Is this intentional?  For a real-world example, this occurs with the very first track of the Mario Galaxy OST, the song titled "Overture".
Title: 24bit ReplayGain on waveform file conversion?
Post by: Kohlrabi on 2014-04-10 07:01:31
Clipping... that reminds me, Foobar's replay gain scanner (only the metadata part, not the "WaveGain" exporter) will adjust the waveform gain to the default target of -18 LUFS no matter what, even if it involves increasing the gain and clipping the peaks of the waveform.
foobar2000's RG scanner does not adjust the gain at all, it just calculates the necessary gain adjustment. It's up to the playback software to avoid clipping when playing files with positive RG values. foobar2000 can prevent clipping according to peak values when using RG adjustment during playback.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-10 08:14:36
foobar2000 can prevent clipping according to peak values when using RG adjustment during playback.

But can it do that without compressing or limiting the dynamics of said clipped waveform peak?
Title: 24bit ReplayGain on waveform file conversion?
Post by: Case on 2014-04-10 16:26:00
Compressing would limit the dynamics. Default clipping prevention adjusts the output amplitude for the entire album or entire track depending on your settings so that peaks will reach digital fullscale but not go above it. This is identical to normal volume adjustment and does nothing to dynamic range.
Optionally you can enable either 'Hard -6dB limiter' or 'Advanced Limiter' DSP with foobar2000. The former compresses the peaks causing some dynamic range loss. The latter will dynamically start lowering amplitude when peaks are getting near fullscale. It affect the dynamic range too but without compression artifacts.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-10 21:16:09
Default clipping prevention adjusts the output amplitude for the entire album or entire track depending on your settings so that peaks will reach digital fullscale but not go above it. This is identical to normal volume adjustment and does nothing to dynamic range.

This is what I want, how can it be done for playback without needing to export/convert?
Title: 24bit ReplayGain on waveform file conversion?
Post by: db1989 on 2014-04-10 21:21:33
Surprisingly, this is achieved by going into the Preferences for Playback and selecting the relevant album- or track-based option suffixed by “and prevent gain according to peak”.

…Did this entire thread of unnecessary worrying come around solely because you hadn’t looked around the options sufficiently?
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-10 21:27:28
Surprisingly, this is achieved by going into the Preferences for Playback and selecting the relevant album- or track-based option suffixed by “and prevent gain according to peak”.

Oh, I had that enabled already.  Well now I feel like a derp again.


…Did this entire thread of unnecessary worrying come around solely because you hadn’t looked around the options sufficiently?

No, the original topic was about lossless-ness being preserved when doing gain changes at 32float and then exporting to 24bit.  Turns out it's lossless if the source material is 24bit or less, but is not lossless if the source is 32float or more, even when the gain adjustment it done at 32float for both.
Title: 24bit ReplayGain on waveform file conversion?
Post by: db1989 on 2014-04-10 21:35:36
Quote
Turns out it's lossless if the source material is 24bit or less
of course (and thanks to Kohlrabi for clarifying re the bit allocation of 32-bit float)

Quote
but is not lossless if the source is 32float or more, even when the gain adjustment it done at 32float for both
Was this surprising? Taking your example of an n-bit signal being adjusted to a different gain and re-saved at the same depth: This process could be lossless as in totally reversible if the original and adjusted levels and the DR had specific, mathematically favourable values. However, it should be evident as a truism that performing DSP is not lossless: one necessarily performs DSP to change a signal, not to keep it the same.
Title: 24bit ReplayGain on waveform file conversion?
Post by: Nintendo Maniac 64 on 2014-04-10 21:42:49
Was this surprising?

Well it explained why when I did resampling it caused the difference between 32float and 24bit to occur, even though I started with 16bit before resampling.