Skip to main content
Topic: Converting 24 bit FLAC to AAC (QAAC) (Read 3917 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Converting 24 bit FLAC to AAC (QAAC)

I have purchased 24 bit flac files from HDTracks few years ago and i am thinking about converting some of them in AAC. I have heard a lot of good things about QAAC and True VBR. I am using Foobar2000 for converting music.
So, my question is, should i just convert them to AAC TrueVBR (QAAC) without resampling or should i use SOX resampler for best results.

Re: Converting 24 bit FLAC to AAC (QAAC)

Reply #1
What sampling rate are the files?  If they are 44.1/48k I would not resampled them.

Re: Converting 24 bit FLAC to AAC (QAAC)

Reply #2
Most of the files have sampling rate 44.1/48k and my recent purchase is Coldplay's X&Y which is 24/192k. Is it okay if i don't resample 24/192?

Re: Converting 24 bit FLAC to AAC (QAAC)

Reply #3
192k is an annoying sampling rate, so I would probably reduce to 48k before encoding.

Re: Converting 24 bit FLAC to AAC (QAAC)

Reply #4
IIRC "AAC TrueVBR (QAAC)" doesn't support samplerates above 48kHz anyway.

Re: Converting 24 bit FLAC to AAC (QAAC)

Reply #5
From a mathematical standpoint, you can always up-sample, without any changes to the actual signal integrity.

Whether this is beneficial (or the opposite) to the compression algorithm, is a different matter. I personally don't know, if QAAC, which uses the AppleAAC encoder, does better or worse when up-sampling.

Re: Converting 24 bit FLAC to AAC (QAAC)

Reply #6
From a mathematical standpoint, you can always up-sample, without any changes to the actual signal integrity.

Whether this is beneficial (or the opposite) to the compression algorithm, is a different matter. I personally don't know, if QAAC, which uses the AppleAAC encoder, does better or worse when up-sampling.

As far as is understand, the opus folks decided for their codec to have a fixed sampling rate (48kHz) so people won't encode audio at rates the codec isn't optimized for. I'd consider the point they're making here, too. I'm sure just like opus, most AAC encoders will be tuned for and most efficient at a 44.1-48 kHz sampling rate. Besides, going any higher than that would contradict the paradigm of lossy compression, that is: of discarding information/allowing inaccuracies where they're inaudible. Everything above 20kHz is inaudible anyway, so there's no point in feeding higher frequencies to the encoder.

Re: Converting 24 bit FLAC to AAC (QAAC)

Reply #7
As far as is understand, the opus folks decided for their codec to have a fixed sampling rate (48kHz) so people won't encode audio at rates the codec isn't optimized for. I'd consider the point they're making here, too. I'm sure just like opus, most AAC encoders will be tuned for and most efficient at a 44.1-48 kHz sampling rate. Besides, going any higher than that would contradict the paradigm of lossy compression, that is: of discarding information/allowing inaccuracies where they're inaudible. Everything above 20kHz is inaudible anyway, so there's no point in feeding higher frequencies to the encoder.
Unfortunatelly, it's not that easy. The bandpassing needs to be imposed before sampling, otherwise you'll get artifacts which reflects off of the band limiter into the lower domains. The lower the cut-off, the more audible the artifacts. On the other hand, no low-pass is  perfect, even the most steeply filters, will still be sloped, hence its often easier to keep a rather shallow filtering and focus on low-reflections and other types of lossy factors when compressing.

A lower samplerate will often result in a better compression, but this isn't necessarily the case. in face in algorithms that employ any form of DCT, the actual samplerate isn't that much of an issue, the actual data is. The more uniform the easier to compress.
Furthermore, things like MP3 employ spectral masking in order to throw away inaudible polynomials out the coefficients used to describe the waveform at any given time. This is one of the areas, where MP3 throws most of the data out, as part of the lossy-ness in its compression. Spectral masking, is essentially you have a loud tone at say 1kHz. Then tones in the vicinity will be in-audible. This effect is exacerbated, when you have two loud tones, say 1kHz and 1.1kHz. Tones between those will be pretty much entirely masked, while well experience the two tones as phasing. This is actually due to the way our hearing works, which I think is pretty interesting. MP3 was an attempt at modelling how human hearing works, and what sort of audible information our hearing system throws out anyway.

A Higher sampling rate can be beneficial to compression, actually. Things like bzip2 profit from a higher degree of signal uniformity, things like FarbFeld compress similarly well, even though the data contains mostly similar and very repeating patterns.

Having said that, the encoder should be smart enough to re-sample frame by frame, and if needed re-sample each block to calculate it's polynomial coefficients based on it's DCT. Technically, the actual sample rate doesn't matter from a signal frame perspective. If the signal is a flat line at any given point, the frames containing that flat line, need just one sample.

Re: Converting 24 bit FLAC to AAC (QAAC)

Reply #8
As far as is understand, the opus folks decided for their codec to have a fixed sampling rate (48kHz) so people won't encode audio at rates the codec isn't optimized for. I'd consider the point they're making here, too. I'm sure just like opus, most AAC encoders will be tuned for and most efficient at a 44.1-48 kHz sampling rate. Besides, going any higher than that would contradict the paradigm of lossy compression, that is: of discarding information/allowing inaccuracies where they're inaudible. Everything above 20kHz is inaudible anyway, so there's no point in feeding higher frequencies to the encoder.

Unfortunatelly, it's not that easy. The bandpassing needs to be imposed before sampling, otherwise you'll get artifacts which reflects off of the band limiter into the lower domains. The lower the cut-off, the more audible the artifacts. On the other hand, no low-pass is  perfect, even the most steeply filters, will still be sloped, hence its often easier to keep a rather shallow filtering and focus on low-reflections and other types of lossy factors when compressing.

This is misleading.  You can make low pass arbitrarily close to perfect, so in practice the only tradeoff is in processing time.  Since processing time is cheap, and resampling is fast, it isn't much of a tradeoff. 

A Higher sampling rate can be beneficial to compression, actually. Things like bzip2 profit from a higher degree of signal uniformity, things like FarbFeld compress similarly well, even though the data contains mostly similar and very repeating patterns.

If increasing the sampling rate improves lossless compression, something is wrong with the codec.  For time domain lossless formats, downsampling generally improves compression significantly.  For transform domain codecs the relationship is less obvious since they don't encode the signal in the time domain and can effectively discard any added information in the higher sampling rate anyway.

Having said that, the encoder should be smart enough to re-sample frame by frame, and if needed re-sample each block to calculate it's polynomial coefficients based on it's DCT. Technically, the actual sample rate doesn't matter from a signal frame perspective. If the signal is a flat line at any given point, the frames containing that flat line, need just one sample.

Transform codecs don't work like this.  They aren't resampling each frame, there are no "polynomial coefficients", and compression is not applied in the time domain.  They instead they transform into the frequency domain and then requantize based on what information is relevent. 

Re: Converting 24 bit FLAC to AAC (QAAC)

Reply #9
As far as is understand, the opus folks decided for their codec to have a fixed sampling rate (48kHz) so people won't encode audio at rates the codec isn't optimized for. I'd consider the point they're making here, too. I'm sure just like opus, most AAC encoders will be tuned for and most efficient at a 44.1-48 kHz sampling rate. Besides, going any higher than that would contradict the paradigm of lossy compression, that is: of discarding information/allowing inaccuracies where they're inaudible. Everything above 20kHz is inaudible anyway, so there's no point in feeding higher frequencies to the encoder.
Unfortunatelly, it's not that easy. The bandpassing needs to be imposed before sampling, otherwise you'll get artifacts which reflects off of the band limiter into the lower domains. The lower the cut-off, the more audible the artifacts. On the other hand, no low-pass is  perfect, even the most steeply filters, will still be sloped, hence its often easier to keep a rather shallow filtering and focus on low-reflections and other types of lossy factors when compressing.

This is misleading.  You can make low pass arbitrarily close to perfect, so in practice the only tradeoff is in processing time.  Since processing time is cheap, and resampling is fast, it isn't much of a tradeoff. 
Technically, this is incorrect. You can make them close to ideal, but never ideal itself, FIR filters simply don't work that way. You can make the transition width smaller, but never 0.

A Higher sampling rate can be beneficial to compression, actually. Things like bzip2 profit from a higher degree of signal uniformity, things like FarbFeld compress similarly well, even though the data contains mostly similar and very repeating patterns.

If increasing the sampling rate improves lossless compression, something is wrong with the codec.  For time domain lossless formats, downsampling generally improves compression significantly.  For transform domain codecs the relationship is less obvious since they don't encode the signal in the time domain and can effectively discard any added information in the higher sampling rate anyway.
Improving might be the wrong sentiment here, but the point is, that if the signal consists of a larger number of reoccurring values, it should increase the compressed value, when there is no added information. This is often the case when transmitting datagrams with arbitrary length.

Having said that, the encoder should be smart enough to re-sample frame by frame, and if needed re-sample each block to calculate it's polynomial coefficients based on it's DCT. Technically, the actual sample rate doesn't matter from a signal frame perspective. If the signal is a flat line at any given point, the frames containing that flat line, need just one sample.

Transform codecs don't work like this.  They aren't resampling each frame, there are no "polynomial coefficients", and compression is not applied in the time domain.  They instead they transform into the frequency domain and then requantize based on what information is relevent. 
Well, when you transform into the frequency domain, you have to do it in intervals. Doing it for the entire length of the file or whatever would be nonsensical. The frequency characteristics of a set window is them used to determine the polynomial coefficients to reconstruct this section of the signal. Saving the coefficients is more efficient in most cases. The highest degree of the polynomial coefficient translates to the bandpass. However not all coefficients must be used for a signal where the lower coefficients aren't needed, the factors of those are simply set to zero. This is famously used in Image compression of H.264/AVC.
It's just that in image compression, it is a two-dimensional signal.

If the audio encoders don't necessarily work that way, technically more of an implementation aspect. Mathematically though, it kinda is how the signal processing works.

A somewhat easy approach to the DSP stuff of AVC can be found here: https://www.vcodex.com/h264avc-4x4-transform-and-quantization/ it's more on the lay-man's side of explanation, but it's reasonably comprehensive.

Having said that, I'm kinda curious to see how these encoders deal with very simple signals, I might wanna check that out.

Re: Converting 24 bit FLAC to AAC (QAAC)

Reply #10
As far as is understand, the opus folks decided for their codec to have a fixed sampling rate (48kHz) so people won't encode audio at rates the codec isn't optimized for. I'd consider the point they're making here, too. I'm sure just like opus, most AAC encoders will be tuned for and most efficient at a 44.1-48 kHz sampling rate. Besides, going any higher than that would contradict the paradigm of lossy compression, that is: of discarding information/allowing inaccuracies where they're inaudible. Everything above 20kHz is inaudible anyway, so there's no point in feeding higher frequencies to the encoder.
Unfortunatelly, it's not that easy. The bandpassing needs to be imposed before sampling, otherwise you'll get artifacts which reflects off of the band limiter into the lower domains. The lower the cut-off, the more audible the artifacts. On the other hand, no low-pass is  perfect, even the most steeply filters, will still be sloped, hence its often easier to keep a rather shallow filtering and focus on low-reflections and other types of lossy factors when compressing.

This is misleading.  You can make low pass arbitrarily close to perfect, so in practice the only tradeoff is in processing time.  Since processing time is cheap, and resampling is fast, it isn't much of a tradeoff. 
Technically, this is incorrect. You can make them close to ideal, but never ideal itself,

That is actually exactly what I said.  I'll restate my point.  You can make a lowpass arbitrarily close to perfect at the expense of processing time, so what you were saying is misleading.

A Higher sampling rate can be beneficial to compression, actually. Things like bzip2 profit from a higher degree of signal uniformity, things like FarbFeld compress similarly well, even though the data contains mostly similar and very repeating patterns.

If increasing the sampling rate improves lossless compression, something is wrong with the codec.  For time domain lossless formats, downsampling generally improves compression significantly.  For transform domain codecs the relationship is less obvious since they don't encode the signal in the time domain and can effectively discard any added information in the higher sampling rate anyway.
Improving might be the wrong sentiment here, but the point is, that if the signal consists of a larger number of reoccurring values, it should increase the compressed value, when there is no added information. This is often the case when transmitting datagrams with arbitrary length.

If your compressor is working correctly you will not improve compression by upsampling.  The upsampled file may have a better compression ratio, but it will still have a larger absolute file size.  If upsampling were beneficial, formats could simply upsample during compression to improve the result, which obviously doesn't work.

Having said that, the encoder should be smart enough to re-sample frame by frame, and if needed re-sample each block to calculate it's polynomial coefficients based on it's DCT. Technically, the actual sample rate doesn't matter from a signal frame perspective. If the signal is a flat line at any given point, the frames containing that flat line, need just one sample.

Transform codecs don't work like this.  They aren't resampling each frame, there are no "polynomial coefficients", and compression is not applied in the time domain.  They instead they transform into the frequency domain and then requantize based on what information is relevent. 

Well, when you transform into the frequency domain, you have to do it in intervals. Doing it for the entire length of the file or whatever would be nonsensical. The frequency characteristics of a set window is them used to determine the polynomial coefficients to reconstruct this section of the signal. Saving the coefficients is more efficient in most cases. The highest degree of the polynomial coefficient translates to the bandpass.

I think by "polynomial" you really mean "frequency transform", by "re-sample" you also mean "frequency transform", and by "bandpass" you mean "lowpass".  Is that right?  Above becomes 'The frequencies in a window determine the frequency transform coefficients in this section of the signal'?  Not sure how to translate "saving the coefficients is more efficient". 

However not all coefficients must be used for a signal where the lower coefficients aren't needed, the factors of those are simply set to zero. This is famously used in Image compression of H.264/AVC.

This is a reasonable approximation of how image codecs like JPEG work, but not a great way to think about audio or video codecs.  In JPEG you do just take a DCT, throw away the higher (not lower) coefficients, and then encode the rest.  For audio that doesn't work, the higher coefficients are higher frequencies, and just lowpassing a signal won't give you a good codec.  Video is even more complex, since while you do the DCT like JPEG, that is only the very first of many steps used to compress each frame, and most of the compression comes after the transform step. 

If the audio encoders don't necessarily work that way, technically more of an implementation aspect. Mathematically though, it kinda is how the signal processing works.

They don't work like that, and it is not just an implementation detail.  Try lowpass filtering a file to remove 90% of it and then compare to MP3 compression reducing a file by 90%.  The lowpass will sound very bad because your hearing and your vision work on very different principles.

A somewhat easy approach to the DSP stuff of AVC can be found here: https://www.vcodex.com/h264avc-4x4-transform-and-quantization/ it's more on the lay-man's side of explanation, but it's reasonably comprehensive.

That is not remotely comprehensive.  That is just an explanation of how the intra-frame encoding step of H.264 works.  That is a few percent of the compression. The other 90+% comes from the interframe encoding steps. 

 
SimplePortal 1.0.0 RC1 © 2008-2018