Skip to main content

Topic: Transcoding from FLAC to AAC ??? VBR and CBR ??? (Read 2280 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • yagan
  • [*]
Transcoding from FLAC to AAC ??? VBR and CBR ???
Hello, so I’m searching for answers to some questions about the transcoding to AAC from a FLAC. I’m curious enthusiast, without experience, and I like logical explanations. So please make it sound simple as possible. So I’m searching for optimal transcoding to AAC, preserving as much important data as possible. The track is complex with mix of sounds and speech.

After a lot of time searching and experimenting, and testing I will explain as:

Decision 1 = It is impossible for me to hear the difference between certain original lossless and transcoded lossy version, when properly played on my PC, outputting it to my receiver, trough my sensitive speakers.

Next question = How else can I understand what is the difference?

Decision 2 = Using a Spectrogram for different transcoding of the same lossless track, even small parts of it, I determined that FDK AAC encoder at VBR=5 is best for preserving the spectogram of the original lossless track.
NOTE: due to common examples and comments I did not consider to even try ABR or CBR mode of the encoders.
 
So i was happy with this result for a period of time and then came the question: Am I using the full potential this transcoding and can it become more efficient?

So after more self education, reading experts opinions, and so ... I came to: There is a common understanding that humans can't hear above 20 KHz. And it gets worse with age, plus exposure to loudness of the environment leads to differences. BUT there is lack or not enough Clinical or Other tests under Controlled environment with participants from different groups. The same situation, maybe even worse, is with the testing for the advantages of 24-bit audio over 16-bit, where most test are in uncontrolled environment, and were the participants are guessing between the two. There are only people claiming to hear the difference.

Decision 3 = 95-99% of people after 20-25 years will not hear above 17.4 KHz. Considering myself exposed to ordinary (or even less than ordinary) loud sounds, I determined to have little-damaged ears for my 30 years. 

leading to

Decision 4 = I do not consider the sounds above 15,5 KHz important, hearing them as WOFF'S and SHUSH'S, undistinguishable in the overall mix of the track. Also the very quiet sounds like from -110 dB to -90 dB are not important, because i will have to listen the track so loud, that it will probably be very unpleasant and will permanently damage my hearing.

So when i was experimenting with the removing of the sound above 15,5 KHz, I used a CBR mode for one encoding, and than we came to my Paradox.

The CBR version is preserving more detail and is similar to the original than every VBR, and as expected with smaller files.
"What... that is a mistake???"
I used FDK and Apple AAC encoders in different CBR modes to be sure. And again and again the CBR versions were closer to the original in the diapason that matters. I used Apple’s CBR and CVBR modes with limitation of 448 kbps and 512 kbps. I did not considered using less, because of so called transparency - 64 kbps per channel.

Another thing is that the 448 kbps CBR and CVBR versions, are same as the 512 CBR and CVBR versions, but the 448 ones are again smaller in size ???
I will experiment more… but for now as I see things, my new favourite will be one of FDK CBR @ 448 kbps or Apple CVBR @ 448 kbps.

So I’m attaching the screens from the spectrograms so you can visually see what I see. These are the differences between the original lossless FLAC track compared to the lossy AAC. Or as I understand them – these are the sounds that the FLAC file have and the AAC don’t. To be less complex I used only a single channel.

So my questions are:
1. Why is that happening?
2. Is it not true that VBR modes, should be superior to CBR.
3. Or is there something else I’m missing?

Feel free to give opinions, but please let them be meaningful!
Thanks

  • DVDdoug
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #1
I rarely use AAC so I won't recommend any settings...

Quote
So I’m searching for optimal transcoding to AAC, preserving as much important data as possible. The track is complex with mix of sounds and speech.
The lower the bitrate, the more information is "thrown away".   But, the highest bitrate may not be "optimal".   The whole point of compression is to get a smaller file, and it's not optimum if it doesn't sound better.

Quote
Decision 1 = It is impossible for me to hear the difference between certain original lossless and transcoded lossy version, when properly played on my PC, outputting it to my receiver, trough my sensitive speakers.
It depends on the particular program material and your ability to hear compression artifacts.   The only way to know for sure is to do an Blind ABX Test, or do a few ABX tests to get a feel of what works for you, or if you're not worried about disc space use a high bitrate or simply stick with lossless.

Quote
Decision 2 = Using a Spectrogram for different transcoding of the same lossless track, even small parts of it, I determined that FDK AAC encoder at VBR=5 is best for preserving the spectogram of the original lossless track.
Lossy compression is NOT intended to make the best  looking spectrogram!   The goal is to get the best SOUND!   (It's probably easier to make a good looking spectrogram.)

Quote
So after more self education, reading experts opinions, and so ... I came to: There is a common understanding that humans can't hear above 20 KHz.
The fact that someone can hear to 20kHz in a hearing test does NOT mean they can hear 20kHz harmonics & overtones in the context of music.    The highest frequencies in music are low-level and are often masked (drowned out) by the other sounds.   That's one of the main "tricks' to MP3 & AAC.   They don't simply low-pass filter, they analyze the sound and throw away sounds that are (hopefully)  masked.


  • Makaki
  • [*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #2
Last I checked Apple's AAC is one of the best encoders out there. Even if this has changed, you can expect it to be in the Top 3 list.

Irrelevant to most people. I found that Apple's produced files which are friendlier to some hardware players, See: https://hydrogenaud.io/index.php/topic,102535.0.html

As for what setting is transparent: Remember that first, forget spectograms. Codec efficiency and transparency is for a Hearing perspective, not a Visual one. And scientifically, there is going to be missing audio data, but if you can't determine what it is, it shouldn't matter.

Most encoders will optimize most settings, and you only need to specify a target bitrate. Trying to force specific settings, like which frequencies to cut out, can usually yield worst results. Trust the encoder, at least in your first trial.

As for Apple's encoder. There's a lot of discussions whether CVBR or TVBR is best. I think in summary, the difference if any should be minor. And looking at the forums, CVBR seems to win. It's also what Apple uses for their iTunes presets. See:
https://github.com/nu774/qaac/wiki/Encoder-configuration#relation-to-itunes-encoder-setting

Most will tell you to try various bitrates, to see what is transparent to YOU. And then maybe go up a bit for good measure. It doesn't take long to do a quick ABX test on foobar2000. Especially if it's a personal test and you only need to convince yourself.

EDIT:

You'd want to start that ABX at a bitrate where you can definitely HEAR the difference. And then step it up from there until you can't hear the difference. For some people with very good ears, that bitrate can sometimes be quite high. Sometimes it's best to draw the line where those differences stop being annoying and become acceptable.
  • Last Edit: 08 October, 2017, 03:46:37 PM by Makaki

Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #3
Hello, so I’m searching for answers to some questions about the transcoding to AAC from a FLAC. I’m curious enthusiast, without experience, and I like logical explanations. So please make it sound simple as possible. So I’m searching for optimal transcoding to AAC, preserving as much important data as possible. The track is complex with mix of sounds and speech.

After a lot of time searching and experimenting, and testing I will explain as:

Decision 1 = It is impossible for me to hear the difference between certain original lossless and transcoded lossy version, when properly played on my PC, outputting it to my receiver, trough my sensitive speakers.

It should be possible with the right program material and listening methodology to at least occasionally hear a difference. Attached is a diagram highlighting common audiophile errors in listening test procedures. They are a two-edged sword: These errors can lead to false errors and false confirmations.

Careful preparation of the samples to be auditioned using one of the popular audio file editors or DAW programs in conjunction with a software ABX Comparator such as the FOOBAR2000 ABX Comparator add-in (freeware) can make this as easy as possible. You may be aware of all this, I just don't know.

Quote
Decision 2 = Using a Spectrogram for different transcoding of the same lossless track, even small parts of it, I determined that FDK AAC encoder at VBR=5 is best for preserving the spectogram of the original lossless track.

It has been a long time since a simple technical test such as a spectrogram has been thought to be adequate to reliably detect losses and errors due to lossy file compression using modern programs. Maybe several decades. At this time all credible work seems to be based on listening tests.

Quote
So after more self education, reading experts opinions, and so ... I came to: There is a common understanding that humans can't hear above 20 KHz. And it gets worse with age, plus exposure to loudness of the environment leads to differences. BUT there is lack or not enough Clinical or Other tests under Controlled environment with participants from different groups.

I've seen Fletcher Munson curves with several lines drawn for listeners in different  age groupsnbut google is not my friend today.



Quote
Decision 3 = 95-99% of people after 20-25 years will not hear above 17.4 KHz. Considering myself exposed to ordinary (or even less than ordinary) loud sounds, I determined to have little-damaged ears for my 30 years. 

The same situation, maybe even worse, is with the testing for the advantages of 24-bit audio over 16-bit, where most test are in uncontrolled environment, and were the participants are guessing between the two. There are only people claiming to hear the difference.

That said, the problem with hearing differences due to HF bandpass is not really touched on my single tone hearing test type evaluations. The more severe problem is due to masking of the final critical band by the one just below it.  IOW loud sounds around say, 12 kHz often present in music keep you from hearing the presence or absence of only slightly softer sounds around say, 15 KHz.

 


  • krabapple
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #4
Hello, so I’m searching for answers to some questions about the transcoding to AAC from a FLAC. I’m curious enthusiast, without experience, and I like logical explanations. So please make it sound simple as possible. So I’m searching for optimal transcoding to AAC, preserving as much important data as possible. The track is complex with mix of sounds and speech.

After a lot of time searching and experimenting, and testing I will explain as:

Decision 1 = It is impossible for me to hear the difference between certain original lossless and transcoded lossy version, when properly played on my PC, outputting it to my receiver, trough my sensitive speakers.

OK

Quote
Next question = How else can I understand what is the difference?

Decision 2 = Using a Spectrogram for different transcoding of the same lossless track, even small parts of it, I determined that FDK AAC encoder at VBR=5 is best for preserving the spectogram of the original lossless track.

This is not the way to understand/predict what the differences will *sound like*, when dealing with lossy perceptual codecs. You may in fact be unable to tell the difference between original and lossy at other VBR settings that you rejected because they 'looked' worse.

Quote
So i was happy with this result for a period of time and then came the question: Am I using the full potential this transcoding and can it become more efficient?

Try different (higher compression) settings and compare them to the original files, using an ABX tool (e.g. one that comes with the foobar player).  Which ever one compresses the most without sacrificing 'transparency', is the one that is 'most efficient' for *you*.

A blind comparison (e.g. ABX)  tool  is essential.  You can't just do it 'by ear' (which isn't really 'by ear' if you know which file is compressed and which isn't). 




  • Last Edit: 13 October, 2017, 02:04:10 PM by krabapple

  • yagan
  • [*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #5
Thank you all for the cogitations:

Sorry I’m to not answering to all of you separately, but it’s easy for me this way, less time consuming :) I will extend the explanation in some of the cases:

1. “Optimal” or “efficient” for me was intended as: preserving relevant data that is meaningful and usable - in the case to be able to hear and distinguish this data. Let’s say that this data must be meaningful for the majority. Just for reference: for a track with average complexity and 8 channels the size of data is 3.83 GB (lossless compression) and for the transcoded lossy file (for example: Apple CVBR or CBR at 448kbps, without cut of the high frequencies) is 416 MB so there are around 3.4 GB of data that the majority will not be aware it is there and the absence of it will not effect their overall perception of the track. A thinking person will ask himself what am I getting from this 3.4 GB of data – nearly 10 times more data?

2. Yes! I’m aware of ABX test and have used FOOBAR2000 ABX Comparator. I’ve done at least… 100 tests, not only me, but I put to the tests around 20 of my friends and even my 10 years old nephew and his friend (presuming they have least damaged hearing). The tests were made with high-res. receiver and 9 sensitive speakers, second with high-res headphones and third with other equipment. And I think the quality of the equipment is high enough from average point of view. And I used tracks with different complex of mixed sounds. The results had very little differences, but everyone had to guess the original from the different transcodings, and without acceptable certainty. And the same goes for comparing the different transcodings themselves. And then the LISTENING TESTS FAILS. So then… what?

3. For the use of the Spectrogram: I’m not using it as primary method just as a second one!!! When no one can hear the difference, I was thinking: “How else can I determine what is that “invisible” data that no one of us can hear?” So based on logic, and the five senses that humans posses, the only way to see the differences was presented by analysis of the computer program! How else? I don’t know if the spectrogram is correct, but she can distinguish differences that we can’t. If someone knows, please tell me of other way or program, that can analyze this in a different aspect. For me spectrograms are presenting the sound spectrum, just as the prism splits the white light, so you can see the different colors in it. And closer to the original is surly better!

4. So when the spectrogram made the difference clear… I wanted to hear it… Of course this is complex mix of different sound and presented with 5 or more channels and it will be very hard even impossible to hear differences. Then splitting channels and then using noise cancellation. So I think that the effect of sound masking if not eliminated is minimized. And the screens that I have presented “the thrown away sounds”- SOUNDed very “overpriced” for the 3.4 GB it represents! It is hard to describe them (files are large to attach), but they are WOFF'S and SHUSH'S, no separate sounds, just like acoustics high tones in some voices, and distinguishable only playing them in very high volume. But when your ears are bombarded with mix of a lot of sounds this is negligible. And played this way the normal track, will sound so high and that will do doubt damage your ears. Just for reference sounds above 90 dB can permanently damage your ears.

The more important question is why Apple's AAC Encoder does not make difference between 448 kbps and 512 kbps. The only thing I can think off is:
1. the encoder does not allow more than some 56 kbps for every channel (the track have 8 channels) (If true – why put an option of 512 kbps)
2. or after transcoding there is no data to fill the room above that.
More logical for me is the second. Its like when I tell the encoder to transcode this track with CBR 512 kbps (limitation of 64 kbps), the encoder says OK but I’ll will give you 56 kbps, no mater how high the limitation is, because there is no more data to give to you.
And just to compare when I tell the encoder to use VBR (presuming to give it more room for quality), as expected it uses MORE room (for the example file 600 MB compared to 400 MB in CBR), but also it stores LESS actual data (comparing to the original file - the 400 MB CBR have less differences than this bigger VBR). The same is when using FDK Encoder, I did not have time to detailed usage of other, but these two are considered the best...

Sorry I had forget to mention the listening test, but that was not the important one, to be really sure of what I’m hearing you will have to be in the same environment with the same equipment, so some other factor will not mess up the testing. There is difference when using different equipment: the overall sound is louder, or clearer, compared to the other equipment. But we are not comparing different equipment in different environments.

I just asked if someone have an idea or can explain why this is happening?

Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #6
Theoretically (without tests), CBR should be more stable in sounding than VBR. For example, I do not have high-quality
equipment to conduct tests. Therefore I often look spectrograms. Many argue that the spectrogram not an indicator of sound quality. But I think that makes sense. And if there are excessive gaps in the spectrogram, torn tatters and blurriness,
this undoubtedly affects the sound quality. This is very noticeable on the tracks of 1980-90 years with thin, not dense tops.


Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #7
Most coders have "excessive gaps in the spectrogram, torn tatters and blurriness" at 160 kbps (~17.4 kHz). And this can be a starting point for comparing the quality of coding by spectrograms. 160 kbps is the boundary where the first oversights of coders are noticeable. And this is clearly visible on the spectrogram.
The encoder, which did not allow blurring of the spectrum (on tops ~17,4 kHz) within 160 kbps, will be one of the best. Suppose she is stupid but this is my theory and my opinion.
BUT(!!!) Comparison of quality over the spectrum is advisable to be carried out only on the lossless sources with weakly expressed tops of the spectrum, which are lost at some encoders at around 160 kbps.
IMHO!
  • Last Edit: 04 November, 2017, 12:11:30 PM by Glaublichmann

Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #8
I do not have any golden ears or super high-quality equipment, but I want to have tracks in my AAC collection,
compressed by the best encoder. Therefore, I act as follows:
I choose an uncompressed excellent track without lossless, which has a poor spectrum on the vertex (like as poplar fluff). Then I squeeze it all known to me AAC encoders with a bitrate of 160 CBR (with the default developer settings). In most of cases at 160 Kbps there is a cutoff of frequencies in the region of 17.4 ... 18 kHz. After that, I compare the spectrograms. The encoder, which makes less "torn scraps" at around 17.4 kHz, I consider priority (for myself). In the future, I squeeze my a music collection with a bitrate of 256 kbps for my home collection.
Why do I compare at 160 kbps?
I visually analyzed the spectrograms of more than 10,000 tracks encoded by different AAC encoders and came to the conclusion, that it is at 160 most noticeable various unpleasant moments, such as stretched spectrum, torn shreds and the like, which should affect the sound quality.
At 256-320 the spectrograms of almost all the AAC encoders are very close to each other. There is no sense to compare here.
At 224 kbps already there are differences. But the spectrograms of FAAC 1.28 at 224 kbps were already indistinguishable from the original, and this was confusing. At 192 kbps some of AAC encoders tried to save the entire spectrum of frequencies and therefore there were large gaps in the region of 20 kHz.
With 128 kbps spectrograms, all of AAC encoders are very similar and almost no difference. Below 128 kbps I do not consider.
Remained the "golden mean" 160 kbps. It is on it more visible distortion.
I am a longtime adherent of the AAC format. He is closest to me subconsciously. I can not explain why. This is comparable to people choose, for example, a Mercedes or BMW and can not explain why they chose one and the other.
It is unfortunate that the developers abandoned the Main and LTP profiles. It was cool and looked rich. Even if this does not affect quality. But ... How smart, premium, luxury, AAC with these profiles was perceived. As if ahead of the planet whole. Unfortunately, in the modern world, everything tends to be simplistic and minimalistic. But with LC-only, I respect AAC codec.
Hi developers!

  • lvqcl
  • [*][*][*][*][*]
  • Developer
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #9
It is unfortunate that the developers abandoned the Main and LTP profiles. It was cool and looked rich. Even if this does not affect quality. But ... How smart, premium, luxury, AAC with these profiles was perceived.
Oh my...

  • bennetng
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #10
Please save us greynol or kode54!

Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #11
Yes, I have a huge collection of AAC-LTP. I've spent a lot of time coding and now I'm happy. I'm listening the music on Nokia phones. Nokia phones perfectly decode LTP. This is my choice from the principle. AAC LTP is my choice.
If Nokia did not reject the LTP profile, it really was necessary for the sound quality. Without a reason, they would not have done this.
  • Last Edit: 04 November, 2017, 04:42:25 PM by Glaublichmann

  • eahm
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #12
"AAC fills me up" - Brazzers.

  • saratoga
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #13
Yes, I have a huge collection of AAC-LTP. I've spent a lot of time coding and now I'm happy.

You probably should not have done that. AAC-LTP isn't even a dead format, it was never even alive to begin with. You're probably going to end up having to redo all those files.

Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #14
The same files are duplicated in LC. I collect LTP as an exotic and listen to then on old Nokia phones. Even there are several albums in AAC-SSR.
It may be strange but I like diversity.
  • Last Edit: 04 November, 2017, 06:30:19 PM by Glaublichmann

  • eahm
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #15
@Glaublichmann, nothing you said makes any sense, to the point that I thought you were a troll from another forum.

Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #16
I'm not a troll. I expressed my personal opinion:
1) the old encoder - does not mean a bad encoder.
2) A track with a torn spectrum will not sound good.
  • Last Edit: 04 November, 2017, 08:21:51 PM by Glaublichmann

Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #17
I'm not a troll. I expressed my personal opinion:
1) the old encoder - does not mean a bad encoder.
2) A track with a torn spectrum will not sound good.
I bet a fiver that you were so keen to start vomiting so much subjective bullshit in here in so short a time, that you didn't even bother reading TOS 8 beforehand.

Though you are certainly going to read it now that I've brought it up, then pretend that you'd done it before, and resume regurgitating even more nonsensical "spectral analysis this, spectral analysis that" - till, hopefully, you cross the line so big-time, that a mod decides to send all this crap (my rant included) deservedly down the drain of Recycle Bin oblivion - just like it has been, all these years, whenever a troll from some audiophool cult elsewhere on the web tries to breach into HA's solid anti-bull shit ranks, thank god.
  • Last Edit: 05 November, 2017, 03:34:28 AM by includemeout
Listen to the music, not the media.

  • shadowking
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #18
He's not a troll  . Typically everyones jumping on the censor train.  Spectral analysis can still be useful -like when you want to read and compare graphs and you may see potential issues . Say I don't want to see lowpass and 'tearing' at 12.5k because of well obvious reasons (@  17.5khz I am not sure if I can hear it in music). I also agree that old encoder doesn't automatically imply inferiority
  • Last Edit: 05 November, 2017, 04:28:17 AM by shadowking
wavpack 4.8 -b4x4s0.75c

  • shadowking
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #19
At higher bitrates the encoder relies less on the psymodel therefore the spectral charts look closer to the original. At lower rates the psymodel is more aggressive while still attempting to preserve fidelity (hopefully) . The spectral analysis doesn't always correspond to subjective quality - you can still have a better looking chart with an audiable flaw vs a worst looking chart that the flaw goes unmasked. The charts show something about quality but not all.
wavpack 4.8 -b4x4s0.75c

  • bennetng
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #20
It's pretty easy to make bad sounding audio files with "rich" and "full" looking spectrograms.

Original file:
http://www.lindberg.no/hires/test/2L-053_04_stereo-44kHz-16b.flac

...courtesy of 2L:
http://www.2l.no/hires/index.html

The logic is pretty simple. If we can tell the quality of audio files by such screenshots, we don't need any audio codecs. Someone should just make a jpg or png decoder and translate them into audio. Don't ask me how to make a 254kbps file sound horrible, it is just a demo to show why listening is much more important than looking.

  • shadowking
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #21
One might still opt for the less efficient approach (higher bitrate and fuller spectro charts). It should still be in their freedom to do so . For this way theres less need for listening tests as chances of serious issues are reduced with inflated bitrate like 224 or more.
  • Last Edit: 05 November, 2017, 06:09:42 AM by shadowking
wavpack 4.8 -b4x4s0.75c

Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #22
I did not say that the spectrogram is a panacea. I had in mind some problematic areas with breaks, which are of concern. Here is the  screenshots of the spectrogram 2 different encoders AAC 160 kbps. On one of them there are problematic plots, I singled out them.

Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #23
On the areas where the encoder,in the most critical areas (approaching the transparency boundary), is "gaining momentum", coming off from 16 kHz, sometimes there are dips in spectrum. This is what I meant.
  • Last Edit: 05 November, 2017, 06:23:48 AM by Glaublichmann

  • bennetng
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #24
Did you try the files in Reply #20? The differences in spectrograms made all of your theories became nonsense.