Skip to main content

Topic: Transcoding from FLAC to AAC ??? VBR and CBR ??? (Read 820 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • yagan
  • [*]
Transcoding from FLAC to AAC ??? VBR and CBR ???
Hello, so I’m searching for answers to some questions about the transcoding to AAC from a FLAC. I’m curious enthusiast, without experience, and I like logical explanations. So please make it sound simple as possible. So I’m searching for optimal transcoding to AAC, preserving as much important data as possible. The track is complex with mix of sounds and speech.

After a lot of time searching and experimenting, and testing I will explain as:

Decision 1 = It is impossible for me to hear the difference between certain original lossless and transcoded lossy version, when properly played on my PC, outputting it to my receiver, trough my sensitive speakers.

Next question = How else can I understand what is the difference?

Decision 2 = Using a Spectrogram for different transcoding of the same lossless track, even small parts of it, I determined that FDK AAC encoder at VBR=5 is best for preserving the spectogram of the original lossless track.
NOTE: due to common examples and comments I did not consider to even try ABR or CBR mode of the encoders.
 
So i was happy with this result for a period of time and then came the question: Am I using the full potential this transcoding and can it become more efficient?

So after more self education, reading experts opinions, and so ... I came to: There is a common understanding that humans can't hear above 20 KHz. And it gets worse with age, plus exposure to loudness of the environment leads to differences. BUT there is lack or not enough Clinical or Other tests under Controlled environment with participants from different groups. The same situation, maybe even worse, is with the testing for the advantages of 24-bit audio over 16-bit, where most test are in uncontrolled environment, and were the participants are guessing between the two. There are only people claiming to hear the difference.

Decision 3 = 95-99% of people after 20-25 years will not hear above 17.4 KHz. Considering myself exposed to ordinary (or even less than ordinary) loud sounds, I determined to have little-damaged ears for my 30 years. 

leading to

Decision 4 = I do not consider the sounds above 15,5 KHz important, hearing them as WOFF'S and SHUSH'S, undistinguishable in the overall mix of the track. Also the very quiet sounds like from -110 dB to -90 dB are not important, because i will have to listen the track so loud, that it will probably be very unpleasant and will permanently damage my hearing.

So when i was experimenting with the removing of the sound above 15,5 KHz, I used a CBR mode for one encoding, and than we came to my Paradox.

The CBR version is preserving more detail and is similar to the original than every VBR, and as expected with smaller files.
"What... that is a mistake???"
I used FDK and Apple AAC encoders in different CBR modes to be sure. And again and again the CBR versions were closer to the original in the diapason that matters. I used Apple’s CBR and CVBR modes with limitation of 448 kbps and 512 kbps. I did not considered using less, because of so called transparency - 64 kbps per channel.

Another thing is that the 448 kbps CBR and CVBR versions, are same as the 512 CBR and CVBR versions, but the 448 ones are again smaller in size ???
I will experiment more… but for now as I see things, my new favourite will be one of FDK CBR @ 448 kbps or Apple CVBR @ 448 kbps.

So I’m attaching the screens from the spectrograms so you can visually see what I see. These are the differences between the original lossless FLAC track compared to the lossy AAC. Or as I understand them – these are the sounds that the FLAC file have and the AAC don’t. To be less complex I used only a single channel.

So my questions are:
1. Why is that happening?
2. Is it not true that VBR modes, should be superior to CBR.
3. Or is there something else I’m missing?

Feel free to give opinions, but please let them be meaningful!
Thanks

  • DVDdoug
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #1
I rarely use AAC so I won't recommend any settings...

Quote
So I’m searching for optimal transcoding to AAC, preserving as much important data as possible. The track is complex with mix of sounds and speech.
The lower the bitrate, the more information is "thrown away".   But, the highest bitrate may not be "optimal".   The whole point of compression is to get a smaller file, and it's not optimum if it doesn't sound better.

Quote
Decision 1 = It is impossible for me to hear the difference between certain original lossless and transcoded lossy version, when properly played on my PC, outputting it to my receiver, trough my sensitive speakers.
It depends on the particular program material and your ability to hear compression artifacts.   The only way to know for sure is to do an Blind ABX Test, or do a few ABX tests to get a feel of what works for you, or if you're not worried about disc space use a high bitrate or simply stick with lossless.

Quote
Decision 2 = Using a Spectrogram for different transcoding of the same lossless track, even small parts of it, I determined that FDK AAC encoder at VBR=5 is best for preserving the spectogram of the original lossless track.
Lossy compression is NOT intended to make the best  looking spectrogram!   The goal is to get the best SOUND!   (It's probably easier to make a good looking spectrogram.)

Quote
So after more self education, reading experts opinions, and so ... I came to: There is a common understanding that humans can't hear above 20 KHz.
The fact that someone can hear to 20kHz in a hearing test does NOT mean they can hear 20kHz harmonics & overtones in the context of music.    The highest frequencies in music are low-level and are often masked (drowned out) by the other sounds.   That's one of the main "tricks' to MP3 & AAC.   They don't simply low-pass filter, they analyze the sound and throw away sounds that are (hopefully)  masked.


  • Makaki
  • [*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #2
Last I checked Apple's AAC is one of the best encoders out there. Even if this has changed, you can expect it to be in the Top 3 list.

Irrelevant to most people. I found that Apple's produced files which are friendlier to some hardware players, See: https://hydrogenaud.io/index.php/topic,102535.0.html

As for what setting is transparent: Remember that first, forget spectograms. Codec efficiency and transparency is for a Hearing perspective, not a Visual one. And scientifically, there is going to be missing audio data, but if you can't determine what it is, it shouldn't matter.

Most encoders will optimize most settings, and you only need to specify a target bitrate. Trying to force specific settings, like which frequencies to cut out, can usually yield worst results. Trust the encoder, at least in your first trial.

As for Apple's encoder. There's a lot of discussions whether CVBR or TVBR is best. I think in summary, the difference if any should be minor. And looking at the forums, CVBR seems to win. It's also what Apple uses for their iTunes presets. See:
https://github.com/nu774/qaac/wiki/Encoder-configuration#relation-to-itunes-encoder-setting

Most will tell you to try various bitrates, to see what is transparent to YOU. And then maybe go up a bit for good measure. It doesn't take long to do a quick ABX test on foobar2000. Especially if it's a personal test and you only need to convince yourself.

EDIT:

You'd want to start that ABX at a bitrate where you can definitely HEAR the difference. And then step it up from there until you can't hear the difference. For some people with very good ears, that bitrate can sometimes be quite high. Sometimes it's best to draw the line where those differences stop being annoying and become acceptable.
  • Last Edit: 08 October, 2017, 03:46:37 PM by Makaki

Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #3
Hello, so I’m searching for answers to some questions about the transcoding to AAC from a FLAC. I’m curious enthusiast, without experience, and I like logical explanations. So please make it sound simple as possible. So I’m searching for optimal transcoding to AAC, preserving as much important data as possible. The track is complex with mix of sounds and speech.

After a lot of time searching and experimenting, and testing I will explain as:

Decision 1 = It is impossible for me to hear the difference between certain original lossless and transcoded lossy version, when properly played on my PC, outputting it to my receiver, trough my sensitive speakers.

It should be possible with the right program material and listening methodology to at least occasionally hear a difference. Attached is a diagram highlighting common audiophile errors in listening test procedures. They are a two-edged sword: These errors can lead to false errors and false confirmations.

Careful preparation of the samples to be auditioned using one of the popular audio file editors or DAW programs in conjunction with a software ABX Comparator such as the FOOBAR2000 ABX Comparator add-in (freeware) can make this as easy as possible. You may be aware of all this, I just don't know.

Quote
Decision 2 = Using a Spectrogram for different transcoding of the same lossless track, even small parts of it, I determined that FDK AAC encoder at VBR=5 is best for preserving the spectogram of the original lossless track.

It has been a long time since a simple technical test such as a spectrogram has been thought to be adequate to reliably detect losses and errors due to lossy file compression using modern programs. Maybe several decades. At this time all credible work seems to be based on listening tests.

Quote
So after more self education, reading experts opinions, and so ... I came to: There is a common understanding that humans can't hear above 20 KHz. And it gets worse with age, plus exposure to loudness of the environment leads to differences. BUT there is lack or not enough Clinical or Other tests under Controlled environment with participants from different groups.

I've seen Fletcher Munson curves with several lines drawn for listeners in different  age groupsnbut google is not my friend today.



Quote
Decision 3 = 95-99% of people after 20-25 years will not hear above 17.4 KHz. Considering myself exposed to ordinary (or even less than ordinary) loud sounds, I determined to have little-damaged ears for my 30 years. 

The same situation, maybe even worse, is with the testing for the advantages of 24-bit audio over 16-bit, where most test are in uncontrolled environment, and were the participants are guessing between the two. There are only people claiming to hear the difference.

That said, the problem with hearing differences due to HF bandpass is not really touched on my single tone hearing test type evaluations. The more severe problem is due to masking of the final critical band by the one just below it.  IOW loud sounds around say, 12 kHz often present in music keep you from hearing the presence or absence of only slightly softer sounds around say, 15 KHz.

 


  • krabapple
  • [*][*][*][*][*]
Re: Transcoding from FLAC to AAC ??? VBR and CBR ???
Reply #4
Hello, so I’m searching for answers to some questions about the transcoding to AAC from a FLAC. I’m curious enthusiast, without experience, and I like logical explanations. So please make it sound simple as possible. So I’m searching for optimal transcoding to AAC, preserving as much important data as possible. The track is complex with mix of sounds and speech.

After a lot of time searching and experimenting, and testing I will explain as:

Decision 1 = It is impossible for me to hear the difference between certain original lossless and transcoded lossy version, when properly played on my PC, outputting it to my receiver, trough my sensitive speakers.

OK

Quote
Next question = How else can I understand what is the difference?

Decision 2 = Using a Spectrogram for different transcoding of the same lossless track, even small parts of it, I determined that FDK AAC encoder at VBR=5 is best for preserving the spectogram of the original lossless track.

This is not the way to understand/predict what the differences will *sound like*, when dealing with lossy perceptual codecs. You may in fact be unable to tell the difference between original and lossy at other VBR settings that you rejected because they 'looked' worse.

Quote
So i was happy with this result for a period of time and then came the question: Am I using the full potential this transcoding and can it become more efficient?

Try different (higher compression) settings and compare them to the original files, using an ABX tool (e.g. one that comes with the foobar player).  Which ever one compresses the most without sacrificing 'transparency', is the one that is 'most efficient' for *you*.

A blind comparison (e.g. ABX)  tool  is essential.  You can't just do it 'by ear' (which isn't really 'by ear' if you know which file is compressed and which isn't). 




  • Last Edit: 13 October, 2017, 02:04:10 PM by krabapple