Skip to main content
Topic: Encoding AAC HE / HEV2 with VBR results (Read 4411 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Encoding AAC HE / HEV2 with VBR results

First post.

Sorry if someone has commented it here but I have made some searchs and found  nothing about it.

I'm currently encoding my disks collection, already passed to FLAC (which it was the hardest) to AAC to listen in my Android phone.

Considering the facts :
- Want to stay with AAC codec for compatibility
- Bandwith/size limit I prefer to respect is 64 Kbps
- Want to have the highest quality with this limit
- Source it's 44.1 kHz stereo 16 bits

I have started to use FFmpeg in Linux with libfdk_aac encoder with profiles a aac_he  and aac_he_v2 to test.
(I have made my choice with the content of this forum, so thanks for it to everyone).

I'm comparing both HE and HEv2 with different qualities and parameters with the help of the web :
http://wiki.hydrogenaud.io/index.php?title=Fraunhofer_FDK_AAC#Bitrate_Modes

And now the extrange thing :
The documentation says that VBR is not accepted for HE and HEv2 higher than -global_quality 3 (48 Kbps) wich is bellow my wishes.

But I in my tests, the VBR 4 and 5 parameter with libfdk_aac encoder  gives a better quality file.

I reproduce it with Winamp (yes I know, the old thing, it works with Wine) and it seems that it's a VBR file.
The media info applied to theese files says that they are really HEv2, so I'm confused...

How can I know if I'm really encoding with VBR or a pseudo-CBR like Opus codec ?

I give here the bash code for the encoding and the parameters :

Code: [Select]
for arg
do
base="${arg%.[^.]*}"
ffmpeg -i "$arg" -c:a libfdk_aac -profile:a aac_he_v2 \
-movflags +faststart -flags +qscale -cutoff 18000 \
-vn -global_quality 5  "$base".m4a
done

I hope that someone can give me an idea of what's happening here. ;)

Thanks in advance.







Re: Encoding AAC HE / HEV2 with VBR results

Reply #1
[...] FFmpeg in Linux with libfdk_aac [...] The documentation says that VBR is not accepted for HE and HEv2 higher than -global_quality 3 (48 Kbps) wich is bellow my wishes.

But I in my tests, the VBR 4 and 5 parameter with libfdk_aac encoder  gives a better quality file.
[...]
what stop you to use "original" fdkaac ?

Re: Encoding AAC HE / HEV2 with VBR results

Reply #2
The ffmpeg documntation says "Currently only the ‘aac_low’ profile supports VBR encoding." https://ffmpeg.org/ffmpeg-codecs.html#Options-8
I don't know if this is still true. If it does allow HE and HEv2 a workable commands might be like
Code: [Select]
ffmpeg -i "infile" -vn -c:a libfdk_aac -profile:a aac_he -vbr 2 "outfile"
and
Code: [Select]
ffmpeg -i "infile" -vn -c:a libfdk_aac -profile:a aac_he_v2 -vbr 5 "outfile"

with HEv2 you might have to specify -channels 2 if the input audio is mono. iirc fdk screams if you don't.

Re: Encoding AAC HE / HEV2 with VBR results

Reply #3
FFMPEG allows vbr in AAC-HE an AAC-HEv2 using libfdk_aac or at least in my built it does, and the vbr table is wrong in more than on way as it don't indicate final bitrate but bitrate per channel (this can misled sometime the users) and that the values are only really true for LC.

Re: Encoding AAC HE / HEV2 with VBR results

Reply #4
I reproduce it with Winamp (yes I know, the old thing, it works with Wine) and it seems that it's a VBR file.
The media info applied to theese files says that they are really HEv2, so I'm confused...

Winamp uses so-called FHG AAC, not FDK AAC. It's a different codec, and it's better than fdkaac (at least, FHG AAC was better than fdkaac several years ago).

Re: Encoding AAC HE / HEV2 with VBR results

Reply #5
Winamp uses so-called FHG AAC, not FDK AAC. It's a different codec, and it's better than fdkaac (at least, FHG AAC was better than fdkaac several years ago).
And probably still is. YouTube probably uses FDK-AAC for encoding low bitrate HE-AACv2 audio. It activates when I have poor network connection. This encoder is so crappy that I hear the difference every time. It sounds so... metallic.
I wanted to check HE-AACv2 out so I installed Winamp and encoded some songs with Winamp's FhG-AAC encoder. It sounds less metallic than FDK-AAC and it has less "hoarseness". FhG-AAC encode sounds just more... musical? FDK-AAC encode sounds more artificial.
I don't know yet how to extract FDK HE-AACv2 streams from YouTube. But if I get to know that, I'll extract some music and compare it with my own FhG-AAC encodes I'll make from higher quality YouTube streams.
sox -e float -b 32 -V4 -D gain -3 rate -v 48000 norm -1
opusenc --bitrate 128


Re: Encoding AAC HE / HEV2 with VBR results

Reply #7
I reproduce it with Winamp (yes I know, the old thing, it works with Wine) and it seems that it's a VBR file.
The media info applied to theese files says that they are really HEv2, so I'm confused...
Winamp uses so-called FHG AAC, not FDK AAC. It's a different codec, and it's better than fdkaac (at least, FHG AAC was better than fdkaac several years ago).
I believe what he means, is that he uses Winamp for playback. In which case the codec as such is non-sequitur as playback of AAC files is deterministic.

You can use HE-AAC (LC + SBR + PS) at a target bitrate of ~64kb/s, at a sampling rate of 48kHz and two channels. In fact this is a pretty common setup. Since you have what is I files with sampling rates of 44.1kHz, that's totally fine, as well.

The VBR mode (AACENC_BITRATEMODE) is either 2 or 3, since this is VBR, it's always a range, not a particular target bitrate.
As always, I'd suggest sticking to the defaults.

I'd try if a simple
Code: [Select]
ffmpeg -i <input> -c:a libfdk_aac -vbr 2 <output>
does it for you. Or use '3' with the -vbr option.

Now, if you don't want to use FFmpeg, that's fine, I suggest using fdkaac (https://github.com/nu774/fdkaac) it should be in the repos of your distro. It's a bit easier to use than FFmpeg, and can use FLAC as input.

Don't use a manual lowpass filter. Let the FDK_AAC library do the SBR and apply the filter. You seem to be using a bunch of quality opitions (-global_quality) that overlap, and you shouldn't do that. Use only what the library supports.

With fdkaac, here's a little script converting all FLAC files into .m4a containing AAC:
Code: [Select]
for fn in *flac; do
    flac -s -d -c $fn | fdkaac -i -p 29 -m 2 -o ${fn%.*}.m4a -
done
Note that the AOT must be 29 for HE-AAC v2 or 156 for MPEG-2 HE-AAC v2. However you can also simply not use those settings and let the defaults figure out the best settings for you. Note that I used -i to make fdkaac ignore the WAV file length, since we're piping that one from flac directly into fdkaac. I used -m 2 to set a target VBR rate at 32kb/s per stereo channel. You can use '3' there, if that's more to your liking.

Anyway, you should use as little options as possible. You're always more likely to change things for the worse, when drifting away from the defaults. And do a listening test inside your car. It might just be, that audio with -m 2 is absolutely enough for your car audio system.

Now. If you simply can't stomach the sound that FDK_AAC makes, even though it's kinda the best one for anything that isn't Apple or Windows, you can use qaac (https://sites.google.com/site/qaacpage/). This is a Windows program, so you need a Windows VM, for instance, with Apple QuickTime installed. qaac is a command line tool giving you access to the Apple AAC encoder, currently regarded as the best one in existence.

Please refer to listening tests like these: https://listening-test.coresv.net/results.htm
Or these: https://hydrogenaud.io/index.php/topic,102699.0.html
(or make your own)

Keep in mind FDK_AAC is an encoder released by Fraunhofer. It's the code they kinda dump into Debian and Android, so it's available on Android phones, otherwise they'd descend into obscurity pretty fast...



Winamp uses so-called FHG AAC, not FDK AAC. It's a different codec, and it's better than fdkaac (at least, FHG AAC was better than fdkaac several years ago).
And probably still is. YouTube probably uses FDK-AAC for encoding low bitrate HE-AACv2 audio. It activates when I have poor network connection. This encoder is so crappy that I hear the difference every time. It sounds so... metallic.
Well, that assumption is a bit odd. Youtube uses MPEG-DASH to select from a plethora of pre-encoded formats. The main reason for the overall low quality, is because Youtube must re-encode from already lossy sources, this excarberates errors. What encoders they use in the back is not that easy to tell, but they definitely use Opus, for instance. It can be safely assumed, that Youtube uses decent encoders, and in fact for very low bitrates, it uses Opus. It is very unlikely, that what you're hearing are FDK_AAC artifacts. Those are probably artifacts from multiple re-encodings, and just the low bitrate in general.
Quote
I wanted to check HE-AACv2 out so I installed Winamp and encoded some songs with Winamp's FhG-AAC encoder. It sounds less metallic than FDK-AAC and it has less "hoarseness". FhG-AAC encode sounds just more... musical? FDK-AAC encode sounds more artificial.
I don't know yet how to extract FDK HE-AACv2 streams from YouTube. But if I get to know that, I'll extract some music and compare it with my own FhG-AAC encodes I'll make from higher quality YouTube streams.
You can easily download and examine all pre-encoded streams with youtube-dl (https://youtube-dl.org):
Code: [Select]
 % youtube-dl -F "https://www.youtube.com/watch?v=Kh-UFfwmuIA"
[youtube] Kh-UFfwmuIA: Downloading webpage
[youtube] Kh-UFfwmuIA: Downloading video info webpage
[info] Available formats for Kh-UFfwmuIA:
format code  extension  resolution note
249          webm       audio only DASH audio   55k , opus @ 50k, 6.45MiB
250          webm       audio only DASH audio   77k , opus @ 70k, 8.13MiB
171          webm       audio only DASH audio   84k , vorbis@128k, 9.61MiB
140          m4a        audio only DASH audio  131k , m4a_dash container, mp4a.40.2@128k, 17.97MiB
251          webm       audio only DASH audio  132k , opus @160k, 14.25MiB
160          mp4        256x144    144p  111k , avc1.4d400c, 30fps, video only, 6.61MiB
278          webm       256x144    144p  142k , webm container, vp9, 30fps, video only, 12.55MiB
242          webm       426x240    240p  224k , vp9, 30fps, video only, 15.27MiB
133          mp4        426x240    240p  238k , avc1.4d4015, 30fps, video only, 11.60MiB
243          webm       640x360    360p  412k , vp9, 30fps, video only, 27.56MiB
134          mp4        640x360    360p  464k , avc1.4d401e, 30fps, video only, 25.41MiB
244          webm       854x480    480p  649k , vp9, 30fps, video only, 41.82MiB
135          mp4        854x480    480p  828k , avc1.4d401f, 30fps, video only, 39.52MiB
247          webm       1280x720   720p 1107k , vp9, 30fps, video only, 74.39MiB
136          mp4        1280x720   720p 1370k , avc1.4d401f, 30fps, video only, 68.31MiB
43           webm       640x360    medium , vp8.0, vorbis@128k, 102.16MiB
18           mp4        640x360    medium  463k , avc1.42001E, mp4a.40.2@ 96k (44100Hz), 64.30MiB
22           mp4        1280x720   hd720  621k , avc1.64001F, mp4a.40.2@192k (44100Hz) (best)
Note how most of those streams are either video, or audio. Which one gets intermixed into a WEBM or MP4 stream, gets negotiated while the stream is playing.

Here's another one from a very recent VEVO video, they should be concerned with audio quality, as they're a music promotion channel, etc:
Code: [Select]
 % youtube-dl -F "https://www.youtube.com/watch?v=3ZdNW4ZSOTs"
[youtube] 3ZdNW4ZSOTs: Downloading webpage
[youtube] 3ZdNW4ZSOTs: Downloading video info webpage
[youtube] 3ZdNW4ZSOTs: Downloading js player vflhRp6T6
[info] Available formats for 3ZdNW4ZSOTs:
format code  extension  resolution note
249          webm       audio only DASH audio   59k , opus @ 50k, 1.05MiB
250          webm       audio only DASH audio   77k , opus @ 70k, 1.36MiB
140          m4a        audio only DASH audio  130k , m4a_dash container, mp4a.40.2@128k, 2.49MiB
171          webm       audio only DASH audio  131k , vorbis@128k, 2.34MiB
251          webm       audio only DASH audio  151k , opus @160k, 2.62MiB
394          mp4        256x144    144p   93k , av01.0.05M.08, 30fps, video only, 1.48MiB
160          mp4        256x144    144p  110k , avc1.4d400c, 30fps, video only, 1.47MiB
278          webm       256x144    144p  123k , webm container, vp9, 30fps, video only, 1.81MiB
395          mp4        426x240    240p  211k , av01.0.05M.08, 30fps, video only, 2.88MiB
242          webm       426x240    240p  216k , vp9, 30fps, video only, 2.95MiB
133          mp4        426x240    240p  297k , avc1.4d4015, 30fps, video only, 3.12MiB
396          mp4        640x360    360p  377k , av01.0.05M.08, 30fps, video only, 5.01MiB
243          webm       640x360    360p  380k , vp9, 30fps, video only, 5.07MiB
134          mp4        640x360    360p  550k , avc1.4d401e, 30fps, video only, 5.68MiB
244          webm       854x480    480p  596k , vp9, 30fps, video only, 7.52MiB
397          mp4        854x480    480p  630k , av01.0.05M.08, 30fps, video only, 8.44MiB
135          mp4        854x480    480p  798k , avc1.4d401f, 30fps, video only, 8.05MiB
247          webm       1280x720   720p 1022k , vp9, 30fps, video only, 12.45MiB
136          mp4        1280x720   720p 1239k , avc1.4d401f, 30fps, video only, 11.62MiB
248          webm       1920x1080  1080p 2621k , vp9, 30fps, video only, 32.75MiB
137          mp4        1920x1080  1080p 3439k , avc1.640028, 30fps, video only, 36.79MiB
18           mp4        640x360    medium  539k , avc1.42001E, mp4a.40.2@ 96k (44100Hz), 10.36MiB (best)

In fact, youtube-dl by default downloads best audio and best video, and muxes them together after download.
You can download a specific variant like this:
Code: [Select]
youtube-dl -f 18 "https://www.youtube.com/watch?v=3ZdNW4ZSOTs"
The resulting file (Vevo - Hot This Week - March 15th,  2019-3ZdNW4ZSOTs.mp4) can then be inspected with FFmpeg:
Code: [Select]
 % ffmpeg -i Vevo\ -\ Hot\ This\ Week\ -\ March\ 15th,\ \ 2019-3ZdNW4ZSOTs.mp4 
ffmpeg version N-45774-g223f3dff8-static https://johnvansickle.com/ffmpeg/  Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 6.3.0 (Debian 6.3.0-18+deb9u1) 20170516
  configuration: --enable-gpl --enable-version3 --enable-static --disable-debug --disable-ffplay --disable-indev=sndio --disable-outdev=sndio --cc=gcc-6 --enable-libxml2 --enable-fontconfig --enable-frei0r --enable-gnutls --enable-gray --enable-libfribidi --enable-libass --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librubberband --enable-libsoxr --enable-libspeex --enable-libvorbis --enable-libopus --enable-libtheora --enable-libvidstab --enable-libvo-amrwbenc --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg
  libavutil      56. 15.100 / 56. 15.100
  libavcodec     58. 19.100 / 58. 19.100
  libavformat    58. 13.100 / 58. 13.100
  libavdevice    58.  4.100 / 58.  4.100
  libavfilter     7. 18.100 /  7. 18.100
  libswscale      5.  2.100 /  5.  2.100
  libswresample   3.  2.100 /  3.  2.100
  libpostproc    55.  2.100 / 55.  2.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Vevo - Hot This Week - March 15th,  2019-3ZdNW4ZSOTs.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isommp42
    creation_time   : 2019-03-15T16:36:00.000000Z
  Duration: 00:02:41.12, start: 0.000000, bitrate: 539 kb/s
    Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p(tv, bt709), 640x360 [SAR 1:1 DAR 16:9], 440 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      creation_time   : 2019-03-15T16:36:00.000000Z
      handler_name    : ISO Media file produced by Google Inc. Created on: 03/15/2019.
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 96 kb/s (default)
    Metadata:
      creation_time   : 2019-03-15T16:36:00.000000Z
      handler_name    : ISO Media file produced by Google Inc. Created on: 03/15/2019.
At least one output file must be specified
So you can easily verify and test those files, etc. And yes - I should update my FFmpeg version.

Oh and one last thing, because I know this confuses people: FDK_AAC is from Fraunhofer, the same people that made FhG-AAC. The main difference is, that FDK_AAC open source (though non-free), and only uses integer math.

And one other suggestion: if you're using youtube through a browser, you can right-click into the video and select "Stats for nerds", it'll display the codec and bitrate currently in use.

If you're gonna try to test Youtube's encoder, you'd have to upload lossless sources, and even then you can't really be sure how the internal encoding pipeline works.

 

Re: Encoding AAC HE / HEV2 with VBR results

Reply #8
First of all, thank you to everybody.
Never in my thoughts I supposed that I would have so much information here :)

Let's start for the begining.
Quote
what stop you to use "original" fdkaac ?

I started to use ffmpeg just because it keeps the ID tags (Metadata) on my songs, just for it.
Probably there is a way to keep them also with fdkaac, I will check it later.
I supposed that the quality was the same with both encoders, so I didn't think it too much.

For the suggestion to use : -vbr 5
And not : -global_quality 5

It seems that the results are exactly the same using -flags +qscale parameter.
I compared both files with cmp, it shows no differences at all.
Same with -vbr 2 and -global_quality 2, both parameters seem to be paired.

About Winamp, yes, I only use it to test the results in my computer, before copying it to the Android device.

Quote
The VBR mode (AACENC_BITRATEMODE) is either 2 or 3, since this is VBR, it's always a range, not a particular target bitrate.
 As always, I'd suggest sticking to the defaults.

I have tried the default settings for HEv2 :

ffmpeg -i <input> -c:a libfdk_aac -profile:a aac_he_v2 -vbr 2 <output>
ffmpeg -i <input> -c:a libfdk_aac -profile:a aac_he_v2 -vbr 3 <output>

No differences, the encoder is doing the same.
The result seems to be in CBR, not in VBR.


Code: [Select]
Test VBR1$ mediainfo test-default-vbr3.m4a  | grep Bit
Bit rate mode                            : Constant
Bit rate                                 : 48.0 Kbps

Test VBR1$ mediainfo test-default-vbr2.m4a  | grep Bit
Bit rate mode                            : Constant
Bit rate                                 : 44.6 Kbps

Test VBR1$ mediainfo test-default-vbr4.m4a  | grep Bit
Bit rate mode                            : Constant
Bit rate                                 : 52.6 Kbps

How can I really check if it's really VBR and not a pseudo CBR limited for using HEv2 ?

Quote
You can use HE-AAC (LC + SBR + PS) at a target bitrate of ~64kb/s, at a sampling rate of 48kHz and two channels. In fact this is a pretty  common setup. Since you have what is I files with sampling rates of 44.1kHz, that's totally fine, as well.

Perfect.
Encoding a 44.1kHz signal to a 48kHz file, just for the principles, makes me feel bad.
I prefer to keep 44.1 kHz, if possible. But if not, I will use it. Thank you.

Quote
The main difference is, that FDK_AAC open source (though non-free), and only uses integer math.


I retain this, thank you.
I knew that both were from FH (like MP3 and MP3pro as well) but I have always thought that in fact it was the same codec, different versions.
So for me the FDK_AAC was newer (better) than FhG-AAC, good to know the truth. :)
Actually, that is why I have always used the Winamp to listen my tests.


Quote
And yes - I should update my FFmpeg version.



Bfffff. I say nothing :)
Code: [Select]
$ ffmpeg -version
ffmpeg version git-2017-01-22-f1214ad Copyright (c) 2000-2017 the FFmpeg developers
built with gcc 4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
configuration: --extra-libs=-ldl --prefix=/opt/ffmpeg --mandir=/usr/share/man --enable-avresample --disable-debug --enable-nonfree --enable-gpl --enable-version3 --enable-libopencore-amrnb --enable-libopencore-amrwb --disable-decoder=amrnb --disable-decoder=amrwb --enable-libpulse --enable-libfreetype --enable-gnutls --enable-libx264 --enable-libx265 --enable-libfdk-aac --enable-libvorbis --enable-libmp3lame --enable-libopus --enable-libvpx --enable-libspeex --enable-libass --enable-avisynth --enable-libsoxr --enable-libxvid --enable-libvidstab --enable-libwavpack --enable-nvenc
libavutil      55. 44.100 / 55. 44.100
libavcodec     57. 75.100 / 57. 75.100
libavformat    57. 63.100 / 57. 63.100
libavdevice    57.  2.100 / 57.  2.100
libavfilter     6. 69.100 /  6. 69.100
libavresample   3.  2.  0 /  3.  2.  0
libswscale      4.  3.101 /  4.  3.101
libswresample   2.  4.100 /  2.  4.100
libpostproc    54.  2.100 / 54.  2.100

Doing it now... :)

To be continued...

Re: Encoding AAC HE / HEV2 with VBR results

Reply #9
If you intend to play on Android, it should support Opus just fine.
Are there any other constraints imposing the use of AAC?
Quote
Encoding a 44.1kHz signal to a 48kHz file, just for the principles, makes me feel bad.
There's nothing bad about resampling, with good quality resampling the error signal is about -140 dB quieter than the signal itself, it's many orders of magnitude lower than the error of any lossy encoding, and it will be impossible to hear the difference of the resampling (even measuring it is pretty tough)
Just accept it as part of the lossy encoding scheme which doesn't really lose any audible quality.

Re: Encoding AAC HE / HEV2 with VBR results

Reply #10
Quote
Encoding a 44.1kHz signal to a 48kHz file, just for the principles, makes me feel bad.
There's nothing bad about resampling, with good quality resampling the error signal is about -140 dB quieter than the signal itself, it's many orders of magnitude lower than the error of any lossy encoding, and it will be impossible to hear the difference of the resampling (even measuring it is pretty tough)
Just accept it as part of the lossy encoding scheme which doesn't really lose any audible quality.
At first I was a little bit concerned, too. Fortunately one of Opus devs explained some things and... just have a read, it was quite nice thread :)
sox -e float -b 32 -V4 -D gain -3 rate -v 48000 norm -1
opusenc --bitrate 128

Re: Encoding AAC HE / HEV2 with VBR results

Reply #11
Winamp uses so-called FHG AAC, not FDK AAC. It's a different codec, and it's better than fdkaac (at least, FHG AAC was better than fdkaac several years ago).
And probably still is. YouTube probably uses FDK-AAC for encoding low bitrate HE-AACv2 audio. It activates when I have poor network connection. This encoder is so crappy that I hear the difference every time. It sounds so... metallic.
Well, that assumption is a bit odd. Youtube uses MPEG-DASH to select from a plethora of pre-encoded formats. The main reason for the overall low quality, is because Youtube must re-encode from already lossy sources, this excarberates errors.
I know what MPEG-DASH is ;) And I don't think that YouTube transcodes from lossy formats more than one generation. I think that YouTube keeps the original audio - they don't serve it to public beacuse it would kill their servers, but they may use original audio for making lossy broadcast copies. I think that YouTube aren't *that* stupid and they know they should make lossy copies from best known and available source, so I think they keep original audio.

What encoders they use in the back is not that easy to tell, but they definitely use Opus, for instance. It can be safely assumed, that Youtube uses decent encoders, and in fact for very low bitrates, it uses Opus.
Yes, they use Opus, but for higher bitrates. The lowest bitrates available for Opus is 50 kbps. Then there are 80 and 160 kbps. For lowest bitrates they use HE-AACv2, at least for mobiles. And you say they may use decent encoders? Well, you may be wrong :( Do you all remember that sh!tstorm when SoundCloud switched to 64kbps Opus? It came out they use opusenc... 1.0! Not 1.2.1, the best version available back then, but 1.0! So unfortunately I don't trust YouTube's encoders that much.

It is very unlikely, that what you're hearing are FDK_AAC artifacts. Those are probably artifacts from multiple re-encodings, and just the low bitrate in general.
Multiple re-encodings? YouTube may use crappy encoders, but just like I said before, they're probably not *that* stupid to transcode more than one time. Well, I downloaded pre-compiled FDK-AAC encoder and encoded some songs at 32 kbps. It didn't sound 'YouTubey', it sounded more than FhG encoding I've also done with Winamp. YouTube either uses FDK-AAC incorrectly or uses some other encoder than FDK (which is unlikely). Either case means more fiddling, so stay tuned. I swear, FDK wasn't as bad as I thought. It must be something with YouTube.

I don't know yet how to extract FDK HE-AACv2 streams from YouTube. But if I get to know that, I'll extract some music and compare it with my own FhG-AAC encodes I'll make from higher quality YouTube streams.
You can easily download and examine all pre-encoded streams with youtube-dl (https://youtube-dl.org):
Note how most of those streams are either video, or audio. Which one gets intermixed into a WEBM or MP4 stream, gets negotiated while the stream is playing.
I know about youtube-dl, too :) ...

And one other suggestion: if you're using youtube through a browser, you can right-click into the video and select "Stats for nerds", it'll display the codec and bitrate currently in use.
...and I'm active user of "Stats for nerds", too ;)
I'm using official YouTube app on Android. Usually, on good network connection, it loads audio ID 140 (AAC 128 kbps). However, when I have poor connection, it loads audio ID 139. I bet it's HE-AACv2 and I think it's 32 kbps or so. Just like I said before, I use youtube-dl, however when I tried to list all available streams, it showed familiar ID 140, but there was no ID 139 :( So I'm trying to figure out how to get that stream. And I'm wondering why some streams (e.g. ID 139) aren't reported by youtube-dl, but exist and are served?
sox -e float -b 32 -V4 -D gain -3 rate -v 48000 norm -1
opusenc --bitrate 128

Re: Encoding AAC HE / HEV2 with VBR results

Reply #12
The problem with youtube is likely that users just like to upload really fucked up content.
Some videos have such a terrible sound quality, it probably sounds worse than mp3@64k average.
Whenever I uploaded something to Youtube, the end result was transparent.

Re: Encoding AAC HE / HEV2 with VBR results

Reply #13
Usually, on good network connection, it loads audio ID 140 (AAC 128 kbps). However, when I have poor connection, it loads audio ID 139. I bet it's HE-AACv2 and I think it's 32 kbps or so.
It seems that youtube-dl thinks that ID139 is 48 kbps:
https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/youtube.py#L442
Quote
        '139': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 48, 'container': 'm4a_dash'},
        '140': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 128, 'container': 'm4a_dash'},
        '141': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 256, 'container': 'm4a_dash'},

Re: Encoding AAC HE / HEV2 with VBR results

Reply #14
Quote
> Have a look at bitrate distribution:
> https://hydrogenaud.io/index.php/topic,98284.msg900540.html#msg900540
I wanted to update my ffmpeg version before doing this.

Now that it's done, I must say that it is exactly what I wanna know.
 :)  :)  :)  :)  :)  :)  :)  :)  :)  :)  :)  :)  :)  :)  :)  :)  :)
Thank you, thank you, thank you VERY MUCH for this.

The file seems to be HEv2 with VBR 5 (arround 64 Kbps).



So it means that HEv2 seems to be working OK with VBR 5 from a file in 44.1 Kbps CD quality and it is NOT limited to VBR 2 and bellow.

Quote
If you intend to play on Android, it should support Opus just fine.
Are there any other constraints imposing the use of AAC?

That's THE question.  ;)

In fact, first of all I wanna know if HEv2 was working with VBR 5 (arround 64 Kbps) correctly.
Afterwards, I want to compare it with Opus in 64 Kbps.

Actually I have made some tests with Opus and in Android on MKA files and they work.

But being this the forum about AAC, I want to delimit my question about AAC HEv2 VBR 5 format.  :-X

I can pass to the next step now.  8)



Re: Encoding AAC HE / HEV2 with VBR results

Reply #15
Usually, on good network connection, it loads audio ID 140 (AAC 128 kbps). However, when I have poor connection, it loads audio ID 139. I bet it's HE-AACv2 and I think it's 32 kbps or so.
It seems that youtube-dl thinks that ID139 is 48 kbps:
https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/youtube.py#L442
Quote
        '139': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 48, 'container': 'm4a_dash'},
        '140': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 128, 'container': 'm4a_dash'},
        '141': {'ext': 'm4a', 'format_note': 'DASH audio', 'acodec': 'aac', 'abr': 256, 'container': 'm4a_dash'},
Great to know. Getting access to that stream with youtube-dl is currently still impossible, though.
sox -e float -b 32 -V4 -D gain -3 rate -v 48000 norm -1
opusenc --bitrate 128

Re: Encoding AAC HE / HEV2 with VBR results

Reply #16
Well, that assumption is a bit odd. Youtube uses MPEG-DASH to select from a plethora of pre-encoded formats. The main reason for the overall low quality, is because Youtube must re-encode from already lossy sources, this excarberates errors.
I know what MPEG-DASH is ;) And I don't think that YouTube transcodes from lossy formats more than one generation. I think that YouTube keeps the original audio - they don't serve it to public beacuse it would kill their servers, but they may use original audio for making lossy broadcast copies. I think that YouTube aren't *that* stupid and they know they should make lossy copies from best known and available source, so I think they keep original audio.
Nope. The creator manual literally states that (if yo check out the creator panel). The reason for that is the way youtube levels and re-distributes across its servers based on demand (specifically, demand-density). Mirroring follows a rather peculiar algorithm, but to save mirroring-time, so to say, this is done gradually, and while a video is being distributed throughout the world, especially for mobile carriers with large throughput, you have 2nd and 3rd generations.

This was one of the main complaints a couple years ago, why "new" videos of small creators seem to have a much worse quality than new videos of larger creators.

MPEG-DASH is a memoization system and protocol, not an adaptive codec as such. You still need a specific timecoded chunk, it's not like part of that chunk carries enough decodable data for the entire timecode interval (as it is with certain interlacing algorithms).

That aside, Youtube is beginning to serve Movies with Dolby Atmos. As I assume you know, You can rent movies through Youtube. Youtube supports streaming some content in 4K and lossless audio (some sports events are streamed like that, and certain events like presidential addresses, etc.). It has nothing to do with youtube being stupid or their servers being killed. It is just a matter of managing resources, and cost optimization. Furthermore most regular content on youtube is 1080p at 30fps right now, a large number is 1080p at 60 fps, both using H.264. The video stream commands a far higher bandwidth than any FLAC stream, yet there's no problem really. Why don't they use FLAC for streaming? Tbh, I believe it's simply because there's little demand for it.

Quote
What encoders they use in the back is not that easy to tell, but they definitely use Opus, for instance. It can be safely assumed, that Youtube uses decent encoders, and in fact for very low bitrates, it uses Opus.
Yes, they use Opus, but for higher bitrates. The lowest bitrates available for Opus is 50 kbps. Then there are 80 and 160 kbps. For lowest bitrates they use HE-AACv2, at least for mobiles. And you say they may use decent encoders? Well, you may be wrong :( Do you all remember that sh!tstorm when SoundCloud switched to 64kbps Opus? It came out they use opusenc... 1.0! Not 1.2.1, the best version available back then, but 1.0! So unfortunately I don't trust YouTube's encoders that much.
Intermediate bitrates and format codes are available, but youtube-dl stopped being able to demand them through a transcoding process. I explain further below. You can request even lower quality codes of even weirder formats, mostly for legacy devices. These aren't listed with youtube-dl's '-F' option either, as they're not contained in the JSON format header. As far as I know, fc 139 is the lowest option for MPEG-DASH right now.

Comparing Soundcloud to Youtube (Google) is a bit steep, given that Youtube was one of the early, large adopters of Opus and Vorbis, as well as VPX-VP9 and the upcoming AV1.

It should be noted, that Youtube adds and removes codecs all the time, though. I have a hard time finding a video that still has an encoded flv stream.

Quote
It is very unlikely, that what you're hearing are FDK_AAC artifacts. Those are probably artifacts from multiple re-encodings, and just the low bitrate in general.
Multiple re-encodings? YouTube may use crappy encoders, but just like I said before, they're probably not *that* stupid to transcode more than one time. Well, I downloaded pre-compiled FDK-AAC encoder and encoded some songs at 32 kbps. It didn't sound 'YouTubey', it sounded more than FhG encoding I've also done with Winamp. YouTube either uses FDK-AAC incorrectly or uses some other encoder than FDK (which is unlikely). Either case means more fiddling, so stay tuned. I swear, FDK wasn't as bad as I thought. It must be something with YouTube.
As I stated above, it's even in the manual. At the same time, FhG is a "good" encoder but also not the best one, according to MUSHRA and ABX tests. That trophy is still with Apple-AAC.

Quote
I don't know yet how to extract FDK HE-AACv2 streams from YouTube. But if I get to know that, I'll extract some music and compare it with my own FhG-AAC encodes I'll make from higher quality YouTube streams.
You can easily download and examine all pre-encoded streams with youtube-dl (https://youtube-dl.org):
Note how most of those streams are either video, or audio. Which one gets intermixed into a WEBM or MP4 stream, gets negotiated while the stream is playing.
I know about youtube-dl, too :) ...

And one other suggestion: if you're using youtube through a browser, you can right-click into the video and select "Stats for nerds", it'll display the codec and bitrate currently in use.
...and I'm active user of "Stats for nerds", too ;)
I'm using official YouTube app on Android. Usually, on good network connection, it loads audio ID 140 (AAC 128 kbps). However, when I have poor connection, it loads audio ID 139. I bet it's HE-AACv2 and I think it's 32 kbps or so. Just like I said before, I use youtube-dl, however when I tried to list all available streams, it showed familiar ID 140, but there was no ID 139 :( So I'm trying to figure out how to get that stream. And I'm wondering why some streams (e.g. ID 139) aren't reported by youtube-dl, but exist and are served?
The format codes are pretty well known: https://gist.github.com/sidneys/7095afe4da4ae58694d128b1034e01e2 (this is just a handy list I use quite often, the list goes more in-depth in the youtube-dl sources).

139 is just 48k HE-AAC v2. This is done on-demand as long as there is no format code available from the get go. This is btw. one example of a possible live-transcode (it trades fluent streaming for lag).

As I assume you know MPEG-DASH is essentially a memoization algorithm applied to bitstreams. Control is completely on the client side. Keep in mind, that the reason why fc 139 doesn't show up for you, is because the servers in your region don't serve it that often, and it'd be uneconomical for youtube to advertise that format. It's not that this stream is requested often in general, the location of the client is also important (This is also part of the Youtube primer). It's why sometimes your videos appear of lesser quality in some areas on the globe, while in others they're of good quality. So in other parts of the world, you can actually see fc 139 being listed in the JSON header (that's what -F loads). You can verify that with a VPN putting yourself in a remote region, or providers like Inmarsat, where bandwidth is very limited or expensive.

Furthermore, the on-demand request is something that youtube-dl was able to, but a somewhat recent version of youtube-dl stopped being able to do so. The reason is probably because youtube-dl is in a constant cat-and-mouse game circumventing website checking for proper browser, adblockers, etc. (this isn't pertaining to Youtube, this goes for pretty much all websites youtube-dl supports.

Requesting on-demand transcode puts extra strain on the servers, and I assume youtube tries to avoid people "pulling" these from their services, as lots of people use youtube as essentially a post-prod rendering engine: https://shkspr.mobi/blog/2017/07/using-youtube-to-transcode-videos-to-dash-on-the-command-line/
Companies used a fast internet connection to upload a file to youtube, and immediately downloaded the transcoded files. This was quicker than rendering or encoding with their computers in the office. This happened in regions with fast emerging IT markets, such as Ghana ("African Silicon Valley").

Btw. Similarly, you can also request even lower quality formats, like fc 17 (3gp 144p a/v). This is for instance quite common in areas like parts of India and Pakistan. So, it is de-facto possible to go to the

Inspecting the streams is also possible with mps-youtube (https://github.com/mps-youtube/mps-youtube), it doesn't use youtube-dl, instead it uses it's own backend. It doesn't support as many websites, but it offers more control. I have yet to see, if it mimics the youtube playback apps well enough to request a transcoded stream.

One thing harking back to youtube-dl: https://github.com/ytdl-org/youtube-dl/blob/master/README.md#format-selection
Now the documentation states, that "-f worstaudio" might not be available (which might be puzzling at first). That's precisely the reason I've discussed above.

MPEG-DASH is merely a protocol for the client to request chunks of a media file of equal length. It doesn't implement a more sophisticated bandwidth optimization algorithms like Adam7 interlacing, where loading part of the frame is enough to have a coarse representation of the entirety of the frame (Adam7 was developed for raster images, but it's applicable to any signal).

I'm assuming UNIPLAYER is just as capable of requesting the lower bitstreams which aren't advertised in the header, too. It should be possible to simply limit your own bandwidth to trick the player into a low-bandwidth mode and then simply dump the stream.



I've been writing this post for over a day now. I'm currently running a couple tests by artificially reducing my bandwidth, and it turns out to be actually harder than I anticipated.

Youtube doas ALL IT CAN to keep the the audio fc at 250 or 251 in my case, I cannot bring it down to 249, the websites terminates connection beforehand. It might be a fine-tuning issue. Youtube regulates the video quality before even attempting to regulate down the audio fc.
But it seems UNIPLAYER refuses to bog down into decoding the HE-AAC v2 stream, it tries to keep to Opus no matter what.
I had to cut the post short for now, otherwise this will turn into a ten-page essay...

 
SimplePortal 1.0.0 RC1 © 2008-2019