Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: exhale - Open Source USAC encoder (Read 311469 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: exhale - Open Source USAC encoder

Reply #626
I compiled gst-plugins-bad 1.18.3 on my Slackware system, picking up fdkaac, and this has certainly opened up a swag of media players under Linux as Kode54 has mentioned. Please pardon my obvious ignorance of this, I have never really been a gstreamer sort of guy...

Tested so far successfully on a native Linux install have been:

  • RhythmBox: Tested on a Hirsute Hippo Ubuntu with Ubuntu package gstreamer1.0-fdkaac installled
  • Clementine: Tested on a Hirsute Hippo Ubuntu with Ubuntu package gstreamer1.0-fdkaac installled
  • Quod Libet: Tested on a Hirsute Hippo Ubuntu with Ubuntu package gstreamer1.0-fdkaac installled
  • Elisa (KDE): Tested on Slackware -current with gst-plugins-bad 1.18.3 compiled against libfdkaac 2.0.1
  • Juk (KDE): Tested on Slackware -current with gst-plugins-bad 1.18.3 compiled against libfdkaac 2.0.1

I have not tested any further but it would be interesting to see how many more media players on this list are successful with exhale output playback...


Re: exhale - Open Source USAC encoder

Reply #627
Thanks very much, Andrew, for the update and the testing! I just cleaned up exhale's source code a bit. Please forget about the additional command-line parameters I introduced with the last release, it should not be necessary anymore to specify any LUFS/peak sample values manually, exhale calculates these values automatically and now also writes these values automatically to all IPFs.

Consider the current revision in exhale's main Git branch an early exhale 1.1.4 beta release. The output of that revision should be identical to that of the revision which NetRanger compiled above. Please report any issues you encounter with the 1.1.4 beta.

Chris
If I don't reply to your reply, it means I agree with you.

Re: exhale - Open Source USAC encoder

Reply #628
This version looks much better; the previous one did not downsample to 32kHz at lower bitrates, now it works fine.

Re: exhale - Open Source USAC encoder

Reply #629
New versions compiled for ARM SoC: exhale@5bfcbaca for Linux armv7l | exhale@5bfcbaca for macOS arm64 .

I have to change my mind about the fact that the new version works well, it still has the problems of the previous. I have made a 15s track from YouTube. In this case we have a treated recording and the microphone is not a cheap model with self-noise present in the track, a Ribera R47 was used.

The result is better than the previous speech, we have a male voice that sings in Italian (the language affects the artifacts, I skipped the English part). In my opinion is excellent. So I show you the bug that I found providing the files to use in inputs to obtain it easily.

Perfect symphony (Ed Sheeran and Andrea Bocelli) - stereo - sampling rate 48kHz | 44,1kHz | 32kHz | 24kHz

Exhale mode 0
  • input file sampled at 24kHz, I read on the screen "Encoding 24-kHz" and I get a 24kHz sampled file;
  • input file sampled at 32kHz, I read on the screen "Encoding 32-kHz" and I get a 32kHz sampled file;
  • input file sampled at 44,1kHz, I read on the screen "ERROR during encoding! Input sample rate must be <=32 kHz for preset mode 0!" and I get nothing;
  • input file sampled at 48kHz, I read on the screen "Encoding 32-kHz" and I get a 32kHz sampled file.

Exhale mode 1
  • input file sampled at 24kHz, I read on the screen "Encoding 24-kHz" and I get a 24kHz sampled file;
  • input file sampled at 32kHz, I read on the screen "Encoding 32-kHz" and I get a 32kHz sampled file;
  • input file sampled at 44,1kHz, I read on the screen "Encoding 44-kHz" and I get a 44,1kHz sampled file;
  • input file sampled at 48kHz, I read on the screen "Encoding 32-kHz" and I get a 32kHz sampled file.

It could have been written simply but I preferred to schematically show the steps I took.

Below are the results obtained with exhale@5bfcbaca mode 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | a | b | c | d | e | f | g .

Re: exhale - Open Source USAC encoder

Reply #630
Previous recording of a male voice spoken in front of a Neumann U87 Ai that showing artifacts on the first syllable.

The results obtained with exhale@5bfcbaca mode 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | a | b | c | d | e | f | g .



Re: exhale - Open Source USAC encoder

Reply #632
the previous one did not downsample to 32kHz at lower bitrates, now it works fine.
...
input file sampled at 44,1kHz, I read on the screen "ERROR during encoding! Input sample rate must be <=32 kHz for preset mode 0!" and I get nothing;
input file sampled at 48kHz, I read on the screen "Encoding 32-kHz" and I get a 32kHz sampled file.
That error is a feature, not a bug. I found a 44.1-to-32-kHz downsampler to be too much work (48-to-32, i.e. 3:2 downsampling is much easier and faster) and didn't implement it. After all, you can use foobar2000's resampler DSP to do that on-the-fly.

Chris
If I don't reply to your reply, it means I agree with you.

Re: exhale - Open Source USAC encoder

Reply #633
I'd like to thank Chris for his amazing work in audio and his xHE implementation by sharing a music sample from a performance that I recorded live at the venue from the audience with my phone, it's the Mozart Quintet K.452. I have it cut, DC-centered, stereo normalized and adjusted for EBU R-128 loudness at -23 LUFS lossless.
It is amazing how a 25-min concert fits in a ~6 Mb file with lowest SBR encoding at 34 kbps. It sounds quite good on my desktop gear, and I can share it in Whatsapp with direct preview from the chat screen on the phone, thanks to included codec since Android 9 and IOS 13 (2019).

 

Re: exhale - Open Source USAC encoder

Reply #634
I'd like to thank Chris for his amazing work in audio and his xHE implementation

Me too and my constant search for flaws is nothing more than a desire to convert all my recordings with Exhale.

Chris did a great job of optimizing the male voices. This time I will write it in a synthetic way, an additional calibration in SBR mode would be needed when there is a male speech sampled at 32kHz in input.

Re: exhale - Open Source USAC encoder

Reply #635
All the searching, testing, and reporting is much appreciated. Thanks a lot! :)

Yes, when encoding speech (or even music) input sampled at 32 kHz with one of the SBR presets, upsampling the input to 44.1 or 48 kHz before encoding may lead to better quality. Basically, the dual-rate SBR technology was designed to work best at 44.1 or 48 kHz input sampling rate.

...my constant search for flaws is nothing more than a desire to convert all my recordings with Exhale.
The upcoming release 1.1.4 should be a good version to do that. I'll be doing the same with my FLAC collection.

Chris
If I don't reply to your reply, it means I agree with you.


Re: exhale - Open Source USAC encoder

Reply #637
Indeed. scharfis_brain, is Dolby Surround encoded stereo audio still really that commonplace? The last audio CD in my collection on which I saw that (German band Schiller) is 20 years old.

Dolby Surround encoding is almost never mentioned on 2.0 content. It is simply "built-in".
However nearly every content I listen to on my surround set has strong action in the rear channels.
This is only possbile, when there is a huge amount of 180° phase shifted audio.

Re: exhale - Open Source USAC encoder

Reply #638
Good to know, thanks for the clarification. Meanwhile, I finished a first exhale 1.1.4 release candidate with slightly modified framing (this commit). Please report any unexpected playback issues not occurring with a previous release of exhale. Thanks.

Chris
If I don't reply to your reply, it means I agree with you.

Re: exhale - Open Source USAC encoder

Reply #639
Test versions: exhale@ab91fd8f-linux-armv7l.bz2 - exhale@ab91fd8f-macOS-arm64.zip

In my opinion I have some problems: I still perceive more defects in the male speech than I can find in the male singing, but now they are clearly annoying only at the lower bitrates.

I don't like to see a voice-only setting in a generalist encoder but maybe that's the only way to improve to 24kbps (like the --speech option in Opus). In this way you can eliminate anything above 16kHz (other ancoders already do this) and use the saved bits to improve the low frequencies typical of male vocal cords.

I leave you below the link to an uncompressed file that lasts 1 min. which I used as a test, this time a song recorded with an AKG P200 microphone.
AKG P200 - Back home.wav

Data format
2 ch,  44100 Hz, 'lpcm' (0x00000009) 32-bit little-endian float - no channel layout
estimated duration: 60.000000 sec
audio bytes: 21168000
audio packets: 2646000
bit rate: 2822400 bits per second
packet size upper bound: 8
maximum packet size: 8
audio data file offset: 4096
optimized
source bit depth: F32

Loudness info - additional loudness parameters
aa noise floor master: "-78.71 -78.71"
aa headroom master: "0.618395 0.614481"

Main loudness parameters
aa ebu max momentary loudness: -14.2571
aa ebu top of loudness range: -15.85
aa itu sample peak: -4.47742
aa itu true peak: -4.4707
aa ebu max short-term loudness   : -15.3241
aa ebu loudness range: 14.3
aa itu loudness: -18.9907

Dialogue anchor parameters
aa itu loudness: -1

Sound check info
sc ave perceived power coeff: "310 318"
sc max perceived power coeff: "3214 3069"
sc peak amplitude msec: "36989 46510"
sc max perceived power msec: "36989 36989"
sc peak amplitude: "19569 19265"

bit depth pcm master: 32

sound check volume normalization gain: 2.99 dB.

The results obtained with exhale@5bfcbaca mode 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | a | b | c | d | e | f | g .

Re: exhale - Open Source USAC encoder

Reply #640
Sorry, celona (and all others with the same interest), if I disappoint you by saying this, but my previous comment about the limitations of exhale's USAC implementation simply means: If you want good-quality speech encodings at 24 kbps mono or lower, then exhale is not for you. I guess you need to look out for commercial xHE-AAC encoders supporting ACELP/TCX coding.

Chris
If I don't reply to your reply, it means I agree with you.



Re: exhale - Open Source USAC encoder

Reply #643
Sorry, celona (and all others with the same interest), if I disappoint you by saying this, but my previous comment about the limitations of exhale's USAC implementation simply means: If you want good-quality speech encodings at 24 kbps mono or lower, then exhale is not for you. I guess you need to look out for commercial xHE-AAC encoders supporting ACELP/TCX coding.

Chris

The fact that in 2020 I wrote that it was the right time to implement ACELP should not be used today to think that I am finding flaws in Exhale to get it. I wrote it because a small developer has to move years in advance, he can't wait for patents to expire to do so. For those who aspire to use xHE-AAC it is a bit early, but it is time to start planning a future encoder change.

I've listened to a lot of compressed files and got the idea that I need a minimum bitrate of 37kbps for 44.1kHz sampled monophonic content and obviously my tests say that not only is ACELP/TCX missing, the fact that monophonic content is worse than the stereophonic ones it's like if the new PS was not used, which notoriously starts from the same downmixing in both cases. Adding background music to the speech forces the encoder to use a higher bitrate and defects tend to disappear. When I wrote that the bitrate had increased I was happy because only by allowing 5kbps more to 32kbps (exhale 1) you can get the best compromise.

For us who are not Netflix or broadcaster, until now raising the bitrate is economically convenient compared to buying a commercial encoder.

Re: exhale - Open Source USAC encoder

Reply #644
Important update for foo_pd_aac: Basically, due to a stupid way I was handling the first packet analysis handler for the packet decoder, was causing the new Immediate Playout Frame gapless encoding preroll method to glitch out.

Basically, the libfdkaac doesn't provide a way to completely reset a decoder without deleting and recreating it. Resetting it just flushes it. Flushing causes IPF frames to think there was a stream error in need of recovery, and it will crossfade the start of the actual frame with the last frame that got decoded before the flush, which was the same exact frame. So the start of the frame gets crossfaded with part of the end of itself. Oopsies.

Re: exhale - Open Source USAC encoder

Reply #645
Awesome, thanks very much for this fix, works great! As always, your prompt reaction here is much appreciated.

For those not following all of what happens on this thread: kode54 is talking about https://www.foobar2000.org/components/view/foo_pd_aac

Thanks for the info, celona. I'm happy to hear that just increasing the bit-rate (CVBR preset) works for you. I just committed an exhale 1.1.4 RC2. Please update to version 1.15 of the FDK-AAC packet decoder for foobar2000 before testing.

Chris
If I don't reply to your reply, it means I agree with you.


Re: exhale - Open Source USAC encoder

Reply #647
Intel compiles of exhale-V1.1.4-RC2-ad888151

www.rarewares.org/files/aac/exhale-V1.1.4-RC2-ad888151_x64.zip

www.rarewares.org/files/aac/exhale-V1.1.4-RC2-ad888151_x86.zip

Edit: The links are now correct. They were delivering the RC1 version. Thanks to capma for pointing this out.

Re: exhale - Open Source USAC encoder

Reply #648
Hello,

I haven't noticed that before and can't remember if it's by design : why an exhale encoded track can have an audible difference in loudness, compared to other formats?

I have compiled the latest git update, updated foo_pd_aac, encoded some of my regular test tracks and... Remember, I'm not good at all in ABX tests but something was bothering me on one track. Using Replaygain Scan in foobar, it shows that exhale encoding has a track gain of -9.72dB, while mp3/aac/opus/ogg average at -11dB (Track peak is 0.89 vs avg. 1.32). And with foobar ABX tools, switching between tracks makes exhale's one less... Punchy?

I've browsed (too) quickly the thread, can it be related to fdk-aac decoding? exhale file loudness computation? Or both?

Sorry if you have already explained this.

    AiZ

Re: exhale - Open Source USAC encoder

Reply #649
The USAC decoder has a dynamics compressor built-in, and will compress the audio to within 0.01 dB of maximum loudness on playback. Chris asked me to leave this enabled, or at least retune the default to that 0.01 dB threshold rather than the 0.1 dB it was originally, rather than outright disable it. Note, this is mostly a peak limiter, and it's sort of ideal to enable it for USAC anyway, since FDK doesn't support floating point decoding, or decoding with a fixed point that would preserve ±1.0 exceeding peaks.

USAC files will always have a compressor in the decoder, unless a particular implementation disables it.