Skip to main content
Topic: State of the art lossy codecs and surround formats (Read 25300 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

State of the art lossy codecs and surround formats

I thought I'd share these for interest. It's amazing how much is happening in the audio world right now.

In terms of new-ish lossy codecs and surround formats, I know of the following - anyone know of any others?

My explanations, where included, are from my current understanding - reality may be more complex


Opus

I don't need to explain this one.
http://opus-codec.org/
http://en.wikipedia.org/wiki/Opus_(audio_format)
http://www.hydrogenaud.io/forums/index.php?showtopic=106911
etc


xHE-AAC

This includes some information about Extended HE-AAC - see slide 15 onwards:
http://www.irt.de/webarchiv/showdoc.php?z=...DA1MjE2I3BkZg==

There are some xHE-AAC and Opus samples here:
http://www.indexcom.com/streaming/codec/
They're not mine, and I don't know what software versions were used to encode them.
The Beatles sample is particularly challenging at low bitrates with its (uncomfortably) wide stereo.

There are some very low bitrate examples in this demo video:
http://www.drm.org/?page_id=2396
(click "DRM-xHE-AAC demo")
doc.php?z=NjI4MSMxMDA1MjE2I3BkZg==


Dolby AC-4

Dolby's latest codec, aiming to reduce the required bitrate compared to E-AC-3.

A few details, and a link to the spec, here:
http://www.investincotedazur.com/en/info/n...codec-standard/


MPEG-H Part 3: Audio

A new 3D audio standard, supporting a mixture of channel-based (like conventional 5.1), object-based (mono source + position information), and scene based (Higher Order Ambisonics) audio sources, efficient encoding of them, and a renderer to put it all back together and feed it to whatever speaker array (or headphones) you have.

They are targeting 256kbs - 1.2Mbps in the first phase (effectively complete - standard will be published February 2015). It delivers close to "true transparency" at 1.2Mbps 22.2 channels:
http://multimediacommunication.blogspot.co...th-meeting.html

They are targeting 48kbps - 128kbps in the second phase. It delivers "Good" quality (MUSHRA scale) at 128kbps for 22.2 channels:
http://mpeg.chiariglione.org/meetings/109
(see press release).


Dolby Atmos Home

From the horse's mouth:
http://blog.dolby.com/2014/06/dolby-atmos-...tions-answered/
http://blog.dolby.com/2014/06/dolby-atmos-...ving-room-near/


ECMA-407

This seems to be a way of parametrically encoding multiple (e.g. 22.2) channels, using an extremely low bitrate addition to a standard lossy codec (e.g. AAC). There is a 22.2 demonstration running at 256kbps (using AAC).

The spec is here:
http://www.ecma-international.org/publicat...ds/Ecma-407.htm

The company behind it is here:
http://www.swissaudec.com/


Cheers,
David.

State of the art lossy codecs and surround formats

Reply #1
There's Xiph.org Ghost of course, but it's more a bunch of ideas and not really a codec (yet) if I understood correctly.
Music: sounds arranged such that they construct feelings.

State of the art lossy codecs and surround formats

Reply #2
Nice list David. Please allow me to add "the" second 3D format that is more music based, where Dolby Atmos is more cinema based.

Auro-3D

"Auro-3D is the next generation three-dimensional audio standard. It provides a realistic sound experience unlike anything before. By fully immersing the listener in a cocoon of life-like sound, Auro-3D® creates the sensation of actually 'being there'. Thanks to a unique 'Height' channel configuration, acoustic reflections are generated and heard naturally due to the fact that sounds originate from around as well as above the listener."

State of the art lossy codecs and surround formats

Reply #3
Isn't DTS-UHD similar to Auro-3D?


State of the art lossy codecs and surround formats

Reply #5
I think the most of new surround formats are based on MPEG Surround or a fork of it.
http://mpeg.chiariglione.org/standards/mpeg-d

Bad news is that while the standard was adopted in 2007 there is no publicly available encoder. That's 7 years ago.
Same goes for USAC/xHE-AAC (2011)

State of the art lossy codecs and surround formats

Reply #6
In terms of new-ish lossy codecs and surround formats, I know of the following - anyone know of any others?

There's also the 3GPP Enhanced Voice Services (EVS) project. Standardization is currently in its last stage, see here.
This is basically a low-delay (< 30 ms) speech and music codec for up to super-wide bandwidth (SWB) even at very low bit-rates (e.g. 14-16 kHz bandwidth at 13 kbit/s mono).

[quote author=IgorC link=msg=0 date=]I think the most of new surround formats are based on MPEG Surround or a fork of it. ... Bad news is that while the standard was adopted in 2007 there is no publicly available encoder.[/quote]
I think in case of MPEG-H 3D-Audio it's a bit different from MPEG Surround.
The Sonnox Pro-Codec plug-in (and maybe also the Codec Toolbox plug-in) can encode to AAC with MPEG Surround.
But both aren't available for free, if that's what you mean by "publicly available".

Chris
If I don't reply to your reply, it means I agree with you.

State of the art lossy codecs and surround formats

Reply #7
Just read on the German heise Online that Apple and U2 are working on a new format. Anyone happens to know more about it?
I wonder what kind of expertise U2 has w.r.t. audio technology... or maybe it will be as ill-advised as the Pono bullshit by Neil Young.

EDIT: It's been slashdotted.
It's only audiophile if it's inconvenient.

State of the art lossy codecs and surround formats

Reply #8
There's also the 3GPP Enhanced Voice Services (EVS) project. Standardization is currently in its last stage, see here.
This is basically a low-delay (< 30 ms) speech and music codec for up to super-wide bandwidth (SWB) even at very low bit-rates (e.g. 14-16 kHz bandwidth at 13 kbit/s mono).

EVS is very interesting format for Telcos. 5.9 kbps for wideband voice is something outstanding.
http://docbox.etsi.org/workshop/2012/20121...m_varga_evs.pdf

Though again this format isn't into transparent region. Such features like stereo, full band 44.1/48 kHz, VBR aren't mandatory.
And it says that EVS would have "near" (not superior to) AAC quality for music content.


The Sonnox Pro-Codec plug-in (and maybe also the Codec Toolbox plug-in) can encode to AAC with MPEG Surround.
But both aren't available for free, if that's what you mean by "publicly available".

I didn't know about these applications. Thank You. Though there is no hard-/software support for it. It’s not about me going hard here but if such destiny waits for USAC then it will be the same way unpopular and unused. It would be a loss of opportunity.

At least in my opinion, open source AAC encoder (FDK) was a step into right direction.   



State of the art lossy codecs and surround formats

Reply #9
EVS is very interesting format for Telcos. 5.9 kbps for wideband voice is something outstanding.


That just reminded me of codec2.
Apparently, the Free Telephony Project is still active.

http://www.rowetel.com/blog/?page_id=452
http://freedv.org/
http://sourceforge.net/p/freetel/code/HEAD/tree/

State of the art lossy codecs and surround formats

Reply #10
There are some xHE-AAC and Opus samples here:
http://www.indexcom.com/streaming/codec/
...

That xHE-AAC encoder isn't particularly high quality one.

Here are some USAC and HE-AAC files from Unified Speech and Audio Coding Verification Test Report

P.S. Nice video about USAC http://mpeg.chiariglione.org/

State of the art lossy codecs and surround formats

Reply #11
Just read on the German heise Online that Apple and U2 are working on a new format. Anyone happens to know more about it?

My first thought was HD-AAC. But that's just my guess.

---

xHE-AAC/USAC sounds interesting. Is there any paper which details what is the difference between SBR and eSBR? From the spectrum it seems it has more detailed control over the SBR replication tools and USAC's more efficient coding allows to use higher frequency cross-cut point between LC-AAC and SBR at the same bitrate than HE-AAC.

For me @64kbps and above Opus is better then HE-AAC. For speech i would use Codec2. I was amazed how understandable the output is at those extremly low bitrates.  (I wonder why none of the digital radio broadcast standards adopt an updateable codec architecture with on-air downloaded decoders? Nowadays even low end ARM processors are strong enough to decode stereo audio thrown at them in any codec format)
WavPack -b4x4hc
Opus --cvbr --bitrate 256 --framesize 5

State of the art lossy codecs and surround formats

Reply #12
xHE-AAC/USAC sounds interesting. Is there any paper which details what is the difference between SBR and eSBR?

I found this, at the right side you can click on "View" to download it.

[quote author=IgorC link=msg=0 date=]Though again this format isn't into transparent region. Such features like stereo, full band 44.1/48 kHz, VBR aren't mandatory.[/quote]
You're right about the non-mandatory features. But when I worked on it, up to 96 kbit/s mono was supported. That should be transparent. And supporting only CBR might be less of an issue than you might think, see below.

[quote author=IgorC link=msg=0 date=]And it says that EVS would have "near" (not superior to) AAC quality for music content.[/quote]
Last year we did a MUSHRA test 48 kbit/s mono with an early version of the EVS codec, in which AAC was outperformed even though the EVS codec uses hard-CBR (CELT ran at unconstrained-VBR, the other codecs at CVBR).
Not sure about the lower bit-rates, but I don't think it will be worse than AAC (it might be worse than xHE-AAC, though, a consequence of the low-delay constraint).

[attachment=8021:icassp14.png] figure taken from this ICASSP paper.

Chris
If I don't reply to your reply, it means I agree with you.

State of the art lossy codecs and surround formats

Reply #13
Isn't DTS-UHD similar to Auro-3D?

I thought that "height" channels were already and option of 7.1, instead of more back speakers. I could be confusing it though. I know that many 7.1 receivers have an option regarding height channels

Also, please don't rip on me if you choose to answer this question quickly, yes I have tried to look it up; when a multichannel lossy codec is has a bit rate eg; 320kbps is that divided among 5 or seven speakers? like 64kbps per channel. I assume this is true as is 128kbps stereo is 64kbps per channel, right? To save your time, I'm ok with "correct," "No," or "it's more complicated than that, go look it up some more." 
end the LOUDNESS war... please?

State of the art lossy codecs and surround formats

Reply #14
@plasticpitchfork No. (Channel coupling < is the thing to google about)
PANIC: CPU 1: Cache Error (unrecoverable - dcache data) Eframe = 0x90000000208cf3b8
NOTICE - cpu 0 didn't dump TLB, may be hung


State of the art lossy codecs and surround formats

Reply #16
[quote author=IgorC link=msg=0 date=]Though again this format isn't into transparent region. Such features like stereo, full band 44.1/48 kHz, VBR aren't mandatory.

You're right about the non-mandatory features. But when I worked on it, up to 96 kbit/s mono was supported. That should be transparent. And supporting only CBR might be less of an issue than you might think, see below.
[/quote]
The source code for the fixed point implementation of the EVS is also available:
http://www.3gpp.org/DynaReport/26442.htm
The floating point source code should be available next year.

EVS also supports full band (48kHz). Stereo is not available yet, but hopefully will be provided in the future. A bit to signal stereo is reserved in the bitstream.
Mono up to 128 kbps is supported and already at 48 kbps listening results are great (as Chris showed with the graph). So I expect that it is transparent at 96 kbps mono.
But the main thing is that the EVS provides much better quality than the competition at low bitrates (<= 24.4 kbps). Some papers including listening test results should appear next year that will show this.

State of the art lossy codecs and surround formats

Reply #17
The MPEG-H listening tests have been published...
http://mpeg.chiariglione.org/standards/mpe...formance-report

Before you get too excited, the test only includes MPEG-H, three unidentified codecs, and a 3.5kHz LPF low anchor. I'm sure the folks at MPEG know more than me, but in my naive opinion the low anchor is too low.

Cheers,
David.

State of the art lossy codecs and surround formats

Reply #18
The MPEG-H listening tests have been published...
http://mpeg.chiariglione.org/standards/mpe...formance-report

Before you get too excited, the test only includes MPEG-H, three unidentified codecs, and a 3.5kHz LPF low anchor. I'm sure the folks at MPEG know more than me, but in my naive opinion the low anchor is too low.

The anchors are dictated by the MUSHRA methodology: http://en.wikipedia.org/wiki/MUSHRA.
They serve as regulators for the score range.

State of the art lossy codecs and surround formats

Reply #19
Thank you. I'd forgotten the 3.5kHz one was mandatory.

Cheers,
David.

State of the art lossy codecs and surround formats

Reply #20
[quote author=IgorC link=msg=0 date=]I think the most of new surround formats are based on MPEG Surround or a fork of it. ... Bad news is that while the standard was adopted in 2007 there is no publicly available encoder.

I think in case of MPEG-H 3D-Audio it's a bit different from MPEG Surround.
[/quote]

The extra additions for the "USAC-plus" core codec in MPEG-H 3D-Audio (MHA? 3DA?) seem like they'd be really useful for normal, stereo mid-bitrate situations. At least one of the tools, the combined parametric + residual stereo mixer, simply reads like someone forgot to include it in the original USAC spec. 

I wonder if there's been any effort to characterize how 3DA performs as a basic stereo music/voice codec.

State of the art lossy codecs and surround formats

Reply #21


Ah, it seems like stereo subjective testing will be part of the submission process for ATSC 3.0.

It'll be between MPEG-H, Dolby, and DTS. I hope the results will be made public.

State of the art lossy codecs and surround formats

Reply #22
I wish they also allowed public contribution through the internet. Just hope their testing methodology isn't flawed.

State of the art lossy codecs and surround formats

Reply #23
As surround formats are discussed it's worth to mention that a patents of AC-3 will expire somewhere in 2017.

A new MPEG-H 3D-Audio will still offer a high number of channels then AC-3 5.1. So compression efficiency is only one variable here. Some part of industry can eventually prefer free-patent AC-3 in future. AC-3 is a most popular audio format on DVD and Blu Ray.

It can lead to same situation as for JPEG image format.  All its patents have been expired and there were a number of a newer image compression formats (JPEG2000, JPEG XR, WebP (intra VP8) and lately HEVC-based BPG). But none of them come any closer to JPEG's status of leading/most popular image format.

State of the art lossy codecs and surround formats

Reply #24
It can lead to same situation as for JPEG image format.  All its patents have been expired and there were a number of a newer image compression formats (JPEG2000, JPEG XR, WebP (intra VP8) and lately HEVC-based BPG). But none of them come any closer to JPEG's status of leading/most popular image format.

The same story will probably happen with mp3. We are just 2 years away from expiring it's last patents in April 2017. With zillions mp3 supporting devices around the world it will seriously reduce importance of free lossy audio codecs like ogg or opus, at least at high bitrates 128k+.

 
SimplePortal 1.0.0 RC1 © 2008-2018