HydrogenAudio

Lossy Audio Compression => Other Lossy Codecs => Topic started by: 2Bdecided on 2014-09-18 10:53:32

Title: State of the art lossy codecs and surround formats
Post by: 2Bdecided on 2014-09-18 10:53:32
I thought I'd share these for interest. It's amazing how much is happening in the audio world right now.

In terms of new-ish lossy codecs and surround formats, I know of the following - anyone know of any others?

My explanations, where included, are from my current understanding - reality may be more complex


Opus

I don't need to explain this one.
http://opus-codec.org/ (http://opus-codec.org/)
http://en.wikipedia.org/wiki/Opus_(audio_format) (http://en.wikipedia.org/wiki/Opus_(audio_format))
http://www.hydrogenaud.io/forums/index.php?showtopic=106911 (http://www.hydrogenaud.io/forums/index.php?showtopic=106911)
etc


xHE-AAC

This includes some information about Extended HE-AAC - see slide 15 onwards:
http://www.irt.de/webarchiv/showdoc.php?z=...DA1MjE2I3BkZg== (http://www.irt.de/webarchiv/showdoc.php?z=NjI4MSMxMDA1MjE2I3BkZg==)

There are some xHE-AAC and Opus samples here:
http://www.indexcom.com/streaming/codec/ (http://www.indexcom.com/streaming/codec/)
They're not mine, and I don't know what software versions were used to encode them.
The Beatles sample is particularly challenging at low bitrates with its (uncomfortably) wide stereo.

There are some very low bitrate examples in this demo video:
http://www.drm.org/?page_id=2396 (http://www.drm.org/?page_id=2396)
(click "DRM-xHE-AAC demo")
doc.php?z=NjI4MSMxMDA1MjE2I3BkZg==


Dolby AC-4

Dolby's latest codec, aiming to reduce the required bitrate compared to E-AC-3.

A few details, and a link to the spec, here:
http://www.investincotedazur.com/en/info/n...codec-standard/ (http://www.investincotedazur.com/en/info/news/etsi-releases-ac-4-the-new-generation-audio-codec-standard/)


MPEG-H Part 3: Audio

A new 3D audio standard, supporting a mixture of channel-based (like conventional 5.1), object-based (mono source + position information), and scene based (Higher Order Ambisonics) audio sources, efficient encoding of them, and a renderer to put it all back together and feed it to whatever speaker array (or headphones) you have.

They are targeting 256kbs - 1.2Mbps in the first phase (effectively complete - standard will be published February 2015). It delivers close to "true transparency" at 1.2Mbps 22.2 channels:
http://multimediacommunication.blogspot.co...th-meeting.html (http://multimediacommunication.blogspot.co.uk/2013/08/mpeg-news-report-from-105th-meeting.html)

They are targeting 48kbps - 128kbps in the second phase. It delivers "Good" quality (MUSHRA scale) at 128kbps for 22.2 channels:
http://mpeg.chiariglione.org/meetings/109 (http://mpeg.chiariglione.org/meetings/109)
(see press release).


Dolby Atmos Home

From the horse's mouth:
http://blog.dolby.com/2014/06/dolby-atmos-...tions-answered/ (http://blog.dolby.com/2014/06/dolby-atmos-home-theaters-questions-answered/)
http://blog.dolby.com/2014/06/dolby-atmos-...ving-room-near/ (http://blog.dolby.com/2014/06/dolby-atmos-coming-soon-living-room-near/)


ECMA-407

This seems to be a way of parametrically encoding multiple (e.g. 22.2) channels, using an extremely low bitrate addition to a standard lossy codec (e.g. AAC). There is a 22.2 demonstration running at 256kbps (using AAC).

The spec is here:
http://www.ecma-international.org/publicat...ds/Ecma-407.htm (http://www.ecma-international.org/publications/standards/Ecma-407.htm)

The company behind it is here:
http://www.swissaudec.com/ (http://www.swissaudec.com/)


Cheers,
David.
Title: State of the art lossy codecs and surround formats
Post by: ktf on 2014-09-18 11:01:23
There's Xiph.org Ghost (http://xiph.org/~xiphmont/demo/ghost/demo.html) of course, but it's more a bunch of ideas and not really a codec (yet) if I understood correctly.
Title: State of the art lossy codecs and surround formats
Post by: Kees de Visser on 2014-09-18 11:38:26
Nice list David. Please allow me to add "the" second 3D format that is more music based, where Dolby Atmos is more cinema based.

Auro-3D

"Auro-3D (http://www.auro-3d.com/) is the next generation three-dimensional audio standard. It provides a realistic sound experience unlike anything before. By fully immersing the listener in a cocoon of life-like sound, Auro-3D® creates the sensation of actually 'being there'. Thanks to a unique 'Height' channel configuration, acoustic reflections are generated and heard naturally due to the fact that sounds originate from around as well as above the listener."
Title: State of the art lossy codecs and surround formats
Post by: probedb on 2014-09-18 12:29:44
Isn't DTS-UHD similar to Auro-3D?
Title: State of the art lossy codecs and surround formats
Post by: Sebastian Mares on 2014-09-18 14:21:43
Just read on the German heise Online that Apple and U2 are working on a new format. Anyone happens to know more about it?
Title: State of the art lossy codecs and surround formats
Post by: IgorC on 2014-09-18 14:27:22
I think the most of new surround formats are based on MPEG Surround or a fork of it.
http://mpeg.chiariglione.org/standards/mpeg-d (http://mpeg.chiariglione.org/standards/mpeg-d)

Bad news is that while the standard was adopted in 2007 there is no publicly available encoder. That's 7 years ago.
Same goes for USAC/xHE-AAC (2011)
Title: State of the art lossy codecs and surround formats
Post by: C.R.Helmrich on 2014-09-19 13:18:08
In terms of new-ish lossy codecs and surround formats, I know of the following - anyone know of any others?

There's also the 3GPP Enhanced Voice Services (EVS) project. Standardization is currently in its last stage, see here (http://www.3gpp.org/DynaReport/FeatureOrStudyItemFile-470030.htm).
This is basically a low-delay (< 30 ms) speech and music codec for up to super-wide bandwidth (SWB) even at very low bit-rates (e.g. 14-16 kHz bandwidth at 13 kbit/s mono).

Quote from: IgorC link=msg=0 date=
I think the most of new surround formats are based on MPEG Surround or a fork of it. ... Bad news is that while the standard was adopted in 2007 there is no publicly available encoder.

I think in case of MPEG-H 3D-Audio it's a bit different from MPEG Surround.
The Sonnox Pro-Codec (http://www.sonnoxplugins.com/pub/plugins/products/pro-codec.htm) plug-in (and maybe also the Codec Toolbox (http://www.sonnoxplugins.com/pub/plugins/products/codec/codectoolbox.html) plug-in) can encode to AAC with MPEG Surround.
But both aren't available for free, if that's what you mean by "publicly available".

Chris
Title: State of the art lossy codecs and surround formats
Post by: Kohlrabi on 2014-09-19 15:25:20
Just read on the German heise Online that Apple and U2 are working on a new format. Anyone happens to know more about it?
I wonder what kind of expertise U2 has w.r.t. audio technology... or maybe it will be as ill-advised as the Pono bullshit by Neil Young.

EDIT: It's been slashdotted. (http://apple.slashdot.org/story/14/09/19/1324241/u2-and-apple-collaborate-on-non-piratable-interactive-format-for-music)
Title: State of the art lossy codecs and surround formats
Post by: IgorC on 2014-09-19 16:39:01
There's also the 3GPP Enhanced Voice Services (EVS) project. Standardization is currently in its last stage, see here (http://www.3gpp.org/DynaReport/FeatureOrStudyItemFile-470030.htm).
This is basically a low-delay (< 30 ms) speech and music codec for up to super-wide bandwidth (SWB) even at very low bit-rates (e.g. 14-16 kHz bandwidth at 13 kbit/s mono).

EVS is very interesting format for Telcos. 5.9 kbps for wideband voice is something outstanding.
http://docbox.etsi.org/workshop/2012/20121...m_varga_evs.pdf (http://docbox.etsi.org/workshop/2012/201211_stqworkshop/s6_hdandcatiq/qualcomm_varga_evs.pdf)

Though again this format isn't into transparent region. Such features like stereo, full band 44.1/48 kHz, VBR aren't mandatory.
And it says that EVS would have "near" (not superior to) AAC quality for music content.


The Sonnox Pro-Codec (http://www.sonnoxplugins.com/pub/plugins/products/pro-codec.htm) plug-in (and maybe also the Codec Toolbox (http://www.sonnoxplugins.com/pub/plugins/products/codec/codectoolbox.html) plug-in) can encode to AAC with MPEG Surround.
But both aren't available for free, if that's what you mean by "publicly available".

I didn't know about these applications. Thank You. Though there is no hard-/software support for it. It’s not about me going hard here but if such destiny waits for USAC then it will be the same way unpopular and unused. It would be a loss of opportunity.

At least in my opinion, open source AAC encoder (FDK) was a step into right direction.   


Title: State of the art lossy codecs and surround formats
Post by: 2012 on 2014-09-19 20:24:47
EVS is very interesting format for Telcos. 5.9 kbps for wideband voice is something outstanding.


That just reminded me of codec2.
Apparently, the Free Telephony Project is still active.

http://www.rowetel.com/blog/?page_id=452 (http://www.rowetel.com/blog/?page_id=452)
http://freedv.org/ (http://freedv.org/)
http://sourceforge.net/p/freetel/code/HEAD/tree/ (http://sourceforge.net/p/freetel/code/HEAD/tree/)
Title: State of the art lossy codecs and surround formats
Post by: IgorC on 2014-09-20 18:47:02
There are some xHE-AAC and Opus samples here:
http://www.indexcom.com/streaming/codec/ (http://www.indexcom.com/streaming/codec/)
...

That xHE-AAC encoder isn't particularly high quality one.

Here (https://drive.google.com/folderview?id=0ByvUr-pp6BuUdmlYdGk3YlgyRFU&usp=sharing) are some USAC and HE-AAC files from Unified Speech and Audio Coding Verification Test Report (http://mpeg.chiariglione.org/sites/default/files/files/standards/parts/docs/w12232-v2-w12232.zip)

P.S. Nice video about USAC http://mpeg.chiariglione.org/ (http://mpeg.chiariglione.org/)
Title: State of the art lossy codecs and surround formats
Post by: darkbyte on 2014-09-20 22:58:47
Just read on the German heise Online that Apple and U2 are working on a new format. Anyone happens to know more about it?

My first thought was HD-AAC. But that's just my guess.

---

xHE-AAC/USAC sounds interesting. Is there any paper which details what is the difference between SBR and eSBR? From the spectrum it seems it has more detailed control over the SBR replication tools and USAC's more efficient coding allows to use higher frequency cross-cut point between LC-AAC and SBR at the same bitrate than HE-AAC.

For me @64kbps and above Opus is better then HE-AAC. For speech i would use Codec2. I was amazed how understandable the output is at those extremly low bitrates.  (I wonder why none of the digital radio broadcast standards adopt an updateable codec architecture with on-air downloaded decoders? Nowadays even low end ARM processors are strong enough to decode stereo audio thrown at them in any codec format)
Title: State of the art lossy codecs and surround formats
Post by: C.R.Helmrich on 2014-09-21 14:00:47
xHE-AAC/USAC sounds interesting. Is there any paper which details what is the difference between SBR and eSBR?

I found this (http://www.researchgate.net/publication/233389117_Audio_Engineering_Society_Convention_Paper_A_Novel_Scheme_for_Low_Bitrate_Unified_Speech_and_Audio_Coding_-MPEG_RM0), at the right side you can click on "View" to download it.

Quote from: IgorC link=msg=0 date=
Though again this format isn't into transparent region. Such features like stereo, full band 44.1/48 kHz, VBR aren't mandatory.

You're right about the non-mandatory features. But when I worked on it, up to 96 kbit/s mono was supported. That should be transparent. And supporting only CBR might be less of an issue than you might think, see below.

Quote from: IgorC link=msg=0 date=
And it says that EVS would have "near" (not superior to) AAC quality for music content.

Last year we did a MUSHRA test 48 kbit/s mono with an early version of the EVS codec, in which AAC was outperformed even though the EVS codec uses hard-CBR (CELT ran at unconstrained-VBR, the other codecs at CVBR).
Not sure about the lower bit-rates, but I don't think it will be worse than AAC (it might be worse than xHE-AAC, though, a consequence of the low-delay constraint).

[attachment=8021:icassp14.png] figure taken from this ICASSP paper (http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6854948).

Chris
Title: State of the art lossy codecs and surround formats
Post by: plasticpitchfork on 2014-09-21 14:56:43
Isn't DTS-UHD similar to Auro-3D?

I thought that "height" channels were already and option of 7.1, instead of more back speakers. I could be confusing it though. I know that many 7.1 receivers have an option regarding height channels

Also, please don't rip on me if you choose to answer this question quickly, yes I have tried to look it up; when a multichannel lossy codec is has a bit rate eg; 320kbps is that divided among 5 or seven speakers? like 64kbps per channel. I assume this is true as is 128kbps stereo is 64kbps per channel, right? To save your time, I'm ok with "correct," "No," or "it's more complicated than that, go look it up some more." 
Title: State of the art lossy codecs and surround formats
Post by: smok3 on 2014-09-21 17:06:59
@plasticpitchfork No. (Channel coupling < is the thing to google about)
Title: State of the art lossy codecs and surround formats
Post by: plasticpitchfork on 2014-09-24 09:24:09
@plasticpitchfork No. (Channel coupling < is the thing to google about)

Thanks!
Title: State of the art lossy codecs and surround formats
Post by: muaddib on 2014-10-07 11:33:36
Quote from: IgorC link=msg=0 date=
Though again this format isn't into transparent region. Such features like stereo, full band 44.1/48 kHz, VBR aren't mandatory.

You're right about the non-mandatory features. But when I worked on it, up to 96 kbit/s mono was supported. That should be transparent. And supporting only CBR might be less of an issue than you might think, see below.

The source code for the fixed point implementation of the EVS is also available:
http://www.3gpp.org/DynaReport/26442.htm (http://www.3gpp.org/DynaReport/26442.htm)
The floating point source code should be available next year.

EVS also supports full band (48kHz). Stereo is not available yet, but hopefully will be provided in the future. A bit to signal stereo is reserved in the bitstream.
Mono up to 128 kbps is supported and already at 48 kbps listening results are great (as Chris showed with the graph). So I expect that it is transparent at 96 kbps mono.
But the main thing is that the EVS provides much better quality than the competition at low bitrates (<= 24.4 kbps). Some papers including listening test results should appear next year that will show this.
Title: State of the art lossy codecs and surround formats
Post by: 2Bdecided on 2014-11-17 11:57:48
The MPEG-H listening tests have been published...
http://mpeg.chiariglione.org/standards/mpe...formance-report (http://mpeg.chiariglione.org/standards/mpeg-h/3d-audio/mpeg-h-3d-audio-performance-report)

Before you get too excited, the test only includes MPEG-H, three unidentified codecs, and a 3.5kHz LPF low anchor. I'm sure the folks at MPEG know more than me, but in my naive opinion the low anchor is too low.

Cheers,
David.
Title: State of the art lossy codecs and surround formats
Post by: muaddib on 2014-11-18 06:46:10
The MPEG-H listening tests have been published...
http://mpeg.chiariglione.org/standards/mpe...formance-report (http://mpeg.chiariglione.org/standards/mpeg-h/3d-audio/mpeg-h-3d-audio-performance-report)

Before you get too excited, the test only includes MPEG-H, three unidentified codecs, and a 3.5kHz LPF low anchor. I'm sure the folks at MPEG know more than me, but in my naive opinion the low anchor is too low.

The anchors are dictated by the MUSHRA methodology: http://en.wikipedia.org/wiki/MUSHRA (http://en.wikipedia.org/wiki/MUSHRA).
They serve as regulators for the score range.
Title: State of the art lossy codecs and surround formats
Post by: 2Bdecided on 2014-11-18 09:46:51
Thank you. I'd forgotten the 3.5kHz one was mandatory.

Cheers,
David.
Title: State of the art lossy codecs and surround formats
Post by: mavere on 2015-03-05 08:04:21
Quote from: IgorC link=msg=0 date=
I think the most of new surround formats are based on MPEG Surround or a fork of it. ... Bad news is that while the standard was adopted in 2007 there is no publicly available encoder.

I think in case of MPEG-H 3D-Audio it's a bit different from MPEG Surround.


The extra additions for the "USAC-plus" core codec in MPEG-H 3D-Audio (MHA? 3DA?) seem like they'd be really useful for normal, stereo mid-bitrate situations. At least one of the tools, the combined parametric + residual stereo mixer, simply reads like someone forgot to include it in the original USAC spec. 

I wonder if there's been any effort to characterize how 3DA performs as a basic stereo music/voice codec.
Title: State of the art lossy codecs and surround formats
Post by: mavere on 2015-03-06 04:05:07
(http://i.imgur.com/GS8QWsv.png)

Ah, it seems like stereo subjective testing will be part of the submission process for ATSC 3.0.

It'll be between MPEG-H, Dolby, and DTS. I hope the results will be made public.
Title: State of the art lossy codecs and surround formats
Post by: LithosZA on 2015-03-06 11:53:18
I wish they also allowed public contribution through the internet. Just hope their testing methodology isn't flawed.
Title: State of the art lossy codecs and surround formats
Post by: IgorC on 2015-04-12 18:36:32
As surround formats are discussed it's worth to mention that a patents of AC-3 will expire somewhere in 2017.

A new MPEG-H 3D-Audio will still offer a high number of channels then AC-3 5.1. So compression efficiency is only one variable here. Some part of industry can eventually prefer free-patent AC-3 in future. AC-3 is a most popular audio format on DVD and Blu Ray.

It can lead to same situation as for JPEG image format.  All its patents have been expired and there were a number of a newer image compression formats (JPEG2000, JPEG XR, WebP (intra VP8) and lately HEVC-based BPG). But none of them come any closer to JPEG's status of leading/most popular image format.
Title: State of the art lossy codecs and surround formats
Post by: SokilOff on 2015-04-13 01:41:22
It can lead to same situation as for JPEG image format.  All its patents have been expired and there were a number of a newer image compression formats (JPEG2000, JPEG XR, WebP (intra VP8) and lately HEVC-based BPG). But none of them come any closer to JPEG's status of leading/most popular image format.

The same story will probably happen with mp3. We are just 2 years away from expiring it's last patents in April 2017. With zillions mp3 supporting devices around the world it will seriously reduce importance of free lossy audio codecs like ogg or opus, at least at high bitrates 128k+.
Title: State of the art lossy codecs and surround formats
Post by: saratoga on 2015-04-13 15:51:02
It can lead to same situation as for JPEG image format.  All its patents have been expired and there were a number of a newer image compression formats (JPEG2000, JPEG XR, WebP (intra VP8) and lately HEVC-based BPG). But none of them come any closer to JPEG's status of leading/most popular image format.

The same story will probably happen with mp3. We are just 2 years away from expiring it's last patents in April 2017. With zillions mp3 supporting devices around the world it will seriously reduce importance of free lossy audio codecs like ogg or opus, at least at high bitrates 128k+.


IIUC, the last mp3 decoder patent in the USA expires in 2 weeks.
Title: State of the art lossy codecs and surround formats
Post by: IgorC on 2015-04-15 14:15:30
Talking about mp3 patents https://docs.google.com/spreadsheets/d/1R8K...1Jt0/edit#gid=0 (https://docs.google.com/spreadsheets/d/1R8KdbQCX-PPRRU3-d8FkcLb-lNsi705NpKJ7YQi1Jt0/edit#gid=0)

Original post http://www.hydrogenaud.io/forums/index.php...st&p=789994 (http://www.hydrogenaud.io/forums/index.php?s=&showtopic=94049&view=findpost&p=789994)
Title: State of the art lossy codecs and surround formats
Post by: SokilOff on 2015-04-16 13:50:35
Talking about mp3 patents https://docs.google.com/spreadsheets/d/1R8K...1Jt0/edit#gid=0 (https://docs.google.com/spreadsheets/d/1R8KdbQCX-PPRRU3-d8FkcLb-lNsi705NpKJ7YQi1Jt0/edit#gid=0)


Seem like this list is more detailed: MPEG-1 Audio Layer 3 patents (http://scratchpad.wikia.com/wiki/MPEG_patent_lists).
Title: State of the art lossy codecs and surround formats
Post by: IgorC on 2015-04-16 14:23:31
Yes, it's more complete.
Though both sources indicate that the last MP3 patent will expire 16/04/2017.  Exactly 2 years from today.
Title: State of the art lossy codecs and surround formats
Post by: IgorC on 2015-04-17 01:27:50
The extra additions for the "USAC-plus" core codec in MPEG-H 3D-Audio (MHA? 3DA?) seem like they'd be really useful for normal, stereo mid-bitrate situations. At least one of the tools, the combined parametric + residual stereo mixer, simply reads like someone forgot to include it in the original USAC spec. 

Wait a minute. A combination of parametric + residual isn't part of USAC? According to this presenetation it is. http://www.filedropper.com/aes132mpegxhe-a...aunhoffer132aes (http://www.filedropper.com/aes132mpegxhe-aacfraunhoffer132aes).

Yes, it could be useful for stereo content at 48 kbps and maybe somewhat  up to 64 kbps
Title: State of the art lossy codecs and surround formats
Post by: ethan11james on 2015-04-17 11:15:04
nice lists thanks for information
Title: State of the art lossy codecs and surround formats
Post by: mavere on 2015-04-18 20:18:47
The extra additions for the "USAC-plus" core codec in MPEG-H 3D-Audio (MHA? 3DA?) seem like they'd be really useful for normal, stereo mid-bitrate situations. At least one of the tools, the combined parametric + residual stereo mixer, simply reads like someone forgot to include it in the original USAC spec. 

Wait a minute. A combination of parametric + residual isn't part of USAC? According to this presenetation it is. http://www.filedropper.com/aes132mpegxhe-a...aunhoffer132aes (http://www.filedropper.com/aes132mpegxhe-aacfraunhoffer132aes).



From what I understand, USAC switches between parametric and residual per subband in an either/or fashion. MPEG-H adds (always-on) implicit weighing between parametric and residual information based on the difference between their energy for that band.

Quote
Yes, it could be useful for stereo content at 48 kbps and maybe somewhat  up to 64 kbps


In terms of usefulness at higher bitrates, I was thinking of the addition of transform splitting and generalized gap filling tools. The former allows 512 block lengths, as opposed to 128 or 1024, which seems applicable to most bit ranges. It's a bit harder to characterize what they call IGF (intelligent gap filling), but it reads like a generalization of SBR-esque coding to the MDCT domain with more precise/localized frequency replacement. The IGF tools is self-limiting, unlike SBR, in that it doesn't do anything at high bitrates when there are no spectral gaps to replace. That last detail is what I thought HE-AAC needed from the very beginning.

One last this-would-have-been-nice-to-have-since-forever thing is the built-in ability to hide priming samples in pre-roll frames that the decoder will simply discard once decoded.
Title: State of the art lossy codecs and surround formats
Post by: IgorC on 2015-05-25 20:37:34
If I understand it correctly SBR/eSBR will be replaced by IGF in 3DA.  Description of IGF (https://patentscope.wipo.int/search/en/detail.jsf?docId=WO2015010948&recNum=1&maxRec=&office=&prevFilter=&sortOption=&queryString=&tab=PCTDescription)

IGF should be considerably better than SBR/eSBR.

If this is so then it's great! Because when I've heard USAC with its eSBR for the first time my first thought was "this is still SBR-ish".  USAC's eSBR still uses QMF transform just as HE-AAC's SBR which still causes both frequency and temporal/transients distortions. Though it's still a bit better: USAC (with eSBR) at 64 kbps is on par with HE-AAC(with SBR) at 70-75 kbps.


So it seems like 3DA can have substantial improvements (IGF, weighing between parametric tools and residual coding for stereo, etc.)  on interesting range of bitrates for multichannel as well as stereo material.

I hope only for one thing. That there will be at least one single high quality encoder available any time soon. Yes, I have repeated many times and I will repeat it again that there is still no available (for simple mortals) MPEG Surround nor USAC encoder. Or is it just developers who have stopped to care?
Title: Re: State of the art lossy codecs and surround formats
Post by: IgorC on 2016-05-28 18:35:54
An interesting paper


http://www.eurasip.org/Proceedings/Eusipco/Eusipco2015/papers/1570102685.pdf

"LOW-COMPLEXITY SEMI-PARAMETRIC JOINT-STEREO AUDIO TRANSFORM CODING ":
Quote
...
The SBS system could, of course, also be integrated into legacy AAC, CELT, or any other transform codec using spectral bands
...