HydrogenAudio

Lossy Audio Compression => Other Lossy Codecs => Topic started by: C.R.Helmrich on 2021-01-09 12:06:08

Title: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: C.R.Helmrich on 2021-01-09 12:06:08
Just found this blog post: https://techcommunity.microsoft.com/t5/microsoft-teams-blog/get-the-most-from-your-meetings-and-calls-with-microsoft-teams/ba-p/1911016

Attached are the two demonstration files linked to in that post (Silk.wav (https://cdn.techcommunity.microsoft.com/assets/MicrosoftTeams/Silk.wav) and Satin.wav (https://cdn.techcommunity.microsoft.com/assets/MicrosoftTeams/Satin.wav)), with the Silk version upsampled to 32 kHz and properly delay matched with the Satin version, for more reliable comparison.

Judging from the file name and sampling rate, Microsoft Teams previously used Silk (the speech coding core of Opus) in a Narrowband configuration (audio only up to 4 kHz), at least at low bit-rates. The new codec seems to achieve Super-wideband coding (the audio range up to 8 kHz is waveform coded, with some simple SBR-like audio bandwidth extension from Wideband 8 kHz to Super-wideband 16 kHz).

No clue which bit-rate this demo was made at, but judging, again, from the Silk.wav, it was likely quite low. Does anybody know more about this new codec?

The packet loss concealment demo in the above blog post is also quite convincing.

Chris
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: o-l-a-v on 2021-01-09 16:21:50
Found mentions of it back in march 2020 already.


Can't seem to find more details about it though.
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: Porcus on 2021-01-09 19:56:50
My first reaction was "oh, just when the world desperately needed another WMA!", but maybe there is more to it?
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: IgorC on 2021-01-09 23:14:04
Considering a high grade of similarity between WMA with LC/HE-AAC also WMV/VC1 with MPEG4 Part2 ASP/H.263(+) there are high chances that a new MSFT codec (Satin) is EVS or based on it.
EVS provides WB at 7.2 kbps and now MSFT speaks about WB at 7 kbps.  It's too much coincidence (?)

P.S: as a user, I can say that MSFT Teams has a very good audio VoIP quality  and overall experience :)
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: binaryhermit on 2021-01-12 16:19:28
The name "satin" makes me think it's somehow related to the older "silk" codec that was part of Skype (and IIRC is used for speech in Opus or something like that)
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: C.R.Helmrich on 2021-01-12 23:44:38
It's definitely related to Silk. Microsoft acquired Skype in 2011 (https://news.microsoft.com/about/) and I assume that some of the developers of Silk (for Skype back then) improved upon that codec, and kept the naming scheme. After all, it's much easier and cheaper to gradually improve a codec than to develop a completely new one.

Chris
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: IgorC on 2021-01-13 00:08:22
Oops, then my assumption about Satin being based on EVS was wrong.

Today I've seen FR of Satin_32kHz.wav and it's not similar to EVS's bandwidth extension. It's rather similar to Opus/CELT's band folding or something else.
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: saratoga on 2021-01-13 03:00:04
Considering a high grade of similarity between WMA with LC/HE-AAC

WMA is a stripped down MDCT codec, so it is superficially similar to almost all modern codecs.  I don't think AAC specifically was a huge inspiration, probably AC3 was given the time frame and how similar the core codecs are.  The option to let the encoder use tons of different transform lengths all in the same file seems like a (over?) reaction to how mp3 picked poor transform sizes and then was stuck with them.  Maybe AC3 and MP3 as inspirations. 
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: IgorC on 2021-01-15 00:39:08
Yes, You probably mean an older versions of WMA.
However the latest WMA10pro has SBR-like BWE and its efficiency is on par with HE/LC-AAC. 
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: saratoga on 2021-01-17 17:57:42
^^^^ You meant WMA Pro. It's a completely different codec then WMA.
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: IgorC on 2021-01-19 21:04:56
ok, got it.
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: Spyrοs on 2021-02-18 11:51:02
Satin: Microsoft’s latest AI-powered audio codec for real-time communications (https://techcommunity.microsoft.com/t5/microsoft-teams-blog/satin-microsoft-s-latest-ai-powered-audio-codec-for-real-time/ba-p/2141382)

New blog post by Microsoft with more details and audio samples. Impressive audio quality at 6 kbps.

Unless I missed something it doesn't mention whether this codec will be royalty-free or proprietary.

Quote
Satin is already being used for all Teams and Skype two-party calls and will roll out for Teams meetings soon. It currently operates in wideband voice mode within a bitrate range of 6 – 36 kbps and will be extended to support full-band stereo music at a maximum sampling rate of 48 kHz in the near future. We are very excited for you to try this new codec and let us know what you think.

I wonder about the higher bitrate performance and how it will compare to Opus.
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: IgorC on 2021-02-19 20:38:22
@Spyros, thank you for the link.

This is actually good news that MSFT Satin won't be just AI-based speech codec but also will support fullband music.  :)

It's clear now that next generation audio codecs will be AI-based.
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: AiZ on 2021-02-27 08:53:19
Hello,

Just saw this on Phoronix (https://www.phoronix.com/scan.php?page=news_item&px=Google-Lyra) :
Google AI Blog: Lyra: A New Very Low-Bitrate Codec for Speech Compression (https://ai.googleblog.com/2021/02/lyra-new-very-low-bitrate-codec-for.html)

3kbps...  ???

    AiZ
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: Spyrοs on 2021-02-27 13:24:24
Hello,

Just saw this on Phoronix (https://www.phoronix.com/scan.php?page=news_item&px=Google-Lyra) :
Google AI Blog: Lyra: A New Very Low-Bitrate Codec for Speech Compression (https://ai.googleblog.com/2021/02/lyra-new-very-low-bitrate-codec-for.html)

Very impressive. Let's hope one of these (Satin or Lyra) becomes an open standard. I wonder how they compare with Codec2.

Paper linked at the blog post: Generative Speech Coding with Predictive Variance Regularization (https://arxiv.org/abs/2102.09660)

With these latest developments, it certainly feels we are at the endgame for lossy audio codecs and that after 2022-2025 the improvements will be extremely tiny, if any.
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: IgorC on 2021-03-25 04:13:25
As if LPCNet @ 1.6 kbps (https://jmvalin.ca/demo/lpcnet_codec/) wasn't enough low now they have outperformed it and even at lowered rate of 0.9 kbps (yes, less than 1 kbps!) https://arxiv.org/pdf/2102.06610.pdf  ???
Title: Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams
Post by: binaryhermit on 2021-03-25 11:42:08
Any samples at such a low bitrate?

I'm extremely skeptical.
SimplePortal 1.0.0 RC1 © 2008-2021