Skip to main content

Notice

Please be aware that much of the software linked to or mentioned on this forum is niche and therefore infrequently downloaded. Lots of anti-virus scanners and so-called malware detectors like to flag infrequently downloaded software as bad until it is either downloaded enough times, or its developer actually bothers with getting each individual release allow listed by every single AV vendor. You can do many people a great favor when encountering such a "problem" example by submitting them to your AV vendor for examination. For almost everything on this forum, it is a false positive.
Topic: Satin - A New(?) Speech Codec developed by Microsoft for Teams (Read 3460 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Satin - A New(?) Speech Codec developed by Microsoft for Teams

Just found this blog post: https://techcommunity.microsoft.com/t5/microsoft-teams-blog/get-the-most-from-your-meetings-and-calls-with-microsoft-teams/ba-p/1911016

Attached are the two demonstration files linked to in that post (Silk.wav and Satin.wav), with the Silk version upsampled to 32 kHz and properly delay matched with the Satin version, for more reliable comparison.

Judging from the file name and sampling rate, Microsoft Teams previously used Silk (the speech coding core of Opus) in a Narrowband configuration (audio only up to 4 kHz), at least at low bit-rates. The new codec seems to achieve Super-wideband coding (the audio range up to 8 kHz is waveform coded, with some simple SBR-like audio bandwidth extension from Wideband 8 kHz to Super-wideband 16 kHz).

No clue which bit-rate this demo was made at, but judging, again, from the Silk.wav, it was likely quite low. Does anybody know more about this new codec?

The packet loss concealment demo in the above blog post is also quite convincing.

Chris
If I don't reply to your reply, it means I agree with you.


Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #2
My first reaction was "oh, just when the world desperately needed another WMA!", but maybe there is more to it?
High Voltage socket-nose-avatar

Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #3
Considering a high grade of similarity between WMA with LC/HE-AAC also WMV/VC1 with MPEG4 Part2 ASP/H.263(+) there are high chances that a new MSFT codec (Satin) is EVS or based on it.
EVS provides WB at 7.2 kbps and now MSFT speaks about WB at 7 kbps.  It's too much coincidence (?)

P.S: as a user, I can say that MSFT Teams has a very good audio VoIP quality  and overall experience :)

 

Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #4
The name "satin" makes me think it's somehow related to the older "silk" codec that was part of Skype (and IIRC is used for speech in Opus or something like that)

Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #5
It's definitely related to Silk. Microsoft acquired Skype in 2011 and I assume that some of the developers of Silk (for Skype back then) improved upon that codec, and kept the naming scheme. After all, it's much easier and cheaper to gradually improve a codec than to develop a completely new one.

Chris
If I don't reply to your reply, it means I agree with you.

Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #6
Oops, then my assumption about Satin being based on EVS was wrong.

Today I've seen FR of Satin_32kHz.wav and it's not similar to EVS's bandwidth extension. It's rather similar to Opus/CELT's band folding or something else.

Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #7
Considering a high grade of similarity between WMA with LC/HE-AAC

WMA is a stripped down MDCT codec, so it is superficially similar to almost all modern codecs.  I don't think AAC specifically was a huge inspiration, probably AC3 was given the time frame and how similar the core codecs are.  The option to let the encoder use tons of different transform lengths all in the same file seems like a (over?) reaction to how mp3 picked poor transform sizes and then was stuck with them.  Maybe AC3 and MP3 as inspirations. 

Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #8
Yes, You probably mean an older versions of WMA.
However the latest WMA10pro has SBR-like BWE and its efficiency is on par with HE/LC-AAC. 

Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #9
^^^^ You meant WMA Pro. It's a completely different codec then WMA.

Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #10
ok, got it.

Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #11
Satin: Microsoft’s latest AI-powered audio codec for real-time communications

New blog post by Microsoft with more details and audio samples. Impressive audio quality at 6 kbps.

Unless I missed something it doesn't mention whether this codec will be royalty-free or proprietary.

Quote
Satin is already being used for all Teams and Skype two-party calls and will roll out for Teams meetings soon. It currently operates in wideband voice mode within a bitrate range of 6 – 36 kbps and will be extended to support full-band stereo music at a maximum sampling rate of 48 kHz in the near future. We are very excited for you to try this new codec and let us know what you think.

I wonder about the higher bitrate performance and how it will compare to Opus.

Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #12
@Spyros, thank you for the link.

This is actually good news that MSFT Satin won't be just AI-based speech codec but also will support fullband music.  :)

It's clear now that next generation audio codecs will be AI-based.


Re: Satin - A New(?) Speech Codec developed by Microsoft for Teams

Reply #14
Hello,

Just saw this on Phoronix :
Google AI Blog: Lyra: A New Very Low-Bitrate Codec for Speech Compression

Very impressive. Let's hope one of these (Satin or Lyra) becomes an open standard. I wonder how they compare with Codec2.

Paper linked at the blog post: Generative Speech Coding with Predictive Variance Regularization

With these latest developments, it certainly feels we are at the endgame for lossy audio codecs and that after 2022-2025 the improvements will be extremely tiny, if any.

 
SimplePortal 1.0.0 RC1 © 2008-2021