Notice
Recent Topics
Board News
TAK 2.3.1
Final release of TAK 2.3.1 ((T)om's lossless (A)udio (K)ompressor)
It consists of:
- TAK Applications 2.3.1
- TAK Winamp plugin 2.3.1
- TAK Decoding library 2.3.1
- TAK SDK 1.1.1
Download
Lyra: AI powered speech codec from Google
Lyra is a high-quality, very low-bitrate speech codec that makes voice communication available even on the slowest networks. To do this, we’ve applied traditional codec techniques while leveraging advances in machine learning (ML) with models trained on thousands of hours of data to create a novel method for compressing and transmitting voice signals.
Lyra Overview
The basic architecture of the Lyra codec is quite simple. Features, or distinctive speech attributes, are extracted from speech every 40ms and are then compressed for transmission. The features themselves are log mel spectrograms, a list of numbers representing the speech energy in different frequency bands, which have traditionally been used for their perceptual relevance because they are modeled after human auditory response. On the other end, a generative model uses those features to recreate the speech signal. In this sense, Lyra is very similar to other traditional parametric codecs, such as MELP.
However traditional parametric codecs, which simply extract from speech critical parameters that can then be used to recreate the signal at the receiving end, achieve low bitrates, but often sound robotic and unnatural. These shortcomings have led to the development of a new generation of high-quality audio generative models that have revolutionized the field by being able to not only differentiate between signals, but also generate completely new ones. DeepMind’s WaveNet was the first of these generative models that paved the way for many to come. Additionally, WaveNetEQ, the generative model-based packet-loss-concealment system currently used in Duo, has demonstrated how this technology can be used in real-world scenarios.
A New Approach to Compression with Lyra
Using these models as a baseline, we’ve developed a new model capable of reconstructing speech using minimal amounts of data. Lyra harnesses the power of these new natural-sounding generative models to maintain the low bitrate of parametric codecs while achieving high quality, on par with state-of-the-art waveform codecs used in most streaming and communication platforms today. The drawback of waveform codecs is that they achieve this high quality by compressing and sending over the signal sample-by-sample, which requires a higher bitrate and, in most cases, isn’t necessary to achieve natural sounding speech.
One concern with generative models is their computational complexity. Lyra avoids this issue by using a cheaper recurrent generative model, a WaveRNN variation, that works at a lower rate, but generates in parallel multiple signals in different frequency ranges that it later combines into a single output signal at the desired sample rate. This trick enables Lyra to not only run on cloud servers, but also on-device on mid-range phones in real time (with a processing latency of 90ms, which is in line with other traditional speech codecs). This generative model is then trained on thousands of hours of speech data and optimized, similarly to WaveNet, to accurately recreate the input audio.
Comparison with Existing Codecs
Since the inception of Lyra, our mission has been to provide the best quality audio using a fraction of the bitrate data of alternatives. Currently, the royalty-free open-source codec Opus, is the most widely used codec for WebRTC-based VOIP applications and, with audio at 32kbps, typically obtains transparent speech quality, i.e., indistinguishable from the original. However, while Opus can be used in more bandwidth constrained environments down to 6kbps, it starts to demonstrate degraded audio quality. Other codecs are capable of operating at comparable bitrates to Lyra (Speex, MELP, AMR), but each suffer from increased artifacts and result in a robotic sounding voice.
Lyra is currently designed to operate at 3kbps and listening tests show that Lyra outperforms any other codec at that bitrate and is compared favorably to Opus at 8kbps, thus achieving more than a 60% reduction in bandwidth. Lyra can be used wherever the bandwidth conditions are insufficient for higher-bitrates and existing low-bitrate codecs do not provide adequate quality.
Source
It looks like this is no longer an audio "codec" - it's basically the AI to recognize speech and synthesize it which is simply amazing. Perhaps future video codecs will work similarly. NVIDIA has already created a working AI powered video codec for video conferences which requires a much lower bitrate than standard codecs.
Exact Audio Copy v1.6
Exact Audio Copy v1.6 released
Homepage
http://www.exactaudiocopy.de/
Download
http://www.exactaudiocopy.de/en/index.php/resources/download/
Changelog:
- Standard setup now using gnudb.org instead of freedb.org (for built-in engine)
- Several small problems with secondary encoder
- Fixed problems with the Musicbrainz plugin
- Several smaller bugs removed
No new features.
Issue "Replace spaces by underscores" (Filename and Additional Filename tab)
https://hydrogenaud.io/index.php?msg=980398 was one of the bugs addressed.
-
Re: Hardware recommen... by PorcusLast post:
-
Re: TAK 2.3.1 by RollinLast post:
-
News Submissions
Submit news for validation. Validated news will appear in the "Validated News"-section and on the front page.
Re: Got message from BBC by fermat62Last post:
-
Re: 2021 Format poll ... by erick128Last post:
-
Site Related Discussion
Hydrogenaudio.org site discussion. Feedback, suggestions, problems etc. related to the the site and forums.
Re: Hydrogenaudio IRC... by PeterLast post:
-
Re: Personal Blind Li... by Kamedo2Last post:
-
Scientific Discussion
Discussion of psychoacoustic phenomena and models, coding architectures and algorithms, and other general DSP related subjects.
Re: FLAC vs. 320 VBR ... by PorcusLast post:
-
Uploads
This is the forum for regular members to upload files for use by others. Hydrogenaudio.org takes no responsibility for the content that may be present here, but states that any misuse of this forum, as deemed by the staff, may result in revocation of the offending users account. Acceptable content includes freely and legally distributable data of the following types: audio programs, audio samples (under 30 second clips), misc. audio related data, or other utilities which are immediately relevant to the Hydrogenaudio.org community.
MOVED: Y2Y by kode54Last post:
-
Re: Foobar Platforms by kode54Last post:
-
Re: What is Fraunhofe... by ani_Jackal3Last post:
-
-
Sub-boards:
- AAC - General
- AAC - Tech
-
-
Re: Multiple .wav fil... by John Jason JordanLast post:
-
-
Sub-boards:
- MP3 - General
- MP3 - Tech
-
-
Re: Vorbis 1.3.7 aoTu... by deus-exLast post:
-
-
Sub-boards:
- Ogg Vorbis - General
- Ogg Vorbis - Tech
-
-
Other Lossy Codecs
Discussion of other lossy audio codecs like AC3, ADPCM, Atrac, Dolby Pro logic/II, DTS, MP1, MP2, Real Audio, VQF, Wavpack lossy, WMA etc.
Re: EAC3 vs AAC by j7nLast post:
-
Speech Codecs
Discussion of speech codecs like Speex, GSM-FR, GSM-EFR, iLBC, G.723.1, G.728, G.729, AMR-NB, AMR-WB, VSELP, ACELP.xxx etc.
Re: Google Lyra goes ... by spoonLast post:
-
MPC
Discussion of MPC (Musepack) audio compression. The official forum is at http://forum.musepack.net/
Re: Musepack in 2020 by ani_Jackal3Last post:
-
Re: Nontransparent ex... by ani_Jackal3Last post:
-
Re: FLAC v1.3.3 by NetRangerLast post:
-
Re: Multi-layered wvc... by PorcusLast post:
-
Lossless / Other Codecs
General discussion of lossless audio compression and other lossless codecs like ALAC, Monkey's Audio, WMA Lossless, OptimFrog, LA, LPAC, Shorten, TAK etc.
Re: Best Audiochecker... by bennetngLast post:
-
Re: Audacity will spy... by mudlordLast post:
-
General A/V
Discussion of general A/V topics such as DivX/XviD, AVC (H.264), DVD ripping, VirtualDub, container formats, streaming, and so on.
Moderator: smok3
Re: catalog no., isrc... by brainchildLast post:
-
Multichannel audio on... by jaywigleyLast post:
-
CD Hardware/Software
Discussion of CD-ROM/-R/-RW/DVD-hardware, copying, ripping and burning of CD media, EAC, CDex, Plextools etc.
Moderator: Pio2001
Re: EAC, Plextor Prem... by PorcusLast post:
-
Re: CUETool how do yo... by JohnnySHLast post:
-
Audio Hardware
Discussion of Audio Hardware, Soundcards, Hi-Fi equipment, stand-alone CD players, portable MP3 players, headphones etc.
Moderator: Pio2001
Re: Mass testing of p... by Serge SmirnoffLast post:
-
Re: Disks Played with... by paregistraseLast post:
-
foobar2000
Official foobar2000 forum. Discussion about Peter Pawlowski's advanced and compact audio player for Microsoft Windows called foobar2000.
Native MP3, Ogg Vorbis, MPC, FLAC, Ogg FLAC, WAV, MOD -support.
Re: 1.6.7 beta 8 => E... by A_Man_Eating_DuckLast post:
-
Re: Why do lots of pe... by doccolinniLast post:
-
Re: Some computer pro... by 2tecLast post:
-
Recycle Bin
The trashcan of HydrogenAudio. These posts represent the kind of messages we wouldn't like to see any more. These include: trolls, offensive, zealotry, spam and other useless and redundant messages.
[double post] shuffle... by foorooLast post: