Notice
Recent Topics
Board News
TAK 2.3.1
Final release of TAK 2.3.1 ((T)om's lossless (A)udio (K)ompressor)
It consists of:
- TAK Applications 2.3.1
- TAK Winamp plugin 2.3.1
- TAK Decoding library 2.3.1
- TAK SDK 1.1.1
Download
Lyra: AI powered speech codec from Google
Lyra is a high-quality, very low-bitrate speech codec that makes voice communication available even on the slowest networks. To do this, we’ve applied traditional codec techniques while leveraging advances in machine learning (ML) with models trained on thousands of hours of data to create a novel method for compressing and transmitting voice signals.
Lyra Overview
The basic architecture of the Lyra codec is quite simple. Features, or distinctive speech attributes, are extracted from speech every 40ms and are then compressed for transmission. The features themselves are log mel spectrograms, a list of numbers representing the speech energy in different frequency bands, which have traditionally been used for their perceptual relevance because they are modeled after human auditory response. On the other end, a generative model uses those features to recreate the speech signal. In this sense, Lyra is very similar to other traditional parametric codecs, such as MELP.
However traditional parametric codecs, which simply extract from speech critical parameters that can then be used to recreate the signal at the receiving end, achieve low bitrates, but often sound robotic and unnatural. These shortcomings have led to the development of a new generation of high-quality audio generative models that have revolutionized the field by being able to not only differentiate between signals, but also generate completely new ones. DeepMind’s WaveNet was the first of these generative models that paved the way for many to come. Additionally, WaveNetEQ, the generative model-based packet-loss-concealment system currently used in Duo, has demonstrated how this technology can be used in real-world scenarios.
A New Approach to Compression with Lyra
Using these models as a baseline, we’ve developed a new model capable of reconstructing speech using minimal amounts of data. Lyra harnesses the power of these new natural-sounding generative models to maintain the low bitrate of parametric codecs while achieving high quality, on par with state-of-the-art waveform codecs used in most streaming and communication platforms today. The drawback of waveform codecs is that they achieve this high quality by compressing and sending over the signal sample-by-sample, which requires a higher bitrate and, in most cases, isn’t necessary to achieve natural sounding speech.
One concern with generative models is their computational complexity. Lyra avoids this issue by using a cheaper recurrent generative model, a WaveRNN variation, that works at a lower rate, but generates in parallel multiple signals in different frequency ranges that it later combines into a single output signal at the desired sample rate. This trick enables Lyra to not only run on cloud servers, but also on-device on mid-range phones in real time (with a processing latency of 90ms, which is in line with other traditional speech codecs). This generative model is then trained on thousands of hours of speech data and optimized, similarly to WaveNet, to accurately recreate the input audio.
Comparison with Existing Codecs
Since the inception of Lyra, our mission has been to provide the best quality audio using a fraction of the bitrate data of alternatives. Currently, the royalty-free open-source codec Opus, is the most widely used codec for WebRTC-based VOIP applications and, with audio at 32kbps, typically obtains transparent speech quality, i.e., indistinguishable from the original. However, while Opus can be used in more bandwidth constrained environments down to 6kbps, it starts to demonstrate degraded audio quality. Other codecs are capable of operating at comparable bitrates to Lyra (Speex, MELP, AMR), but each suffer from increased artifacts and result in a robotic sounding voice.
Lyra is currently designed to operate at 3kbps and listening tests show that Lyra outperforms any other codec at that bitrate and is compared favorably to Opus at 8kbps, thus achieving more than a 60% reduction in bandwidth. Lyra can be used wherever the bandwidth conditions are insufficient for higher-bitrates and existing low-bitrate codecs do not provide adequate quality.
Source
It looks like this is no longer an audio "codec" - it's basically the AI to recognize speech and synthesize it which is simply amazing. Perhaps future video codecs will work similarly. NVIDIA has already created a working AI powered video codec for video conferences which requires a much lower bitrate than standard codecs.
Exact Audio Copy v1.6
Exact Audio Copy v1.6 released
Homepage
http://www.exactaudiocopy.de/
Download
http://www.exactaudiocopy.de/en/index.php/resources/download/
Changelog:
- Standard setup now using gnudb.org instead of freedb.org (for built-in engine)
- Several small problems with secondary encoder
- Fixed problems with the Musicbrainz plugin
- Several smaller bugs removed
No new features.
Issue "Replace spaces by underscores" (Filename and Additional Filename tab)
https://hydrogenaud.io/index.php?msg=980398 was one of the bugs addressed.
-
Re: AAC(16bit) vs WMA... by LithosZALast post:
-
Re: TAK 2.3.1 by phwLast post:
-
News Submissions
Submit news for validation. Validated news will appear in the "Validated News"-section and on the front page.
Re: Got message from BBC by fermat62Last post:
-
Re: 2021 Format poll ... by erick128Last post:
-
Site Related Discussion
Hydrogenaudio.org site discussion. Feedback, suggestions, problems etc. related to the the site and forums.
Re: Cannot access hyd... by AnielytraLast post:
-
Re: Personal Blind Li... by Kamedo2Last post:
-
Scientific Discussion
Discussion of psychoacoustic phenomena and models, coding architectures and algorithms, and other general DSP related subjects.
Re: Compression optio... by OctocontrabassLast post:
-
Uploads
This is the forum for regular members to upload files for use by others. Hydrogenaudio.org takes no responsibility for the content that may be present here, but states that any misuse of this forum, as deemed by the staff, may result in revocation of the offending users account. Acceptable content includes freely and legally distributable data of the following types: audio programs, audio samples (under 30 second clips), misc. audio related data, or other utilities which are immediately relevant to the Hydrogenaudio.org community.
Mobile (Android) Foob... by simon_owlLast post:
-
Proposed changes to "... by korthLast post:
-
Re: Qaac vs FDK vs Fh... by U2021Last post:
-
-
Sub-boards:
- AAC - General
- AAC - Tech
-
-
Re: what are the smal... by lvqclLast post:
-
-
Sub-boards:
- MP3 - General
- MP3 - Tech
-
-
Re: Vorbis 1.3.7 aoTu... by deus-exLast post:
-
-
Sub-boards:
- Ogg Vorbis - General
- Ogg Vorbis - Tech
-
-
Other Lossy Codecs
Discussion of other lossy audio codecs like AC3, ADPCM, Atrac, Dolby Pro logic/II, DTS, MP1, MP2, Real Audio, VQF, Wavpack lossy, WMA etc.
Re: exhale - Open Sou... by JptLast post:
-
Speech Codecs
Discussion of speech codecs like Speex, GSM-FR, GSM-EFR, iLBC, G.723.1, G.728, G.729, AMR-NB, AMR-WB, VSELP, ACELP.xxx etc.
Re: Google Lyra goes ... by spoonLast post:
-
MPC
Discussion of MPC (Musepack) audio compression. The official forum is at http://forum.musepack.net/
Re: Musepack in 2020 by ani_Jackal3Last post:
-
Re: Opus lowpass filter by AhoyMateyLast post:
-
Re: New FLAC compress... by PorcusLast post:
-
Re: Multi-layered wvc... by PorcusLast post:
-
Lossless / Other Codecs
General discussion of lossless audio compression and other lossless codecs like ALAC, Monkey's Audio, WMA Lossless, OptimFrog, LA, LPAC, Shorten, TAK etc.
Re: questions about l... by CynicLast post:
-
Re: Audacity will spy... by shadowkingLast post:
-
General A/V
Discussion of general A/V topics such as DivX/XviD, AVC (H.264), DVD ripping, VirtualDub, container formats, streaming, and so on.
Moderator: smok3
Re: catalog no., isrc... by brainchildLast post:
-
Multichannel audio on... by jaywigleyLast post:
-
CD Hardware/Software
Discussion of CD-ROM/-R/-RW/DVD-hardware, copying, ripping and burning of CD media, EAC, CDex, Plextools etc.
Moderator: Pio2001
Re: Beginner looking ... by RosevalLast post:
-
Re: is there a way of... by korthLast post:
-
Audio Hardware
Discussion of Audio Hardware, Soundcards, Hi-Fi equipment, stand-alone CD players, portable MP3 players, headphones etc.
Moderator: Pio2001
Re: Mass testing of p... by Serge SmirnoffLast post:
-
Re: Disks Played with... by paregistraseLast post:
-
foobar2000
Official foobar2000 forum. Discussion about Peter Pawlowski's advanced and compact audio player for Microsoft Windows called foobar2000.
Native MP3, Ogg Vorbis, MPC, FLAC, Ogg FLAC, WAV, MOD -support.Re: Spider Monkey Pan... by regorLast post:
-
Re: Does anyone still... by tomstephens89Last post:
-
Icecast with SSL Windows by handleymanLast post:
-
Recycle Bin
The trashcan of HydrogenAudio. These posts represent the kind of messages we wouldn't like to see any more. These include: trolls, offensive, zealotry, spam and other useless and redundant messages.
[Off Topic - TOS#5] F... by SwiLast post: