Difficult samples for 1.2 beta at ~48 kbps

Topic: Difficult samples for 1.2 beta at ~48 kbps (Read 7175 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Difficult samples for 1.2 beta at ~48 kbps

2017-05-31 13:40:40

Found a dozen of short samples, that sounds unpleasantly with new beta (NetRanger build from May 27) at 32-50 kbps range. There are some easily audible clicks and noises.

P.s. When i merge all samples into one file, seems like all artifacts gone (even on Casio MT-600 Blues Harmonica C4 part), excerpt for the first second of file.

Re: Difficult samples for 1.2 beta at ~48 kbps

Reply #1 – 2017-06-02 07:20:40

Yeah, it is really a very noticeable problem, especially in some of the samples, like Alesis-Fusion-Viola-C5.1.2b.48k.opus

Re: Difficult samples for 1.2 beta at ~48 kbps

Reply #2 – 2017-06-02 19:04:08

Just confirm this weird behaviour

Re: Difficult samples for 1.2 beta at ~48 kbps

Reply #3 – 2017-06-02 19:43:57

Quote from: VEG on 2017-06-02 07:20:40

Yeah, it is really a very noticeable problem, especially in some of the samples, like Alesis-Fusion-Viola-C5.1.2b.48k.opus

If you're referring to the first 0.5 seconds of that file, then it's related to the speech/music detector that took a small amount of time to realize that the sample was music and not speech (so the first 0.44 seconds are encoded as speech). The good news is that it only happens at low bitrates and that for longer files it would only affect the beginning. That being said, Opus has a way to use up to two seconds of "look-ahead" to avoid this kind of issues. It's just a matter of getting the encoder to use it. I recently wrote a library (libopusenc) that actually makes it easy to encode Opus files and take advantage of that feature.

Re: Difficult samples for 1.2 beta at ~48 kbps

Reply #4 – 2017-06-03 11:29:34

I thought that changing between these modes is transparent...

Quote from: jmvalin on 2017-06-02 19:43:57

That being said, Opus has a way to use up to two seconds of "look-ahead" to avoid this kind of issues.

Is it possible to enable this mode in the opusenc.exe? Or maybe it is better to add some switch (--music) which will tell the encoder that it is definitely music and not speech. IMHO, it is even better to think that it is music by default, and --speech setting for changing this. As far as I remember, old versions of opusenc had such switches.

Re: Difficult samples for 1.2 beta at ~48 kbps

Reply #5 – 2017-06-03 16:46:49

Quote from: VEG on 2017-06-03 11:29:34

I thought that changing between these modes is transparent...

Usually they are, but sometimes it's not perfect, especially at low bitrate. I'm currently working on a patch that reduces artefacts on transitions by boosting the bitrate a bit.

Quote

Quote from: jmvalin on 2017-06-02 19:43:57
That being said, Opus has a way to use up to two seconds of "look-ahead" to avoid this kind of issues.
Is it possible to enable this mode in the opusenc.exe?

It's on the TODO list. As I said, I have it implemented in libopusenc, but we need to switch opusenc over to libopusenc.

Quote

Or maybe it is better to add some switch (--music) which will tell the encoder that it is definitely music and not speech.

Also on the TODO list for opusenc.

Quote

IMHO, it is even better to think that it is music by default, and --speech setting for changing this.

Somehow everyone thinks what they're doing should be the default ;-) There's also a lot of people using Opus for podcasts and that sort of things. So the default will remain "auto-detect". And it's generally doing a decent job. It's only the very beginning of the files that had an issue -- and only at low bitrate

Re: Difficult samples for 1.2 beta at ~48 kbps

Reply #6 – 2017-06-03 18:39:43

I could easily see in the near future opus being used in video games and apps for their audio files especially the ones the developers opted for vorbis (like Pokémon Go and Mobius Final Fantasy). Resolving this issue might prove integral. I would never have noticed but for this topic.

Re: Difficult samples for 1.2 beta at ~48 kbps

Reply #7 – 2017-06-03 20:14:41

Quote from: Klimis on 2017-06-03 18:39:43

I could easily see in the near future opus being used in video games and apps for their audio files especially the ones the developers opted for vorbis (like Pokémon Go and Mobius Final Fantasy). Resolving this issue might prove integral. I would never have noticed but for this topic.

"Video games" developers will likely have no problem tweeking the encoder in a way they need...

Re: Difficult samples for 1.2 beta at ~48 kbps

Reply #8 – 2017-06-03 20:42:32

I heavily doubt they would even bother. I haven't come across a single game they did. For example every game that uses vorbis seems to use general settings and implementations. Even mp3so in Square Enix games use generic Lame encoders and settings, they even leave the ID3tags on them for everybody to see.

Re: Difficult samples for 1.2 beta at ~48 kbps

Reply #9 – 2017-06-04 07:00:32

Quote from: Klimis on 2017-06-03 18:39:43

I could easily see in the near future opus being used in video games and apps for their audio files especially the ones the developers opted for vorbis (like Pokémon Go and Mobius Final Fantasy). Resolving this issue might prove integral. I would never have noticed but for this topic.

First, there *are* games already using Opus. In fact, there have been since before it was called Opus (it was CELT back then). Second, this issue can only happen at very low rates like 32-48 kb/s, whereas games tend to code at a much higher bitrate.

Re: Difficult samples for 1.2 beta at ~48 kbps

Reply #10 – 2017-06-04 08:20:44

So it turns out there was *also* a bug involved here. While everything I said earlier is true and there can be an issue when at the beginning of files the encoder isn't sure between speech and music, it turns out that a bug was causing the encoder to be sure the signal is speech. With the fix applied (see git master), it still takes about 0.5 second for the encoder to be sure it's music, but in the mean time (with 50% probability), it actually chooses to use CELT, which is the right decision.

Notice