Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Personal Listening Test of 2 Opus encoders (Read 53646 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: Personal Listening Test of 2 Opus encoders

Reply #25
According to official site 1.1.2 fixes a bugs. All quality changes go to 1.2 ( exp_lbr_tune branch and others)

Opus 1.2 looks promising.  8)

Some time ago I have tried exp_lbr_tune build (see post #16) and Opus was already at least on par (if not better) than top-quality HE-AAC encoders at 48 kbps. And much better than Vorbis. 


Re: Personal Listening Test of 2 Opus encoders

Reply #26
Took some time, but the exp_lbr_tune has now landed in git master and I have done some more work to "generalize" the stereo collapse work I uploaded here earlier. Here's three samples on which I'd be interested in feedback:
https://jmvalin.ca/misc_stuff/weighting0.wav
https://jmvalin.ca/misc_stuff/weighting1.wav
https://jmvalin.ca/misc_stuff/weighting2.wav
These are again 32 kb/s VBR (final rate 38 kb/s). Please let me know what you think of them.


Re: Personal Listening Test of 2 Opus encoders

Reply #28
Great. I will try to listen them too during these days.

Re: Personal Listening Test of 2 Opus encoders

Reply #29
Code: [Select]
w0 w1 w2
3.3 3.2 3.0
3.1 2.9 3.3
3.3 3.3 3.2
3.0 3.2 3.1
3.6 3.8 3.8
3.3 3.1 3.0

Re: Personal Listening Test of 2 Opus encoders

Reply #30
I wouldn't take seriously the following results:

Code: [Select]
	W0	W1	W2
3,475 3,5 3,35
2,15 2,125 2,125
2,35 2,275 2,325
2,45 2,4 2,4
2,8 2,9 2,925
2,775 2,7 2,65

Average 2,667 2,650 2,629
Those are the average scores per sample. Because scores change per each pass (weird). The differences are very small, near to nonexistent

Re: Personal Listening Test of 2 Opus encoders

Reply #31
Kamedo2, IgorC, thanks very much for the testing. It seems like my idea didn't work. What I was trying to do was slightly "shift" the stereo masking so that the loudest channel in a band gets a little higher SMR than the weakest one. Doesn't seem like it's had much effect.

Re: Personal Listening Test of 2 Opus encoders

Reply #32
Probably it's worth to mention that an amount of noise is high at 32 kbps. It might be that all improvement from this new approach was completely masked by distortion from, i don't know, frequency leakage, high quantization noise, an effect of some tools like bandwidth extension or a combination of all of them. Just my guess.

During this test I didn't even get into stereo part because the general distortion of both channels was making way bigger picture. Either I didn't noticed any meaningfull (not sure if any) difference in a stereo image.

Re: Personal Listening Test of 2 Opus encoders

Reply #33
Probably we ought to change the set of sound samples in the next test.
We can use the "Stochastic gradient descent" in this manual optimization process in here.

Re: Personal Listening Test of 2 Opus encoders

Reply #34
OK, I've got a new experiment I'd like feedback on. This time I think I've managed to produce Windows builds, so it might be easier to try all kinds of samples. Here's the two binaries:
https://jmvalin.ca/misc_stuff/opus-tools-master-2af92cd.zip
https://jmvalin.ca/misc_stuff/opus-tools-exp-df24252.zip
This is mostly meant for very low bitrates again (32-48 kbps), though I'd be curious what difference it makes at higher bitrates. Let me know if you have any issues with the binaries.


Re: Personal Listening Test of 2 Opus encoders

Reply #36
Thank you, IgorC. I visualized it.


Re: Personal Listening Test of 2 Opus encoders

Reply #37
OK, I've got a new experiment I'd like feedback on.
Added to my TODO list. I'm going to test after finishing my mp3 192kbps test (Now 44% done).

Re: Personal Listening Test of 2 Opus encoders

Reply #38
It looks from IgorC's results that what I did wasn't really effective at 48 kbit/s. Digging further, I think I understand why I was seeing an improvement at 32 kbit/s but there's none at even 48 kbit/s. My change was effectively just trying to not use folding below 8 kHz, which I'm now doing explicitly. I just released 1.2-alpha, which includes that change. An example or clip where it provides an improvement (at least for me) is EnolaGay at 32 kb/s. I wish the improvement went all the way to 48 kbit/s, but unfortunately it doesn't.

Re: Personal Listening Test of 2 Opus encoders

Reply #39
~32 kbps, VBR, stereo, 48 KHz


Files: https://drive.google.com/file/d/0ByvUr-pp6BuUR3VMekNSa1NQU2c/view?usp=sharing

exp-df24252 produces considerably better results at ~32 kbps. The only issue is speech-music detector which interprets Fatboy sample as speech (well, it's actually kind of speech) and quality isn't homogeneous there.

Re: Personal Listening Test of 2 Opus encoders

Reply #40
Unbelievable  :o
Opus 1.2 alpha outperforms HE-AACv2 at ~32 kbps VBR for the first time.

Scores of blind listening test at 32 kbps VBR:
1.  Opus 1.2 alpha (exp-df24252) - 3.22
2.  CELT (music) Mode - 3.28
3. HEAACv2 (FhG Winamp, VBR 1) - 2.92



Files https://drive.google.com/file/d/0ByvUr-pp6BuUMkRlS3RGWXI1MmM/view?usp=sharing


Re: Personal Listening Test of 2 Opus encoders

Reply #41
Everybody always forgets that Citizen Kane scene was representative of undeserved applause.

Back on topic, congratulations on this, and I hope the voice/music detection can be trained to pick up the attributes of that fatboy sample without degrading anything else noticeably.

Re: Personal Listening Test of 2 Opus encoders

Reply #42
So basically if you remove the fatboy then auto-Opus wins (but HEAACv2 is still a clear looser).
Code: [Select]
                        32 kbps               
                   Opus exp-df24252 CELT mode HEAACv2  
01 castanets                     3,2       2,9       2,6
                                             
03 eig                           3,4       3,8       2,3
04 Bachpsichord                    3       2,9       3,9
5 Enola                          2,8       2,7       3,4
6 Trumpet                        3,4       3,5       3,6
7 Applaud                        2,7       2,7       2,4
8 Velvet                         2,9       2,9       2,7
9 Linchpin                         3         3         2
10 Spill the blood                 3         3         3
11 Female speech                 4,5       4,2       4,3
12 French ad                       3       2,9       2,2
13 german sppech                 4,2       3,7       3,8
                                             
Average                         3,26      3,18      3,02
                                             
Min.                             2,7       2,7         2

PANIC: CPU 1: Cache Error (unrecoverable - dcache data) Eframe = 0x90000000208cf3b8
NOTICE - cpu 0 didn't dump TLB, may be hung

Re: Personal Listening Test of 2 Opus encoders

Reply #43

smok3, Yes,  the speech-music detector works quite good on all tested samples except this 'musical-speech' sample Fatboy.
It will be great if folks will try some mixed samples to see how it performs.

Everybody always forgets that Citizen Kane scene was representative of undeserved applause.
I haven't seen that movie yet. Thank you for clarifying.

Re: Personal Listening Test of 2 Opus encoders

Reply #44
I need to see the movie for real. I was off base. It wasn't undeserved applause.

This topic explains it quite well.

Kane, who is building up his mistress-turned-second-wife, Susan, to be an opera singer. She gives a "bad" performance. He feels obligated to clap for her anyway, to build her up to be the success that he trained her to be, to support his own need to see her succeed. So, "This thing I am clapping for is terrible, but I am clapping for it anyway, because my sanity depends on it being the rousing success I hyped myself up for it to be."

Still not "generic clapping gif", but I was still off with the exact reason.

Re: Personal Listening Test of 2 Opus encoders

Reply #45
Unbelievable  :o
Opus 1.2 alpha outperforms HE-AACv2 at ~32 kbps VBR for the first time.

Scores of blind listening test at 32 kbps VBR:
1.  Opus 1.2 alpha (exp-df24252) - 3.22
2.  CELT (music) Mode - 3.28
3. HEAACv2 (FhG Winamp, VBR 1) - 2.92

Really cool! I again did not expect Opus to do so well against HE-AAC v2 at 32 kb/s. I'll see what I can do about the speech/music detection on fatboy. At the very least, we're looking at an "official" option for forcing the encoder to consider everything as speech or music.

One other thing I'm a bit curious about is the difference on the other samples (e.g. castanets, eig). If they're real and not due to the speech/music detector, then it's worth investigating since the two builds you have differ only by a tiny detail in the bit allocation.

Re: Personal Listening Test of 2 Opus encoders

Reply #46
Really cool! I again did not expect Opus to do so well against HE-AAC v2 at 32 kb/s.
The test was done in headphones and parametric stereo is known to perform not great there.  Results can be different with speakers/studio monitors. (I will try later). Anyway Opus alpha 1.2 is at least in a same league now with HE-AAC v2 at such bitrate.

 


I'll see what I can do about the speech/music detection on fatboy. At the very least, we're looking at an "official" option for forcing the encoder to consider everything as speech or music.
Great.
Another sample where the quality (especially stereo) is intermittent due to SILK/CELT mode changes.
https://drive.google.com/file/d/0ByvUr-pp6BuUbXF5MUMxYW5VM2s/view?usp=sharing
I've performed more rigorous test to be sure about it. Anyway an audible differences are pretty obvious there. Also the CELT+SILK(hybrid) sample hasn't the same loudness (the sound goes down and high) across the whole sample while CELT-only hasn't this issue.
Original - -2.39 dB
CELT      -  -2.31 dB
Hybrid   -  -1.74 dB

One other thing I'm a bit curious about is the difference on the other samples (e.g. castanets, eig). If they're real and not due to the speech/music detector, then it's worth investigating since the two builds you have differ only by a tiny detail in the bit allocation.
Don't pay attention. I've performed an ABX test on castanets,eig samples and I saw those were personal and momental preferences.  I could ABX between two lossy files but wasn't sure what to prefer this time. Somewhere bettter somewhere worse. Not sure. It will be better if someone else can perform the blind test on them.  As for me both versions (CELT and exp) made good there.

 

Re: Personal Listening Test of 2 Opus encoders

Reply #47
I just uploaded another experimental build that changes the tonality analysis to (hopefully) be more sensitive, especially when there's lots of closely-spaced harmonics. This change affects pretty much all rates and I'd be curious to have feedback on any improvements or regression it causes before I merge it in for 1.2. You can compare it to 1.2-alpha. Note that this new build also forces the encoder to consider everything as music.

Re: Personal Listening Test of 2 Opus encoders

Reply #48
I just uploaded another experimental build that changes the tonality analysis to (hopefully) be more sensitive, especially when there's lots of closely-spaced harmonics. This change affects pretty much all rates and I'd be curious to have feedback on any improvements or regression it causes before I merge it in for 1.2. You can compare it to 1.2-alpha. Note that this new build also forces the encoder to consider everything as music.
Sorry for not reacting recently. I would rather like to get rid of my remaining mp3 listening test sessions first. Now 24 sessions remaining and 56% done.

Re: Personal Listening Test of 2 Opus encoders

Reply #49
Quote
   1.2a (celt mode)   New tonality
32 kbps
01 Castanets   2,8   3
04 Bachpsichord   2   2
05 Enola   2,4   2,4
06 Tumpet   3,6   3,6
10 Spill the blood   3,2   3,3
Take your fingers from my head   2   2
      
http://listening-test.coresv.net/tracks/take_your_finger_from_my_hair.wav from a  test http://listening-test.coresv.net/results.htm

Average:   2,67   2,72


The latest experimental build (new tonality) has higher average bitrate at 32 kbps. It's 0.4-0.9 kbps more on my bitrate calibration set of 44 songs and slightly more on tested samples. Not that much of a bitrate increase but it could be a reason of slightly higher average score.

I notice some very tiny improvements only on guitar intro of "Castanets"  and on "spill the blood" samples.

If You ask me I think it's a slight bitrate bump/increase rather than quality improvement.

P.S. I didn't notice any audible regressions.