Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: New Listening Test (Read 107053 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

New Listening Test

Reply #50
I would also like that we try the complete MUSHRA methodology with appropriate ranking scale


But is there a good tool for MUSHRA testing? (and by good, I don't mean your buggy comparator  )

Besides, it should be a tool that outputs test logs in a format recognizable by Chunky, otherwise it would be a major pain for Sebastian to work on the results.

New Listening Test

Reply #51
5.1 test will be cool especialy now. Doing test @ 96kbs will be great. CT now have excelent encoder and 5.1 sounds great at all bitrates and is god for 1cd rips, comparing to Nero 6/7 and lastest beta which is terible.

Encoders in test

Aud-X 5.1 (mp3 5.1)
Nero 7 (lastest beta)
CT (from winamp 5.21)
oog vorbis

New Listening Test

Reply #52
Well...read the previous messages and you'll see that a blind test for multichannel is not planned for the moment.


In addition, vorbis multichannel is not well tuned.

New Listening Test

Reply #53
I would also like that we try the complete MUSHRA methodology with appropriate ranking scale


But is there a good tool for MUSHRA testing? (and by good, I don't mean your buggy comparator  )

Besides, it should be a tool that outputs test logs in a format recognizable by Chunky, otherwise it would be a major pain for Sebastian to work on the results.


Hell no, not comparator (where did you dig THAT from)

Maybe developers of ABC/HR and Chunky would be interested to extend their tools to support ITU-R BS.1534/MUSHRA arrangement.

So far as I could see three changes need to be made:

- Scoring from 0 to 100, istead of 1 to 5

- Rearranging the panel so there are just test samples and hidden reference,  not a hidden reference paired with each sample (useless IMO for 48 kbps, as differences are clearly bigger than a need to blindly identify the orignal against each sample - it just puts too much efforts for the testers without a particular need)

- Modifying Chunky to accept new values (is that needed?)

Maybe there is something else, too?  I would personally like to see this change for low bitrates,  as it would make tests more similar to ones done within EBU, etc...  and in compliance with latest ITU-R recommendations.

New Listening Test

Reply #54
Well...read the previous messages and you'll see that a blind test for multichannel is not planned for the moment.
Yes i know, to wait until Nero relase new encoder, why, why test @48 isn't worked wit old nero encoder than new optimized for low bitrate, and in test use old CT encoder.

New Listening Test

Reply #55
New Nero encoder will be released much before than any 96 kbps multichannel test could be arranged, so I don't think this is an issue or the reason why there is no 5.1 test.

Sebastian clearly pointed out the reasons why 5.1 multichannel test (regardless the bitrate/contenders) will not be held at this point - please check them out.

Quote
Yes i know, to wait until Nero relase new encoder, why, why test @48 isn't worked wit old nero encoder than new optimized for low bitrate, and in test use old CT encoder.


AFAIK, I think there were no significant improvements of 48 kbps between last and current CT encoders  in subsequent Winamp builds (and anyway build revisions were quite minor),  I did tests with at least 20 samples and I did not find any diff - they are probably focusing on something else.  On the other hand, Nero encoder underwent a major update (complete code rewrite).

New Listening Test

Reply #56
Quote
I think there were no significant improvements of 48 kbps between last and current CT encoders in subsequent Winamp builds
I don't think about CT improvements i think about nero lastest beta which is be on test. This binaries is have big optimisation for low bitrates especilay for 48kbs. I do some test with this binaries and lastest nero 7 ofical binaries and quality is more better, so next time when doing test's use lastest encoders from both CT and Nero.

Quote
Sebastian clearly pointed out the reasons why 5.1 multichannel test (regardless the bitrate/contenders) will not be held at this point - please check them out.
Yes it true but test must work on cbr not abr/vbr becouse Aud-X and CT don't have abr/vbr modes.

New Listening Test

Reply #57
I'm voting for a 48 kb/s test.  The codecs that ought to be included:
Vorbis (aoTuV pb5), HE-AAC (Nero), LAME latest, and some variant of WMA (but I'm not very educated about WMA, so I won't suggest a version).

Those should be the important ones.  48kb/s should be a good streaming bitrate, and besides, I'm considering it for portable versions of my lossless material, as to me, HE-AAC does sound good at that bitrate.


New Listening Test

Reply #59
Quote
I don't think about CT improvements i think about nero lastest beta which is be on test. This binaries is have big optimisation for low bitrates especilay for 48kbs. I do some test with this binaries and lastest nero 7 ofical binaries and quality is more better, so next time when doing test's use lastest encoders from both CT and Nero.


I think that was a general conclusion - Gabriel made some requirements for the tests, and I think they should be usually considered (test latest encoders, if developers recommend them + they become available within next 3 months)  unless there is some extraorinary demand to test something that could not be released on time.

So, I think this is quite safe

Quote
Yes it true but test must work on cbr not abr/vbr becouse Aud-X and CT don't have abr/vbr modes.


HA-wise, I do not think that makes too much sense. Goal of the tests is to find best possible solution for encoding at a certain bit rate.  The fact that some of the competitors do not have higher quality modes is absolutely not relevant for the test itself - doing otherwise would mean that you intentionally cripple the quality of the other encoders in order to make "fair" test.

Fair to whom? 

- To participants of the test with less options?  Maybe
- To users, wanting to see what they could get at a certain bit rate (within limits) - I don't think so

Besides, many other encoders (e.g. Vorbis, WMA) probably do not have CBR models similar to MP3/AAC - so I don't see any particular reason to cripple AAC/MP3 encoders.

Only valid reason for test AAC/MP3 CBR would be for fixed-channel streaming (e.g. ISDN, digital satellite / microwave, etc...) over sync. channels - that is where deterministic buffer of AAC/MP3 CBR makes only reasonable sense.  However, I don't think HA community would be interested too much in these kind of tests - but I could be easily mistaken.

New Listening Test

Reply #60
Quote
I think that was a general conclusion - Gabriel made some requirements for the tests, and I think they should be usually considered (test latest encoders, if developers recommend them + they become available within next 3 months) unless there is some extraorinary demand to test something that could not be released on time.

So, I think this is quite safe
You probably right, but when test started winamp is came out 5.2 which have big improvments, on test is used i think encoder from 5.111 which is very old CT encoder.

Quote
Goal of the tests is to find best possible solution for encoding at a certain bit rate.
That is true but you can say Nero HE-AAC is best encoder becouse isn't aslo can't say that for CT but CT work correctly, I see that you doing some test about differences in abr/cbr modes but in practice that is not true, i see all samples from test and most have averge bitrate of 64kbs and more. Aslo did you use Nero BIG oversize like good thing.

New Listening Test

Reply #61
Quote
I think that was a general conclusion - Gabriel made some requirements for the tests, and I think they should be usually considered (test latest encoders, if developers recommend them + they become available within next 3 months) unless there is some extraorinary demand to test something that could not be released on time.

So, I think this is quite safe
You probably right, but when test started winamp is came out 5.2 which have big improvments, on test is used i think encoder from 5.111 which is very old CT encoder.


First of all, encoder used in the latest test was from Winamp 5.2 Beta 393

http://www.mp3-tech.org/content/?48kbps%20...20public%20test  (see the encoder IDs)

Do you have any proof of audible differences among Winamp 5.2 Beta 393 and 5.2 retail when it comes to 48 kbps AAC encoding?  I don't - I actually couldn't find any at 48 kbps.

Quote
Quote
Goal of the tests is to find best possible solution for encoding at a certain bit rate.
That is true but you can say Nero HE-AAC is best encoder becouse isn't aslo can't say that for CT but CT work correctly, I see that you doing some test about differences in abr/cbr modes but in practice that is not true, i see all samples from test and most have averge bitrate of 64kbs and more. Aslo did you use Nero BIG oversize like good thing.


Nero AAC scored with highest average score among all codecs in the test - that is a fact drawn after statistical analysis of the test results.

No, it wasn't statistically better than CT's encoder, but it did scored better on average, and it was statistically better than CT on one of the tested samples, sample1 - I will not draw any conclusions from that, I leave that to people interpreting the test results to judge what is better then.

Second, it would be appreciated a lot if you could please pinpoint me to which samples had average bitrate of 64 kbps and more?  And which Nero "big oversize" are you talking about?

http://www.mp3-tech.org/tests/aac_48/results.html

Average bit rate of Nero AAC was 47.4 kbps  - well within the usual 10% limits of tests conducted here.  It was within 10% of the limits for 17 of 18 samples, and within 15% on only one sample (where it was 54 kbps)

New Listening Test

Reply #62
I would follow by a third listening test at 48 kbps. We did pre-tests for Nero AAC then for AAC; the logical new step should be a multiformat test (AAC vs others).
What I'd like to see is the reproduction of the same testing conditions as previous test:
• same anchors
• same samples

Why?
— for curiosity first: it would be nice to see if the same results are reproducible when testing conditions are the same.
— because the previous conditions were apparently very-well balanced. The test ended with both anchors on the extreme position in the scale and all contenders were located on the middle. HE-AAC is, in one sense, like a third anchor: a mid-anchor. Not bad, not good: something exactly in the middle. I'd like to see the position of different contenders in a scale filled by these three references/anchors.
I'm not sure that I'm clear... 

Nevertheless, a new ABC/HR software using MUSHRA arrangement is something I really like to experiment first.
Wavpack Hybrid: one encoder for all scenarios
WavPack -c4.5hx6 (44100Hz & 48000Hz) ≈ 390 kbps + correction file
WavPack -c4hx6 (96000Hz) ≈ 768 kbps + correction file
WavPack -h (SACD & DSD) ≈ 2400 kbps at 2.8224 MHz

New Listening Test

Reply #63
Quote
Nevertheless, a new ABC/HR software using MUSHRA arrangement is something I really like to experiment first.


I agree - would be good to test it first, but it seems that full MUSHRA won't be available for this test, either

As far as sample selection goes - I would personally like to avoid using the same samples, as this might lead to assumptions (probably non-justified) that some of the contenders used them to "tune" their encoders.  I don't believe this will happen - but we all know how much flames results usually cause

New Listening Test

Reply #64
Quote
Do you have any proof of audible differences among Winamp 5.2 Beta 393 and 5.2 retail when it comes to 48 kbps AAC encoding? I don't - I actually couldn't find any at 48 kbps.
Yes there are small improvment which resulting more better quality becouse i founded on winamp forums about build 393 have some bug in enc_aacplus which is correct in next buld i think 440, but pro version of winamp 5.2 have totaly new encoder which support LC-AAC and HE-AAC up to 320kbs.


Quote
No, it wasn't statistically better than CT's encoder
I wanna to hear this. , i now the facts but ppl now talking about CT encoder like his most whrose encoder.


Quote
Second, it would be appreciated a lot if you could please pinpoint me to which samples had average bitrate of 64 kbps and more? And which Nero "big oversize" are you talking about?
When i found this sampes i post to you. Now i don't have much time to do this. btw about oversize, I do some rip movie and sound will be AC3 stereo, i try to convert using lastest beta @48kbs, the file should be about 35mb and i get about 50mb, finaly i rip sound with CT and i get right size whit amaizing quailty. 48 is good for 1CD rips. Bye

New Listening Test

Reply #65
Quote
Yes there are small improvment which resulting more better quality becouse i founded on winamp forums about build 393 have some bug in enc_aacplus which is correct in next buld i think 440, but pro version of winamp 5.2 have totaly new encoder which support LC-AAC and HE-AAC up to 320kbs.


I think it is still quite unrelated to the 48 kbps SBR performance of the CT encoder. Unless someone from CT comes and claims that Gabriel did a huge mistake by testing Winamp build 393.

Would also help if people from Winamp could tell which version of CT encoder is used in the Winamp 393 and which one in retail, if it is not a secret - e.g. latest CT encoder version known to me is 7.2.1  - but I am not in position to publish the changelog.

Quote
I wanna to hear this. biggrin.gif, i now the facts but ppl now talking about CT encoder like his most whrose encoder.


Now I am confused, who is talking about what?  If you are refering to the listening test results  - they are quite clear: all encoders were tied so there was no winner in the statistical sense, Nero was almost statistically better overall than 3GPP encoder (missed by a very small margin), Nero  also had the highest average score - and, it was significantly better than #2 encoder (CT v1) on one sample.

That's it - calling any encoder "much worse" is a mistake IMO, and people making this mistake are just illiterate when it comes to reading of the test results. 

It could be said that 3GPP HE-AAC v1 is probably worse than Nero HE-AAC v1, but still missing just a little bit in order to be statistically relevant.

Quote
When i found this sampes i post to you. Now i don't have much time to do this. btw about oversize, I do some rip movie and sound will be AC3 stereo, i try to convert using lastest beta @48kbs, the file should be about 35mb and i get about 50mb, finaly i rip sound with CT and i get right size whit amaizing quailty. 48 is good for 1CD rips. Bye


What we are talking are the samples from the listening test , where the bit-rate was measured in order to be compliant. And all encoders complied.

Now, if you have a sample which generates "64 kbps" than it is clearly an encoder bug - it would be great if you could send that sample to me, but it indeed sounds pretty weird.

New Listening Test

Reply #66
Quote

Yes there are small improvment which resulting more better quality becouse i founded on winamp forums about build 393 have some bug in enc_aacplus which is correct in next buld i think 440, but pro version of winamp 5.2 have totaly new encoder which support LC-AAC and HE-AAC up to 320kbs.


I think it is still quite unrelated to the 48 kbps SBR performance of the CT encoder. Unless someone from CT comes and claims that Gabriel did a huge mistake by testing Winamp build 393.

Would also help if people from Winamp could tell which version of CT encoder is used in the Winamp 393 and which one in retail, if it is not a secret - e.g. latest CT encoder version known to me is 7.2.1  - but I am not in position to publish the changelog.


Build 393 has the same CT encoder as 5.2 and 5.21.  It is the 7.2.0a version of the aacPlus encoder.  All changes made to enc_aacplus between Build 393 and 5.21 were in the "front-end" (GUI, file handling, MP4 handling, etc) only.  Winamp Pro unlocks options for the oversampled SBR mode (which, incidentally, faired well on http://www.soundexpert.info, if their methodology is to be trusted).

New Listening Test

Reply #67
Thanks Benski!

Hopefully this clears out the confusion and unnecessary panic concerning the Winamp builds and CT's encoders being used.

New Listening Test

Reply #68
Quote
It could be said that 3GPP HE-AAC v1 is probably worse than Nero HE-AAC v1, but still missing just a little bit in order to be statistically relevant.
Yes but everyone now has no wish to use 3GPP HE-AAC v1 or CT becouse test prove that is not good encoders. Especialy ppls who first time use AAC. In practice everything is different.

Quote
latest CT encoder version known to me is 7.2.1 - but I am not in position to publish the changelog.
I don't know which version is becouse dll don't have version info, but enc_aacplus.dll for 393 have size about 400kb and lastest have about 550-600kb i am not sure.

Quote
What we are talking are the samples from the listening test , where the bit-rate was measured in order to be compliant. And all encoders complied.
Yes i am talking about this samples and i said when i have time i download samples again and found this sampes.

New Listening Test

Reply #69
Quote
Yes but everyone now has no wish to use 3GPP HE-AAC v1 or CT becouse test prove that is not good encoders. Especialy ppls who first time use AAC. In practice everything is different.


Eek... in this case let's just stop doing listening tests, because uneducated people would make wild claims ;-)  Actually I wouldn't use 3GPP HE-AAC v1 anyway as its performance is definitely worse as it could be seen from the graphs.

But it was not statistically worse - (that is, it was almost for Nero, but still not there)

Quote
I don't know which version is becouse dll don't have version info, but enc_aacplus.dll for 393 have size about 400kb and lastest have about 550-600kb i am not sure.


Please check out Benski's post - hopefully it will stop completely unfounded speculations about encoders.

New Listening Test

Reply #70
Quote
Actually I wouldn't use 3GPP HE-AAC v1 anyway as its performance is definitely worse as it could be seen from the graphs
I am just talkig about it, of course me and anybody now don't want use 3GPP.

Quote
Eek... in this case let's just stop doing listening tests, because uneducated people would make wild claims ;-)
Of course we have to do some test but not in that way.


Quote
Please check out Benski's post - hopefully it will stop completely unfounded speculations about encoders.
Yes i check and i don't know why size of encoder grow up. If version same then is my mistake. Aslo i have real pro version and is not same encoder in pro and free.

New Listening Test

Reply #71
Quote
Of course we have to do some test but not in that way.


And which way would you propose?

New Listening Test

Reply #72
Quote
And which way would you propose?

Will see i don't have now much time to explain whole procedure but simply CBR/ABR/VBR vs CBR/ABR/VBR

 

New Listening Test

Reply #73
This approach would make sense if and only if all encoders used same bit-allocation methods, but it is not true.

So, therefore - if you compare Vorbis and WMA to, say, AAC - only measure you would have is average bit rate, total - since bit rate allocations and strategies between these algorithms differ too much.

Regarding AAC vs. AAC - I dunno, only implementation capable of doing CBR, VBR and ABR is Nero - so, you either have to cripple Nero's  encoder - or you don't have too much to test anyway.

New Listening Test

Reply #74
Jesus, I wasn't at home one day and look at this monster!

I would follow by a third listening test at 48 kbps. We did pre-tests for Nero AAC then for AAC; the logical new step should be a multiformat test (AAC vs others).
What I'd like to see is the reproduction of the same testing conditions as previous test:
• same anchors
• same samples

Why?
— for curiosity first: it would be nice to see if the same results are reproducible when testing conditions are the same.
— because the previous conditions were apparently very-well balanced. The test ended with both anchors on the extreme position in the scale and all contenders were located on the middle. HE-AAC is, in one sense, like a third anchor: a mid-anchor. Not bad, not good: something exactly in the middle. I'd like to see the position of different contenders in a scale filled by these three references/anchors.
I'm not sure that I'm clear... 

Nevertheless, a new ABC/HR software using MUSHRA arrangement is something I really like to experiment first.


I see one problem - it is possible that some encoders are now tuned to work well with these samples. Choosing (some) new samples can rule out the possibility of testing samples that were used for tuning.