HydrogenAudio

Hydrogenaudio Forum => Listening Tests => Topic started by: rjamorim on 2004-08-12 03:22:39

Title: Proposal on listening tests
Post by: rjamorim on 2004-08-12 03:22:39
Hello.

It has been a month since the (official) finish of the last listening test, and I'd like to propose some tests that can be conduced by the more courageous people out there. It's about time new tests start getting planned.

There have been several proposals. I believe the one deserving more attention is a speech listening test, comparing speech samples against several different speech codecs (GSM, WMA Voice, Speex, g729, MPEG4 CELP, PureVoice/CDMA...) in both wideband and narrowband mode. I'm confident jmvalin would be able to help the conducer choose the adequate encoders and settings.

Another test, proposed by Danchr, would be testing open source encoders at some bitrate. LAME, Vorbis, FAAC, maybe Musepack...  such test would surely be of interest to users using platforms other than Windows, specially Linux users.

Sthayashi proposed a test comparing several AAC encoders againt Vorbis, to see how Vorbis performs againt encoders other than iTunes. I guess that the answer is now clear that Vorbis will perform better than all AAC encoders at 128kbps, since if it won even over the best of them. But the proposal is made.

Last but not least, ScorLibran proposed a transparency thresold test. It would somehow detect at which average bitrate each codec first reaches transparency. That would serve as proof if Musepack is still the codec that offers transparency at lowest bitrates, or if the recent developments in all other codecs obsoleted Musepack in this aspect.

Of course, the most courageous (or nuts) out there might be dreaming of a 160 or 192kbps test. I personally believe this is madness, but hey, don't let me stop you :B

I personally don't think that conducing another multiformat at 128kbps, or AAC at 128kbps test would be justifiable right now. There has hardly been any development in the mainstream encoders since my last tests were conduced, so it would just be a waste of resources. Maybe next year?

This is what I can offer to help starters:
- Hosting the sample packages
- Hosting the torrent tracker and help seeding from fast servers
- Help you with answers and hints about test conducing to the best of my knowledge

All this for the low, low fee of zero bucks

All other responsabilities would belong to the test conducer: choosing the sample set, deciding on codecs, versions and settings, managing eventual pre-tests, gathering and processing the results, and the most dreaded question - VBR or CBR?

Hope I can spark some interest with this invitation. We definitely need someone to pick up from where I left.

Thanks for your attention.

Best regards;

Roberto.
Title: Proposal on listening tests
Post by: rjamorim on 2004-08-12 03:24:10
Crap. The subtitle should have been "What should be conduced next?". It's a question, not an order. Please ignore that.
Title: Proposal on listening tests
Post by: slippyC on 2004-08-12 03:42:04
The test I would like to see done is various codecs at 80 to 96kbits, to me no codec in stereo can perform well at the 64kbits level.  Maybe after HE-AAC with PS comes out it will help use some of it's extra bits to change my mind, but that's yet to be seen....errr heard. 

Do something along the lines of Mp3pro, HE-AAC, Vorbis, wmaPro, ect  at both 80 and 96kbs.  The reason for both is also to see if it is distinguishable between those 2 bitrates as well with the same codec.


Just so ya understand, encode all the sample in both 80kbs and 96kbs with each codec.

Anyway, that's what I'd like to see the conclusion to.


***Edited Part***
Of course these want be transparent, that isn't what I wondered.  Just which sounds the best to the public at large and is their some breaking point at the lower bitrates.
Title: Proposal on listening tests
Post by: kwanbis on 2004-08-12 04:24:32
Quote
Last but not least, ScorLibran proposed a transparency thresold test. It would somehow detect at which average bitrate each codec first reaches transparency. That would serve as proof if Musepack is still the codec that offers transparency at lowest bitrates, or if the recent developments in all other codecs obsoleted Musepack in this aspect.


this is it .... =)

also ... a transcoding test should be good ... from MPC, Vorbis, to LAME ...
Title: Proposal on listening tests
Post by: Omion on 2004-08-12 04:39:49
Quote
Last but not least, ScorLibran proposed a transparency thresold test. It would somehow detect at which average bitrate each codec first reaches transparency. That would serve as proof if Musepack is still the codec that offers transparency at lowest bitrates, or if the recent developments in all other codecs obsoleted Musepack in this aspect.
[a href="index.php?act=findpost&pid=233782"][{POST_SNAPBACK}][/a]


I'll second this. It'll be interesting to know where transparency occurs, although then it wouldn't make sense to use the standard 'killer samples.' Well, I suppose if you want a general ratio it's ok (ie. AAC reaches transparency at 80% the bitrate of MP3), but for an absolute value (ie. MPC is transparent at ~160) normal music should be used.
Title: Proposal on listening tests
Post by: shadowking on 2004-08-12 05:39:14
Most people wouldn't be able to do this test. Guruboolez did test normal music (mpc vs vorbis vs lame) a few weeks ago, concluding that mpc was superior. I suppose where transparency occurs is dependant on the sample used.

If mpc q5 avg 170k, then we would use q5-5.5 for vorbis, lame V2,V3

Vorbis has quality issues below Q6, lame 3.96.1 aps or apm will match mpc bitrates. Nero AAC 'normal' profile should be used against mpc.

My bet is that the other codecs don't stand much chance at these bitrates. At >200k things even out more or less.
Title: Proposal on listening tests
Post by: jmvalin on 2004-08-12 05:56:01
I would definitely like to see a speech codec test and, eventhough I don't have enough time to organize it, I can provide help for choosing the codecs and test samples.

I think a speech codec test is probably more complicated (in some aspects at least) than for music because most speech codecs usually only one bit-rate (even Speex doesn't have a continuous range like Vorbis or MP3). Actually, the only way I see for comparing the codecs is to plot the results on a quality vs. bit-rate graph.

These are the codecs which I think would be the most interesting to have:
narrowband: Speex (8, 11, 15 kbps), iLBC (15.2 kbps), AMR-NB (8, 10, 12 kbps), G.729A (8 kbps), GSM-FR (13 kbps), QCELP?
wideband: Speex (12.8, 20.6, 27.8 kbps), AMR-WB (2 or 3 bit-rates), G.722 (reference, 64 kbps), VMR?

The choice of samples is also important I think. Do we want only clean (studio-like) samples or samples that would cover other applications like VoIP (samples with background noise) and broadcast (samples with light music background). Even the filtering would be important as some codecs don't react well when there's lots of low frequencies (especially the narrowband ones).
Title: Proposal on listening tests
Post by: unfortunateson on 2004-08-12 06:41:00
I also vote for a transcoding test, I'd like to see how Musepack performs.
Title: Proposal on listening tests
Post by: rjamorim on 2004-08-12 06:54:39
Quote
I also vote for a transcoding test, I'd like to see how Musepack performs.
[a href="index.php?act=findpost&pid=233802"][{POST_SNAPBACK}][/a]


This test already happened. Sthayashi conduced it. He discussed and announced the test here. Nearly nobody participated :B
Title: Proposal on listening tests
Post by: ScorLibran on 2004-08-12 07:13:42
Quote
Most people wouldn't be able to do this test. Guruboolez did test normal music (mpc vs vorbis vs lame) a few weeks ago, concluding that mpc was superior. I suppose where transparency occurs is dependant on the sample used.

I think most people would be able to do this test.  We're very likely not talking bitrates above 160kbps, but rather closer to the 96-128 range.  People would ABX each bitrate until they couldn't distinguish one with p<0.05.  That's their transparency threshold for that format and sample.  Wash, rinse and repeat for each other format and sample across all participants, then average all resulting bitrate thresholds and present them by format with a standard ANOVA error margin.  The whole thing would follow ITU-R BS.1116-1 standards as much as possible.  As for making the test "participant-friendly", that's something I've made it a high-priority to do when this test starts the planning phase.

Quote
If mpc q5 avg 170k, then we would use q5-5.5 for vorbis, lame V2,V3

Vorbis has quality issues below Q6, lame 3.96.1 aps or apm will match mpc bitrates. Nero AAC 'normal' profile should be used against mpc.

My bet is that the other codecs don't stand much chance at these bitrates. At >200k things even out more or less.
[a href="index.php?act=findpost&pid=233796"][{POST_SNAPBACK}][/a]

This is exactly why I think we need this kind of test, to resolve these issues and eliminate the need for speculation of sound quality and efficiency with formats tested seperately and at different points in time.
Title: Proposal on listening tests
Post by: DreamTactix291 on 2004-08-12 07:27:34
Transparency test would be interesting.  I'm curious to see where most people end up considering modern codecs transparent and to see if the recent developments in codecs like Vorbis have really helped them a lot.
Title: Proposal on listening tests
Post by: Omion on 2004-08-12 07:29:37
Quote
Quote
Most people wouldn't be able to do this test. Guruboolez did test normal music (mpc vs vorbis vs lame) a few weeks ago, concluding that mpc was superior. I suppose where transparency occurs is dependant on the sample used.

I think most people would be able to do this test.  We're very likely not talking bitrates above 160kbps, but rather closer to the 96-128 range.  People would ABX each bitrate until they couldn't distinguish one with p<0.05.  That's their transparency threshold for that format and sample.  Wash, rinse and repeat for each other format and sample across all participants, then average all resulting bitrate thresholds and present them by format with a standard ANOVA error margin.  The whole thing would follow ITU-R BS.1116-1 standards as much as possible.  As for making the test "participant-friendly", that's something I've made it a high-priority to do when this test starts the planning phase.

[a href="index.php?act=findpost&pid=233804"][{POST_SNAPBACK}][/a]

Agreed. The point of this test would to figure out at what bitrate people won't be able to do the test (so to speak). Everybody will be able to input their particular threshold, no matter how bad their ears are.

I can say for sure that my results will be around the 96 range. My hearing doesn't go above ~12khz(*), and I have found previous listening tests quite difficult. But it would still be good to know if, for example, AAC was transparent at 80kbps and MP3 at 128.

I will definitely participate in this test, should it occur.

(*) I can hear a single sine wave at 14khz, but I can't ABX a 12khz lowpass on normal music
Title: Proposal on listening tests
Post by: DreamTactix291 on 2004-08-12 07:36:47
Ouch!

Did something happen when you were younger to damage your hearing?

Last time I checked I could hear a single sine wave up to around 18kHz.  I usually can ABX a 16kHz lowpass but not always.
Title: Proposal on listening tests
Post by: Omion on 2004-08-12 08:04:53
Not that I recall, but maybe it damaged my memory as well 

<Pointless story>
I remember making a hearing test for myself with a 17khz sine wave. I played it, turned up the volume slowly, but couldn't hear a thing. Just then my friend opened the door, and he acted like he got hit in the head with some invisible brick. He said that's exactly what it felt like. For days he would come up to me and say "I can't believe you didn't HEAR that! Do you know how loud that was?!" Pretty loud, I guess. 
</pointless story>

What's kind of odd is that I'll worry about audio quality to no end. I keep wondering if there's something that I'm not hearing, but that was somehow deterring from my overall enjoyment. I kept prowling this forum, looking for any codec that might be better than what I was using at the time. In the end I was using MPC Xtreme, even though I probably couldn't tell the difference at half the bitrate.
Eventually I decided to save myself the emotional stress and re-ripped to FLAC. It probably uses up 8 times the disk space than I need, but it saves my mind. In the end, that's what really matters.
Title: Proposal on listening tests
Post by: DreamTactix291 on 2004-08-12 09:14:20
I'm really bad about worrying about quality.  I had to spend 3 days talking myself out of going from -q6 to -q7 Vorbis for my portable even though I knew I wouldn't be able to really hear a difference.  That's why I really think the transparency test would be good.  Peace of mind.
Title: Proposal on listening tests
Post by: robUx4 on 2004-08-12 09:45:17
I'd really like to see the transparency test happen. But it's a tough one because it needs people than can really spot the smallest glitches in codecs.

In the other hand I think most people would like to use the result
Title: Proposal on listening tests
Post by: Iain on 2004-08-12 09:56:07
I would like to know if anyone else in interested in a listening test to determine the effect of post processing (eg EQ, Compression etc.) on a codec after compression.

Is it easier or harder to ABX?

Perhaps the same processing could be applied to the original uncompressed file and the file after compression.

Any takers?

-Iain
Title: Proposal on listening tests
Post by: Omion on 2004-08-12 22:10:55
I've been thinking about how to do the transparency test as objectively as possible. The problem is that there would be a LOT of files. For example, if one wanted to do a test of MP3, AAC, MPC, Vorbis on 10 different samples, with 4 bitrates, that's 160 separate files, and up to 160 ABX sessions.

Well, if one feels like trusting people, it could be an informal thing. Download a FLAC, compress it yourself, and tell whoever's doing the test your lowest non-ABXable bitrate.

But then zealots could easily tip the scales ("OMG MPC @ 300kpbs and OGG @ 64!!!1"), so if one wants a truely scientific test, it would have to be encrypted, and compression settings detemined beforehand. As far as I know, there is no program that will blindly ask you to ABX a bitrate, then if you pass go on to a higher bitrate, etc.. Basically, I think it will be hard to implement.

ABChr could do it, but not very efficiently. One would download a 64kbps sample, and ABX it. Then go on to 96, and ABX it, then 128... But what if they could ABX a codec at 128 but could NOT at 96? I don't really know.

PS. There's a very good reason this post reads like a raw brain dump. 

PPS. Was HA not working for a few hours a little while ago, or what?
Title: Proposal on listening tests
Post by: Eli on 2004-08-12 22:26:56
My vote is for a high bitrate listening tests on problem samples. We all know that most codec perform well and are nearly transparent at 192 with the exception of problem samples. It would be nice to see which codecs handle these samples the best when compared at "advertised" transparent settings (probably VBR near 192 avg)
Title: Proposal on listening tests
Post by: rjamorim on 2004-08-12 23:12:54
Quote
My vote is for a high bitrate listening tests on problem samples. We all know that most codec perform well and are nearly transparent at 192 with the exception of problem samples. It would be nice to see which codecs handle these samples the best when compared at "advertised" transparent settings (probably VBR near 192 avg)
[a href="index.php?act=findpost&pid=233930"][{POST_SNAPBACK}][/a]


The problem with such test, as I already wrote here some times, is that using problem samples leads to non-representative results.

That is, you can't guarantee codec X is the best at 192kbps just because it encodes problem samples better than the competition. At most, you can say it's the best when encoding problem samples.
Title: Proposal on listening tests
Post by: Mac on 2004-08-12 23:43:22
Add another vote for the transparency test, it would be an interesting challenge for the conducer to say the least
Title: Proposal on listening tests
Post by: de Mon on 2004-08-12 23:50:24
What about transparancy test on EASY SAMPLES (just music - some proportional mix of metal, pop, classical e.t.c samples but not the hard ones)?
Title: Proposal on listening tests
Post by: Omion on 2004-08-13 00:01:56
Quote
What about transparancy test on EASY SAMPLES (just music - some proportional mix of metal, pop, classical e.t.c samples but not the hard ones)?
[a href="index.php?act=findpost&pid=233953"][{POST_SNAPBACK}][/a]

Do you mean easy for the encoder, or easy to ABX? They're quite different.

I suggest against using only easy-to-ABX samples, as there should be a representative dififculty level in order to get an absolute conclusion. That is to say, if only the trouble samples (easy to ABX) were tested, the transparency bitrates would be artificially inflated.
Title: Proposal on listening tests
Post by: Eli on 2004-08-13 00:54:26
Quote
Quote
My vote is for a high bitrate listening tests on problem samples. We all know that most codec perform well and are nearly transparent at 192 with the exception of problem samples. It would be nice to see which codecs handle these samples the best when compared at "advertised" transparent settings (probably VBR near 192 avg)
[a href="index.php?act=findpost&pid=233930"][{POST_SNAPBACK}][/a]


The problem with such test, as I already wrote here some times, is that using problem samples leads to non-representative results.

That is, you can't guarantee codec X is the best at 192kbps just because it encodes problem samples better than the competition. At most, you can say it's the best when encoding problem samples.
[a href="index.php?act=findpost&pid=233944"][{POST_SNAPBACK}][/a]


A number of tests have already been done to show which codecs are best at 128. I dont think many ppl would be able to abx many samples at 192, so it would be pointless. However, if only problem samples are used (with the assumption being that all of the codecs would be essentially transparent for most listeners - even those with tuned ears and good equipment), that the best codec would be the one that handles most of the problem samples well.
Title: Proposal on listening tests
Post by: rjamorim on 2004-08-13 01:02:43
Quote
that the best codec would be the one that handles most of the problem samples well.
[a href="index.php?act=findpost&pid=233958"][{POST_SNAPBACK}][/a]


Yes, the best codec - for problem samples!

There's no guarantee that it will show the same behaviour on "normal" samples. And what's the point of a test that only show results applicable to a small share of the musical styles?
Title: Proposal on listening tests
Post by: Phantom_Photon on 2004-08-13 01:07:25
Hi, I've been lurking for a long time, don't let the user data fool you.

I second (or, umm, twelth) the vote for some kind of transparency test for the peace-of-mind reasons stated above.  (A friend just ripped 250 CDs to 160kbps Quicktime AAC then left the country, leaving CDs behind; I'm not sure what to tell him when he asks if they're good enough ;-)

What about throwing something else into the mix: transparency of the various online stores?  AFAIK, there wouldn't be copyright issues from taking a short clip of something bought at iTunes or the like and comparing it to FLACs of the original CD.

I know you can't chose bitrate with them, but, once samples were identified that, say, seemed to not be transparent to a good number of people at 128kbps, then we could compare them to the store-downloaded one (encoded with whatever they use) to see if it's just as opaque. 

Just a thought; I'm sure others here could dream up better implementations.
Title: Proposal on listening tests
Post by: shadowking on 2004-08-13 03:10:54
Sample A might be transparent at 110k yet another may need 190k..

The important thing is consistency and I think such a test will only reveal 50% of the story. We need to show that a codec is consistently good at a certain bitrate range  e.g - mpc is solid in 140-190k range: meaning it can encode majority of samples without problems, without needing wild bitrates to deal with pre-echos, sharp transients etc.
Title: Proposal on listening tests
Post by: Eli on 2004-08-13 04:49:24
Quote
Quote
that the best codec would be the one that handles most of the problem samples well.
[a href="index.php?act=findpost&pid=233958"][{POST_SNAPBACK}][/a]


Yes, the best codec - for problem samples!

There's no guarantee that it will show the same behaviour on "normal" samples. And what's the point of a test that only show results applicable to a small share of the musical styles?
[a href="index.php?act=findpost&pid=233959"][{POST_SNAPBACK}][/a]


Well, I would support a transparency test, to determine at which bitrate the codecs become transparent (throughing out the lower outlyers, biasising the results to people with better ears - which is not me). And then test each of the codecs at their transparent settings with problem sample. I would be supprised if the results of a transparency test show requirements for bitrates > -aps for lame, q5 for mpc, ect (the recommended settings already). In fact I think these setting are probably overkilling, allowing for more "head room", which is fine (hey, I encode in FLAC). So the test I was proposing was HA recommeded settings on problem samples.
Title: Proposal on listening tests
Post by: Omion on 2004-08-13 05:15:07
Quote
(throughing out the lower outlyers, biasising the results to people with better ears - which is not me)
[a href="index.php?act=findpost&pid=233986"][{POST_SNAPBACK}][/a]

You're so mean! You'd throw me out? 
Anyway, I think that all the people would be important to keep, then one could say that "95% of the people think MPC is transparent at 160kbps" and that would be that.
If you're averaging the results, it might be good to throw out the very top (guruboolez  ) and very bottom (me), but I think a percentile rating would be more informative.
Title: Proposal on listening tests
Post by: Eli on 2004-08-13 05:37:05
Quote
Quote
(throughing out the lower outlyers, biasising the results to people with better ears - which is not me)
[a href="index.php?act=findpost&pid=233986"][{POST_SNAPBACK}][/a]

You're so mean! You'd throw me out? 
Anyway, I think that all the people would be important to keep, then one could say that "95% of the people think MPC is transparent at 160kbps" and that would be that.
If you're averaging the results, it might be good to throw out the very top (guruboolez  ) and very bottom (me), but I think a percentile rating would be more informative.
[a href="index.php?act=findpost&pid=233991"][{POST_SNAPBACK}][/a]


Id be throwing myself out as well. I just fear that keeping everyone would lowering the quality to a threshold where those with better hearing would hear more artifacts. Its just the way its reported but all info would be available to you. I think it would be more useful for my personal sanity to know the results of the top 5% of the testers for transparency, since after averaging me in the results would probably be much lower, and I, like many others here am just obsevive-compulsive with the quality, even though I would probably never know the difference.
Title: Proposal on listening tests
Post by: shadowking on 2004-08-13 05:43:19
I have some spoken word comedy albums and would be interested in the speech test.
Title: Proposal on listening tests
Post by: Omion on 2004-08-13 05:57:50
Quote
Id be throwing myself out as well. I just fear that keeping everyone would lowering the quality to a threshold where those with better hearing would hear more artifacts. Its just the way its reported but all info would be available to you. I think it would be more useful for my personal sanity to know the results of the top 5% of the testers for transparency, since after averaging me in the results would probably be much lower, and I, like many others here am just obsevive-compulsive with the quality, even though I would probably never know the difference.
[a href="index.php?act=findpost&pid=233992"][{POST_SNAPBACK}][/a]

I have the feeling that we're saying just about the same things. The way I understand it:

you:
Average the top 5% of the listeners' bitrates, so that around half of those 5% will find the resulting bitrate transparent.

me:
Find the bitrate that 95% of the people find transparent, so that only the top 5% can ABX them.

Either way, the bitrates should be about the same. Your way would result in a slightly higher bitrate, but as I said before, I think percentiles would be better than averaging a subset.
Title: Proposal on listening tests
Post by: Derge on 2004-08-13 05:58:34
Some of the excitement about conducting a transparency test might arise from the bitter, unassailable fact that there has never been a decent one. Ever. Tell me if I'm way off base here. Transparency has my vote.
Title: Proposal on listening tests
Post by: markanini on 2004-08-13 11:03:53
I think speech listening tests would be intresting, since there hasnt been any tests. I use FLAC for archiving my CD's so a transparency thresold test would not be intresting for me, and such a test would be very difficult to conduct.
Title: Proposal on listening tests
Post by: oluv on 2004-08-13 13:16:34
sorry i am new here, so please don't flame me if my proposition is ridiculous to you. 

for me not the fixed bitrate is important, but the resulting file-size. if we compare differently encoded samples at the same bitrate, there might be still some samples seriously bigger than others. and the funny thing ist, that the smaller files might sound "better" or nearer transparency than others. and this would interest me personally.

my aim would be the lower bitrate end, like 80-128 kbps. because at these rates the user can save a lot of data compared to 192kpbs or above.
in the last days i was playing around with ogg vorbis a lot, with the conclusion that 96kbps is totally enough for some material and for other tracks even lower. it is very recording-dependent, but most of my test-material i encoded sounds perfect at 128kbps with ogg and i won't go up much more, because at the higher rates the differences in sound are not so obvious anymore.

so, let's try to produce the smallest possible files, that still sound good not to say "transparent".

is this nutty?
Title: Proposal on listening tests
Post by: Madman2003 on 2004-08-13 16:55:31
Most listening tests are in my eyes low bit rate tests, i would like to see a test of how codecs work in the 175-225 kbps range. My codec of choice would have to reach transparancy almost always in that range.(the size of 200+ bit rates is not an issue for me) I would like to see how ogg vorbis (1.1 RC1, megamix 2 and 1.1 RC1 with advanced encode options set to different settings) goes against musepack.

Madman2003.
Title: Proposal on listening tests
Post by: phong on 2004-08-13 19:00:51
I might be interested in conducting the "about" 160k test (which is something I discussed a long time ago in another thread).  Contenders would include:
mpc --standard
lame --preset medium (I would push for 3.96.1)
vorbis (settings TBD)
aac (Nero?  Itunes?)
wma OR wma pro (not both)

Though the number of participants would be limited by the difficulty of the test, I think this would be offset somewhat by the greater interest in the test, i.e. people elsewhere tend to scoff at the 128k tests ("OMG, 128k is so horrible, it's worse than AM radio, I'd rather cut my ears off, nobody uses 128k").  I'd make the test longer, perhaps make it into a series of tests, to give people time to participate.  ABX results would be required.

Subgroups could be analyzed to get more information, i.e. what were the results on the most difficult samples, what were the results for the most sensetive listeners, etc.  The definition of the subgroups would be defined beforehand to avoid introducing bias.

There would need to be a vorbis test beforehand to determine which combination of -q, impluse_noisetune and impulse_trigger_profile produce the best results at similar bitrates to the other codecs in the test.  The same might be true of other codecs (as lame --preset 128 was determined not to be optimal prior to the last 128k test).

Also, a 160k test might be a good "feasability study" for future "transparency" tests.
Title: Proposal on listening tests
Post by: dev0 on 2004-08-13 19:14:38
I said it before and I'll say it again: A 160kbps (actually 170kbps judging from your suggested settings; and everything above) test is not feasible and worth nothing.
The transparency threshhold test is more interesting, but will probably be difficult to conduce, since it requires lots and lots of ABXing.
<bitter>I wonder how many of the people requesting such a test have actually succeeded at ABXing any modern codec at those bitrates on more than a handful of samples. Until they haven't: STFU.</bitter>

Personally I'd vote for the speech-codec test, since it's an area, which has been neglected at HA.org before but deserves more caverage.
Title: Proposal on listening tests
Post by: Omion on 2004-08-13 19:21:40
Quote
sorry i am new here, so please don't flame me if my proposition is ridiculous to you.  

for me not the fixed bitrate is important, but the resulting file-size. if we compare differently encoded samples at the same bitrate, there might be still some samples seriously bigger than others. and the funny thing ist, that the smaller files might sound "better" or nearer transparency than others. and this would interest me personally.

my aim would be the lower bitrate end, like 80-128 kbps. because at these rates the user can save a lot of data compared to 192kpbs or above.
in the last days i was playing around with ogg vorbis a lot, with the conclusion that 96kbps is totally enough for some material and for other tracks even lower. it is very recording-dependent, but most of my test-material i encoded sounds perfect at 128kbps with ogg and i won't go up much more, because at the higher rates the differences in sound are not so obvious anymore.

so, let's try to produce the smallest possible files, that still sound good not to say "transparent".

is this nutty?
[a href="index.php?act=findpost&pid=234086"][{POST_SNAPBACK}][/a]

Yup. It's nutty 

File size depends on exactly two things: bitrate and song length. The files that are "seriously bigger" are the ones that are "seriously longer" than the other songs. If you wanted every song to be exactly 5MB, then you'd be using 5MB for a 30 second song, and 5MB for a 10 minute song. The 30 second song would have MUCH higher quality than the 10 minute, since there are more bits used to describe every second of the 30-second song. It's better to focus on bitrate than file size.
Title: Proposal on listening tests
Post by: Derge on 2004-08-13 20:26:10
Quote
<bitter>I wonder how many of the people requesting such a test have actually succeeded at ABXing any modern codec at those bitrates on more than a handful of samples. Until they haven't: STFU.</bitter>


The point of a high-bitrate test, Dev0, is not:

1. To prove conventional wisdom wrong

2. To prove YOU wrong

That handful of samples would provide empirical data (something we do not have yet) on where codecs fail at high bitrates. What we have now is: "Conduct your own tests and see if 160kb/s is good enough for you.", which is a discouraging and unappealing cop-out to most people. We could do some good with a test like this, so long as we resign ourselves to failure before we begin. The greater failure to me is in conducting tests which do not really need to be conducted.
Title: Proposal on listening tests
Post by: oluv on 2004-08-13 20:38:16
Quote
File size depends on exactly two things: bitrate and song length. The files that are "seriously bigger" are the ones that are "seriously longer" than the other songs.
[a href="index.php?act=findpost&pid=234200"][{POST_SNAPBACK}][/a]


of course i know that the size is lengthdependent. that was not the point. i was only confused by the fact that with ogg some files are bigger than others, although the same bitrate is assumed.
but now i read that in fact with ogg the bitrate varies from one build to another at the same quality-setting. that means that -q4 can be 120kbps while with another build it's even 160 at -q4. as i do not see the "actual" bitrate in my player i thought it could be something else that had influence on the file-size as well.

sorry for my ignorance. but therefore i am here, to learn something 
Title: Proposal on listening tests
Post by: sehested on 2004-08-13 20:43:48
I would like to know the transparency level of each codec, but I see a number of problems related to such a test.

Which of the following should be considered the transparency level?
- The level at which the top 5% can't ABX problem samples from original
- The level at which the top 5% can't ABX 95% of the samples from original
- The level at which the top 5% can't ABX average samples from original
- The level at which I can't ABX problem samples from original on my stereo
- The level at which I can't ABX problem samples from original on my iPod
- The level at which I can't ABX average samples from original

I would probably be more interested in the last three and could find the transparency level by doing my personal listening tests, once the problem samples have been identified.

A transparency test with public interest should be based on the results from a population of testers that should each report the transparency level supported by ABX tests. That would require not only a number of codecs to be used but also a number of predefined settings for each codec.

In the end it would require a series of listening tests, say a seperate transparency test for each codec.

It would be nice to know both the 5% problem sample transparency level and the 5% average sample transparency level. Then you could say: "95% of people can not distinguish even problem samples from the original using these settings".
Whether I am amongst these 95% I could soon find out and if not then select these settings comfortably knowing that I would probably never hear any artifacts no matter what I encoded.

Then again, it seems like a LOT of work for this comfort.

I therefore second Phong's suggestion of a 160 kbps listening test.

BTW I expect some of the codecs to be transparent to most users at 160 kbps.
Title: Proposal on listening tests
Post by: Omion on 2004-08-13 21:06:38
Quote
of course i know that the size is lengthdependent. that was not the point. i was only confused by the fact that with ogg some files are bigger than others, although the same bitrate is assumed.
but now i read that in fact with ogg the bitrate varies from one build to another at the same quality-setting. that means that -q4 can be 120kbps while with another build it's even 160 at -q4. as i do not see the "actual" bitrate in my player i thought it could be something else that had influence on the file-size as well.

sorry for my ignorance. but therefore i am here, to learn something 
[a href="index.php?act=findpost&pid=234223"][{POST_SNAPBACK}][/a]

Ah. I see what you're saying. I think I should point out that Vorbis(*) quality settings are not mapped to specific bitrates. Files encoded with -q4 should have the same quality, but in order to do that they might need a different bitrate. Even with one build a -q4 file could be 120, and another 160. It all depends on how hard they are to encode.

No need to apologize about ignorance as long as you're trying to overcome it 

(*) Just to be picky, Ogg is the container and Vorbis is the actual audio codec. A lot of people confuse the two.
Title: Proposal on listening tests
Post by: Omion on 2004-08-13 21:20:35
Quote
I would like to know the transparency level of each codec, but I see a number of problems related to such a test.

Which of the following should be considered the transparency level?
- The level at which the top 5% can't ABX problem samples from original
- The level at which the top 5% can't ABX 95% of the samples from original
- The level at which the top 5% can't ABX average samples from original
- The level at which I can't ABX problem samples from original on my stereo
- The level at which I can't ABX problem samples from original on my iPod
- The level at which I can't ABX average samples from original

I would probably be more interested in the last three and could find the transparency level by doing my personal listening tests, once the problem samples have been identified.

A transparency test with public interest should be based on the results from a population of testers that should each report the transparency level supported by ABX tests. That would require not only a number of codecs to be used but also a number of predefined settings for each codec.

In the end it would require a series of listening tests, say a seperate transparency test for each codec.
[a href="index.php?act=findpost&pid=234225"][{POST_SNAPBACK}][/a]

I've been thinking about this a lot, and the test would indeed be hard to pull off. I've especially been thinking about where the threshold should be, although my thoughts were:
- 95% of the people can't ABX any samples
- nobody can ABX more than 5% of the samples (this would require at least 20 samples, though)
- 95% of the people can't ABX 95% of the samples (sort of a combination between the first two)

The last three on your list really only have to do with you, so there's no need for everybody to participate. You only need yourself to figure out where they're transparent to you. 

It would be quite a bit different from other listening tests, but I think that's a lot of the appeal for it: there hasn't been anything like it.

PS. I think you got something a bit wrong with your "which of the following" list. For example:
Quote
- The level at which the top 5% can't ABX problem samples from original
If the top 5% can't ABX them, then the bottom 95% CERTAINLY can't. You'll end up with the level where nobody can ABX them. I think you mean that only the top 5% CAN ABX them.

[span style='font-size:8pt;line-height:100%']edit: streamlined quote[/span]
Title: Proposal on listening tests
Post by: sehested on 2004-08-13 23:06:21
Quote
If the top 5% can't ABX them, then the bottom 95% CERTAINLY can't. You'll end up with the level where nobody can ABX them. I think you mean that only the top 5% CAN ABX them.

Sure, my mistake.
Title: Proposal on listening tests
Post by: Polar on 2004-08-14 22:05:30
I'm interested in a multiformat test at 96 kbps, for reasons I've already mentioned here on July 9 (http://www.hydrogenaudio.org/forums/index.php?showtopic=23047). If I understood her/him correctly, slippyC (http://www.hydrogenaudio.org/forums/index.php?showtopic=25835&view=findpost&p=233785) has also shown interest in this rate range.

This between-64-and-128 test should be a lot easier to organize than a 160 kbps or transparency threshold test, however interesting the latter seem, even to me. So while we're debating the hows and whys of these kinds of transparent bitrate tests - which could take a while, or so it looks - I figured we could just as well get a 96 kbps test going, and fill up that huge gap between 64 and 128. By the time all noses point in the same direction for that long awaited transparency test, a 96k test could well be over and dealt with

Come to think of it, 96 kbps is often the bitrate online radio stations consider broadband transmission quality, at least for the stations I listen to most regularly. I don't know about your impression, but in this part of the Internet  I get to chose between narrowband, which is usually 32 kbps, and broadband, which is 96k most of the time (128k is a little less common in this case).

Anyone?
Title: Proposal on listening tests
Post by: markanini on 2004-08-14 22:37:30
I want to point out that speech listening test would be good to have because IP telephony is getting more common and soon it might be part everyday life and there wouldnt be a better listening test made by anyone else than Hydrogenaudio. So theres a good reason to do a speech listening test even though most HA members are most intrested in the performance of mid to high bitrate lossy codecs.
Title: Proposal on listening tests
Post by: marcan on 2004-08-14 23:25:06
I'm coming up with something completely different. I’m looking for quality and distribution on the net.
I distributed mp3 encoded directly from 24 bits and I was impressed by the result.

My proposition is for pcm 44.1/16 dithered <> 44.1/24 mp3 lame –api. The purpose is to find out which one is closer from the original pcm 44.1/24. These tests should be done with 16 and obviously 24 bits playback.
If the mp3 is better, it could bring a better quality with the largest compatibility.
Title: Proposal on listening tests
Post by: Polar on 2004-08-14 23:35:05
Quote
My proposition is for pcm 44.1/16 dithered <> 44.1/24 mp3 lame –api. The purpose is to find out which one is closer from the original pcm 44.1/24. These tests should be done with 16 and obviously 24 bits playback.
I don't consider myself very well up in lossy codecs, but my guess is that 44.1/24 LAME --alt-preset insane (so 320 kbps) will be near to impossible to ABX from an original 44.1/16 PCM.
Besides, very few people, even among HA members, have 24 bit soundcards.
Title: Proposal on listening tests
Post by: markanini on 2004-08-15 00:16:48
Quote
I'm coming up with something completely different. I’m looking for quality and distribution on the net.
I distributed mp3 encoded directly from 24 bits and I was impressed by the result.

My proposition is for pcm 44.1/16 dithered <> 44.1/24 mp3 lame –api. The purpose is to find out which one is closer from the original pcm 44.1/24. These tests should be done with 16 and obviously 24 bits playback.
If the mp3 is better, it could bring a better quality with the largest compatibility.
[a href="index.php?act=findpost&pid=234393"][{POST_SNAPBACK}][/a]

Do some ABX testing for your self since most people encode from CD which is 16 bits.
Quote
Besides, very few people, even among HA members, have 24 bit soundcards.

I don't think so, 24 bit sound card are very common these days.
Title: Proposal on listening tests
Post by: Triza on 2004-08-15 02:25:27
Quote
so, let's try to produce the smallest possible files, that still sound good not to say "transparent".

is this nutty?
[a href="index.php?act=findpost&pid=234086"][{POST_SNAPBACK}][/a]


Yes. Someone already discussed how to define transparency. I think that is doable. Also transparency tests can use the simple ABX-ing.  "Sound good" is not something you can easily define. It would require more complex so-called ABC/HR test. Even then different people would have different ratings. In addition to the fact that transparency is easier to define and test, most of the people here are after transparency historically because this is a (sane) audiophile forum.

Triza

PS: I also would like to see a transparency test. I am just about to start to conduct one for myself with Ogg Vorbis, which is my chosen lossy format.
Title: Proposal on listening tests
Post by: Polar on 2004-08-18 08:45:20
Quote
Quote
Besides, very few people, even among HA members, have 24 bit soundcards.[a href="index.php?act=findpost&pid=234397"][{POST_SNAPBACK}][/a]
I don't think so, 24 bit sound card are very common these days.[a href="index.php?act=findpost&pid=234404"][{POST_SNAPBACK}][/a]
I really wouldn't exaggerate this. 24 bit soundcards may indeed be becoming increasingly common on new machines, but I wouldn't go as far as to call them "very common" in general. As long as 24 bit isn't mainstream like 16 bit still is, I see no point in conducting a 24 bit MP3 listening test, which then wouldn't quite serve the general interest that it owes to itself.