Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Proposal on listening tests (Read 20959 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Proposal on listening tests

Reply #25
Hi, I've been lurking for a long time, don't let the user data fool you.

I second (or, umm, twelth) the vote for some kind of transparency test for the peace-of-mind reasons stated above.  (A friend just ripped 250 CDs to 160kbps Quicktime AAC then left the country, leaving CDs behind; I'm not sure what to tell him when he asks if they're good enough ;-)

What about throwing something else into the mix: transparency of the various online stores?  AFAIK, there wouldn't be copyright issues from taking a short clip of something bought at iTunes or the like and comparing it to FLACs of the original CD.

I know you can't chose bitrate with them, but, once samples were identified that, say, seemed to not be transparent to a good number of people at 128kbps, then we could compare them to the store-downloaded one (encoded with whatever they use) to see if it's just as opaque. 

Just a thought; I'm sure others here could dream up better implementations.

Proposal on listening tests

Reply #26
Sample A might be transparent at 110k yet another may need 190k..

The important thing is consistency and I think such a test will only reveal 50% of the story. We need to show that a codec is consistently good at a certain bitrate range  e.g - mpc is solid in 140-190k range: meaning it can encode majority of samples without problems, without needing wild bitrates to deal with pre-echos, sharp transients etc.

Proposal on listening tests

Reply #27
Quote
Quote
that the best codec would be the one that handles most of the problem samples well.
[a href="index.php?act=findpost&pid=233958"][{POST_SNAPBACK}][/a]


Yes, the best codec - for problem samples!

There's no guarantee that it will show the same behaviour on "normal" samples. And what's the point of a test that only show results applicable to a small share of the musical styles?
[a href="index.php?act=findpost&pid=233959"][{POST_SNAPBACK}][/a]


Well, I would support a transparency test, to determine at which bitrate the codecs become transparent (throughing out the lower outlyers, biasising the results to people with better ears - which is not me). And then test each of the codecs at their transparent settings with problem sample. I would be supprised if the results of a transparency test show requirements for bitrates > -aps for lame, q5 for mpc, ect (the recommended settings already). In fact I think these setting are probably overkilling, allowing for more "head room", which is fine (hey, I encode in FLAC). So the test I was proposing was HA recommeded settings on problem samples.

 

Proposal on listening tests

Reply #28
Quote
(throughing out the lower outlyers, biasising the results to people with better ears - which is not me)
[a href="index.php?act=findpost&pid=233986"][{POST_SNAPBACK}][/a]

You're so mean! You'd throw me out? 
Anyway, I think that all the people would be important to keep, then one could say that "95% of the people think MPC is transparent at 160kbps" and that would be that.
If you're averaging the results, it might be good to throw out the very top (guruboolez  ) and very bottom (me), but I think a percentile rating would be more informative.
"We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2

Proposal on listening tests

Reply #29
Quote
Quote
(throughing out the lower outlyers, biasising the results to people with better ears - which is not me)
[a href="index.php?act=findpost&pid=233986"][{POST_SNAPBACK}][/a]

You're so mean! You'd throw me out? 
Anyway, I think that all the people would be important to keep, then one could say that "95% of the people think MPC is transparent at 160kbps" and that would be that.
If you're averaging the results, it might be good to throw out the very top (guruboolez  ) and very bottom (me), but I think a percentile rating would be more informative.
[a href="index.php?act=findpost&pid=233991"][{POST_SNAPBACK}][/a]


Id be throwing myself out as well. I just fear that keeping everyone would lowering the quality to a threshold where those with better hearing would hear more artifacts. Its just the way its reported but all info would be available to you. I think it would be more useful for my personal sanity to know the results of the top 5% of the testers for transparency, since after averaging me in the results would probably be much lower, and I, like many others here am just obsevive-compulsive with the quality, even though I would probably never know the difference.

Proposal on listening tests

Reply #30
I have some spoken word comedy albums and would be interested in the speech test.

Proposal on listening tests

Reply #31
Quote
Id be throwing myself out as well. I just fear that keeping everyone would lowering the quality to a threshold where those with better hearing would hear more artifacts. Its just the way its reported but all info would be available to you. I think it would be more useful for my personal sanity to know the results of the top 5% of the testers for transparency, since after averaging me in the results would probably be much lower, and I, like many others here am just obsevive-compulsive with the quality, even though I would probably never know the difference.
[a href="index.php?act=findpost&pid=233992"][{POST_SNAPBACK}][/a]

I have the feeling that we're saying just about the same things. The way I understand it:

you:
Average the top 5% of the listeners' bitrates, so that around half of those 5% will find the resulting bitrate transparent.

me:
Find the bitrate that 95% of the people find transparent, so that only the top 5% can ABX them.

Either way, the bitrates should be about the same. Your way would result in a slightly higher bitrate, but as I said before, I think percentiles would be better than averaging a subset.
"We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2

Proposal on listening tests

Reply #32
Some of the excitement about conducting a transparency test might arise from the bitter, unassailable fact that there has never been a decent one. Ever. Tell me if I'm way off base here. Transparency has my vote.

Proposal on listening tests

Reply #33
I think speech listening tests would be intresting, since there hasnt been any tests. I use FLAC for archiving my CD's so a transparency thresold test would not be intresting for me, and such a test would be very difficult to conduct.

Proposal on listening tests

Reply #34
sorry i am new here, so please don't flame me if my proposition is ridiculous to you. 

for me not the fixed bitrate is important, but the resulting file-size. if we compare differently encoded samples at the same bitrate, there might be still some samples seriously bigger than others. and the funny thing ist, that the smaller files might sound "better" or nearer transparency than others. and this would interest me personally.

my aim would be the lower bitrate end, like 80-128 kbps. because at these rates the user can save a lot of data compared to 192kpbs or above.
in the last days i was playing around with ogg vorbis a lot, with the conclusion that 96kbps is totally enough for some material and for other tracks even lower. it is very recording-dependent, but most of my test-material i encoded sounds perfect at 128kbps with ogg and i won't go up much more, because at the higher rates the differences in sound are not so obvious anymore.

so, let's try to produce the smallest possible files, that still sound good not to say "transparent".

is this nutty?

Proposal on listening tests

Reply #35
Most listening tests are in my eyes low bit rate tests, i would like to see a test of how codecs work in the 175-225 kbps range. My codec of choice would have to reach transparancy almost always in that range.(the size of 200+ bit rates is not an issue for me) I would like to see how ogg vorbis (1.1 RC1, megamix 2 and 1.1 RC1 with advanced encode options set to different settings) goes against musepack.

Madman2003.

Proposal on listening tests

Reply #36
I might be interested in conducting the "about" 160k test (which is something I discussed a long time ago in another thread).  Contenders would include:
mpc --standard
lame --preset medium (I would push for 3.96.1)
vorbis (settings TBD)
aac (Nero?  Itunes?)
wma OR wma pro (not both)

Though the number of participants would be limited by the difficulty of the test, I think this would be offset somewhat by the greater interest in the test, i.e. people elsewhere tend to scoff at the 128k tests ("OMG, 128k is so horrible, it's worse than AM radio, I'd rather cut my ears off, nobody uses 128k").  I'd make the test longer, perhaps make it into a series of tests, to give people time to participate.  ABX results would be required.

Subgroups could be analyzed to get more information, i.e. what were the results on the most difficult samples, what were the results for the most sensetive listeners, etc.  The definition of the subgroups would be defined beforehand to avoid introducing bias.

There would need to be a vorbis test beforehand to determine which combination of -q, impluse_noisetune and impulse_trigger_profile produce the best results at similar bitrates to the other codecs in the test.  The same might be true of other codecs (as lame --preset 128 was determined not to be optimal prior to the last 128k test).

Also, a 160k test might be a good "feasability study" for future "transparency" tests.
I am *expanding!*  It is so much *squishy* to *smell* you!  *Campers* are the best!  I have *anticipation* and then what?  Better parties in *the middle* for sure.
http://www.phong.org/

Proposal on listening tests

Reply #37
I said it before and I'll say it again: A 160kbps (actually 170kbps judging from your suggested settings; and everything above) test is not feasible and worth nothing.
The transparency threshhold test is more interesting, but will probably be difficult to conduce, since it requires lots and lots of ABXing.
<bitter>I wonder how many of the people requesting such a test have actually succeeded at ABXing any modern codec at those bitrates on more than a handful of samples. Until they haven't: STFU.</bitter>

Personally I'd vote for the speech-codec test, since it's an area, which has been neglected at HA.org before but deserves more caverage.
"To understand me, you'll have to swallow a world." Or maybe your words.

Proposal on listening tests

Reply #38
Quote
sorry i am new here, so please don't flame me if my proposition is ridiculous to you.  

for me not the fixed bitrate is important, but the resulting file-size. if we compare differently encoded samples at the same bitrate, there might be still some samples seriously bigger than others. and the funny thing ist, that the smaller files might sound "better" or nearer transparency than others. and this would interest me personally.

my aim would be the lower bitrate end, like 80-128 kbps. because at these rates the user can save a lot of data compared to 192kpbs or above.
in the last days i was playing around with ogg vorbis a lot, with the conclusion that 96kbps is totally enough for some material and for other tracks even lower. it is very recording-dependent, but most of my test-material i encoded sounds perfect at 128kbps with ogg and i won't go up much more, because at the higher rates the differences in sound are not so obvious anymore.

so, let's try to produce the smallest possible files, that still sound good not to say "transparent".

is this nutty?
[a href="index.php?act=findpost&pid=234086"][{POST_SNAPBACK}][/a]

Yup. It's nutty 

File size depends on exactly two things: bitrate and song length. The files that are "seriously bigger" are the ones that are "seriously longer" than the other songs. If you wanted every song to be exactly 5MB, then you'd be using 5MB for a 30 second song, and 5MB for a 10 minute song. The 30 second song would have MUCH higher quality than the 10 minute, since there are more bits used to describe every second of the 30-second song. It's better to focus on bitrate than file size.
"We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2

Proposal on listening tests

Reply #39
Quote
<bitter>I wonder how many of the people requesting such a test have actually succeeded at ABXing any modern codec at those bitrates on more than a handful of samples. Until they haven't: STFU.</bitter>


The point of a high-bitrate test, Dev0, is not:

1. To prove conventional wisdom wrong

2. To prove YOU wrong

That handful of samples would provide empirical data (something we do not have yet) on where codecs fail at high bitrates. What we have now is: "Conduct your own tests and see if 160kb/s is good enough for you.", which is a discouraging and unappealing cop-out to most people. We could do some good with a test like this, so long as we resign ourselves to failure before we begin. The greater failure to me is in conducting tests which do not really need to be conducted.

Proposal on listening tests

Reply #40
Quote
File size depends on exactly two things: bitrate and song length. The files that are "seriously bigger" are the ones that are "seriously longer" than the other songs.
[a href="index.php?act=findpost&pid=234200"][{POST_SNAPBACK}][/a]


of course i know that the size is lengthdependent. that was not the point. i was only confused by the fact that with ogg some files are bigger than others, although the same bitrate is assumed.
but now i read that in fact with ogg the bitrate varies from one build to another at the same quality-setting. that means that -q4 can be 120kbps while with another build it's even 160 at -q4. as i do not see the "actual" bitrate in my player i thought it could be something else that had influence on the file-size as well.

sorry for my ignorance. but therefore i am here, to learn something 

Proposal on listening tests

Reply #41
I would like to know the transparency level of each codec, but I see a number of problems related to such a test.

Which of the following should be considered the transparency level?
- The level at which the top 5% can't ABX problem samples from original
- The level at which the top 5% can't ABX 95% of the samples from original
- The level at which the top 5% can't ABX average samples from original
- The level at which I can't ABX problem samples from original on my stereo
- The level at which I can't ABX problem samples from original on my iPod
- The level at which I can't ABX average samples from original

I would probably be more interested in the last three and could find the transparency level by doing my personal listening tests, once the problem samples have been identified.

A transparency test with public interest should be based on the results from a population of testers that should each report the transparency level supported by ABX tests. That would require not only a number of codecs to be used but also a number of predefined settings for each codec.

In the end it would require a series of listening tests, say a seperate transparency test for each codec.

It would be nice to know both the 5% problem sample transparency level and the 5% average sample transparency level. Then you could say: "95% of people can not distinguish even problem samples from the original using these settings".
Whether I am amongst these 95% I could soon find out and if not then select these settings comfortably knowing that I would probably never hear any artifacts no matter what I encoded.

Then again, it seems like a LOT of work for this comfort.

I therefore second Phong's suggestion of a 160 kbps listening test.

BTW I expect some of the codecs to be transparent to most users at 160 kbps.

Proposal on listening tests

Reply #42
Quote
of course i know that the size is lengthdependent. that was not the point. i was only confused by the fact that with ogg some files are bigger than others, although the same bitrate is assumed.
but now i read that in fact with ogg the bitrate varies from one build to another at the same quality-setting. that means that -q4 can be 120kbps while with another build it's even 160 at -q4. as i do not see the "actual" bitrate in my player i thought it could be something else that had influence on the file-size as well.

sorry for my ignorance. but therefore i am here, to learn something 
[a href="index.php?act=findpost&pid=234223"][{POST_SNAPBACK}][/a]

Ah. I see what you're saying. I think I should point out that Vorbis(*) quality settings are not mapped to specific bitrates. Files encoded with -q4 should have the same quality, but in order to do that they might need a different bitrate. Even with one build a -q4 file could be 120, and another 160. It all depends on how hard they are to encode.

No need to apologize about ignorance as long as you're trying to overcome it 

(*) Just to be picky, Ogg is the container and Vorbis is the actual audio codec. A lot of people confuse the two.
"We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2

Proposal on listening tests

Reply #43
Quote
I would like to know the transparency level of each codec, but I see a number of problems related to such a test.

Which of the following should be considered the transparency level?
- The level at which the top 5% can't ABX problem samples from original
- The level at which the top 5% can't ABX 95% of the samples from original
- The level at which the top 5% can't ABX average samples from original
- The level at which I can't ABX problem samples from original on my stereo
- The level at which I can't ABX problem samples from original on my iPod
- The level at which I can't ABX average samples from original

I would probably be more interested in the last three and could find the transparency level by doing my personal listening tests, once the problem samples have been identified.

A transparency test with public interest should be based on the results from a population of testers that should each report the transparency level supported by ABX tests. That would require not only a number of codecs to be used but also a number of predefined settings for each codec.

In the end it would require a series of listening tests, say a seperate transparency test for each codec.
[a href="index.php?act=findpost&pid=234225"][{POST_SNAPBACK}][/a]

I've been thinking about this a lot, and the test would indeed be hard to pull off. I've especially been thinking about where the threshold should be, although my thoughts were:
- 95% of the people can't ABX any samples
- nobody can ABX more than 5% of the samples (this would require at least 20 samples, though)
- 95% of the people can't ABX 95% of the samples (sort of a combination between the first two)

The last three on your list really only have to do with you, so there's no need for everybody to participate. You only need yourself to figure out where they're transparent to you. 

It would be quite a bit different from other listening tests, but I think that's a lot of the appeal for it: there hasn't been anything like it.

PS. I think you got something a bit wrong with your "which of the following" list. For example:
Quote
- The level at which the top 5% can't ABX problem samples from original
If the top 5% can't ABX them, then the bottom 95% CERTAINLY can't. You'll end up with the level where nobody can ABX them. I think you mean that only the top 5% CAN ABX them.

[span style='font-size:8pt;line-height:100%']edit: streamlined quote[/span]
"We demand rigidly defined areas of doubt and uncertainty!" - Vroomfondel, H2G2

Proposal on listening tests

Reply #44
Quote
If the top 5% can't ABX them, then the bottom 95% CERTAINLY can't. You'll end up with the level where nobody can ABX them. I think you mean that only the top 5% CAN ABX them.

Sure, my mistake.

Proposal on listening tests

Reply #45
I'm interested in a multiformat test at 96 kbps, for reasons I've already mentioned here on July 9. If I understood her/him correctly, slippyC has also shown interest in this rate range.

This between-64-and-128 test should be a lot easier to organize than a 160 kbps or transparency threshold test, however interesting the latter seem, even to me. So while we're debating the hows and whys of these kinds of transparent bitrate tests - which could take a while, or so it looks - I figured we could just as well get a 96 kbps test going, and fill up that huge gap between 64 and 128. By the time all noses point in the same direction for that long awaited transparency test, a 96k test could well be over and dealt with

Come to think of it, 96 kbps is often the bitrate online radio stations consider broadband transmission quality, at least for the stations I listen to most regularly. I don't know about your impression, but in this part of the Internet  I get to chose between narrowband, which is usually 32 kbps, and broadband, which is 96k most of the time (128k is a little less common in this case).

Anyone?

Proposal on listening tests

Reply #46
I want to point out that speech listening test would be good to have because IP telephony is getting more common and soon it might be part everyday life and there wouldnt be a better listening test made by anyone else than Hydrogenaudio. So theres a good reason to do a speech listening test even though most HA members are most intrested in the performance of mid to high bitrate lossy codecs.

Proposal on listening tests

Reply #47
I'm coming up with something completely different. I’m looking for quality and distribution on the net.
I distributed mp3 encoded directly from 24 bits and I was impressed by the result.

My proposition is for pcm 44.1/16 dithered <> 44.1/24 mp3 lame –api. The purpose is to find out which one is closer from the original pcm 44.1/24. These tests should be done with 16 and obviously 24 bits playback.
If the mp3 is better, it could bring a better quality with the largest compatibility.

Proposal on listening tests

Reply #48
Quote
My proposition is for pcm 44.1/16 dithered <> 44.1/24 mp3 lame –api. The purpose is to find out which one is closer from the original pcm 44.1/24. These tests should be done with 16 and obviously 24 bits playback.
I don't consider myself very well up in lossy codecs, but my guess is that 44.1/24 LAME --alt-preset insane (so 320 kbps) will be near to impossible to ABX from an original 44.1/16 PCM.
Besides, very few people, even among HA members, have 24 bit soundcards.

Proposal on listening tests

Reply #49
Quote
I'm coming up with something completely different. I’m looking for quality and distribution on the net.
I distributed mp3 encoded directly from 24 bits and I was impressed by the result.

My proposition is for pcm 44.1/16 dithered <> 44.1/24 mp3 lame –api. The purpose is to find out which one is closer from the original pcm 44.1/24. These tests should be done with 16 and obviously 24 bits playback.
If the mp3 is better, it could bring a better quality with the largest compatibility.
[a href="index.php?act=findpost&pid=234393"][{POST_SNAPBACK}][/a]

Do some ABX testing for your self since most people encode from CD which is 16 bits.
Quote
Besides, very few people, even among HA members, have 24 bit soundcards.

I don't think so, 24 bit sound card are very common these days.