Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: I think I've discovered a new listening test method (Read 24488 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

I think I've discovered a new listening test method

This is my first post, and I have no professional background in audio, but I love music and try hard to treat it with respect. So I offer you something I've been working on.

I've always had a problem with traditional A/B or ABX listening tests. They're great when the differences are pretty big, but as they get smaller it becomes harder and harder to remember precisely what one track sounded like when you're listening to another. I for one just can't think about audio with the kind of detail that I can with visual information. Now I will confess that by the time this becomes a problem, the quality difference is probably not very significant. But perhaps I'll get better equipment later and suddenly the difference will be more obvious. Or perhaps I'll play the file to someone with better ears and they'll find it unpleasant. Whatever the case, I wanted a way to detect even small differences in audio quality. My concern was not so much with being blind or even with determining which audio file sounded better, just to determine if there was a discernible difference in the first place. In my mind there's no point trying to say which file sounds better if they've just been proven to sound exactly the same. But this method could easily be performed under double blind conditions.

Anyway, I think I've come up with a solution. So far it's worked nicely in my tests, but it seems far too easy. I can't help but suspect I'm overlooking something. So I'm asking for your help to validate the technique. Tell me if you can see a flaw in the process.

My inspiration came from the classic method of inverting one stereo channel and combining it with the other to eliminate center channel vocals and allow analysis of the subtler instrumental stuff. I've heard it called OOPS or Out Of Phase Stereo. I always liked the elegance of it. But then I realized I could do something similar to analyze the impact of various levels of audio compression.

To start with I ripped a single track from a CD as a wav file, then created several versions of it compressed with the latest LAME library at different bitrates. VBR 0, CBR 320, CBR 128 and finally CBR 56 to provide a really glaring example if need be.
Then I loaded the wav into Audacity along with one of the compressed versions, carefully measured and cut off the padding created by the Mp3 compression so the two waveforms were aligned not simply to the second but to the very sample, inverted the compressed version and mixed it with the wav. This produced a file which contained only the portions of the audio which had been removed or altered by LAME during compression. It was really cool to listen to, a hissy scratchy sort of racket. But it's not the difference file itself I was interested in.

I then loaded the corresponding Mp3 back into audacity below this "difference" file. With the two in perfect waveform alignment, playing them together produced something which was mathematically identical to the source CD. But at any time during the playback I could mute the "difference" file and switch instantly to hearing just the Mp3. If there was a discernible quality difference between the Mp3 and the source CD, it would show up at the point of muting as a change in volume, background hiss, clarity, etc. But if the aspects of the file which had been removed or altered were imperceptible to my ears (and thus having no impact on audio quality,) there would be no change when I muted/unmuted the "difference" track.

So far my testing has revealed absolutely no detectable change on the half dozen songs I've tried at CBR 320 or VBR -0. To prove that the method's working though I also made 128 bitrate and 56 bitrate Mp3s. Those produce a dramatic drop in quality every time the "difference" track is muted, but sound just like the CD when it's unmuted. So it seems the method works correctly, and LAME works really damn good these days.

I've also found that it's much easier to detect an increase in quality from unmuting the "difference" track than a decrease in quality from muting it.

So what do you guys think? Has anyone else tried this before? Am I on to something here, or have I misunderstood something? I do hope I'm correct in this because the method is so simple and elegant. It only takes a minute to set up.

I think I've discovered a new listening test method

Reply #1
It's a sighted test, so you cannot guarantee that the results will be free from expectation bias.

Using foobar2000's ABX utility provides exactly the same functionality but eliminates the possibility for expectation bias.

I think I've discovered a new listening test method

Reply #2
Is it actually able to switch between a compressed and non-compressed version of a song in mid playback? I've never seen something which could do that without introducing a delay or other change of its own.


I think I've discovered a new listening test method

Reply #4
For a moment I was quite excited, but every time I switch tracks in Foobar's ABX during playback there's a very distinct popping sound. 1 in 3 times there's even a jitter. I have a hunch that Audacity is waiting for the waveform to cross the center before it applies the mute, since it always pulls it off without a pop.

I think I've discovered a new listening test method

Reply #5
Perhaps I can clarify my intent with this a little. I know things get rocky when you start to work outside of blind tests.

My major complaint with traditional testing is not the blindness but rather the reliance on memory. I know I can't trust my memory to accurately remember precisely how clear a given horn blast in a song was 3 and a half minutes previous when I fire up a track to compare with it. So what I'm trying to do is change the test from one relying on memory and a vague feeling of "which sounds better" to a test of the listener's ability to notice a change in the audio track as it plays. A change to pure perception without requiring recollection.

I further wanted to do it in a way that would be very simple and transparent, relying on basic tools that the person orchestrating the test would understand, instead of a magic box process that did god only knows what to the audio in the process.

I'm sure there are other methods to switch seamlessly between audio sources, but I'm not aware of any which work this well.

The part I find particularly interesting is that even with my sightedness, even with my unfair advantage, my knowledge of when the switches occurred and in which direction, even when fully expecting to hear a change, I found I was unable to. CBR 320 and VBR -0 produced no change when compared to the source CD that I could detect. If I had detected a difference I would have followed it up with a traditional ABX test. But since I didn't, I suspect that it means there isn't one. And that's something I wouldn't have felt comfortable saying if I had gone straight to a traditional ABX test, due to its reliance on memory.

I think I've discovered a new listening test method

Reply #6
I'll try to take a few minutes later to convert your "word problem" into regular math problem (or expression/equation).   

But if I'm following what you are doing, I think you are simply doing a normal A/B test  (sighted), comparing the original file to the MP3.  Except you've done some round-about operations to re-create the original, rather than just using the "original-original"...  Is that what you're doing?

BTW (this doesn't directly relate to what you are doing) - The sound of the difference file does not represent the difference in the sound!  The fact that you had to time-align the MP3 is a big clue as to why this doesn't work.    Try recording yourself and someone else reading the same sentence and subtract...  See if what you hear sounds like the difference in the two files.  Or record yourself twice and subtract.  Or, try subtracting two completely different songs (or different versons of the same song) and see if that sounds like the difference between the two files...    It turns-out that the difference sounds exactly like the sum (with two uncorrelated files). 

I think I've discovered a new listening test method

Reply #7
Yes, it is a roundabout way to do an a/b, but the most important detail is that it allows an instant and undistorted ability to change between the two sources. So far I've never encountered that anywhere else.

I'm afraid I'm a little unclear about your second part, but it's late for me and I'm on the verge of going to bed. I had to time correct them because the Mp3 compression pads the beginning and end of the file. The whole gapless playback problem. Once the beginning of the files were aligned, they remained aligned for the duration of the files. It's not like I had to alter the speed of one to keep it lined up with the other.

I think I've discovered a new listening test method

Reply #8
If you cannot remember what something sounds like, how on earth are you going to know if you're listening to a altered version of it?

This is one of the classic excuses given by people who are afraid of ABX tests.  It simply does not hold water.

I think I've discovered a new listening test method

Reply #9
Interesting. I've no professional background in audio, and never used anything like Audacity either.

But I can understand well enough what you've done there. I must admit I'm surprised to learn there's a problem finding hardware & software to switch instantly & silently between audio streams. Maybe the pros have stuff they don't noise about.

DVDdoug seems preoccupied with the fact that what you've come up with is still fundamentally an A/B comparison. Er... what should we be looking for instead? The only thing we're interested in here is any audible difference between A and B, to which end anything that simplifies comparison must surely be a useful addition to the toolkit.

I disagree with greynol's idea that your perceptions are somehow 'invalidated' simply because you have concious control of the source switch. Sure, they wouldn't constitute a strict scientific proof unless someone else toggled the input unseen by you. Even then you might be lying, and claim not to be aware of any switching at all. But obviously you yourself would know what if any differences you heard, and any sane person should believe what their own senses tell them, not someone else's convoluted & agenda-ridden arguments to the contrary.

As to the notion that you should be able to remember the audio quality of a sample for perhaps several minutes before you finally get to hear the alternative you want to compare it with...  Words (almost) fail me.

Human brains are very good at noticing subtle differences when there are real-time transitions - we've evolved that way because it's seriously pro-survival to know about changes going on around us. But when it comes to comparing the current environment with the memory of a similar previous one, our brains are heavily biased to find similarities, not differences. Evolution again, because matching & categorising the present against stored historical knowledge makes us smarter, more adaptive, and more likely to survive until parenthood.

In short, Jax184, I think you raise a very interesting issue. I'll be fascinated to know what the real professionals have to say.

I think I've discovered a new listening test method

Reply #10
I disagree with greynol's idea that your perceptions are somehow 'invalidated' simply because you have concious control of the source switch.

I said nothing of the sort.

any sane person should believe what their own senses tell them

Unfortunately for your argument there are plenty of experiments that demonstrate just the opposite when it comes to audio.  You can begin by googling the McGurk effect.

As to the notion that you should be able to remember the audio quality of a sample for perhaps several minutes before you finally get to hear the alternative you want to compare it with...

Can you say straw man?

I think I've discovered a new listening test method

Reply #11
I disagree with greynol's idea that your perceptions are somehow 'invalidated' simply because you have concious control of the source switch.


I suspect you have never tried to compare audio sources in a blind test.  One's (non-audio) perception has an extraordinarily powerful effect on ones hearing.


I think I've discovered a new listening test method

Reply #13
I am painfully aware of how fallible our senses are. I actually prefaced the page I wrote about this technique on my website with a discussion of optical illusions and such. But our memories are even more fallible. Ask any police officer, they'll tell you all about the 5 different unique stories that 5 different people who witnessed the same car crash will tell. So perception+memory has the potential to introduce far more mistakes than pure perception. I'm just trying to get it down to its simplest form. The sightedness of the method I outlined is entirely optional. You can blind it just as well. But I find it very interesting that even with my sight I still wasn't able to detect a difference in high quality Mp3s. I think there's something to be said for that.


Quote
Can you say straw man?


I'm sorry, but it sounded to me like that's what you were saying as well. Care to clarify?


That Advanced ABX looks interesting, but so far it's been a pain. It requires the two steams to be in the same format and have the same length. In other words, I need to do most of the work of the technique I outlined above, then export the files, download a seperate program and use it. Why not just keep going with Audacity at that point? To be a full replacement it would need to align the waveforms internally and only switch between them during a center cross.

BTW, I tried that McGurk effect video and found that it didn't work on me. Perhaps it was because of the subtle timing errors in that particular example, but I heard it as ba no matter what he was mouthing.



I think I've discovered a new listening test method

Reply #14
I disagree with greynol's idea that your perceptions are somehow 'invalidated' simply because you have concious control of the source switch.

I said nothing of the sort.

Well, I don't want to get in a big argument about this - my main concern was to endorse the OP approach IF it's really true there's no existing easily-implemented way of seamlessly switching between audio sources for subjective comparison.

But you DID say any differences Jax184 perceived would be subject to expectation bias, and I assumed the implication was this would make his conclusions less meaningful. Which I don't agreee with. Apart from anything else, he could easily get a friend to do 'blind' source switching once he's established the basic principles.

As to the epistemological issues raised by perception of differences between current input and memory, I stand by my contention that memory is far less reliable, and should be factored out of any comparison process so far as possible. Obviously in some (often contrived or pathological) cases people tend to misinterpret what their own senses are telling them. But that's no reason to big up the role of memory, which I firmly believe is even less reliable.


I'm interested in this thread because I'd like to know if OP's method is technically sound (no pun intended!), and because it just seems odd to me if there isn't already a standard way of doing the seamless source switching. With disc space so cheap today, I don't really care how big my audio files are any more. But if I could prove to myself that I consistently hear the difference between 128kbps mp3 and lossless formats, I might want to identify my personal "equivalence threshhold" bitrate with a view to replacing inferior quality items already in my music library.

I think I've discovered a new listening test method

Reply #15
It doesn't really matter whether you're switching between original and encoded using buttons marked A and B, or a button marked mute.

What matters is that you demonstrate you can tell a difference when you don't know what you're listening to and that includes there being no way you could be unconsciously led to know which one you're listening to. A double-blind test.

This is entirely absent from your sighted A/B test.

It's not that easy to prove you hear a difference in a simple A/B test even if you do get a friend to help. If you think it through, it's far more work (and far less reliable) than a straightforward ABX test.

That said, a simple sighted A/B, sometimes with no switch glitch, is one of the first things I usually try. Or sometimes I loop the two different versions, switch the monitor off, turn the volume down (so I lose track of which is which), then turn it back up again later. If I think I can tell which one is which, and/or hear a difference, I'll do an ABX.

Cheers,
David.

I think I've discovered a new listening test method

Reply #16
Ahh, but that's what I'm trying to build! As I've said a few times now, this is the only method I am aware of which does not cause some form of distortion at the moment of transition. Nothing to inform the listener that the file being played has been changed, and nothing to mask the change in quality that might come with it. If you can point me in the direction of another piece of software which will take two versions of a track, put them into perfect alignment and switch between them with no pauses, changes in volume, pops, crackles, jitters or other artifacts introduced by the method, then I would welcome it as a far more straightforward method. But until then, this is still the only method I know of which can switch between two tracks in real time in such a clean manner that it can be used for blind testing.

I think I've discovered a new listening test method

Reply #17
Here, how about I include a practical example to make things a bit clearer. Here's 15 seconds of a Sarah McLachlan song. In the first example it switches twice between a VBR -0 stream and the CD audio. See if you can spot it.
http://www.jax184.com/projects/Mp3s/Buildi...R%20Toggle).wav

In this second example it switches between a CBR 56k stream and the CD audio.
http://www.jax184.com/projects/Mp3s/Buildi...k%20Toggle).wav

Spoiler (click to show/hide)


Notice how smooth the transition is? To my ears with a pair of Sennheiser HD-280s, there's no change at all in the first one, despite switching twice between different quality streams.

(PS, I hope I'm not violating any forum rules by linking to these)

I think I've discovered a new listening test method

Reply #18
I agree it would be nice to have an ABX tool with inaudible switching like this.

However, I don't use ABX like that (I always play from start for each click), so haven't tried advanced ABX to see if it works.

Cheers,
David.

I think I've discovered a new listening test method

Reply #19
I'm sorry, but it sounded to me like that's what you were saying as well. Care to clarify?

Allow me to add emphasis to the quote to which I responded:
As to the notion that you should be able to remember the audio quality of a sample for perhaps several minutes before you finally get to hear the alternative you want to compare it with...

With foobar2000's ABX, which is the most suitable test for checking encoders (as opposed to hardware-based ABX), the user can choose any clip of any duration and switch between the two versions of it.  Let's compare this with the method you currently have where the tester must be a different person from the testee and the tester chooses what and where the change is made without the testee being given the opportunity to hear any specific region in either version.  Until your method is made to be double-blind, perhaps exactly as what has been done with foobar2000's ABX comparator, it will not be as effective.

Ask any police officer, they'll tell you all about the 5 different unique stories that 5 different people who witnessed the same car crash will tell.

Tell them ahead of time that they will be witnessing an accident and will be given a video that perfectly captures it, any part of which they can replay any number of times they like and you'll see that your analogy quickly falls apart.

So far your only complaint with foobar2000 is a split-second glitch when transitioning.  Considering how many people report reliable results with this tool, many (most?) of whom probably never made use of the transitioning functionality, I think your complaint is quite minor.  Let's take a step back and actually consider the fact that any audible artifact can be pinpointed to an exact piece of audio.  It should then make sense to select just that point for testing by using the comparator's start and stop functionality.  If the audible artifact is something more general like the presence of a low pass filter, the same method can still be applied.  The piece of the sample selected for playback need not be longer than just a few seconds.

FWIW, I have dabbled with your method of subtracting an error signal in the past.

I think I've discovered a new listening test method

Reply #20
For a moment I was quite excited, but every time I switch tracks in Foobar's ABX during playback there's a very distinct popping sound. 1 in 3 times there's even a jitter. I have a hunch that Audacity is waiting for the waveform to cross the center before it applies the mute, since it always pulls it off without a pop.


I would be good to check this, as simply unmuting the difference track should be identical to a seamless switch in foobar2000's ABX component. So your method must be doing something different if it never produces an audible distortion, which by the way is perfectly expected, because the instant switch can introduce frequencies not present in either original track.

Quote
I've also found that it's much easier to detect an increase in quality from unmuting the "difference" track than a decrease in quality from muting it.


You're doing this sighted, so your observation is completely and utterly unreliable. Please see TOS 8. Your brain is much more powerful than you think it is.

I think I've discovered a new listening test method

Reply #21
Alright listen, this guy http://www.youtube.com/watch?v=WmIAJeaKQys already did what you're suggesting. I also saw someone else that inverted the lossless signal to compare to a lossy one using Audacity, so umm nothing new man

I think I've discovered a new listening test method

Reply #22
I don't see an error signal being mixed in and out to switch back and forth in that video, so no, that is not what Jax184 is doing.

I think I've discovered a new listening test method

Reply #23
For a moment I was quite excited, but every time I switch tracks in Foobar's ABX during playback there's a very distinct popping sound. 1 in 3 times there's even a jitter. I have a hunch that Audacity is waiting for the waveform to cross the center before it applies the mute, since it always pulls it off without a pop.


I would be good to check this, as simply unmuting the difference track should be identical to a seamless switch in foobar2000's ABX component. So your method must be doing something different if it never produces an audible distortion, which by the way is perfectly expected, because the instant switch can introduce frequencies not present in either original track.

I don't know why, but in the dozen or so songs I've tested so far I've never heard an artificial pop or a click or anything of the sort when unmuting the difference track, but I do clearly hear a change in quality if switching between two files with sufficient differences. As I said, I'm assuming Audacity is waiting for a center cross before applying/removing the mute to avoid false frequencies. I'll see if I can dig up any information on it.


I've also found that it's much easier to detect an increase in quality from unmuting the "difference" track than a decrease in quality from muting it.


You're doing this sighted, so your observation is completely and utterly unreliable. Please see TOS 8.


So I'm not permitted to say I can't hear something?

Your brain is much more powerful than you think it is.


I am autistic, I was abused as a child, I went through a severe depression in my teens, and had to reassemble myself into a person after it was all said and done. That's none of your business and not relevant to this topic, but I want you to know that I'm very very much aware of how fallible our minds are. I Know our perception is highly distorted. I know our thought processes are strange things. You don't need to tell me. I live with the proof every day. Please don't tell me what I think.

Back on topic. The fact that a test is sighted does not automatically discredit its results. Especially if the results are negative. I am not saying I have used a sighted test to find a difference which accepted wisdom says shouldn't exist. Quite the opposite! I thought there would be a small difference if a really detailed test could be set up, but I didn't find one. To borrow an audio term, I will fully agree that the signal to noise ratio of sighted audio tests is very low. But in this case the outcome was no signal at all. Even when I expected to find something I still found I could not. I Think that's a legitimate result.

But all this ignores something of an important point. As I've said something like 3 times now, this test does not need to be sighted. Obviously I knew the answers to the first few runs as I was assembling the process. But after that there's no need for sightedness. I have already asked people to listen to files like the ones I linked to above, which switch sources at multiple points, and to report if they could detect a change in audio quality at any point. So far no one has been able to spot the transitions in the high quality samples, while the low quality ones have produced correct answers. To me, that suggests I'm on to something. But since I know how hard it is to be sure of these things, and because I know I'm not a professional audio engineer, I came here to ask for help. So can we please stop arguing over what I heard during the process of putting the method together and move on to what others hear with it under more controlled conditions?

 

I think I've discovered a new listening test method

Reply #24
I wonder if the smoothness of Audacity is really that it does a soft transition (ramp up/down) when switching the difference track in and out.  If it turns out the mute transition takes a significant fraction of a second, that's messing with your memory more than a slight click.

Can you make audacity just switch between A and B tracks instead of muting one of them?  It seems like while adding the difference file back in works, it is prone to error in execution which could make some encoders/settings appear less transparent.