Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Randomizing file names (Read 7199 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Randomizing file names

As you may know, comparing various lossy files (by ABXing each of them against the lossless file or something else) can be subject to some bias, if you know which of the lossy files you're testing.
You might have a preference for one lossy format or encoder and so you might (subconsciously) do the tests differently.

There's a way to eliminate this potential bias, by randomly renaming the lossy files.
Of course, at the end, you also need to know which file is which.
For this purpose, I found a simple script (Windows) that does just that: http://www.howtogeek.com/57661/stupid-geek...in-a-directory/
Put some files into the folder with the script, run the script and you will get: renamed files + a txt file that tells you which file is which, so you can check when you're finished.

To then use those files, it helps if they're of the same size/duration/metadata (easier with simple CBR files like WAV), so that Explorer (or whatever file manager you're using) doesn't give you any hints.
Even with different file sizes, you can select the icon view in Explorer, so that unless you hover over a file for a second or two, you won't see the details. You can then Ctrl+A on the files, unselect the ones you don't need and add the selected ones to Foobar without getting any extra information. You'll see what I mean when you try it in practice.


I found this an effective way to randomize files for blind testing on your own.
But if there's an easier way, let me know.

There's at least one thing that would improve this, though: copy the randomized files to clipboard. This would eliminate the need to select the files carefully, since they could simply be pasted into Foobar. Is there a way to add this to the script?

Randomizing file names

Reply #1
I thought ABXing meant you had no idea what file was which anyway?


Randomizing file names

Reply #3
The goal here is to compare a variety of different kinds of lossy encoding against the original, one at a time. Each time, the original is one of the two in the comparison. ABX software prevents one from knowing which is the original, which is the encoding.

HOWEVER, ABX software does not prevent one from knowing which lossy encoding was selected for the test each time. This later is the problem the OP wishes to overcome and has attacked with random file renaming.


Randomizing file names

Reply #5
As AndyH-ha explained, this is for doing several separate ABX tests.

Just a quick practical example: you want to determine which is (more) transparent at 150kbps, MP3 or Vorbis. So you do an ABX test with the MP3 against the lossless and then another ABX test with Vorbis against the lossless.
If you know whether you're ABXing MP3 or Vorbis, you could be (subconsciously) biased and perform the tests differently. Randomly renaming the files first prevents you from knowing this.
(It would make sense to convert the MP3 and Vorbis to WAV before randomizing.)


Randomly renaming the files could also be used for simply playing back the files on their own (no ABX) to subjectively evaluate which file 'sounds better'.
ABC/HR mentioned above is more sophisticated for this kind of task, but it requires a dedicated tool or an external tester.

 

Randomizing file names

Reply #6
To me the idea of being more transparent is like the idea of being more pregnant. Either something is or it isn't.  If you're trying to rank something then ABX is the wrong tool.

Randomizing file names

Reply #7
Let me put it this way, then: some non-transparencies (or pregnancies) are easier to notice than others. In this sense, I think ABXing can be used for ranking.
And if in the process of separate ABXing of two lossy samples you notice that one of them is transparent and the other isn't, you have a clear winner anyway.

Randomizing file names

Reply #8
To me the idea of being more transparent is like the idea of being more pregnant. Either something is or it isn't.  If you're trying to rank something then ABX is the wrong tool.


More visibly pregnant, I guess? Problem is, the original A may be indistinguishable from B may be indistinguishable from C, yet C might not be transparent. “ABX and ACX” could be helpful then.

Randomizing file names

Reply #9
More visibly pregnant, I guess? Problem is, the original A may be indistinguishable from B may be indistinguishable from C, yet C might not be transparent. “ABX and ACX” could be helpful then.

In this case would you say that B is "more transparent" than C (which simply is not)?
Or, put in another way, if a x quality level encoding of a track is proven to be transparent, a x + Dx level (same track, codec and encoder) is more transparent?
... I live by long distance.

Randomizing file names

Reply #10
Has less audible differences? Less “intransparencies”? (Sounds like a euphemism from politically correct American speek ...)

Point: If you can tell A from C, but not A from B nor B from C, that indicates an ordering of ABC or CBA.

Randomizing file names

Reply #11
Originally what Porcus said sounded different from the situation where C is transparent to A but can be distinguished from B which is also transparent to A, assuming both are lossy and A is the lossless source.  Either way it is solved by performing ABX on one lossy codec at a time which still doesn't do any better at ranking what is only a binary state.

Why not just use the correct tool which automatically does all the heavy-lifting for you?  It isn't like one doesn't exist.

Randomizing file names

Reply #12
At the risk of repeating myself or stating the obvious, I think we're talking about two different tasks here:

1. which sample is harder or impossible to ABX (aka "sounds more like the lossless")
and
2. which sample sounds better (even without the lossless to compare it against)

The highest ranking sample in an ABC/HR test might not be the hardest to ABX. Yet, the latter might be what one wants.

This is why I think ABC/HR is mostly used for lower bitrates, where audible artifacts are noticeable or at least expected, while ABXing is used when testing for transparency.

Randomizing file names

Reply #13
Lol, do I have to risk repeating myself too?

You may rank any way you want just as you can load any samples you like.  The reason high bitrates are avoided has to do with getting useful information from public tests which would just as well exist with ABX-style tests.

I'd also like to know if anyone here has ever personally been encumbered by knowing which codecs were being tested while performing an ABX test.

Randomizing file names

Reply #14
Frankly, I don't know what your point is.

Is "sample A was easy to ABX, while sample B was difficult/impossible to ABX" not useful information?


I'd also like to know if anyone here has ever personally been encumbered by knowing which codes were being tested while performing an ABX test.

Oh, come on. We're on HA, sub-forum Listening Tests.
I would expect us all to be familiar with the placebo effect and with taking proper precautions to avoid any possible bias.

Randomizing file names

Reply #15
To the best of my recollection I know of no issues raised about being biased based on knowing what was being tested...until now.  ABC/HR and ABX already take care of that as you were already told.  You were also either directly or indirectly told that ABX should not be lossy vs. lossy by at least two of us.

My point is that your needs can be met with ABC/HR better than with ABX.  This is even taking each and every one of your "concerns" into account.  This includes one being less transparent than another.

I personally don't see any problems with your trying to reinvent the wheel other than you are trying to reinvent the wheel.

If it makes any difference I've tested near-transparent (read: transparent to untrained ears) codecs professionally. The style of test used: MUSHRA not ABX.  IME, the best way to bias/unbiased/train/inform yourself is to work on a codec at a time varying bitrates on various samples. Research the codecs to find thier weaknesses in order to choose the best samples to hone in most efficiently. Otherwise let ignorance be bliss.

Randomizing file names

Reply #16
Alright, got it. I don't think it's reinventing the wheel, it's not meant to replace ABC/HR, MUSHRA or anything else. It's a specific method for specific tests and results. Whether you find it of any use or you prefer another method it's up to you.

For the record, though: I never suggested ABXing lossy vs. lossy, so I don't know why you'd mention that.
And the problem of expectation bias regarding ABX tests (when you know which encoder/bitrate you're ABXing) has been brought up before here on HA, it's not my original idea.

Randomizing file names

Reply #17
1) I didn't raise the issue of the problems of lossy vs. lossy.

2) I don't view concerns about sample bias at transparent or near-transparent bitrates with any real seriousness.  I'll believe people do better in an ABX test due to bias when I see it and have a hard time seeing the possibility of doing worse as a bad thing other than it's reminiscent of a fairly common placebophile argument.  I also don't think it's unreasonable to dismiss as trivial concerns about bias against codecs using settings that are obviously not transparent with codecs that are known to give sub-par results at these settings while also testing more modern codecs that are generally considered to do well in comparison (e.g. 96kbit. Mp3 vs. 96kbit aac). Biases against specific modern codecs in favor of other modern codecs? Use MUSHRA.

3) "Ease" (or lack thereof) of ABX-ing could easily be due to fatigue, or correctly guessing. For reasons like these, P-values are not going to be a reasonable gauge of definitive quality, unless they demonstrate the ability to distinguish one lossy codec but not for another consistently.  I will concede that one codec can be easier to test compared to another, with both giving passing results, in which case the time it took could be used as an indicator for someone else reviewing the results. But why not rank one 5 or 10 points higher (on a 100-point scale) in a MUSHRA test???

If this doesn't move you then we will have to agree to disagree.  From here on out my comments are for the benefit of others who might not know that there are already perfectly good methods to accomplish your end goal as you stated it.  So long as I feel I can contribute useful information to the discussion, I will continue to do so.