Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: A little VBR ~88 kbps ABX-test (Read 31379 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

A little VBR ~88 kbps ABX-test

[span style=\'font-size:11pt;line-height:100%\']Some background information[/span]

A couple of weeks ago we had a minor debate about the quality of the modern lossy encoders at a Finnish AV forum. My plan was to provide the forum users an opportunity to test their beliefs and hearing by providing a few test samples and instructions how to use the foo_abx tool.

I prepared five samples of different genres from my collection. I thought the selected samples would be a bit above average in complexity - nothing like killer samples, but not too easy for the encoders. I encoded the samples using three different quality levels and two encoders: Vorbis b4.5 (-q 1.5, -q 4.25 and -q 6.25) and LAME 3.97b1 (-V8, -V5 and -V2; all --vbr -new). The idea was to explain something like this:
1. the lowest quality is useful e.g. with portables, but not actually "hifi" quality
2. the middle quality is very good and finding differences is not easy
3. the highest quality is transparent or almost transparent.

My plan didn't work out. After ABX testing the lowest quality samples I realized that my samples are going to be way too easy for the encoders at the higher quality levels. For example, one of the local audio gurus who accepts only lossless files tried to ABX one of the samples. He could ABX MP3 -V8 and didn't like it, but the Vorbis -q 1.5 sample made him almost angry because he couldn't ABX it. He used his high-end speakers instead of headphones, but this tells something about the Vorbis quality anyway. There was no sense to continue the test with the higher quality samples.

I had no plans to publish any results here, but because of the ongoing debate about the 128 kbps test here is a nice example:


[span style=\'font-size:11pt;line-height:100%\']The test sample[/span]

hot_tequilla_brown.flac (genre: ~electronic/pop/funk (?), 21 s, 2.63 MB)

- This sample produces 91-96 kbps (my overall test target was ~88 kbps for this quality level).

The tested lossy files are available in this package: lossy_samples.zip (767 kB)


[span style=\'font-size:11pt;line-height:100%\']My ABX results[/span]


[span style=\'font-size:11pt;line-height:100%\']LAME 3.97 beta1 -V8 --vbr-new ~94 kbps[/span]
Code: [Select]
foo_abx v1.2 report
foobar2000 v0.8.3
2005/11/13 00:01:32

File A: file://E:\test\Monkey's Audio 3.99 High\hot_tequilla_brown.ape
File B: file://E:\test\LAME 3.97 beta 1 -V8 --vbr-new\hot_tequilla_brown.mp3

00:01:35 : Test started.
00:02:04 : 01/01  50.0%
00:02:09 : 02/02  25.0%
00:02:17 : 03/03  12.5%
00:02:29 : 04/04  6.3%
00:02:35 : 05/05  3.1%
00:02:41 : 06/06  1.6%
00:02:51 : 07/07  0.8%
00:03:04 : 08/08  0.4%
00:03:08 : Test finished.

 ----------
Total: 8/8 (0.4%)
LAME was easy to ABX because of the obvious lowpass. I think -V8 is too low setting for anything that contains high frequencies. (Though, in general my high frequency hearing is not excellent. I can't ABX a lowpass over 16 kHz)

LAME at -V5 was much better with this sample. It sounded fine in casual listening, but I didn't ABX it.


[span style=\'font-size:11pt;line-height:100%\']Vorbis aoTuV beta 4.5 -q 1.5 ~96 kbps[/span]
Code: [Select]
foo_abx v1.2 report
foobar2000 v0.8.3
2005/11/13 00:27:18

File A: file://E:\test\Monkey's Audio 3.99 High\hot_tequilla_brown.ape
File B: file://E:\test\Vorbis aoTuV beta 4.5 -q1,5\hot_tequilla_brown.ogg

00:27:21 : Test started.
00:32:10 : 01/01  50.0%
00:32:50 : 02/02  25.0%
00:33:14 : 02/03  50.0%
00:33:27 : 03/04  31.3%
00:34:26 : 04/05  18.8%
00:35:00 : 05/06  10.9%
00:35:12 : 06/07  6.3%
00:35:25 : 07/08  3.5%
00:37:55 : 08/09  2.0%
00:38:09 : 09/10  1.1%
00:38:29 : 10/11  0.6%
00:38:48 : Test finished.

 ----------
Total: 10/11 (0.6%)
Very difficult to ABX. It took me 5 minutes to find a passage where I could possibly hear a difference. Vorbis was almost transparent with this sample. In casual listening I couldn't hear any problems.


[span style=\'font-size:12pt;line-height:100%\']Window Media Audio[/span]

Today I tested WMA standard with the same sample. I had never properly tried WMA at VBR quality 25.

[span style=\'font-size:11pt;line-height:100%\']WMA 9.1 Standard VBR25 ~91 kbps[/span]
Code: [Select]
foo_abx v1.2 report
foobar2000 v0.8.3
2005/11/28 17:38:04

File A: file://E:\test\Monkey's Audio 3.99 High\hot_tequilla_brown.ape
File B: file://E:\test\WMA9.1 STD VBR25\hot_tequilla_brown.wma

17:38:06 : Test started.
17:39:10 : 01/01  50.0%
17:39:42 : 02/02  25.0%
17:40:24 : 03/03  12.5%
17:40:42 : 04/04  6.3%
17:41:14 : 05/05  3.1%
17:42:03 : 06/06  1.6%
17:42:44 : 07/07  0.8%
17:43:14 : 07/08  3.5%
17:44:11 : 08/09  2.0%
17:47:26 : 08/10  5.5%
17:47:44 : 08/11  11.3%
17:49:04 : 09/12  7.3%
17:49:46 : 10/13  4.6%
17:50:37 : 11/14  2.9%
17:53:24 : 12/15  1.8%
17:54:03 : 13/16  1.1%
17:54:34 : 14/17  0.6%
17:54:54 : 15/18  0.4%
17:58:55 : 16/19  0.2%
17:59:53 : 17/20  0.1%
18:00:01 : Test finished.

 ----------
Total: 17/20 (0.1%)
This was not easy, but I could hear the difference at a certain passage, mostly because of the slight lowpass and perhaps a bit narrower stereo width. But after the first seven tries my ears got tired and I had difficulties to ABX. I wanted to be sure and continued through 20 trials. It took over 20 minutes. In general I couldn't hear any obvious problems. (I didn't expect VBR25 to be this good. Am I becoming deaf?) Since I tested Vorbis two weeks ago I cannot directly compare these two codecs.


[span style=\'font-size:7pt;line-height:100%\']The test gear used: Terratec DMX 6fire 24/96 soundcard,  Harman/Kardon AVI 200 MKII amp, KOSS HV/1A headphones.

Edit: typo
Edit 2: changed the lossless sample to FLAC format
Edit 3: added the lossy samples[/span]

EDIT 4:
I removed the samples to make room. PM me if you like to try them.

A little VBR ~88 kbps ABX-test

Reply #1
nice hint,
I second it, and repeat so one of my suggestions to the multiformat test:

Lower averaged target bitrate of music of various genres from 128k down to 10x or even 9x kbit/s. People testing will have hard work still, consider even the number of samples and formats.

This idea is somehow logical, as encoders made progress over the years, so the magic 128k area for "CD-quality" is lower these days..., especially for Joe Average.

And, thinking more, testing in these bitrates suitable for portables, we should collect comparable test results not only for our HA formats ogg aotuv, aac (in those 2 variants), mp3 lame, mpc (in past, though it should be tested against the modern encoders also), but also wma, and maybe wma-pro, but with lower priority, because wma-pro hasn't much hardware support yet and it is windows only anyway, or is there mac, linux support ?
Polls can be taken as support for decisions, but I don#t think, polling makes sense regarding selecting contenders for listening tests. As we don't have pressures on time, we should setup tests, so that there is logic in them. And we should more think in long time terms meanwhile, as we see the a little bit already the end of lossy format developments (regarding squeezing qualitywise even more out of given stereo bitrates). So, tests should contain something, that tests of modern encoders can be compared to older tests (of previous encoders of same format), otherwise it is even more difficult to say something to development of  a format.

A little VBR ~88 kbps ABX-test

Reply #2
OK, finally! It's pretty hard to find any ABX comparator for mac... But once I got a perl-script running, I'm very confident that I can hear a difference also with the Vorbis file (8/8). It's pretty good though, and I'd say "perceptible but not annoying"...

Code: [Select]
a  Playing file A...
b  Playing file B...
x  ********Choosing/Playing an X...
B  Vote for B logged... currently 1/1
x  ********Choosing/Playing an X...
A  Vote for A logged... currently 2/2
x  ********Choosing/Playing an X...
b  Playing file B...
a  Playing file A...
b  Playing file B...
x  Playing the X again...
A  Vote for A logged... currently 3/3
x  ********Choosing/Playing an X...
A  Vote for A logged... currently 4/4
x  ********Choosing/Playing an X...
B  Vote for B logged... currently 5/5
x  ********Choosing/Playing an X...
B  Vote for B logged... currently 6/6
x  ********Choosing/Playing an X...
B  Vote for B logged... currently 7/7
x  ********Choosing/Playing an X...
B  Vote for B logged... currently 8/8

A little VBR ~88 kbps ABX-test

Reply #3
I came here from the 128 kbit pre-test discussion, so it was of some interest to test it at -q4 also. This level is very close the original, and I'd rate it very close to "imperceptible".. Abx: 14/16

Code: [Select]
x  ********Choosing/Playing an X...
A  Vote for A logged... currently 1/1
x  ********Choosing/Playing an X...
x  Playing the X again...
A  Vote for A logged... currently 2/2
x  ********Choosing/Playing an X...
B  Vote for B logged... currently 3/3
x  ********Choosing/Playing an X...
B  Vote for B logged... currently 4/4
x  ********Choosing/Playing an X...
A  Vote for A logged... currently 5/5
x  ********Choosing/Playing an X...
A  Vote for A logged... currently 6/6
x  ********Choosing/Playing an X...
x  Playing the X again...
b  Playing file B...
XUnkown key 'X'.
B  Vote for B logged... currently 7/7
x  ********Choosing/Playing an X...
A  Vote for A logged... currently 8/8
x  ********Choosing/Playing an X...
B  Vote for B logged... currently 9/9
x  ********Choosing/Playing an X...
B  Vote for B logged... currently 10/10
x  ********Choosing/Playing an X...
A  Vote for A logged... currently 11/11
x  ********Choosing/Playing an X...
A  Vote for A logged... currently 11/12
x  ********Choosing/Playing an X...
b  Playing file B...
x  Playing the X again...
A  Vote for A logged... currently 12/13
x  ********Choosing/Playing an X...
A  Vote for A logged... currently 13/14
x  ********Choosing/Playing an X...
x  Playing the X again...
B  Vote for B logged... currently 14/15
x  ********Choosing/Playing an X...
A  Vote for A logged... currently 14/16
All done! ABX results: 14/16

A little VBR ~88 kbps ABX-test

Reply #4
Quote
But after the first seven tries my ears got tired and I had difficulties to ABX. I wanted to be sure and continued through 20 trials.
Just a note, ABX trials are only statistically valid if you either say "I will do x trials" before the ABX test and stick to it or hide the results until the end of the test.

A little VBR ~88 kbps ABX-test

Reply #5
I'm still amazed that there is such a huge difference between the average joe and people with some artifact training. My normal reaction would be: 1.5? That's horrible quality! No need to ABX something like that.

So I went ahead and tested a random sample of my own with Lancer 20051121, which includes the current aotuv tunings at -q 1.5. It sounded much better than expected (with speakers at least).

ABX with headphones was easy due to the overall "mushyness" and lack of highs: 8/8.
The sample in question was an excerpt from "Jake Walton - Seven Gurdies".

ABXing the hot_tequilla_brown sample was a little more difficult: 8/8. I focused on the lowpass mostly. But amazing quality nevertheless. I don't think I would have been able to do it with speakers. I also expected pre-echo to be much worse.

I guess this is a testament to aoyumi's fine work and a reminder why we need to backup our claims, which we otherwise would take for granted, with blind tests.

A little VBR ~88 kbps ABX-test

Reply #6
Quote
Just a note, ABX trials are only statistically valid if you either say "I will do x trials" before the ABX test and stick to it or hide the results until the end of the test.[a href="index.php?act=findpost&pid=346820"][{POST_SNAPBACK}][/a]


You are right. Better would have been to stop the test at 10 and continue with a new test after a rest break for seeing if the possibly changed physical condition and practice can make difference.

For my defense I'd like to add that I wasn't in the best shape for test. I wanted to include WMA in my report, but I was tired after a long workday and I found it difficult to concentrate. Too bad for science that hearing is not an absolute thing that doesn't change from time to time. In this case it changed during the test. At first I was sure about the difference and the results were similar. Then I lost my concentration and started guessing. After a break and a coffee cup (illegal doping) I could hear the difference again.

Perhaps the test was meaningful just for myself. I think I proved (to myself) that I can hear the difference and the difference is small enough to be difficult to hear if I am not in the best possible shape.

[span style='font-size:7pt;line-height:100%']Edit: typo[/span]

A little VBR ~88 kbps ABX-test

Reply #7
Vorbis does sound really good at lower bitrates, but the HF boost / coarse sound is still there - Q4 is fixed, but at Q2 its definately there most of the time.

A little VBR ~88 kbps ABX-test

Reply #8
Quote
I'm still amazed that there is such a huge difference between the average joe and people with some artifact training. My normal reaction would be: 1.5? That's horrible quality! No need to ABX something like that.

So I went ahead and tested a random sample of my own with Lancer 20051121, which includes the current aotuv tunings at -q 1.5. It sounded much better than expected (with speakers at least).

ABX with headphones was easy due to the overall "mushyness" and lack of highs: 8/8.
The sample in question was an excerpt from "Jake Walton - Seven Gurdies".

ABXing the hot_tequilla_brown sample was a little more difficult: 8/8. I focused on the lowpass mostly. But amazing quality nevertheless. I don't think I would have been able to do it with speakers. I also expected pre-echo to be much worse.

I guess this is a testament to aoyumi's fine work and a reminder why we need to backup our claims, which we otherwise would take for granted, with blind tests.
[a href="index.php?act=findpost&pid=346828"][{POST_SNAPBACK}][/a]


Thanks.

Would you mind to try the WMA sample?

I guess I should have not mentioned about the lowpass. When you start listening specifically to it you kind of lose the complete picture. A slight lowpass is often masked with other elements and difficult to realize.

A little VBR ~88 kbps ABX-test

Reply #9
Quote
Vorbis does sound really good at lower bitrates, but the HF boost / coarse sound is still there - Q4 is fixed, but at Q2 its definately there most of the time.[a href="index.php?act=findpost&pid=346846"][{POST_SNAPBACK}][/a]

Now it's my turn to be picky. Have you ABXed Vorbis aoTuV b4.5 or 4.51 at -q 2?

A little VBR ~88 kbps ABX-test

Reply #10
Quote
I'm still amazed that there is such a huge difference between the average joe and people with some artifact training. My normal reaction would be: 1.5? That's horrible quality! No need to ABX something like that.

[...]

I guess this is a testament to aoyumi's fine work and a reminder why we need to backup our claims, which we otherwise would take for granted, with blind tests.
[a href="index.php?act=findpost&pid=346828"][{POST_SNAPBACK}][/a]


Exactly my thoughts too before and after. 

A little VBR ~88 kbps ABX-test

Reply #11
Quote
Thanks.

Would you mind to try the WMA sample?

I guess I should have not mentioned about the lowpass. When you start listening specifically to it you kind of lose the complete picture. A slight lowpass is often masked with other elements and difficult to realize.
[a href="index.php?act=findpost&pid=346857"][{POST_SNAPBACK}][/a]

I didn't read all your comments about the individual codecs. But you are right, if the original were allready lowpassed, things would get a lot more difficult.

Wma sounds just like I would expect such a low bitrate encoding to sound (contrary to the good performance of vorbis). Strong lowpass and what remained of the highs was metallic and squishy. Fastest ABX ever; 11 seconds to get 8/8.

Mp3 was even worse: lower lowpass than wma with some pre-echo. 8/8

On the abc/hr scale I would rank ogg around 4 (perceptible, but not annoying). Wma and mp3 would get something < 2 (Annoying), with mp3 being a little lower.

A little VBR ~88 kbps ABX-test

Reply #12
Gecko,

What equipment you used?

I am afraid that my HW chain may produce distortion in headphone listening.

A little VBR ~88 kbps ABX-test

Reply #13
Quote
Quote
Vorbis does sound really good at lower bitrates, but the HF boost / coarse sound is still there - Q4 is fixed, but at Q2 its definately there most of the time.[a href="index.php?act=findpost&pid=346846"][{POST_SNAPBACK}][/a]

Now it's my turn to be picky. Have you ABXed Vorbis aoTuV b4.5 or 4.51 at -q 2?
[a href="index.php?act=findpost&pid=346863"][{POST_SNAPBACK}][/a]


I did with Aotuv 4. Haven't tried 4.5

A little VBR ~88 kbps ABX-test

Reply #14
Quote
Gecko,

What equipment you used?

I am afraid that my HW chain may produce distortion in headphone listening.
[a href="index.php?act=findpost&pid=346877"][{POST_SNAPBACK}][/a]

Foobar -> Terratec Aureon Sky (with Prodigy drivers) -> Beyerdynamic DT 880

A little VBR ~88 kbps ABX-test

Reply #15
I guess I am in better shape now.

This time I tried to find out if Vorbis is different from WMA. Before starting the test I decided to do 20 tries.

Vorbis -q 1.5 vs WMA VBR25
Code: [Select]
foo_abx v1.2 report
foobar2000 v0.8.3
2005/12/01 16:22:24

File A: file://E:\test\hot_tequilla_brown.ogg
File B: file://E:\test\hot_tequilla_brown.wma

16:22:25 : Test started.
16:23:02 : 01/01  50.0%
16:23:18 : 02/02  25.0%
16:23:27 : 03/03  12.5%
16:23:35 : 04/04  6.3%
16:23:42 : 05/05  3.1%
16:23:50 : 06/06  1.6%
16:24:00 : 07/07  0.8%
16:24:14 : 08/08  0.4%
16:24:20 : 09/09  0.2%
16:24:26 : 10/10  0.1%
16:24:30 : 11/11  0.0%
16:24:35 : 12/12  0.0%
16:24:41 : 13/13  0.0%
16:24:48 : 14/14  0.0%
16:24:58 : 15/15  0.0%
16:25:03 : 16/16  0.0%
16:25:11 : 17/17  0.0%
16:25:16 : 18/18  0.0%
16:25:21 : 19/19  0.0%
16:25:28 : 20/20  0.0%
16:25:29 : Test finished.

 ----------
Total: 20/20 (0.0%)

No problems to ABX. Vorbis was clearly better. I can confirm Gecko's findings.

Also, I didn't hear any "headphone distortion" that would have masked artifacts. I think my gear is fine after all. The problem I had with ABXing WMA was caused by other factors.


[span style=\'font-size:7pt;line-height:100%\']Edit: codebox[/span]


A little VBR ~88 kbps ABX-test

Reply #17
Incredible!

I never thought I'd be unable to hear the difference at such a low bitrate;  I'm thinking about reencoding my whole collection to ogg now -- maybe at -q 4 or so.  Are there any problem samples I could check with, to see if I hear a difference at that type of bitrate?

It was easy to tell for mp3 and wma, though. 19/20 and 12/12

 

A little VBR ~88 kbps ABX-test

Reply #18
It's amazy sample. Even itunes v6.0.1.3 at 128 kbit/s VBR 136 kbit/s (aprox. 155 kbit/s real bitrate  for this sample) wasn't  transparent. Well maybe it's trasnparent. But I could hear artefacts and noise on human voices in both channels during ABX test.

Code: [Select]
ABC/HR Version 1.1 beta 2, 18 June 2004
Testname: itunes vbr 136   vs original .      Tequila

1R = C:\TEST48\15_tequila\itunes 128vbr 136.wav


1 of   1, p = 0.500
 2 of   2, p = 0.250
 3 of   3, p = 0.125
 4 of   4, p = 0.063
 4 of   5, p = 0.188
 5 of   6, p = 0.109
 6 of   7, p = 0.063
 7 of   8, p = 0.035
 8 of   9, p = 0.020
 9 of  10, p = 0.011
10 of  11, p = 0.006
11 of  12, p = 0.003
11 of  13, p = 0.011
12 of  14, p = 0.006
13 of  15, p = 0.004
14 of  16, p = 0.002
15 of  17, p = 0.001
16 of  18, p < 0.001
FINISHED

---------------------------------------
General Comments:

---------------------------------------
ABX Results:
Original vs C:\TEST48\15_tequila\itunes 128vbr 136.wav
   16 out of 18, pval < 0.001