Hello,
Ive tested mp3 encoders today and I like to share my results.
I havent done spectrum analisys, only ABX listening tests.
Programs used for encoding:
1. dbpoweramp 14.2 (with fraunhofer 4.0.3-latest)
2. iTunes 10.5.2.11 (uses fraunhofer mp3)
3. Easy CD-DA 15 (uses some version of lame 3.98)
4. foobar 1.1.8 (with lame3.99-20111018)
5. Nero 11 (uses fraunhofer mp3 encoder)
*note: even if there are few fraunhofer encoders, they are not exactly the same, also I've tested only 2 tracks of hard/glam rock genre (band name: Extreme). Ive chose that music because it has nice rhythm, drums and guitars, has nice effects on vocals and also contains samples of acoustic guitars and harmonics-and i didnt chose those tracks because of their names neither because of the lyrics that they are containing. Also i wanted to test WMP12's mp3 encoder, but i didnt found a way to encode a flac file to mp3 with it, also Ive found out that the flac decoder filter for wmp is with lower quality than other players (ex. winamp, foobar).
The settings used:CBR 320kbps Joint Stereo, on itunes 10 ive checked
"filter frequencies below 10Hz" &
"smart encoding adjustments".
Filesize results:_________Extreme1_______Extreme2
Flac_______26.2MB________27.4MB
Lame3.99__8.68MB________9.34MB
CD-DA_____8.69MB________9.35MB
Nero11_____8.68MB________9.34MB
iTunes10___8.68MB________9.34MB
dbPower___8.67MB________9.33MB
*note: iTunes tends to add 1-2 seconds of silence at the beggining and/or end of the track
Winner: dbPoweramp with fraunhofer
Encoding speed results:Speed results were minimal in difference, the fastest was itunes, dbpoweramp was same or almost same as itunes, while the other encoders were few seconds late.
Winners: iTunes & dbPoweramp (both use fraunhofer)
ABX Listening Results:1. Easy CD-DAExtreme1
foo_abx 1.3.4 report
foobar2000 v1.1.8
2012/01/11 17:10:28
File A: D:\Hard rock\Extreme\1989 - Extreme\- 01 Little Girls.flac
File B: D:\Hard rock\Extreme\1989 - Extreme\cd-da\- 01 Little Girls.mp3
17:10:28 : Test started.
17:11:29 : 01/01 50.0%
17:12:12 : 02/02 25.0%
17:12:53 : 02/03 50.0%
17:13:26 : 02/04 68.8%
17:14:00 : 02/05 81.3%
17:14:05 : Test finished.
----------
Total: 2/5 (81.3%)
Extreme2
foo_abx 1.3.4 report
foobar2000 v1.1.8
2012/01/11 17:15:00
File A: D:\Hard rock\Extreme\1989 - Extreme\- 03 Kid Ego.flac
File B: D:\Hard rock\Extreme\1989 - Extreme\cd-da\- 03 Kid Ego.mp3
17:15:00 : Test started.
17:16:37 : 01/01 50.0%
17:16:49 : 02/02 25.0%
17:17:00 : 02/03 50.0%
17:17:15 : 03/04 31.3%
17:17:34 : 03/05 50.0%
17:18:42 : 03/06 65.6%
17:18:46 : Test finished.
----------
Total: 3/6 (65.6%)
Description:This is my first ever ABX test I think of it as a warm-up , all i remember that the difference was noticable and probably it was similar as lame3.99.
2. lame3.99-20111018Extreme1
foo_abx 1.3.4 report
foobar2000 v1.1.8
2012/01/11 17:26:48
File A: D:\Hard rock\Extreme\1989 - Extreme\- 01 Little Girls.flac
File B: D:\Hard rock\Extreme\1989 - Extreme\lame3.99-20111018\- 01 Little Girls.mp3
17:26:48 : Test started.
17:27:16 : 01/01 50.0%
17:27:36 : 02/02 25.0%
17:27:54 : 03/03 12.5%
17:28:12 : 04/04 6.3%
17:28:31 : 04/05 18.8%
17:28:33 : Test finished.
----------
Total: 4/5 (18.8%)
Extreme2
foo_abx 1.3.4 report
foobar2000 v1.1.8
2012/01/11 17:29:28
File A: D:\Hard rock\Extreme\1989 - Extreme\- 03 Kid Ego.flac
File B: D:\Hard rock\Extreme\1989 - Extreme\lame3.99-20111018\- 03 Kid Ego.mp3
17:29:28 : Test started.
17:29:51 : 01/01 50.0%
17:30:09 : 02/02 25.0%
17:30:23 : 03/03 12.5%
17:30:36 : 04/04 6.3%
17:30:49 : 05/05 3.1%
17:30:50 : Test finished.
----------
Total: 5/5 (3.1%)
Description:The Difference was very notable, especially on the first acoustic part (its in the first 15-20 secs of the song) of Extreme1 track and also on the on the vocals that had effects on the Extreme2 track. Absolutely the worst choice for compressing music to mp3.
3. Nero 11 (fraunhofer):Extreme1
foo_abx 1.3.4 report
foobar2000 v1.1.8
2012/01/11 17:19:26
File A: D:\Hard rock\Extreme\1989 - Extreme\- 01 Little Girls.flac
File B: D:\Hard rock\Extreme\1989 - Extreme\nero\- 01 Little Girls.mp3
17:19:26 : Test started.
17:20:23 : 01/01 50.0%
17:20:57 : 01/02 75.0%
17:21:31 : 01/03 87.5%
17:22:57 : 02/04 68.8%
17:23:31 : 02/05 81.3%
17:23:33 : Test finished.
----------
Total: 2/5 (81.3%)
Extreme2
foo_abx 1.3.4 report
foobar2000 v1.1.8
2012/01/11 17:24:15
File A: D:\Hard rock\Extreme\1989 - Extreme\- 03 Kid Ego.flac
File B: D:\Hard rock\Extreme\1989 - Extreme\nero\- 03 Kid Ego.mp3
17:24:15 : Test started.
17:24:37 : 01/01 50.0%
17:24:57 : 01/02 75.0%
17:25:18 : 02/03 50.0%
17:25:36 : 03/04 31.3%
17:25:56 : 04/05 18.8%
17:26:14 : 04/06 34.4%
17:26:17 : Test finished.
----------
Total: 4/6 (34.4%)
Description:Nero 11's fraunhofer encoder is a better choice than the lame versions used for testing. The result was very little noticable difference between the Flac's and the mp3's encoded with Nero11 .
4. iTunes 10.5.2 (fraunhofer)Extreme1
foo_abx 1.3.4 report
foobar2000 v1.1.8
2012/01/11 17:31:31
File A: D:\Hard rock\Extreme\1989 - Extreme\- 01 Little Girls.flac
File B: D:\Hard rock\Extreme\1989 - Extreme\itunes\- 01 Little Girls.mp3
17:31:31 : Test started.
17:31:54 : 00/01 100.0%
17:32:13 : 01/02 75.0%
17:32:29 : 02/03 50.0%
17:32:47 : 02/04 68.8%
17:33:01 : 03/05 50.0%
17:33:03 : Test finished.
----------
Total: 3/5 (50.0%)
Extreme2
foo_abx 1.3.4 report
foobar2000 v1.1.8
2012/01/11 17:33:34
File A: D:\Hard rock\Extreme\1989 - Extreme\- 03 Kid Ego.flac
File B: D:\Hard rock\Extreme\1989 - Extreme\itunes\- 03 Kid Ego.mp3
17:33:34 : Test started.
17:34:04 : 00/01 100.0%
17:34:16 : 01/02 75.0%
17:34:36 : 02/03 50.0%
17:34:49 : 03/04 31.3%
17:35:11 : 04/05 18.8%
17:35:14 : Test finished.
----------
Total: 4/5 (18.8%)
Description:iTunes mp3 encoder is very nice choice for encoding music. Althou its not absolutely best its sertanly better than lame. In the testings I was also able to hear minimal difference between the flac's and the mp3's encoded with iTunes.
5. dbPowerAmp (with fraunhofer)Extreme1
foo_abx 1.3.4 report
foobar2000 v1.1.8
2012/01/11 17:35:45
File A: D:\Hard rock\Extreme\1989 - Extreme\- 01 Little Girls.flac
File B: D:\Hard rock\Extreme\1989 - Extreme\dbpoweramp\- 01 Little Girls.mp3
17:35:45 : Test started.
17:36:12 : 00/01 100.0%
17:36:34 : 00/02 100.0%
17:36:56 : 00/03 100.0%
17:37:23 : 00/04 100.0%
17:37:44 : 00/05 100.0%
17:37:46 : Test finished.
----------
Total: 0/5 (100.0%)
Extreme2
foo_abx 1.3.4 report
foobar2000 v1.1.8
2012/01/11 17:38:19
File A: D:\Hard rock\Extreme\1989 - Extreme\- 03 Kid Ego.flac
File B: D:\Hard rock\Extreme\1989 - Extreme\dbpoweramp\- 03 Kid Ego.mp3
17:38:19 : Test started.
17:38:36 : 00/01 100.0%
17:38:50 : 01/02 75.0%
17:39:04 : 01/03 87.5%
17:39:18 : 02/04 68.8%
17:39:35 : 03/05 50.0%
17:39:36 : Test finished.
----------
Total: 3/5 (50.0%)
Description:dbPowerAmp's fraunhofer encoder is the absolute winner. I wasnt able to hear any difference between the flac's and the mp3's encoded with dbPowerAmp's fraunhofer encoder. Its definitely
"ears bit perfect" encoder.
*note about ABX testings: on the first track (little girls) i was able to hear the difference from 4 of the encoders from the acoustic guitar in the beginning of the song-somewhere in the first 15-20 seconds. On the 2nd track (kid ego) i was able to hear the difference from 4 of the encoders at the beginning of the song when appeared vocals with effects. During the main part of the song and electric guitars, all encoders sounded identical or almost identical.
For the test ive used:Core 2 Quad 2.5ghz, audigy platinum ex, audio system: panasonic sa-ak27 2x100W with superwoofer (its 8yrs old but it has quite good quality for home usage)
Ive used Foobar 1.1.8 with ABX component.
Conclusion:dbPowerAmp with fraunhofer makes the mp3s smallest in size and best in quality-I couldnt hear any difference between its mp3s and the flacs-also its fastest or next to itunes, itunes is also excelent althou not "ears bit perfect" as dbPowerAmp, Nero is fine too, while lame is something that even at Highest quality settings provides lower quality than the other mp3 encoders.
Thanks for reading.
I hate to say it but your testing was flawed. 5-6 trials for a single song isn't nearly enough. You should conduct the test and make at least 10 determinations as to which sone is which. 5 guesses is way too short as the results can be skewed. Increasing the sample number (i.e. how many times you pick which song is which) is necessary. People gave you links in your original post where you did absolutely no testing but it looks like you didn't fully read them.
I hate to say it but your testing was flawed. 5-6 trials for a single song isn't nearly enough. You should conduct the test and make at least 10 determinations as to which sone is which. 5 guesses is way too short as the results can be skewed. Increasing the sample number (i.e. how many times you pick which song is which) is necessary. People gave you links in your original post where you did absolutely no testing but it looks like you didn't fully read them.
If i do 10-15 checks on every mp3 itll take me a week of testing. Btw it makes me tired and i think the results will be more inaccurate. 5-6 checks is pretty fine.
That, and most of those percentages are so high as to actually
undermine your claims, not prove them.
Had you read the information on ABX tests properly (at all?), you would know that a percentage of 5% or less (a.k.a.
p-value < 0.05) is the minimum that is considered statistically significant,
i.e. indicative of an audible difference, rather than blind guessing—the latter of which might as well have been the case in most of your tests.
i think the results will be more inaccurate.
Proving that you cannot actually hear a difference, you mean? Good attempt at redefinition, though, I suppose.
5-6 checks is pretty fine.
No (http://www.hydrogenaudio.org/forums/index.php?showtopic=16295), it’s not (http://wiki.hydrogenaudio.org/index.php?title=ABX).
Well if you hear any difference in the sound once and than you repeat that 3 times and hear the same difference is pretty fine. Ive did 5-6 times.
5-6 checks is pretty fine.
17:35:45 : Test started.
17:36:12 : 00/01 100.0%
17:36:34 : 00/02 100.0%
17:36:56 : 00/03 100.0%
17:37:23 : 00/04 100.0%
17:37:44 : 00/05 100.0%
17:37:46 : Test finished.
You went 5 for 5 the wrong way . . . clearly if this happened you're not doing enough tests!
All of the results of all the posted could have just as easily been obtained through a coin toss.
I am not aware of any hard or fast rule, but at this point in time I would say you should be showing us tests with values of less than 1% if you want to be taken seriously.
5-6 checks is pretty fine.
17:35:45 : Test started.
17:36:12 : 00/01 100.0%
17:36:34 : 00/02 100.0%
17:36:56 : 00/03 100.0%
17:37:23 : 00/04 100.0%
17:37:44 : 00/05 100.0%
17:37:46 : Test finished.
You went 5 for 5 the wrong way . . . clearly if this happened you're not doing enough tests!
That happened because theres no difference between dbPowerAmp's mp3 and the flac lol
5-6 checks is pretty fine.
You went 5 for 5 the wrong way . . . clearly if this happened you're not doing enough tests!
That happened because theres no difference between dbPowerAmp's mp3 and the flac lol
If there is no difference, you should get 50/50, provided you do enough trails. You got 0/100 -> not enough trials.
Logically, if you can get 0/5 by chance, you can just as easily get 5/5 by chance. You need to do a lot more trials.
That happened because theres no difference between dbPowerAmp's mp3 and the flac
That happened because you got the wrong answer each time.
If there is no audible difference then you ought to expect something around 50 out of 100, but if you only do five trials it's just like flipping a coin and getting heads five times in a row which is very far from impossible.
lol
You're really not in any position to LOL. Other than to use this as and example to show people what
not to do, these results are useless.
Well if you hear any difference in the sound once and than you repeat that 3 times and hear the same difference is pretty fine. Ive did 5-6 times.
You still have not read about guessing vs. hearing, and how a lower number of trials means increases the likelihood that mere guesswork might suggest (to those unaware of standard ABXing procedures) the existence of differences that are not really there.
greynol’s metaphor is a common and very illustrative one: I might flip a coin five times now and avoid seeing the face of the Queen four times, but that wouldn’t do much to support my (hypothetical) claim to have the power to overthrow the monarchy; I’d have to do a lot better to be taken seriously.
Until you have taken the time to understand the purpose, basis (replicability
vs. randomness/‘luck’),
etc.), and (actual) procedure of an ABX test, don’t make pronouncements on how it can and cannot be used.
If there is no difference, you should get 50/50, provided you do enough trails.
You mean to say 50/100.
If there is no difference, you should get 50/50, provided you do enough trails.
You mean to say 50/100.
Opps, just realized I used A/B and A/(A+B) in alternative paragraphs
If you want more checks, make a math, double or triple the result of every test i had made and you have the 15 checks per file result. Obviously youll get the same result
If you want more checks, make a math, double or triple the result of every test i had made and you have the 15 checks per file result. Obviously youll get the same result
Obviously? No.
The only thing that is obvious is that you haven't actually performed enough trials per test set. If you're content that you have performed enough trials then you need to be content with the fact that your test fails to show a significant result and as such your claims about sound quality are worthless (and in violation of the Terms of Service to which you agreed to follow upon registering here).
My favorite part was ranking the encoders to see which one produced the smallest 320 kbps CBR file.
zmejceIf You want to see which codec was better than other You should perform ABC/HR test. Because ABX shows that there is an audible difference between the lossless and encoded samples. ABX log doesn't show us that codec A is better than other codec B.
You can download ABC/HR application from here http://ff123.net/abchr/abchr.html (http://ff123.net/abchr/abchr.html) and some short explanation http://ff123.net/64test/practice.html (http://ff123.net/64test/practice.html)
ABC/HR requires more time so it's reasonable to test only 3-4 codecs as much or at lower bitrate.
I will be glad to see your ABC/HR+ABX logs here with at least 5/5.
Example:
ABC/HR for Java, Version 0.53a, 22 July 2011
Testname:
Tester:
1R = ..\ABC-HR_bin\Sample08\Sample08_6.wav
2R = ..\ABC-HR_bin\Sample08\Sample08_1.wav
3R = ..\ABC-HR_bin\Sample08\Sample08_5.wav
4L = ..\ABC-HR_bin\Sample08\Sample08_3.wav
5L = ..\ABC-HR_bin\Sample08\Sample08_2.wav
6L = ..\ABC-HR_bin\Sample08\Sample08_4.wav
Ratings on a scale from 1.0 to 5.0
---------------------------------------
General Comments: Interesting sample. Equivalent of ''Linchpin'' sample.
---------------------------------------
1R File: ..\ABC-HR_bin\Sample08\Sample08_6.wav
1R Rating: 1.0
1R Comment:
---------------------------------------
2R File: ..\ABC-HR_bin\Sample08\Sample08_1.wav
2R Rating: 2.5
2R Comment: Bass drum is distorted as hell. Impulsive hi-hat is expanded in time.
---------------------------------------
3R File: ..\ABC-HR_bin\Sample08\Sample08_5.wav
3R Rating: 3.8
3R Comment: Not bad but plates are distorted.
---------------------------------------
4L File: ..\ABC-HR_bin\Sample08\Sample08_3.wav
4L Rating: 2.7
4L Comment: Wavy post-echo on each snare drum hit.
---------------------------------------
5L File: ..\ABC-HR_bin\Sample08\Sample08_2.wav
5L Rating: 4.4
5L Comment: mainly minor issues with bass drum (shaky sound) but it's not annoyning at all.
---------------------------------------
6L File: ..\ABC-HR_bin\Sample08\Sample08_4.wav
6L Rating: 2.9
6L Comment: similar issues to N4.
---------------------------------------
ABX Results:
Original vs ..\ABC-HR_bin\Sample08\Sample08_4.wav
5 out of 5, pval = 0.031
Original vs ..\ABC-HR_bin\Sample08\Sample08_3.wav
5 out of 5, pval = 0.031
Original vs ..\ABC-HR_bin\Sample08\Sample08_2.wav
5 out of 5, pval = 0.031
Original vs ..\ABC-HR_bin\Sample08\Sample08_5.wav
5 out of 5, pval = 0.031
Original vs ..\ABC-HR_bin\Sample08\Sample08_1.wav
5 out of 5, pval = 0.031
---- Detailed ABX results ----
Original vs ..\ABC-HR_bin\Sample08\Sample08_4.wav
Playback Range: 02.297 to 05.016
5:10:19 AM p 1/1 pval = 0.5
5:10:22 AM p 2/2 pval = 0.25
5:10:24 AM p 3/3 pval = 0.125
5:10:26 AM p 4/4 pval = 0.062
5:10:29 AM p 5/5 pval = 0.031
Original vs ..\ABC-HR_bin\Sample08\Sample08_3.wav
Playback Range: 00.000 to 06.061
4:58:14 AM p 1/1 pval = 0.5
4:58:25 AM p 2/2 pval = 0.25
4:58:30 AM p 3/3 pval = 0.125
4:58:50 AM p 4/4 pval = 0.062
4:59:07 AM p 5/5 pval = 0.031
Original vs ..\ABC-HR_bin\Sample08\Sample08_2.wav
Playback Range: 00.000 to 06.061
5:04:05 AM p 1/1 pval = 0.5
5:04:13 AM p 2/2 pval = 0.25
5:04:19 AM p 3/3 pval = 0.125
5:04:26 AM p 4/4 pval = 0.062
5:04:33 AM p 5/5 pval = 0.031
Original vs ..\ABC-HR_bin\Sample08\Sample08_5.wav
Playback Range: 00.000 to 06.061
4:55:35 AM p 1/1 pval = 0.5
4:55:42 AM p 2/2 pval = 0.25
4:56:08 AM p 3/3 pval = 0.125
4:56:15 AM p 4/4 pval = 0.062
4:56:21 AM p 5/5 pval = 0.031
Original vs ..\ABC-HR_bin\Sample08\Sample08_1.wav
Playback Range: 00.000 to 06.061
4:50:29 AM p 1/1 pval = 0.5
4:50:37 AM p 2/2 pval = 0.25
4:50:40 AM p 3/3 pval = 0.125
4:50:42 AM p 4/4 pval = 0.062
4:50:46 AM p 5/5 pval = 0.031
Thread split: Minimum number of required ABX trials (http://www.hydrogenaudio.org/forums/index.php?showtopic=92859)
Well, I really can't see why zmejce is proud of his results, when it's obvious he is guessing on most of them, except lame3.99 extreme2 setting, but there should be more trials to see if he isn't just lucky guessing in this one.
zmejce, at least 10 trials for every test! At least 10! No less! And noone said you have to finish them in one day! My ABX testings, when I wanted to know what minimum bitrate is transparent for me, lasted a month. With a much more styles of music, not just one.
Do it properly, or don't do it at all - you won't earn any fancy status here (or get to secret download section), except for being eejit for not listening to people who know what are they talking about.
First, zmejce, congratulations on having the testing mindset! That's a great start, but now you have to understand how to set up a proper test.
Btw it makes me tired and i think the results will be more inaccurate. 5-6 checks is pretty fine.
I understand, because that happens. But it's still essential to do more than 5-6 trials if you want results that are useful.
If i do 10-15 checks on every mp3 itll take me a week of testing.
Yes, it will. Unfortunately, you can't do this kind of thing quick & dirty; that is, testing multiple songs in multiple formats and multiple encoders. Make no mistake: it's a big project you took on.