Skip to main content

Topic: Public MP3 Listening Test @ 128 kbps - FINISHED (Read 145947 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.
  • guruboolez
  • [*][*][*][*][*]
  • Members (Donating)
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #100
But what you are saying here is that what we need is quantity to get the "real" proof, in other words, there were few participants ?

Participants, but also samples - or what I said before, experience. And even there, you won't get any real proof, or universal answer. The very best encoder on the world won't necessary please every single user. When people here will start using HELIX, reporting their good feelings and also their bad samples; when developers will start fixing those issues; then HELIX will for sure become a true alternative, or maybe the obvious choice for MP3 encoding at x bitrate. Trust is something that need a long time to grow. LAME is not the best MP3 encoder but the most tested and therefore the most trustable. LAME not better but simply safe (to a certain point).
Anyway, if HELIX really please some people here, I really suggest them to start using it. Their experience will be for sure interesting for all other possible users.
  • Last Edit: 26 November, 2008, 01:57:04 AM by guruboolez

Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #101

You should have brought in your peers (yourself too) to inflate the sample size (no. of participants), so that the magical black bars decrease in length


That's what I thought happens too, but it seems not to have had an effect: If you look at the first sample which had 39 listeners, the bars are about as long as the second sample which had 26 listeners, and definitely longer than the third sample which also had 26 listeners.


It does have an effect. I never said it is the only thing that influences the error margins.

  • halb27
  • [*][*][*][*][*]
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #102
...Unlike you, I don't see anybody defending LAME in this thread. ...  I don't think HELIX is currently as trustable as LAME. A possible collective experience may help to get a better vision of HELIX quality and flaws. This experience will make the pudding bigger and the proof clearer.

Well, as you can learn from recent posts there are some people feeling that there are posters here defending Lame in an inadequate way (though there is nothing to defend). Chance is high they wouldn't do something similar if Lame had come out clear on top. I am one of these who feel like that.
And you are one of those Lame defenders, and you do it in a way I really dislike. What you say isn't wrong, it's just killer statements which if taken seriously makes this test worthless.

It's true, and you can read it for instance in my posts in this thread, that such a test just contributes to the experience on encoders. It is one of the most objective contributions of a considerable amount of participants with higher demands on encoder quality who spent a considerable amount of time evaluating this. It's the average judgement of active HA members (and comparable people) on the samples tested. Not more. Not less.

You are trying to relativate Helix' result by throwing doubts on the way we can trust Helix, and on the other hand you try to give special merits to Lame because you think we can trust Lame more. This simply isn't fair. And it's even a bad argument, cause Lame 3.98 isn't Lame 3.97 and when going back in time we had significant changes in Lame technology when looking at the Lame history. Moreover what is this trust in Lame good for if for instance with Lame 3.97 the 'sandpaper problem' came up? We just should stick to the real experience we have with encoders. The trust speech without hard facts is the non-audio variant of the warm-fuzzy feeling speech.

I like the way AlexB talked about his judgement on Helix behavior on 3 samples which he didn't like. He says what he felt, but in a way which respects the results of the test (which is the judgement of all the participants).

If we look at the test results IMO we can conclude the following for practical purposes:

a) the overall outcome of the encoders averaged over all the samples doesn't give any hint which encoder to use

b) the detailed outcome of the encoders on the individual samples gives some hints which encoder to use:

b1) iTunes and Lame 3.97 aren't attractive candidates for encoding (things can look different in case those samples where these encoders perform weakly are not very relevant for the individual choosing the encoder)

b2) Lame 3.98, Helix, and FhG are all good candidates to use. Which encoder is 'best' is personal and can partially be answered by figuring out which samples are individually most relevant and looking at these encoders' outcome on these samples. Best is backing things up by additional personal tests with favorite music. Not mentioning non-audio quality related topics which are relevant too for encoder choice, but in a very individual way.
  • Last Edit: 26 November, 2008, 04:19:13 AM by halb27
lame3995n -Q0.5

  • halb27
  • [*][*][*][*][*]
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #103
I have posted some ABX logs and samples of tracks that shows Helix's major flaws.

We know now that Helix has major flaws for you with metal, so chance is high that this is relevant to other metal lovers too. It is also backed up by the test where Helix shows its worst behavior with metal.
  • Last Edit: 26 November, 2008, 04:35:35 AM by halb27
lame3995n -Q0.5

  • Synthetic Soul
  • [*][*][*][*][*]
  • Global Moderator
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #104
I think guru has answered most of the nonsense far better than I ever could, but I can't let the statements below go without some comment.

I like the part where test result (quality and encode speed) should raise the popularity of Helix, but instead people try to proof that Helix is bad in their test, while the others blame Helix for not support gaplessness.
I second that.Nobody complained about the samples or a potential bias they might give to some encoders before the test.
A listening test's outcome is seriously influenced by the samples used (and the degree the participants are sensitive towards the issues with them).
It's incredulous to me that you think that we may sing the praises of Helix but not mention any of the cons.  It is obvious, following the result of this test that members are going to be drawn to Helix: we have had people suggesting that it become the new HA recommendation solely from the results of this test, and also members stating that it is proved better than LAME.  I think that it is important for members to consider the reality, pros and cons.

As for you halb27, are you complaining that we are not complaining enough or too much?  Should we start complaining about the samples or bias?  Is this even relevant to your point?


...Unlike you, I don't see anybody defending LAME in this thread. ... I don't think HELIX is currently as trustable as LAME. A possible collective experience may help to get a better vision of HELIX quality and flaws. This experience will make the pudding bigger and the proof clearer.
Well, as you can learn from recent posts there are some people feeling that there are posters here defending Lame in an inadequate way (though there is nothing to defend). Chance is high they wouldn't do something similar if Lame had come out clear on top. I am one of these who feel like that.
And you are one of those Lame defenders, and you do it in a way I really dislike. What you say isn't wrong, it's just killer statements which if taken seriously makes this test worthless.
I find this attack most strange.  How many LAME users are there compared to Helix users?  Which encoder do you think has had the most testing?  Or do you feel that these fourteen samples are enough to usurp the thousands of tracks that LAME users have thrown at LAME?

I just don't get it.

I don't think that you should see it as LAME fanboys people shooting down Helix for no reason; I would rather see it as users who have shown a fresh interest in Helix attempting to make an informed decision.
I'm on a horse.

  • halb27
  • [*][*][*][*][*]
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #105
...As for you halb27, are you complaining (about sample bias) that we are not complaining enough or too much?  Should we start complaining about the samples or bias?

I didn't complain, and IMO nobody should do (or he should have done it when it was about sample selection if there had been some concerns). What I'm trying to say is: We should take the test as it is. There is a tendency in this thread by some posters that sound like lowering the Helix results. This isn't good. Look at for instance Helix' behavior on metal. Ít is reflected in the test. And it's okay to provide additional warnings on this from people who have experience in this field. But am I over-sensitive when I feel the way it's done has a tendency to bring down Helix in a more general way? May be I am, but that's what I feel about it. And it looks like I'm not the only one.
BTW I personally don't use Helix (I'm personally considering converting from Lame to FhG), but I can't see the catastrophe when Helix gets some attraction. After all it seems to be a good encoder (okay, not so much for metal and hard rock). Maybe some warning should be given that we can't expect any Helix development (guru gave this hint already) and have to take Helix as is.
But I don't expect further FhG stereo mp3 development as well. I don't care when I'm happy with what I got. Brings even some relief not having to care about new versions. We can be happy with Lame being developed further, but we can be happy with Lame 3.98. mp3 development has pretty much reached its good end, as shown in this test.
  • Last Edit: 26 November, 2008, 05:29:07 AM by halb27
lame3995n -Q0.5

  • Gabriel
  • [*][*][*][*][*]
  • Developer
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #106
In case you are interested, here is a quick and dirty "quality distribution" across the samples:


Would it be possible for you to include this graph within the results page?

Btw, question for the audience: What are the relative speeds of FhG/Helix/Lame ?
edit: sorry, speed is already mentioned within the test results
  • Last Edit: 26 November, 2008, 05:57:45 AM by Gabriel

Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #107
Sure, can be done when I get home.

  • Synthetic Soul
  • [*][*][*][*][*]
  • Global Moderator
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #108
...but I can't see the catastrophe when Helix gets some attraction. After all it seems to be a good encoder (okay, not so much for metal and hard rock). Maybe some warning should be given that we can't expect any Helix development (guru gave this hint already) and have to take Helix as is.
I haven't seen anybody suggesting that it is catastrophic.  I think the attention has been very positive on the whole.  Very positive, given that many people had probably never heard of or tested the damn thing!

For my part I think that Helix's results are of great interest.  So much interest that I considered running some tests of my own; however, I really enjoy the fact that I can play LAME MP3s gaplessly with foobar, when my Creative Nano fails at this I find it really jarring.  The fact that Helix cannot currently do this natively (please, no-one bother pointing out Canar's tongue in cheek suggestion) is a major drawback in my eyes.  I'm not saying that it is not a minor fix.

I am not one of these members that is willing to encode every track with various encoders at various settings to see which makes a better job of it.  I decided upon LAME -V5 a couple of years back and I stick with it.  That's not to say that I can't change, but I don't have the time to be so picky when encoding new albums.

Now, that is not to say that Helix will never be a contender.  It is open source, and improvements can be made, if anyone cares to undertake it.

I'm very much in favour of some positive attention to Helix - as you rightly point out, the more the merrier - but I'm not in favour in glossing over its failings just because it's in vogue in November 2008.
  • Last Edit: 26 November, 2008, 06:27:13 AM by Synthetic Soul
I'm on a horse.

  • sizetwo
  • [*][*][*]
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #109
Derived from this test and the consequent forum postings, this is what I have learned about listening tests:

1: Whichever encoder(s) comes out on top of a test does not indicate that it is a superior encoder at that bitrate, regardless, as in this case, if its a tie.
2: The samplesize and participants need be increased as well as some form of participant knowledge on audiocompression and what to listen for (artifacts).
3: The samples selected for a test will never be enough to make an encoder "safe", in other words we will not  be able to know that the samples are representative for the various types of music one would imagine to compress.
4: The test results should be interpreted in a highly subjective manner, as everyone seems to interpret the results differently.
5: The final outcome is for most people to end up saying "test for yourself", thus negating the empirical evidence we can draw from such a test, and ultimately making it rather pointless, other then saying that people should stay away from the low anchor.

The question remains, how then can a test be deviced so that it can yield results that are in fact conclusive and create a form of intersubjective opinion regarding the prefered codec at a specific bitrate ?

Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #110
The problem which a lot of people do not understand is that you cannot generalize the results by saying "Encoder x is the best" when a finite number of participants test a finite number of samples at a certain bitrate.
If you have an average hearing and are listening to all the types of music that were covered in this test, you could actually choose any of the contenders with regards to quality. If you only listen to metal, you would put the encoders that performed best at metal on your list. What these public listening tests actually serve for is to let you narrow down the encoders you should consider for starting your own tests. Then you start to cut more and more encoders from your list depending on whether you need fast speed, gapless playback, support for platforms like Linux or Mac, etc. and in the end, you come up with one encoder that is best suited for your individual needs. I hope you get my point.
  • Last Edit: 26 November, 2008, 07:16:13 AM by Sebastian Mares

  • halb27
  • [*][*][*][*][*]
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #111
...5: The final outcome is for most people to end up saying "test for yourself", thus negating the empirical evidence we can draw from such a test, and ultimately making it rather pointless, other then saying that people should stay away from the low anchor.

The question remains, how then can a test be deviced so that it can yield results that are in fact conclusive and create a form of intersubjective opinion regarding the prefered codec at a specific bitrate ?

It's true that the overall outcome averaged over all samples doesn't say much especially if the results are tíed. But looking at the detailed results every reader can get results for any level of personal effort he is willing to put into interpreting the results.
IMO it's like this (keep in mind it's only about interpreting the results of the test):

a) for the take-it-easy-people people not struggling about details:
Helix is best as it achieved good results with any sample. It's rather closely followed up by first Lame 3.98 and second FhG surround which show a weakness (of minor to modest degree) on only 1 sample.  iTunes and Lame 3.97 are quite a bit behind having both 3 weaknesses (one of them being of higher degree).
From the test organization a warning can be helpful that deciding things this way may lead to suboptimal decision as personal relevance of the samples is not taken into account.

b) for the more caring people giving some effort to result interpretation but avoiding own listening tests:
Concentrate on those samples which are meaningful to you (which are roughly your kind of music) and ignore those samples which have no or nearly no relevance to you. Look at the outcome of the various encoders for this sample selection and pick your favorite.
From the test organization this procedure can be supported by giving more detailed information about the samples (genre(s) in the first place), as not every reader will listen to the samples (which however should be highly recommended cause otherwise the reader doesn't know what he's reading about).

c) for the very caring people allowing for own listening tests procedure b) can be a start and eventually make things easier as it can exclude certain encoders from consideration.

In case several encoders are getting candidates for personal use this way: don't worry, enjoy the choice in it's own right (and you always have the choice to go the b) or c) way in case you're coming from a level above).
  • Last Edit: 26 November, 2008, 07:51:24 AM by halb27
lame3995n -Q0.5

  • uart
  • [*][*][*][*][*]
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #112
1: Whichever encoder(s) comes out on top of a test does not indicate that it is a superior encoder at that bitrate, regardless, as in this case, if its a tie.


This is called "statistical significance" and is a very important part of making judgements about which is better in  cases where there is an element of randomness or  uncertainly (aka variance) in measurement. There is well developed statistical theory that analyses the difference in the means (averages) in relation to the variance of the scores and the number of samples and determines whether the observed difference is likely to be a result of chance or whether it is more likely that it is due to a genuine difference in the nature of the things being measured. Loosely speaking these two cases correspond to "no significant difference" or a "significant difference" respectively.

What people really mean here when they say the scores are "tied" is that the cold hard statistical mathmatics says that the differences are not statistically significant. Essentually this just means that the underlying randomness of the data set means that is unrelaible to assume that there is a real difference.
  • Last Edit: 26 November, 2008, 10:47:25 AM by uart

  • Alex B
  • [*][*][*][*][*]
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #113
I think it would be good to quote Pio2001's valid comment here:

... Oh, and Greynol is right. Helix is not  winner. It is tied. The differences are within the confidence intervals, which means that they are just random. If you redid the test with the same samples and same listeners, the simple fact the ABC/HR presents them in a different order every time would probably lead Lame, or Fraunhofer, or iTunes to get a slightly, but not significantly, superior score.

We must consider this to be chance, unless we have more information to backup further claims.


To better understand the results I am going to start sample specific discussion threads - one for each sample.

The first two are here:

http://www.hydrogenaudio.org/forums/index....showtopic=67562
http://www.hydrogenaudio.org/forums/index....showtopic=67561
  • Last Edit: 26 November, 2008, 09:02:32 AM by Alex B

  • greynol
  • [*][*][*][*][*]
  • Global Moderator
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #114
b1) iTunes and Lame 3.97 aren't attractive candidates for encoding (things can look different in case those samples where these encoders perform weakly are not very relevant for the individual choosing the encoder)

This is nonsense.    Who's to say using 14 different samples would have given the exact same outcome?  If they had then maybe you'd be right but there aren't more samples.  If the difference between samples where Lame 3.98 scored consistently and significantly higher than Lame 3.97 was due to a known defect of Lame 3.97 that has been corrected in Lame 3.98 ("sandpaper problem"), then possibly.  Perhaps a class of samples exist that show weaknesses new to Lame 3.98.  This is not beyond the realm of possibility considering that we've seen regression in Lame's CBR method with at least one documented sample between 3.93 and 3.98, though one sample does not a class make.

Based on the test results the candidates were all tied.  There is not enough statistical evidence to suggest the sound quality of any are more attractive than any other, period, end of discussion.
  • Last Edit: 26 November, 2008, 11:56:13 AM by greynol
Your eyes cannot hear.

  • guruboolez
  • [*][*][*][*][*]
  • Members (Donating)
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #115
I'm quoting halb27:

Well, as you can learn from recent posts there are some people feeling that there are posters here defending Lame in an inadequate way (though there is nothing to defend). Chance is high they wouldn't do something similar if Lame had come out clear on top. I am one of these who feel like that.
Of course, and that's perfectly normal. When a general consensus is confirmed, there's no debate. But when the same consensus is broken by a new element (test, proof, theory) then the pertinence of the latter is subject to strong debate. Take an example. A scientific would find a new proof that earth turn around the sun: the scientific community won't put real attention to this new proof. Another scientific would bring a test proving that heliocentrism is wrong… and guess what will happen. You see a bias where there's simply a very common attitude.

“What you say isn't wrong, it's just killer statements which if taken seriously makes this test worthless.”
So what I say is not wrong but you refuse to accept it because it makes the test worthless?! I said this result is "a lead" and "a brick" to a bigger building. No more and certainly not less. I don't call this "worthless".

“and on the other hand you try to give special merits to Lame because you think we can trust Lame more. This simply isn't fair. And it's even a bad argument, cause Lame 3.98 isn't Lame 3.97 and when going back in time we had significant changes in Lame technology when looking at the Lame history. ”
This argument looks dishonest to my eyes. LAME 3.98 is an improvement, not a radically different piece of code. A new release won't break the confidence people have on an encoder just because parts of the code changed. People trust LAME in general, Vorbis in general, MPC, FLAC, x264, Xvid in general... and not a single and past version of it. LAME is trustable since years ; LAME 3.98 quality didn't start from scratch ; with no surprise several people are trusting and using the last version of the encoder. HELIX/Real wasn't trustable for years, and I don't see giving a special merit to LAME when I say that a single listening test won't make Helix as trustworthy as LAME considering the different history they have.

“Moreover what is this trust in Lame good for if for instance with Lame 3.97 the 'sandpaper problem' came up”
I case you forgot it, the sandpaper issue occured on very specific occasions and the overall progress of LAME 3.97 over 3.96 was massive enough (specially with VBR at mid -bitrate range) to prefer that most recent version. I've posted several listening tests on LAME 3.97 beta few years ago (in which the artefact you described was discovered).

“b) the detailed outcome of the encoders on the individual samples gives some hints which encoder to use:

b1) iTunes and Lame 3.97 aren't attractive candidates for encoding ”


So long on HA.org and still unable to read a listening test?!
ALL ENCODERS ARE TIED. HELIX is as good as iTunes according to this test. If you refuses it then you're implicitly admitting some limitation of collective listening tests.
  • Last Edit: 26 November, 2008, 12:22:05 PM by guruboolez


  • Neasden
  • [*][*][*]
  • Banned
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #117
I've never seen such a hot debate. It's pretty cool. Perhaps the discussion of how things were in the past and we were there and saw LAME crawl to its majesty is useless now.
  • Facts
Helix hasn't been tuned in 3 full years.
LAME latest tuning is from the last 3 months.
Helix is encoding at 90x. I got 30x in my PC probably because of hardware limitations. But it's OK.
LAME encoding here is no more than 12x. And this bothers me.
Helix performed a bit better than LAME in this test.
LAME is showing weaknesses at 128 kbps (this could be with this set of samples, we don't know)

This is just the tip of the iceberg that already started to bother the crowd.

Can you imagine if Helix had been developed and tuned? Would it have been surpassed LAME in light-years?

I guess this discussion does not end here, I see a lot of analytical people trying to make a point. Everyone's got their point, and I think we should deepen this investigation, make another test, perhaps a different listening test with more people and a vast amount of samples to "end this discussion".
  • Last Edit: 26 November, 2008, 12:53:43 PM by Neasden

  • Big_Berny
  • [*][*][*]
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #118
So long on HA.org and still unable to read a listening test?!
ALL ENCODERS ARE TIED. HELIX is as good as iTunes according to this test. If you refuses it then you're implicitly admitting some limitation of collective listening tests.

Well to be 100% correct you can't say that HELIX is as good as iTunes. This is not 'proven' by the test (you'd have to test the beta error instead the alpha error). But since the differences between the two aren't significant you also can't say that HELIX is better as the difference MAY BE (!) random.

So what we can say (as conservative scientists) is: We can't be sure that there's a difference in quality between the different encoders. Nothing more.

Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #119
yay, this is exciting!!
seems like i'll have to run a few tests myself since i didn't check on FhG or Helix for a very long time. and it also seems like i underestimated their performance/progress in development...
10 FOR I=1 TO 3:PRINT"DAMN":NEXT

  • greynol
  • [*][*][*][*][*]
  • Global Moderator
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #120
Helix performed a bit better than LAME in this test.
No, it didn't.

LAME is showing weaknesses at 128 kbps (this could be with this set of samples, we don't know)
How so?

To point out the impotency of your analysis, based on Sebastian's colored graph, Lame 3.97 performed the best on the greatest number of samples (it appears to be tied with Fraunhofer on sample 10).  The point is that you have to look at the totality of the test and understand something about statistics.  Those vertical bars in the chart summarizing the results are there for a reason and it appears that you have no idea how to interpret them.

Would it (Helix) have been surpassed LAME in light-years?
Quite possibly not.
  • Last Edit: 26 November, 2008, 01:21:52 PM by greynol
Your eyes cannot hear.

  • Soap
  • [*][*][*][*][*]
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #121
and I think we should deepen this investigation, make another test, perhaps a different listening test with more people and a vast amount of samples to "end this discussion".

We?  Let us give Sebastian Mares credit - this was largely a one-man show. 
We?  How about you?  Organize a new test if you like.  Don't beg the collective audience to do the work for you.
More People?  I'm sure if Sebastian had a magic wand there would have been more people involved - but even with a Slashdotting and repeated extensions there were only a limited number of participants.
Creature of habit.

  • Dingo_RG
  • [*]
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #122
Neasden said:

"Helix hasn't been tuned in 3 full years"

"Can you imagine if Helix had been developed and tuned? Would it have been surpassed LAME in light-years?"

-----------------------------------------------------------

Excellent point, exactly my thoughts...

With the results from the test anyone could conclude that in general, Helix is a good encoder, performing excellent there...

There are two main flaws in Helix encoder, one regarding to audio quality with metal music; and the other regarding to gapless.

Well, Helix is open source... there is a good challenge for the software developers and beta testers from HA to fix these two issues and tuning Helix to its maximum capacity.

  • guruboolez
  • [*][*][*][*][*]
  • Members (Donating)
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #123
So what we can say (as conservative scientists) is: We can't be sure that there's a difference in quality between the different encoders. Nothing more.

Exactly. Or at "We can't be sure that there's a difference in quality between the different encoders for this set of samples and for the participants etc..."

To put the debate on statistical difference and on the practical side of the graph, I created a fake one in which I add as competitor a lossless encoding. It's not quite perfect as the confidence error margin would change a bit but I don't think a true graph would really look different:



LAME 3.98 and Helix are statistically tied to any lossless format.
What's the point of this? Simply imagine any lossy competitor at higher bitrate (it could be LAME -V2, MPC standard or any other idol): it would appear on this graph a bit below my virtual lossless contender. Then what should people conclude? That Helix ~130 kbps is as good as LAME ~200 kbps but is also much faster and much smaller. What the hell LAME developers did during these years? Why people on HA.org are so conservative and don't immediately switch to this encoder which even competes with lossless.

Am I clear enough? The first, immediatate and indubitable conclusion of this test was first made by Sebastian Mares: it's the last time MP3 at 128 kbps will be tested (by him). They're too close to transparency to reach other conclusions. From this test you can build the most foolish recommandations, including that LAME and Helix are a substitute to any lossless format. This is what the test would say. I'm not caricaturing things and it's not even an aberration: the evidence that a group of listeners would be OK with several MP3 implementation at around 130 kbps is there. I find it nice – much nicer than the useless debate below. It's not the conclusion I dreamt about but I would thank Sebastian to bring HA.org (which is sometimes a bit elitist) to a conclusion million people reached by themselves in the world. MP3 at 128 kbps is often good even with the fastest encoders.

Now individual users are different from a group and people won't replace their lossless collection by an helix or lame at 130 kbps just because a test said it's safe to do it. We don't blindly obey to listening tests.
  • Last Edit: 26 November, 2008, 01:53:23 PM by guruboolez

  • [JAZ]
  • [*][*][*][*][*]
Public MP3 Listening Test @ 128 kbps - FINISHED
Reply #124
Can you imagine if Helix had been developed and tuned? Would it have been surpassed LAME in light-years?


* Helix (including its former and current incarnation) has been developed for around 10 years (ok, make that 7~8 if we accept that the last modification was in 2005).

* Helix has always been developed by companies, and full-time workers.

* The original goal of Xing (helix's parent) was speed. Back then, the claim was: "it is 8 times faster than current encoders". And it was true!

* During the later days of Xing, development focused on quality (moved from i/s stereo to m/s stereo, allowed full bandwidth encoding instead of usually filtering at around 16Khz, improved in the VBR department..)

* When Helix was born, as part of a whole new attempt of Real Networks to embrace the open source community (Helix DNA, Helix server, Helix player... ), the Helix mp3 encoder was further tuned and developed with quality in mind, while preserving its speed (For Real it was good to have a fast encoder).


In constrast:

* LAME has been developed for 10 years.

* LAME's development has always been a work of volunteers, sometimes, a single person.

* The original's (1.0) original goal of LAME was to be an mp3 encoder for the Amiga pc's. That implied speed.
The actual original (2.0) goal of LAME is quality.
As such, LAME was based on the official dist10 reference MP3 encoder, and improving the methods as to get a better quality.
This got further remarked when LAME developers tuned the encoder using fraunhofer's output as reference.

* LAME has always received both, speed and quality improvements, taking quality as most important. GOGO took speed as most important.

* During the last years of LAME development, the changes have been focused on new models, tweak behaviour shown in certain killer samples and overall standards compliance. This translates that in fact, the development didn't advance much, but it did, as the test shows.


In the end, it is not strange for me to see Helix's behaviour. I may have found strange that Itunes showed Helix's behaviour.

About the test results, I will just repeat what's the consensus: They are tied. Closed point.
All them have weaker and stronger areas.