HydrogenAudio

Lossy Audio Compression => MP3 => MP3 - General => Topic started by: Dibrom on 2002-02-11 23:36:08

Title: a response to a growing rumor...
Post by: Dibrom on 2002-02-11 23:36:08
Normally I wouldn't attempt to address an issue in this manner, but since it is getting a bit out of hand, and usually on boards I'm not participating in (or have little desire to participate in), I'll try and address it officially, once, in the place where it should be the most relevant.

The matter I'm discussing is related to the --alt-presets and their handling of the "stereo image".

There have been some completely unsubstantiated reports and rampant speculation going on in a few threads which I will list below:

1. http://66.96.216.160/cgi-bin/YaBB.pl?board...&num=1013124809 (http://66.96.216.160/cgi-bin/YaBB.pl?board=general&action=display&num=1013124809)
2. http://www.digital-inn.de/showthread.php?threadid=8212 (http://www.digital-inn.de/showthread.php?threadid=8212)
3. http://www.hydrogenaudio.org/forums/showth...s=&threadid=759 (http://www.hydrogenaudio.org/forums/showthread.php?s=&threadid=759) (I simply hadn't gotten around to responding to this thread though its on this board).

At any rate, I'll try to make a few points as clearly as I can.

1. All of the --alt-preset VBR modes are tuned for "stereo image".

2. All of the vbr presets provide better sound quality via joint stereo than LAME on it's own with joint stereo, and in some cases should even sound better than with --nssafejoint, while at the same time providing a lower bitrate.

3.  The --alt-presets do not, by design, make any sacrifice in regards to stereo image to keep bitrate down.  Anyone who tells you this has no idea what they are talking about.  I should know since I actually wrote the code and designed the presets.

4.  An extremely high degree of stereo frames is not always needed to achieve good sound quality.  I challenge anyone who believes that --alt-preset standard has poor stereo seperation, on a common basis (as a few unsubstantiated claims imply), to provide me with direct evidence of this.

5.  Joint stereo is needed even at bitrates of 320kbps to achieve the best sound quality in some critical cases.  Forcing stereo on everything up to 320kbps and then forcing joint stereo does not fix the problem (as user implies in one of those threads).  I've tried this before.

6.  There seems to be a misconception that all that the --alt-presets improve on are pre-echo.  This is sorely mistaken.  Indeed they do improve on pre-echo and impulse handling to a fairly large degree, but they also improve upon:

- joint stereo handling (serioustrouble is a prime example)
- dropout prevention (2nd_vent_clip is a prime example)
- fluttering (gekkou is a prime example)
- knocking (velvet is a prime example)
- ringing (bloodline is a prime example)
- noise pumping (piano, rach_original, etc, are examples)
- rasping (present with noise shaping 2 on some clips like fatboy, or on clean vocals sometimes.  Mostly eliminated, even on the most critical samples, with --alt-presets)

And that's just the stuff I can think of off the top of my head.

Now, that's not to say the --alt-presets are perfect.  I certainly know they aren't.  But they also don't have some massive flaw in regards to stereo image which is present to the degree some people imply.  In fact, the only case I've seen which I put any credence in is the few isolated cases which Wombat has found (and provided samples for I might add).  I will eventually attempt to address these few samples, but note that these are exceptional cases, not common cases, and as far as I can tell, they are completely unrelated to the other complaints being made.  This is especially so since Wombat doesn't describe the artifact as being a collapse of the stereo field (which isn't your typical joint stereo artifact in LAME anyway...).

At any rate, I'm always looking to improve things if I can, but claims must be substantiated which includes providing abx results (which are then verified by other parties) and providing test samples, preferrably multiple ones if you are implying a problem with general behavior.

Not to come across arrogant, but for the most part, I'm the only one who truly understands the workings behind the --alt-preset specific tunings.  Not even the other developers have followed my work (though that's by their choice, not mine).  The code is available for all to see, but so far I have not seen anyone attempt to reimplement my modifications or to discuss them with me on a technical level.  So unless you see someone who is closely related to the work I've done (ie, they have participated in testing, JohnV for example) stating something, or you see me stating something directly about the presets, then chances are whoever is discussing the presets doesn't have the full picture.  This is especially true when people begin discussing how the --alt-presets work internally or technically, and especially in relation to joint stereo.

If you see a discussion on another board about these issues, please point people to this thread.  If you have a question, please ask me here, you'll likely get a much more correct answer in addition to helping to keep questions about this issue centralized and concise (which will help when the FAQs are created).  Speculation is not only wasteful, but it also helps to propogate misinformation such as the old "joint stereo is bad" line of thinking.
Title: a response to a growing rumor...
Post by: rc55 on 2002-02-11 23:52:40
Dibrom,

I'd say it'd be a good idea to keep this topic either sticky and/or locked.... Unless Beatles has anything to say!!!

Also... perhaps you could make a facility for submitting difficult samples with ABX data or whatever.


Ruairi
Title: a response to a growing rumor...
Post by: Dibrom on 2002-02-12 00:03:17
Quote
Originally posted by rc55
I'd say it'd be a good idea to keep this topic either sticky and/or locked.... Unless Beatles has anything to say!!!


Perhaps I'll make it sticky.. we'll see.

Quote
Also... perhaps you could make a facility for submitting difficult samples with ABX data or whatever.


Probably a good idea.  Unfortunately, I've been so busy lately that I haven't really been able to move the site in the direction I've been hoping for.  I've been working on a remedy for that though, but it might still be a ways off.  I'd really like to make this site more functional and user friendly though so stuff like what you suggested is certainly in mind
Title: a response to a growing rumor...
Post by: ff123 on 2002-02-12 01:53:39
It's a many-headed hydra.  Just as soon as one can show that a particular individual setting is inferior to --alt-preset standard (such as -q0 -V0 -b160 --athlower 1 lowpass 20.5), another one pops up with an additional tweak ("This one gets close to or is as good as aps, which sounds 'brash' to me, maybe even better because aps uses joint-stereo, which must logically degrade the stereo image; and what's more, I didn't even try hard, just used the default settings with a couple of tweaks -- aps must be in need of improvement!").

In one thread, I took two easy-to-hear samples, showing just two different types of artifacts on which --alt-preset standard is superior to the command line in question.  But rumors die hard -- since I didn't listen to samples which might possibly have stereo separation problems (although I am willing to upload reasonably short samples to my page -- no multi-hundred megabyte files, please), the argument goes, aps may be inferior in that regard.

I think the best way to kill off inferior command lines is to test them one by one.  But perhaps there should be some record which documents each death.  And for those that refuse to die, just keep adding samples

ff123
Title: a response to a growing rumor...
Post by: Dibrom on 2002-02-12 05:03:34
Quote
Originally posted by ff123
It's a many-headed hydra.  Just as soon as one can show that a particular individual setting is inferior to --alt-preset standard (such as -q0 -V0 -b160 --athlower 1 lowpass 20.5), another one pops up with an additional tweak ("This one gets close to or is as good as aps, which sounds 'brash' to me, maybe even better because aps uses joint-stereo, which must logically degrade the stereo image; and what's more, I didn't even try hard, just used the default settings with a couple of tweaks -- aps must be in need of improvement!").


Indeed.  One of the biggest problems also is that there are no real controls in the comparisons that other people are making.  For example, they are testing largely on non-critical samples (being defined as those which L.A.M.E. and other mp3 encoders are known to have problems on), not enough of these samples, not enough people are verifying their results, and I'd dare say that often there are not enough "sensitive" listeners participating.  Really, all of that is pretty much an understatement..

Quote
In one thread, I took two easy-to-hear samples, showing just two different types of artifacts on which --alt-preset standard is superior to the command line in question.  But rumors die hard -- since I didn't listen to samples which might possibly have stereo separation problems (although I am willing to upload reasonably short samples to my page -- no multi-hundred megabyte files, please), the argument goes, aps may be inferior in that regard.


Perhaps another problem is that people give to much credit to unsubstantiated claims.  I'm very skeptical myself, as are some other people on this board, mainly because there have just been so many cases where the problem turned out to be non-existant, or when verification was called for, the person just disappeared.  I guess I just prefer to take a "guilty till proven innocent approach" (where "guilty" is a non-existant problem), and so I naturally require proof of claims.  A lot of people don't seem to take this approach though.

The bottom line that people need to realize is that there can be no fixing of a problem if it cannot be substantiated.  You can't improve upon something if you don't see it's flaws.  This means that if someone is complaining about something, they must provide evidence of what they are describing or it's basically useless to everyone interested in real progress.

Quote
I think the best way to kill off inferior command lines is to test them one by one.  But perhaps there should be some record which documents each death.  And for those that refuse to die, just keep adding samples


This actually sounds like a good idea.  To go further, I've thought about compiling a package of test samples where L.A.M.E. and other MP3 encoders have trouble with, but where --alt-preset standard does very well.  For each new command line someone thinks they can come up with that is superior (which really isn't possible without code modifications... but oh well), they can test against these samples.  If there is a significant enough score in favor of their line (somehow taking into account quality, size, and the quality/size ratio or "efficiency") vs --alt-preset standard, then perhaps there is some merit to the other line and it warrants further investigation.  This would really provide an easier way to verify the results quickly and efficiently.  The list of which lines were inferior (along with release date) could then go in a FAQ of sorts.
Title: a response to a growing rumor...
Post by: mithrandir on 2002-02-12 05:54:06
The --alt-presets are some of the best improvements that have ever been incorporated into LAME.
Title: a response to a growing rumor...
Post by: PatchWorKs on 2002-02-12 09:48:11
Well, sincerly i can't understand why LAME have so mutch parameters. I think it could move to an "OGGDrop-like" interface: just set the bitrate (or the quality) and the application automatically choose the best settings.
Keep things simply... and leave many options you like for developers !
Title: a response to a growing rumor...
Post by: cadabra3 on 2002-02-12 10:13:05
As a Psychologist (forgive me ahead of time); I see this kind of thing all the time- someone puts X number of hours into something and almost immediately people start looking for flaws- real or not. They feel they must devalue someone elses work to validate themselves. Dibrom says he makes no claim his work is perfect- he never did- but it sure sounds good to these old ears when I use it and listen to the results.
I can't write code- I wish- and just try to soak in as much knowledge as I can from this site- but thanks to everyone who contributes (especially Dibrom, of course). Dibrom, joke 'em if they can't take a f--k!! To the critics- Jealousy and envy are wasted energy- if you find a problem- document it and post it. Sorry to ramble- just tired of seeing people who work so hard at something so creative get criticized for going 'outside' the circle and working on their own and getting ripped for it.
Title: a response to a growing rumor...
Post by: JohnV on 2002-02-12 12:47:15
Quote
Originally posted by cadabra3
As a Psychologist (forgive me ahead of time); I see this kind of thing all the time- someone puts X number of hours into something and almost immediately people start looking for flaws- real or not. They feel they must devalue someone elses work to validate themselves.
There's nothing wrong if people look for real flaws from settings like alt-presets or different codecs - that's the way things are developing, so it's only a good thing.  All devs here agree that their work must be tested and challenged, so that best possible results can be achieved.

The problem is unproven claims and rumors and lack of knowledge.
And sometimes people just want to use something they "discovered" themselves and think it's good enough for them. That's fine as long as there aren't any unproven claims made publically...
Title: a response to a growing rumor...
Post by: brosselle on 2002-02-12 14:17:15
Quote
Originally posted by ff123

I think the best way to kill off inferior command lines is to test them one by one. 

ff123


You know, I actually had to prove this to myself. To make a long story short, I bought a new DVD player that has MP3 support. I popped in one of my APS cd's to see how it sounded. It sounded horrible. I already knew the APS sounded good on my JVC car mp3 player.

Hmmmmmm.....

So, after tyring a few things (CBR/VBR), I decided to try a different encoder (some fhg variant). Anyway, it sounded bad too. The problem is defintely the hardware decoder on the DVD player.

In the process of doing this, I thought I would test this other encoder by downloading one of the test clips that everybody talks about. I chose "fatboy".

Well, with my Sony V6 headphones, I could EASILY hear how badly this other encoder mangled the soundclip. In addition, I tried a boatload of different command lines with LAME (including r3mix). Nothing, and I mean nothing, even remotely came close on this clip except APS. I didn't need to bother with APX.

Granted, this is an extreme soundclip, but it is a valid piece of music. The ablity to accurately reproduce it should be a requirement for any encoder. Ogg at medium quality did pretty well by the way.

This test proved to me that there was no combination of LAME command line switches that would accurately reproduce this soundclip, save for APS.

I'm convinced.
Title: a response to a growing rumor...
Post by: JohnV on 2002-02-13 05:32:24
Well, just read Roel's (r3mix's) comment about alt-preset standard.. (http://66.96.216.160/cgi-bin/YaBB.pl?board...&num=1013249202 (http://66.96.216.160/cgi-bin/YaBB.pl?board=c&action=display&num=1013249202))
Quote
To sum it up: what is an improvement to one person, might be a quality lowering for another.

like: I'd never use the normal "--alt-preset standard" because that 18.6-19.2 kHz lowpass is too low for my taste, but for someone that hears like up to 17kHz this wouldn't be an issue while some echo problems I don't hear in --r3mix might be for them.
So, Roel seems to say that he can hear a difference because of lower lowpass than --r3mix, because he would never use --alt preset standard because its lowpass is lower. Also Roel seems to have the wrong impression that --aps is just a pre-echo fix... oh well..

Interestingly his own site describes lowpass selection (http://users.belgacom.net/gc247244/quality.htm#lowpass (http://users.belgacom.net/gc247244/quality.htm#lowpass))
Quote
no-one hears the difference between a 19.5kHz lowpassed signal and the same full-range clip in a double-blind test. It's been proven by science many times before (even with 18.5khz on a very significant number of youngsters) and I did the test myself on my site&forum. In a poll only two people claimed they heard a difference between a 18.5khz and a full range one and the difference was gone at 19kHz. The 19.5 is an extra safety margin.
Title: a response to a growing rumor...
Post by: Dibrom on 2002-02-13 07:22:24
Quote
Originally posted by JohnV
Well, just read Roel's (r3mix's) comment about alt-preset standard.. (http://66.96.216.160/cgi-bin/YaBB.pl?board...&num=1013249202 (http://66.96.216.160/cgi-bin/YaBB.pl?board=c&action=display&num=1013249202))


Interesting read... guess nothing's changed

The bit about this:

Quote
To sum it up: what is an improvement to one person, might be a quality lowering for another.


I don't buy that at all.  I've seen this argument before, and it's always from someone who doesn't want to accept something which has been proven to be superior (ie, people who make up their command lines in disregard to evidence continually backed up and verified by the community -- what this very thread is about).

I've never really seen this verified, that being an improvement to one person being a degredation to another with almost all other things being equal.. and 19khz vs 19.5khz is not significant especially when one consider the logarithmic nature of hearing and the fact that most people probably can't hear beyond 18 or 18.5khz in real music.  But then, when you consider the source, someone who is willing to use frequency analysis to judge quality, perhaps that is to be expected.

It doesn't really seem to hold water anyway when compared against community data.  Not only has Roel's own AQ test implied (if not directly showed) that the old dm-preset standard was better than --r3mix, I don't believe I've seen a claim since the last few revisions of --alt-preset standard to where someone found --r3mix better.  Even if I've missed one or two, the ratio of samples where --r3mix fails badly vs where --aps sounds fine is very high.

At this point, to ignore all of the improvements that have been made (which many people can hear, just look through the revX threads), it'd have to simply be denial I think.  Continuing to state, given that evidence, that --r3mix is CD Quality still, supports that as well.

Quote
So, Roel seems to say that he can hear a difference because of lower lowpass than --r3mix, because he would never use --alt preset standard because its lowpass is lower.


And of course, we probably won't see any evidence to back up the claims that he can hear the difference between 19khz and 19.5khz.  What's worse, he doesn't believe in ABX.. so good luck with that

Quote
Also Roel seems to have the wrong impression that --aps is just a pre-echo fix... oh well..


Perhaps this isn't particularly surprising given the history of reaction towards the dm-presets on his board.  If one didn't consider those developments significant then, it probably wouldn't be a stretch for them to think the same now.

At any rate, it would be interesting to see --r3mix develop further (though I feel that "third party presets" with someone's name on them are counter productive towards to goal of simplication and user friendliness at this point; LAME needs consolidation, not further fragmentation), but I can almost guarentee that quality improvements cannot be had without increasing size at all, as Roel seems to think is possible.  I've worked on this issue very significantly, and it just isn't going to happen.  The only way it could be possible is if there were some pretty major changes to the psymodel, and I don't see that happening anytime soon.  Furthermore, I feel that LAME is as far as it can be taken just by combining different combinations of command lines.  That's why months ago I decided to delve into the code instead...

We'll wait and see what happens, but it seems like someone is expecting a miracle, and it just ain't there  What's more, in the past Roel has not shown concern for fixing the many samples which have caused problems in the past and has played down the matter, saying that the person must be abnormal and perhaps should not use MP3 or should normalize their file first or something else.  I'd be simply astonished to see the approach to this change now.  When you have one person who is apparently unable to hear faults, does not seem to show interest in scientific methodology (abx), is willing to rely on flawed techniques for comparison (freq analysis), and seperates himself from people with sensitive hearing who could help improve things, how can you possibly expect much progress?  Take MPC, PsyTEL, or Vorbis for example... if this was the approach used there, they wouldn't be anywhere near the level of quality they are currently at.

And personally I still don't see how 192kbps average is "too high" of a bitrate, especially since the mp3 groups have been trading in this format for years.... I think most users feel this way these days also considering the explosive growth of the use of the --alt-presets.

So at the end of the day, using the bitrate excuse and the .5khz difference in lowpass as reasons for ignoring --aps, just seems like a last stand... an unwillingness to embrace improvements simply for the sake of pride and being stubborn.
Title: a response to a growing rumor...
Post by: Delirium on 2002-02-13 07:59:09
Quote
I've never really seen this verified, that being an improvement to one person being a degredation to another with almost all other things being equal..
Well, the concept doesn't seem flawed to me - lowering the lowpass frees up extra bits for lower frequencies with a trade-off of throwing away high-frequency data.  If you were to lower the lowpass further, to say 16 kHz, you might well increase quality for people whose hearing is not sensitive to frequencies above 16 kHz, since you'd free up more bits for the <16kHz frequencies.  But you'd have a trade-off of decreased quality for people who could hear the now-missing high frequencies.

In this particular example I agree that the 19 kHz to 19.5 kHz difference is almost certainly too small and still at too high of a frequency to make an audible difference, but I don't think that shows that the concept of lowering a lowpass being a tradeoff is wrong - it just shows that in this case the tradeoff is a very good one, as the reduction in frequency is either completely inaudible or inaudible only in extreme cases (but it it is still theoretically there - if someone were to encode a 19.25 kHz sine wave, there are people who would be able to hear it, while -aps would chop it off).

In any case I suppose that's a nitpick, I still agree with your entire analysis about Roel's claims being wrong.  Especially giving the lack of any evidence, I'm not inclined to believe that raising the lowpass back up to 19.5 kHz would improve music quality for any listeners.
Title: a response to a growing rumor...
Post by: Gabriel on 2002-02-13 08:23:39
I think that an improvement for someone could be a degradation for someone else.

As an example, it could be possible if one of the 2 persons is a dog or a bat....or someone with a seriously damaged audition.

But in case of normal humans, it's very likely to not be the case.
Title: a response to a growing rumor...
Post by: Dibrom on 2002-02-13 08:40:55
Yes, nitpicking aside... it perhaps is possible (anything's possible, right? ), but being aware of the context it is being discussed in and the context the original statement was made in, it's almost certainly not the case.  Even then, as already stated, it's unlikely to be the case in the vast majority of situations, and even more so since I said specifically "with almost all other things being equal", which implies a lack of another area where quality would have been compromised in the particular situation.
Title: a response to a growing rumor...
Post by: cadabra3 on 2002-02-13 09:59:34
JohnV- no disrespect intended- if you read a little further down, I just said, that if you think you find a flaw or problem- document it before announcing it to the world. Of course we would get nowhere if we didn't question things; I believe you're old enough (as I am) to remember the phrase 'question authority'. The one I try to forget is ' don't trust anyone over 30' !! Anyway, I was just trying to say I appreciate the work (and help) on this site and blind criticism with no data to test is useless. Sorry if I offended anyone.  -a survivor of the 60's- cadabra3
Title: a response to a growing rumor...
Post by: Pio2001 on 2002-02-13 11:31:38
Roel didn't say he would never use APS, he was just giving an example of what someone not willing to use it could say.
Title: a response to a growing rumor...
Post by: tangent on 2002-02-13 13:08:16
Quote
Originally posted by Pio2001
Roel didn't say he would never use APS, he was just giving an example of what someone not willing to use it could say.
Really? Most of the time what they say is "My setting is better! Use '-v0 -q0 -ms -k --lowpass 22.05'. With this setting the sound is no longer muffled, stereo image is conserved, and it doesn't sound so bad with those artificial sine sweeps I create using CoolEdit/SoundForge"
Title: a response to a growing rumor...
Post by: JohnV on 2002-02-13 15:02:09
Also one thing to consider when comparing this small lowpass difference and saying --aps might sound muffled to very young people when compared to --r3mix, is of course ringing and high freq accuracy in general.

If a very young person might be able to hear a difference here (very unlikely but I won't say impossible), it's even more likely one will also hear the increased high frequency ringing of --r3mix.
But of course Roel didn't consider this aspect.. only what it looks like "on a paper"..
So, Roel's attempt to downplay --aps with lowpass issue is pretty ridiculous considering that.
Title: a response to a growing rumor...
Post by: Dibrom on 2002-02-13 19:05:35
Quote
Originally posted by JohnV
Also one thing to consider when comparing this small lowpass difference and saying --aps might sound muffled to very young people when compared to --r3mix, is of course ringing and high freq accuracy in general.

If a very young person might be able to hear a difference here (very unlikely but I won't say impossible), it's even more likely one will also hear the increased high frequency ringing of --r3mix. 
But of course Roel didn't consider this aspect.. only what it looks like "on a paper"..
So, Roel's attempt to downplay --aps with lowpass issue is pretty ridiculous considering that.


Indeed.  Not only is the insinuation that --alt-preset standard will encode less hf because of the slightly lower lowpass extremely misleading, it is also flat out incorrect, as I illustrate with these graphs below:

Sample "love.wav":

--alt-preset standard
http://static.hydrogenaudio.org/extra/love-std.jpg (http://static.hydrogenaudio.org/extra/love-std.jpg)

--r3mix
http://static.hydrogenaudio.org/extra/love-r3.jpg (http://static.hydrogenaudio.org/extra/love-r3.jpg)

In this sample you can see --alt-preset standard encoding up to the same level as --r3mix, except if you look closely you notice that --r3mix actually encodes less hf because of the "ringing" artifact!  This manifests itself as the short chunks of "missing audio" you see in the --r3mix sample.

Sample "piano.wav"

--alt-preset standard
http://static.hydrogenaudio.org/extra/piano-std.jpg (http://static.hydrogenaudio.org/extra/piano-std.jpg)

--r3mix
http://static.hydrogenaudio.org/extra/piano-r3.jpg (http://static.hydrogenaudio.org/extra/piano-r3.jpg)

This sample flat out shows that --r3mix simply isn't encoding as much high frequency content as --alt-preset standard.  The cut for the --r3mix sample is nominally around 13khz in this sample and sometimes even fails to encode beyond 11khz, while in --aps it ranges from 13khz to 16khz, spending more time on the 16khz side.  Furthermore, in the --aps sample, there are multiple cases where it actually spikes up to around 18.5khz, where in --r3mix the most you see is 1 time it barely reaches 18khz.

Sample "them.wav"

--alt-preset standard
http://static.hydrogenaudio.org/extra/them-std.jpg (http://static.hydrogenaudio.org/extra/them-std.jpg)

--r3mix
http://static.hydrogenaudio.org/extra/them-r3.jpg (http://static.hydrogenaudio.org/extra/them-r3.jpg)

Here is another clear cut case where --r3mix simply fails to encode more high frequency content than --aps despite the fact that --aps has a slightly lower lowpass.  What you see here is that --r3mix fails to encode a fairly large portion of the peaks present in the signal.  --aps on the other hand, does a much better job, and encodes the impulses nearly completely, all the way to 19khz.



If the examples here don't clearly show the falsity behind the insinuation that --r3mix encodes more hf content (which is what was implied with the fact that Roel pointed out the lower lowpass of --aps as being some sort of fault), then I don't know what does.  In many of these cases, --r3mix fails to encode hf content to the degree of --alt-preset standard, and this is far below the 19.5khz area!

This (mis)representation of the facts (in the statement made about the lowpass being too low), though, is what you can expect when you don't pay attention to the (irrelevant?) details

The bottom line is that despite the lower lowpass, --alt-preset standard encodes more high frequency content, with more accuracy, than --r3mix.
Title: a response to a growing rumor...
Post by: JohnV on 2002-02-13 21:01:14
Hey Dib, a bit OT. Could you make those pics a bit smaller size, maybe link for full size pics.
I mean loading of this thread takes some serious time with lower connection speeds atm..
Title: a response to a growing rumor...
Post by: Dibrom on 2002-02-13 21:33:21
Quote
Originally posted by JohnV
Hey Dib, a bit OT. Could you make those pics a bit smaller size, maybe link for full size pics.
I mean loading of this thread takes some serious time with lower connection speeds atm..


It was a hassle to make the pictures due to my current setup (copy from one pc, load cool edit, snapshot, crop, copy to another pc to burn pictures, copy to another to upload them.. heh), so I'll just change it to links instead of resizing.  However, the thread should (and I think it does..) load completely, as far as text, before the pictures pop up so I don't think it should have a detrimental effect.. but oh well
Title: a response to a growing rumor...
Post by: Gecko on 2002-02-13 22:00:15
That is some hard evidence, Dibrom! It may not exactly be on topic but I would like to comment on the "youth hears more hf" statement.

Short version: young folk (rough estimate: up to 14 years) do not have trained hearing. Almost anything sounds good to them. They enjoy the music itself and not it's good or bad reproduction. Many young adults kill their ears in discos. Both do not need hf reproduction so discussion focusing on this group of people is irrelevant.

Long version:
The question is, how relevant is hf reproduction for "youngsters"? From personal experience I wish to claim that it is not until a certain age that young people will start to actually listen to music. After taking that step it takes a while for them to develop a feeling for what sounds good or bad. After this it takes yet another while until they start to actually demand quality and understand what this means. Very many (young) people still take 128k fastenc quality as transparent and they experience no gain when going any higher. It takes time to get "educated" in terms of listening.

Probably most are listening on relatively low fidelity equipment ( --> flaws in hf reproduction) especially poor PC speakers when it comes to mp3 (ok, so they burn to CD too and listen through their rel. cheap stereo) and wouldn't be able to make out the difference anyway.

Another point is that a whole bunch (of older teenagers and up) visit discos/clubs almost every weekend where their ears get blasted by very high volumes on setups that are tuned by people who themselves have ruined their hearing and compensate for their hf hearing loss by boosting the treble way too much, thus resulting in even faster loss of the hf hearing of the visitors. In 10-20 years we will be facing a large mass of people with damaged hearing. I cringe whenever I enter such a freakin' loud place. (On a side note: more and more people start to wear ear plugs when they go to discos or concerts. How about turning down the volume?)

A small anecdote: on a trance music related forum some DJ said that after spinning for a while he could no longer hear the bass drum in his headphones. It turned out he had the main speakers directly behind him hammering in on his ears at close range. After some discussion about hearing loss and some "Dude! Are you mad?!" and talking about his sucky headphones he decided to go for louder headphones with better bass reproduction... (wonder how he kept the needle from skipping; probably added a lot of weight  )

Conclusion:
I guess what I'm trying to say (not so sure there myself  ) is that this whole discussion about satisfying younger people with hf repro is overhyped. Claiming that a 13 year old will much rather listen to an encode with more high frequencies is simply not true (see long version). Young folk around 20 ruin their ears anyway so they don't care either. You loose more hf hearing capabilty the older you get.

Then there is the group of people who demand high fidelity and have the hearing to enjoy it (not only focused on hf of course). Those are the ones we are trying to please. Not "ignorant" children or half deaf clubbers. Discussion should focus on the needs of people who require high quality! Arguing wether or not a five year old would percieve the hf in some specific encode is just irrelevant.

What I have said is not based on research just on some thinking I've been doing, so please speak up if you know better! (Phew! Longest post I've ever done on this forum  )
Title: a response to a growing rumor...
Post by: Pio2001 on 2002-02-13 22:16:26
Quote
Originally posted by Gecko
wonder how he kept the needle from skipping; probably added a lot of weight  


It much easier :  DJ often remove completely the counterweight, so that all the catridge weights fully on the stylus :rofl:
Title: a response to a growing rumor...
Post by: Gecko on 2002-02-13 22:47:27
Quote
Originally posted by Pio2001
It much easier :  DJ often remove completely the counterweight, so that all the catridge weights fully on the stylus :rofl:


Indeed  ...  This ensures that even after playing the record for the 50th time the needle digs deep into the material to get that extra accurate playback that would be lost otherwise (or so)

But either way: you are curing only the symptoms, but the cure worsens the disease!
Title: a response to a growing rumor...
Post by: Dibrom on 2002-02-13 23:03:37
Pio2001,

It's a shame you didn't post your response to my graphs on this board here, but since I'm not registered at r3mix.net I'll just respond to it here.

Originally written by Pio2001
Quote
The pictures shown by Dibrom are interesting, but their interpretation is not easy.


Really?  It seems as if they're not so easy to interpret in the context I discussed them in, only if you wish to interpret them that way (being difficult).  The evidence is clear enough I think... how can you attempt to spin that?

Quote
I'm not sure that APS shows more HF content than r3mix, because it rather shows a large band and large time of very low HF level, while r3mix shows short burst of high level HF (especially on the them sample).


Umm... so?  Low level HF or not, it's irrelevant.  The fact of the matter is that it is HF, and it is audible in some of these cases.  Saying that you "aren't sure" is a bit rediculous.  In all but the "them" sample, it's a very clear cut case.  In "them" the sum of the high frequencies across the file are encoded more often in --aps than in --r3mix.  There's more stability and accuracy in terms of quality in the --aps file.

Quote
To know wether R3mix or APS shows more HF content, a global spectrum analysis of the sample should be made, instead of a sonogram.


--r3mix may at times show higher peaks in the one sample, but remember that this is by .5khz at 19khz, which is, again, not going to be significant in any case compared to the artifacting shown in the --r3mix sample.

Quote
But I don't think that the RMS level of HF content means anything. Remember that those are nearly unaudible frequencies, therefore only yellow plots should be taken into account. Grey parts are surely completely inaudible. From this point of view, r3mix shows more HF content than APS in the "them" sample, and the other sample show no (audible) HF content at all.


Sorry, but I beg to differ.  Having spent a large amount of time tuning LAME and at times using sonograms to determine the point of error, I can say with confidence that these frequencies you deem to be inaudible, are actually quite audible in some cases.

Have you ever heard of the artifact termed noise pumping?  This is caused by a failure to encode enough (high frequency) background noise, usually in a quiet file.  This is the type artifact you will get if you do not encode this to a high enough frequency, and from personal experience and that of many people on this board, --r3mix does not.  This is at least partially due to the ATH in --r3mix being too high (which also leads to other artifacts such as ringing which is actually very rapid and short bursts of the "noise pumping" or "dropout" type problem).

Other than that, I find it a bit ludicrous that you would say that the --r3mix sample shows more high frequency content and that the problems would be inaudible.  Fatboy, spahm, and many others show the exact same problems and as with them, they are certainly audible.  Yes, this is on impulse samples, but the same effect induces different artifacts in other cases.

The point then remains, that --alt-preset standard encodes more high frequency content, and with higher accuracy, than --r3mix, especially on difficult samples.

Quote
But remember, graphs mean almost nothing. EG the HF part could in fact be some impulse content. In that case, the lack of HF content doesn't "muffle" the sound at all, but creates pre-echo instead.


True, but I never said anything about "muffling" at all, I only ever discussed high frequency content by itself.  So then, just change the statement.  "--r3mix shows high frequency pre/post-echo in addition to many other artifacts due to its failure to properly encode high frequency content in many cases".
Title: a response to a growing rumor...
Post by: user on 2002-02-13 23:45:24
Hi,

in r3mix and here people are referring to some graphs.

Are they published ?

If, where ?
Title: a response to a growing rumor...
Post by: JohnV on 2002-02-14 00:05:05
Uhm, user.  About 7 messages above your post.
Here: http://www.hydrogenaudio.org/forums/showth...d=8878#post8878 (http://www.hydrogenaudio.org/forums/showthread.php?s=&postid=8878#post8878)
Title: a response to a growing rumor...
Post by: JohnV on 2002-02-14 01:38:19
I'd say you really can't tell absolutely which one of the presets is objectively better at high freqs.
Just by looking at different spectrals, it seems that APS is almost always better with consistency of somewhat below 16kHz (less dropouts,ringing), and some of the test samples seem to verify this (bloodline,serioustrouble etc.)
But sometimes APS seems to have a bit more isolated frequency spikes above 16kHz which could mean ringing, if audible. Then again sometimes r3mix looks less defined above 16kHz..

I'd say it depends pretty much on the case. But to me there's not very noticeable audible difference in very high frequency production. There are very noticeable differencies in other quality areas though, in favor of APS of course.

I would be interested if anybody is able to hear more ringing type of artifact (above 16Khz) with APS using 1400.wav.
http://sivut.koti.soon.fi/julaak/1400.flac (http://sivut.koti.soon.fi/julaak/1400.flac)
Title: a response to a growing rumor...
Post by: Pio2001 on 2002-02-14 12:09:33
Quote
Originally posted by Dibrom
Pio2001,
It's a shame you didn't post your response to my graphs on this board here,

You're right, sorry

To make it short, I don't like to discuss sound quality analyzing graphs, Listening tests are way better in my opinion.

I must say that, exept for test samples, I don't hear the difference between r3mix, APS and the original on music chosen at random among well recorded CDs.

The rest is a matter of words, I think.
I of course agree that APS is proven superior to r3mix, but I wouldn't draw any conclusion by myself based on graphs only, without having encoded and listened to the samples themselves.
I still remember your old blind test sample with all frequencies encoded up to 22000 Hz 

PS : seriously, DJ actually remove the counterweight when they mix on a vehicle that is driving, like in a train or in parades.
Title: a response to a growing rumor...
Post by: JohnV on 2002-02-14 17:00:13
Quote
Originally posted by Pio2001
To make it short, I don't like to discuss sound quality analyzing graphs, Listening tests are way better in my opinion.
Ooooh, really??!! How surprising. How can somebody actually prefer listening tests, unbelievable, never heard anything like that before, certainly not people here at least....
Title: a response to a growing rumor...
Post by: tangent on 2002-02-14 17:34:26
Quote
Originally posted by Pio2001

To make it short, I don't like to discuss sound quality analyzing graphs, Listening tests are way better in my opinion.

It is true that listening tests are better to discuss quality, but graphs are not completely useless. You have to realise that Dibrom was not using the graphs to discuss quality, but to discuss BEHAVIOUR of the encoder. Now graphs are the perfect tools for this purpose. The right tools for the job, remember? Many people looked down on EAQUAL as a decider of quality, but I think it's silly. Although you can't use EAQUAL to replace listening tests, EAQUAL is useful for many other purposes too (e.g. those Ogg bitrate vs quality graphs).

In the case above, you have to realise that Dibrom did not use the graphs to say that one setting is better than the other. He used the graph to support his findings that one setting encodes more HF content than the other. This is plain fact, and the conclusion he draws is correct. Did he say that one sounds better than another because of the conclusion? No. But he proved a valid point, and that's the important part.

Anyway, the conclusion is just because --r3mix lowpasses higher doesn't mean it will encode more HF content than --alt-preset standard. It is not hard to figure out why: In most cases, HF frequency content falls below the ATH curve (which is already very high in the HF region) and don't get encoded at all. In this case, the ATH curve combined with the noise measuring algorithm of --alt-preset standard decides that more HF content needs to be encoded than --r3mix's algorithm.

The second thing is that graphs seems to indicate that ringing is occuring in the --r3mix samples. Although we cannot be sure we can hear it without the listening tests, we can recognize the ringing syndromes from the graphs. You may want to visit http://ff123.net/ringing_graph.html (http://ff123.net/ringing_graph.html) to understand it better. In this case the graphs are useful in showing that the --r3mix encode shows the syndrome and signs of ringing. But we cannot conclude that this is audible in the sample itself without a listening test.

Therefore: Don't underestimate the usefulness of graphs. It cannot replace listening tests, and probably is not as useful as listening tests. But it is a very useful tool when used properly. Unfortunately, it has been given a very bad reputation because graphs have often been abused in the past.
Title: a response to a growing rumor...
Post by: Gecko on 2002-02-14 18:52:54
Apart from discussing the point of the high cutoff why not just change it if you think you must? It may do more harm to the mid and lower range so I can't recommend it, but that's not the problem here.

--alt-preset standard --lowpass 19.5

There. Tested on "RMB - Love is an Ocean" which raised the respective bitrates from 212 to 216 (values from Winamp. Encspot says 213/217). This song has a very high amount of high frequencies which go up to 22kHz. A quick listen revealed no obvious difference or flaws (but that's only me listening). Spectral view also showed that more hf were encoded (no visible drop-outs; also, I didn't bother to look at r3mix).

So if you persist on having a higher lowpass just use it for all I care and still have the benefits aps offers over r3mix! Noone forces you to stick to one particular lowpass. But be aware that it may have some negative side affects.
Title: a response to a growing rumor...
Post by: Dibrom on 2002-02-14 20:01:46
Quote
Originally posted by Pio2001
To make it short, I don't like to discuss sound quality analyzing graphs, Listening tests are way better in my opinion.


I wholeheartedly agree with this, and it's the very reason that I've never used graphs to judge quality in such a manner.  If you look at every method I've used along the way to improve the dm-presets, and then the --alt-presets, it's always been based on listening tests.  I of course understand that cannot gauge relative quality.  But that wasn't my point here either.  tangent pretty much covered it I think... but basically it was just to state a fact, and that was the point about the lowpass 19.5 vs 19 being mostly irrelevant and misleading, and that it would be more appropriate to use --aps if one was concerned with high frequencies in any case.

Quote
I must say that, exept for test samples, I don't hear the difference between r3mix, APS and the original on music chosen at random among well recorded CDs.


Of course you won't always hear a difference.. but it certainly is there.  It has been noted by myself and others on a very wide range of music.  I'd say that perhaps the areas where --r3mix fails very regularly though is in quieter music (ath is too high as you see) and electronic music (many other reasons).

Quote
I of course agree that APS is proven superior to r3mix, but I wouldn't draw any conclusion by myself based on graphs only, without having encoded and listened to the samples themselves.


Sure.  But I've already drawn all my conclusions from all of the vast amount of listening tests I've done and many other people have done in verification.  At this point I already know (and so do most people) what I stated, I didn't need to draw conclusions from it

Quote
I still remember your old blind test sample with all frequencies encoded up to 22000 Hz


Title: a response to a growing rumor...
Post by: fewtch on 2002-02-15 07:43:34
Quote
Originally posted by tangent
Really? Most of the time what they say is "My setting is better! Use '-v0 -q0 -ms -k --lowpass 22.05'. With this setting the sound is no longer muffled, stereo image is conserved, and it doesn't sound so bad with those artificial sine sweeps I create using CoolEdit/SoundForge"

I plead guilty to making such a statement before  ... but i am also a recent convert to the --alt-presets.  After doing some research (and listening tests) it was enough to convince me.

If it were possible to do, I would like to see a way to raise/lower the default lowpass, without affecting other frequencies too much (e.g. code level tweaks that would raise/lower the bitrate or tweak other parameters to compensate) -- maybe that would quiet some critics, and it could be useful in other respects.
Title: a response to a growing rumor...
Post by: cd-rw.org on 2002-02-15 09:28:14
Just by showing up again Roel has already done more for the audiophile MP3 community than some of the posters in this thread. He re-initiated the discussion about optimal LAME settings and that's always a good thing.

The imaginary "JS-collapsing" pops up every now and then. It think it's still mostly due to the misunderstanding JS and the fact that mp3 scene people tend to promote stereo encoding. Usually the complaining users disappear when someone asks for a ABX.

To the APS lowpassing - it naturally is a trade-off. Lowpass is always "lose some win some"-situation.  But so far it has been a very succesful trade-off, since APS has performed so well in peoples' testings, and I think it is the wrong end to start tweaking from - there's not much to fix there...
Title: a response to a growing rumor...
Post by: Dibrom on 2002-02-15 10:29:48
Quote
Originally posted by cd-rw.org
Just by showing up again Roel has already done more for the audiophile MP3 community than some of the posters in this thread. He re-initiated the discussion about optimal LAME settings and that's always a good thing.


There's only one problem.  The discussion he "re-initiated" appears now to be nothing more than to debate the points of the --alt-presets (or why they aren't right for --r3mix, etc, etc). 

Basically, we're now back to the same old situation as before with the --dm-presets.  Roel doesn't think the benefits are worth it because he can't hear them, and it doesn't matter how many other people do, it simply doesn't matter.  He'll continue to state --r3mix is CD-Quality (incorrectly) and continue to advocate, if nothing else, a technically inferior preset.

I think the nature of the discussion now is actually more harmful than useful because continuing down this course is pretty much useless judging by past trials (I'm sure you remember...).  Forward progress is what is necessary, and downplaying the merits of real improvements which cannot be denied serves no constructive purpose that I can see.

It would be nice to see some agreement from the other party on something which has been doing extremely well on all fronts and which has been proven multiple times to be superior, but I seriously doubt that is going to happen.  Instead, I fear that the discussion will regress back into 2 divided camps and nothing more.  It's a pity really, because in the end, its the users that suffer.
Title: a response to a growing rumor...
Post by: johnicon on 2002-02-15 12:25:47
My $.02:
    I found the r3mix website in May of 2001.  For the first few months, I loved it.  I was reading every post I could, even posting myself, though not much posting (I was a newbie back then:) ).  But after around September, I lost interest because the forum was already bogged down in the same old (old by that time) issues about competing presets and command lines and what not.  I came here as soon as you guys opened the site and was (and still am) pleased with the "free thinking" in these forums.  Anyway, I think the point should be agreeing to disagree and pushing foward with improving things that we care about in the audio world as well as discussing new issues, instead of getting bogged down with no progress.
Title: a response to a growing rumor...
Post by: brosselle on 2002-02-15 13:46:27
You know, I kind of liken this to the old vinyl LP vs. CD debate. For a long time, a lot of audiphiles were so concerned about this new digital thing, and how the "essence" of the music would be lost.

Well, I'm glad to say that most of them came around finally. Just try to find a Led Zeppelin album at Best Buy. And, just like r3mix, some audiphiles never came around. But so what, who cares. Let them keep listening to their scratchy albums on $1000+ turntables.

The point is, given enough time, I think the --alt-presets will become the standard....the norm. The r3mix preset will still be there for those that want it, but I don't think anybody will, save for a select few. It will eventually die out on it's own, due to it being an inferior technology. Evolution, ya know ;-)
Title: a response to a growing rumor...
Post by: fewtch on 2002-02-15 14:28:09
For some reason, I suspect Roel will "come around" too.  Dunno why, just intuition.  He can still have his "special" --R3mix switch with a tweaking or two he might want to add himself... when he finds a good one, he'll probably decide to use --alt-preset standard with a few tweaks  .

P.S... this "audiophile" uses LPAC for archival purposes (as in really backing up a music CD) and MP3 for general listening... that could potentially change to MPC if it catches on, but for now it's MP3 .
Title: a response to a growing rumor...
Post by: user on 2002-02-15 22:08:56
Hi,

during playing around with mp3gain, I found following:


- a  mp3 song as source encoded with --alt-preset extreme or dm xtreme, containing very much only stereo frames, but this is not important.


1.  maximize this song lossless ! with  mp3gain, so that it does not clip.
Re-Encode it eg. with alt-extreme

2. lower the gain of the source.
Reencode it with same preset.


Result after reencoding:

1. the maximized song will contain mostly joint stereo frames.
2. the same song with lossless lower gain/volume will contain less js frames, more stereo frames !

Looking with Encspot:
Both songs have after reencoding nearly same bitrate, nearly same bitrate distribution.
Of course only nearly, but nearly exact the same....
That would be well....

But: the distribution of js and ms is very different.

My interpretation:

In alt preset extreme (i tested) there has been achieved that all songs result into a comparable average bitrate aorund 256 kbit, independent from gain.
But this was achieved by a compromise in stereo/joint stereo frames.
Loud songs have much joint stereo,
quiet songs much more pure stereo frames.

Even if it is the very same song just changed with lossless gains...

Of course I understand why it is like it is:
the gain change is lossless for one mp3, but reencoding works via wave.
And the 2 resulting waves of same song in my test are very equal, but have different gain....

And encoding these waves is different, because of ath and so on, , so if the quiet song gets encoded with alt extreme, that means target bitrate 256 kbit, and there is plenty space for quiet songs, encoder thinks: quiet parts are under ath....
so encoder offers bitrate for stereo instead of js.

Otherwise
loud song:
relative ath is different, quiet parts of loud song are listenable...
so much bitrate is required....
so use of js instead of stereo,
I think, that's it.
Title: a response to a growing rumor...
Post by: Dibrom on 2002-02-15 22:30:52
Quote
Originally posted by user
1.  maximize this song lossless ! with  mp3gain, so that it does not clip.
Re-Encode it eg. with alt-extreme

2. lower the gain of the source.
Reencode it with same preset.


Result after reencoding:

1. the maximized song will contain mostly joint stereo frames.
2. the same song with lossless lower gain/volume will contain less js frames, more stereo frames !

Looking with Encspot:
Both songs have after reencoding nearly same bitrate, nearly same bitrate distribution.
Of course only nearly, but nearly exact the same....
That would be well....

But: the distribution of js and ms is very different.

My interpretation:

In alt preset extreme (i tested) there has been achieved that all songs result into a comparable average bitrate aorund 256 kbit, independent from gain.
But this was achieved by a compromise in stereo/joint stereo frames.


Sigh...

Ok, I'll try to explain this again.

First, I'd like to ask you a question, user.  Are you hearing a deficiency in the stereo field?  Are you hearing an error that you can verify repeatedly?  If not, then why the incessent assertion that there must be something wrong because you don't think enough stereo frames are being used?

If you can actually hear a problem, I'm interested, if not, then you are rehashing this same issue needlessly once again.

Anyway...

There is one major flaw in your "interpretation".  That is the fact that you completely disregard (or simply do not acknowledge) psychoacoustic effects in your statement that joint stereo is compromised.  I've been seeing this a lot lately though.. people are making assumptions based on standard audio perception (and how they think things should work), without taking into account psychoacoustic effects.

It's a simple fact that the masking effect is more powerful at louder volumes.  If you'd like, you could easily test this with one of the many samples which have shown "dropouts" with LAME in the past.  What happens is that the louder you turn up the music, the less you can hear these dropouts.  This is because the masking effect is increased and the flaws are actually hidden. 

Now, in the case of dropouts, this is a bad thing.. and the --alt-preset solve that.  However, the principle is the same with joint stereo.  The --alt-presets have been tuned by ear, and verified by others, to coincide with what makes sense from a psychoacoustic point of view.  They are tuned to sound good while being as efficient as possible.  One of the ways the extreme preset is efficient is because it uses the athadjust to modify joint stereo thresholds in some cases.  This does not mean that it compromises the stereo image.  Again, this has been verified by others as well as myself via blind listening tests, not unsubstantiated assumptions that "there must be a problem because there are too few stereo frames".

So I'll state it once again.  Either you hear a problem, or you don't.  If you do, then let me know, otherwise stop worrying about it because for all intents and purposes the presets have been tuned to sound good, and that's what matters.
Title: a response to a growing rumor...
Post by: user on 2002-02-15 23:07:00
Thank you, now it is clear.

yeah, as I spoke in simple model, I forgot masking in louder songs.

You see, for me it was/is still amazing that you and perhpas other developer have managed it to make routines more flexible.

Some days before Christmas there was everytime the same distribution of js/ss frames, independent if song was loud or quiet.
But at that time the bitrate changed a lot corresponding to gain.

I assume that altogether resulting quality has grown by this behaviuor, because now the quiet songs get more bitrate and if they are played more loudly then artefacts will be minimized.