problem samples for WavPack lossy

Topic: problem samples for WavPack lossy (Read 19760 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

problem samples for WavPack lossy

Reply #25 – 2007-05-17 02:29:01

Well, these new realizations will point me in the direction of what kinds of tracks might potentially be the worst. I still think that from an overall practical standpoint, WavPack has been performing well. What I will do is slowly search for more real-life problem samples over the next couple months. If I find one that is as bad as I fear possible (requiring 500 to 550 kbps) then I may give up on WavPack. If I don't find one, then I will use WavPack occasionally but I will change my approach. Because theoretically WavPack can result in disaster on some songs where the high-freq amplitudes are too high, I can't consider it close to perfect anymore. It will be just another (very interesting, because it operates on different principles) lossy codec that must compete and perform similarly to MP3, etc, at comparable bitrate. I still think WavPack is better quality than MP3 at the same low bitrate (200 kbps to 256 kbps) on some songs (loud and noisy), for example.

So now I would prefer to encode at perhaps -b384x4s0 (compared to MP3 bitrate, this is still closer to 400 kbps because the WavPack ABR is not exact). If I need more than that for WavPack transparency then I will use MP3 instead for that song or album. I don't want to use settings like -b450 anymore because now I think WavPack doesn't deserve it.

The other side of this coin is that I should also check my MP3 encodings more carefully. If there is some song where there is a noticeable artifact at 320 kbps CBR (pre-echo or whatever) then I can use WavPack 400 kbps instead (or even 500 kbps, since in this rare case its the best choice). But usually I'm quite satisfied with MP3 320 kbps CBR at least in practicality. So far I cannot ABX with the original, but perhaps I just didn't learn how yet.

problem samples for WavPack lossy

Reply #26 – 2007-05-17 10:40:50

Porcupine , I think you need to put some things into perspective. First of all to me the quality advantages of hybrid encoders is better handling of postprocessing, transcoding and full lossless restoration with use of correction files. One can encode at reasonably high quality - 300..400k for PC use, transcoding and burn the correction files to DVD. The other way is for rockbox portable use. Also, It comes down to a preferance of noise vs artifacts. I wouln't normally class wavpack hiss difference as a killer sample. It doesn't sound anything like a traditional mp3 / mpc killer.

I don't think there is a 100% solution for what you seek (at least not with current wavpack or the bitrates you wish to use). Since mid-high bitrate transform coders are normaly transparent - there must exist different reasons / philosophy to use hybrid encoders instead. I still think that Dualstream is ready as a replacement for transform encoding. Default parameters [VBR quality 3] result in transparent quality for nearly anything you wish to encode. Bitrate will be inline with 320k and encoding can be fast like Vorbis Lancer.

problem samples for WavPack lossy

Reply #27 – 2007-05-17 11:04:19

I think when looking for desasters you have to search within artificial (electronic) music.

The principle of a lossless based lossy encoder essentially involves the quality of the predictor. With the bitrates you consider the prediction error is coded with roughly 4 bits of accuracy, so in order that encodings are fine the predictor has to work pretty well. It usually is the case with natural music as with this the sample-to-sample-relation is pretty well predictable. It may happen that noise isn't masked well at a rather low bitrate but when going into the upper half of the 300...400 kbps range and especially when using higher quality settings the probability for noise being audible is so low that to me it's negligible (and it's a good attitude towards lossy codecs to allow for non-annoying issues in very rare cases which is especially easy to do as these errors normally just sound like noise).

If it's up to electronic music a possible solution may be to say good bye to wavPack lossy for this genre.
As wavPack doesn't have a real quality control (yet?) you might consider shadowking's proposal of OptimFrog Dualstream. Quality 3 to me too provides for an astoshingly robust quality. Moreover OptimFrog's predictor seems to be more adequate for artificial music than wavPack's.

Of course there's nothing wrong using a high quality mp3 setting. But more so than with wavPack where you say wavPack doesn't deserve a 450 kbps or so usage it's to me with mp3 CBR 320. mp3 is efficient at a lower bitrate and you can't expect to get an essential improvement when going from say 256 kbps to 320 kbps. Even 256 kbps is unnessary overhead most of the time.

BTW the worst known problem samples for mp3 are in the electronic music genre too. pre-echo is a general problem (though the different encoders behave differently) but most of the problems known are electronically produced. Worst known sample to me is eig. Maybe you'll never want to use mp3 again if you've heard an eig mp3 version.

The problem with the problem samples is that it's hard to decide on the practical implications. We all have a tendency for perfection (you have it very much, I probably have it also to a larger extent than is really sane).
But I've changed my attitude pretty much within the last year. eig for instance has no real influence on my choice of encoder and setting for the mere fact that I don't listen to such a music. I'm still interested in encoders' behavior on eig but that has nothing to do with my practicing. A pre-echo sample with more practical implication (to me) is castanets. But luckily I'm not very sensitive to pre-echo, and I can easiliy abx castanets only at low bitrate which I don't use. When I was practicing abxing very intensively one day I was able to abx castanets at 256 kbps, but I had a hard time, so castanets isn't a real problem to me when considering real life listening situations.

As for mp3 I personally would use quite some quality headroom, but the 250 kbps range is the maximum I would allow. mp3 doesn't deserve more. If things aren't fine at such a bitrate (which has a probability close to zero) it won't be at 320 kbps. BTW depending on bit reservoir usage strategy (restrictions on 320 kbps frames) CBR 256 can temporarily allow for a higher audio data bitrate than CBR 320.

As a safeguard against rare failure of the psy model where VBR makes things worse I personally would prefer ABR (with Lame) or CBR (otherwise). My current mp3 encodings were made with FhG CBR 192 (I would have preferred 224 or 256 kbps but I would have to give away joint stereo) - anyway quality is excellent to me (problem samples of course aren't but they are acceptable).

Guess you just run into worse perfection trouble again when returning to mp3.
If you reconsider using transform codecs why don't you think of using AAC, Vorbis, or MPC? They are far better candidates to get at a close-to-perfection level than mp3 is.

But with your demand for perfection: why don't you do it this way:
Use wavPack 350...400 kbps with a high quality setting for natural music.
Play around for some time with it to heavily confirm your current experience that everything is fine (hope I don't misinterpret you).
Use lossless (wavPack, FLAC, TAK, Monkey, OptimFrog or whatever you like best) for any kind of electronic music.

problem samples for WavPack lossy

Reply #28 – 2007-05-17 15:36:38

When mp3 stuffs up its just sounds plain wrong and much much worse than added hiss.

EIG is bad but this one is worse on 3.97:
Its ironic: Listen to the vocals !!!.. BTW hybrids add a little hiss. Now compare the difference.

http://ff123.net/samples/SeriousTrouble.flac

problem samples for WavPack lossy

Reply #29 – 2007-05-17 23:07:34

shadowking, yes there may not be a 100% solution for what I seek. Well, I never said I was expecting a perfect solution in the first place, I am just investigating a new option with WavPack. To be honest, WavPack exceeded my expectations greatly but it still has some problems.

Regarding the bitrate I choose, right now I say that I want to use -b384, but this wasn't always the case. Earlier I said to halb27 in PM that I was planning to use -b480 on everything (right before I discovered the disasters). It has to do with my philosophy. If I think that WavPack is robust and deserving I will give it more bits because I know I am getting extremely good, near-perfect quality. But now I know that there can theoretically be a disaster even at -b480 (try the stereo high-freq sine wave test at 100% amplitude to hear how disastrous, although it is unfair...but even that may not be the worst possible case). So that's why I change my approach and now prefer -b384 which is usually transparent anyway...and I will have to use my own ears/eyes (look at spectra works well) to make sure I don't encode any disaster songs with WavPack.

BTW I haven't found any disaster songs so far. I don't need to encode to check anymore so I can search way faster, too. I just look at the spectra, it's obvious to me now what a disaster will be without having to actually encode to test. I need to find a high-freq tonal sample with large stereo separation (different instruments/notes in both channels) but the last part is the most difficult, it rarely occurs with real music (but there is no reason it cannot occur). If not, just invert the right channel of 'furious' and there's your practical disaster (I think, I didn't try it).

A good VBR mode would fix everything though. Because to me WavPack's main problem is lack of robustness/consistency. I have no innate preference for hiss vs artifacts. Maybe later on I will try OptimFrog since it uses VBR (but there's no guarantee the VBR works perfectly...do you know how it works? I would prefer a VBR mode that uses psychoacoustics to determine the audibility of the noise). But that could be much later, couple months away. Sorry I am so slow, I am the type of person who hates installing lots of software into a new computer all at once, and I still have plenty of things I could test with WavPack or MP3.
When you play back Optimfrog files, can you see the current bitrate in realtime as it is varying? How high/low does it vary between? (I would prefer that it must go very high on a problem sample) On the Optimfrog webpage I see VBR bitrate ranges for the various Quality settings, but are those the practical ranges for files that were produced (in which case they look fine to me), or the realtime temporary values the bitrate is allowed to take (in which case it's not enough variance).

BTW, I declared from the very beginning (but not to you, I think) that to me the lossless modes and hybrid correction files are useless (85% compression ratio, such as I've sometimes encountered, is not useful to me). I have no desire for them. However, that they exist does give me piece of mind that the codec is well-written and should be mostly or perfectly bug-free.

BTW, shadowking did you do your previous listening tests in -s0 or -s-0.5? Just curious.

Yes I should listen to a few mp3 problem samples too. I have been curious about 'serioustrouble' from a while back. But I don't have FLAC decoder (and I'm not going to install it, it's useless software to me. I have WavPack now. I hate filling my computer with extraneous software). Sorry for being stubborn.

problem samples for WavPack lossy

Reply #30 – 2007-05-18 00:19:39

> I think when looking for desasters you have to search within artificial (electronic) music.

Agree. Or semi-artificial. Most of the music I listen to is what I would call semi-artificial. All 3 samples I posted are semi-artificial, I think. Is that what you would call them, too?

> The principle of a lossless based lossy encoder essentially involves the quality of the predictor. With the bitrates you consider the prediction error is coded with roughly 4 bits of accuracy, so in order that encodings are fine the predictor has to work pretty well. It usually is the case with natural music...

Yes, I look at things the same way. Therefore the problem is with artificial and semi-artificial music when the high-freqs (which are impossible to predict) are too loud. The predictor only works on the low and middle-low frequencies. However, you can only hear the error if the song is tonal/soft...so the added requirement is that the high-freqs must be tonal.

Regarding the noise being ignorable even in the rare case where it is audible...I have considered that also. It is hard for me to say. I need to train my ability to hear artifacts in 320 kbps MP3 better, otherwise even the slightest noticeable hiss in WavPack would be worse.

I'm curious about OptimFrog too. Well, I want to rest a couple months and still play with WavPack and MP3 before moving on to something new. Maybe I will try it sooner than expected (like I did with WavPack), though. It depends on my mood. I've been reading the Optimfrog webpage and documentation.

> But more so than with wavPack where you say wavPack doesn't deserve a 450 kbps or so usage it's to me with mp3 CBR 320. mp3 is efficient at a lower bitrate and you can't expect to get an essential improvement when going from say 256 kbps to 320 kbps. Even 256 kbps is unnessary overhead most of the time.

I understand what you are trying to say. Although I have no ABX tests to prove anything, it's been my experience that 320 kbps MP3 is worth it. Most of my experience is with the Fraunhofer encoder at 256 kbps and below, and LAME at 320 kbps. Fraunhofer 256 kbps Stereo and 192 kbps Joint-Stereo both usually sound almost the same to me (BTW, I like Fraunhofer's Forced Joint Stereo implementation much more than LAME's Joint Stereo switching, which I think is foolish...but I doubt anyone else agrees with me). So in some sense, if I am paranoid of Joint-Stereo, then 256 kbps is necessary. 192 kbps Stereo is definitely not sufficient at many times. Adding upon that, Fraunhofer 256 kbps Stereo to me is fairly distinguishable from the original still (distinguishable extreme low and high freqs), while LAME 320 kbps Stereo is indistinguishable to me so far. It could just be that LAME is better than Fraunhofer, maybe LAME 256 kbps Stereo is indistinguishable to me too but I've never tried. But in theory, I think LAME needs 320 kbps because it encodes high-freqs much more carefully than Fraunhofer. So I personally am happy with 320 kbps mp3 and don't feel like I am wasting...but I have no intention of convincing others of my habits, this is just my personal feeling.

The main problem with WavPack is lack of consistency. I don't think MP3 can have a disaster as bad as WavPack can. I have never heard 'serioustrouble' though, is it really that bad (as 320 kbps Stereo mp3)?

> Maybe you'll never want to use mp3 again if you've heard an eig mp3 version.

Yes indeed! I am very afraid of that myself. If you'd like, you can send me Eig and SeriousTrouble (.WAV or .WV please) and I will listen. If it's bad enough, I will change my mind back to WavPack.

> eig for instance has no real influence on my choice of encoder and setting for the mere fact that I don't listen to such a music.

I share this viewpoint a little. But I have to be worried about WavPack because the disasters can potentially occur on the kind of music I like. My first problem sample was the 4th song I ever tried to encode (and one of my favorite songs in all the world, maybe you will laugh at me for that, but the rest of the song is a little different). My next two problem samples came from a CD where the whole CD is like that. So far, no real disasters but those were close calls. (And I feel a true disaster is not that unreasonable, a stereo version of 'furious' is definitely something I could encounter one day given the kind of music I listen to).

My favorite kind of music in all the world is like those problem samples I gave. That possibly-synthetic instrument in Track03entreaty, whatever it is called ('freshair' maybe) is my favorite instrument in the world. I wish I had more music of that type but I don't know where to find. All those songs came from TV shows I watched, that is how I find all my music.

> As for mp3 I personally would use quite some quality headroom, but the 250 kbps range is the maximum I would allow. mp3 doesn't deserve more.

I understand what you mean. But for me, the high freqs (perhaps not 20+ kHz, but definitely 16 - 20 kHz) are critically important (even though my left ear cannot hear them well anymore, which makes me sad). I won't even listen to music without high freqs. After I damaged my hearing 2 years ago, I did not listen to any music for an entire year (except for on the TV) because I was sad and music did not sound good to me anymore.

Fraunhofer doesn't deserve more than 256 kbps because it does not encode frequencies above 16+ kHz well. It encodes them all up to 21 kHz, but with terrible quantization that is easily audible (I claim, but no ABX test so I will retract if you want me to. Also, maybe I cannot ABX well anymore because I hurt my ears, but before I'm sure I could). I laugh at Fraunhofer 320 kbps mp3s they are wasting bitrate. But LAME I think deserves it. But I would agree that LAME 400 kbps does not deserve it, if it were normally possible.

> If things aren't fine at such a bitrate (which has a probability close to zero) it won't be at 320 kbps. BTW depending on bit reservoir usage strategy (restrictions on 320 kbps frames) CBR 256 can temporarily allow for a higher audio data bitrate than CBR 320.

I don't know about LAME 256 compared to LAME 320, but to me Fraunhofer 256 to LAME 320 there was a significant difference. I have wondered if LAME 256 can have higher bitrate than LAME 320 also, since only LAME 256 has bit reservoir. But I am guessing that if LAME turns off bit reservoir at 320 kbps, it will still not allow > 320 kbps frames even at 256 CBR (although the bit reservoir could be very full at most times). But I am just guessing that. But as for Fraunhofer, who knows? But to me the Fraunhofer 256 kbps is inferior to LAME anyways (BTW, I still like Fraunhofer...I think it produces better quality overall than LAME at 192 kbps Joint-Stereo and lower...but that's just my personal feeling).

I am 100% in agreement with you that LAME VBR is not a good idea. There are too many potential problems with it. One is that the M/S frames are overcoded by 20% bits compared to the L/R frames (if it's true, which the recent MP3 thread and multiple of my tests seems to suggest)...that is very dumb to me. I usually use CBR. Sometimes I even use 256 CBR with LAME but only when the song is very simple (like a piano or orchestra, very easy for mp3 but above average difficulty for WavPack). I used LAME VBR for some tracks I encoded with just people talking though (conversations). For such things VBR is ideal because the people stop talking then the bitrate drops to 32 kbps.

> My current mp3 encodings were made with FhG CBR 192

Yeah I knew. I like FhG 192 kbps Joint-Stereo. I think theirs is the best at that setting. I made a reasonable number of such files long ago when I had restricted HD space (but I quickly moved on to FhG 256 Stereo when I had more space....I like Stereo better, but I agree FhG (forced) Joint-Stereo is good. LAME Joint-Stereo is stupid to me because switching to L/R frames to me defeats the point of Joint-Stereo, philosophically. But other people don't agree.)

> Use wavPack 350...400 kbps with a high quality setting for natural music....Use lossless (wavPack, FLAC, TAK, Monkey, OptimFrog or whatever you like best) for any kind of electronic music.

Lossless is not an option to me I would rather use WAV. Also, many kinds of electronic music compress great with WavPack too (nasty Rock music compresses at 200 kbps transparently to me). The good thing about me investigating WavPack carefully is that now I can tell what are the problem songs without even having to test encode. So it's easier for me to choose something else when WavPack lossy is in danger, as you said. I may continue to test to see what gives WavPack trouble, though. Right now I know most of it, but I still don't know what gives the switches (-x4, -h, etc) trouble....the switches are very useful on 17 kHz sine wave, howcome not as useful on real music?

problem samples for WavPack lossy

Reply #31 – 2007-05-18 21:26:32

Quote from: Porcupine on 2007-05-18 00:19:39

.. So I personally am happy with 320 kbps mp3 and don't feel like I am wasting...
...Yes indeed! I am very afraid of that myself. If you'd like, you can send me Eig and SeriousTrouble (.WAV or .WV please) and I will listen. ...
...My first problem sample was the 4th song I ever tried to encode (and one of my favorite songs in all the world, maybe you will laugh at me for that, but the rest of the song is a little different). My next two problem samples came from a CD where the whole CD is like that. So far, no real disasters but those were close calls....
...Right now I know most of it, but I still don't know what gives the switches (-x4, -h, etc) trouble....the
switches are very useful on 17 kHz sine wave, howcome not as useful on real music?

I just gave you my thoughts on mp3 bitrate, but everything's fine if you prefer Lame CBR 320.
I'll give you eig and SeriousTrouble in wav form the way I did with the wavPack samples. But though I really understand being prohibitive with installing software I suggest you give foobar a try. It makes encoding and decoding (and both things together) so much easier. FLAC decoding is directly available after installing foobar. And you get more benefits like replay gain support, tag support, etc. etc. (which you may prefer to take care of later). The GUI is something I didn't like much in the beginning, but once acquainted with it it's fine.

Yeah, I admire your ability to find out weaknesses of wavPack so quickly. But IIRC when using the upper part of the 300...400 kbps range the tracks from your collection were fine at least when using high quality settings.

As for the switches to me it's rather simple. As long as you don't care about encoding or decoding time prefer -hh over -h, and -h over -b. With regular music a higher quality setting allows you to lower the bitrate or alternatively gives you more quality headroom for problematic samples though it might not help in specific cases). The -x settings up to -x3 will improve quality in many cases, and -even x3 doesn't slow the encoding procedure down a lot according to my taste, but the higher you go the more painful encoding time becomes and it rarely provides an improvement with regular music. It can have a significant effect with problem samples however.

In the end for PC use and a rather fast PC like your new one decoding effort will presumably play no role so from the decoding side you can use the highest quality setting -hh. The -x switch has no influence on decoding effort anyway.
For the encoding side it's just up to you what you are willing to allow for encoding time. Formerly I used -hhx5. Encoding was real slow on my rather old machine, but I didn't care much as I encoded when I was asleep. But with -hh and a bitrate close to 400 kbps -x5 isn't really necessary, and you may prefer -x4 or -x3 for the sake of encoding speed.
With -h it's similar. With -b (normal mode) I personally would use a high -x setting like -x5.
I am about to use -f for the sake of relaxation of my DAP's CPU, and with it I use even -x6 right now cause encoding is fast even with -x6. But I just do it for the best I can do and cause I can easily allow for it though dont't expect to get an essential improvement form going from -x5 to -x6.

problem samples for WavPack lossy

Reply #32 – 2007-05-19 03:54:21

I still have much testing to do, but I'm considering bitrates of up to -b480 again with WavPack. Hard to say, that will probably be the last decision I make, between 384 and 480 most likely.

I think I've already decided on my preferred quality switches -x4s0. Like you said, -x5 and -x6 hardly ever does anything (maybe they work better with -f, though, but I won't use -f), and even when they do it is next to insignificant. -h and -hh can be good, but they often work badly together with the -x switches. -hx4 is usually an improvement over -x4 alone but even that's not guaranteed, so I rather use -x4 alone. Encoding time is not a factor to me but decoding time is slightly, if I choose to play WavPack files on my old PC (200 MHz) I don't want it to feel a load. If -hx4 and -hhx4 were guaranteed to benefit over -x4 on everything I would use them, but sometimes they don't, so I'm not willing to sacrifice something (decoding speed) for a "maybe."

I've also been testing the -j0 switch a lot. There are several benefits and drawbacks to WavPack-style Joint-stereo that I've noticed. But I think in general WavPack benefits greatly from Joint-Stereo, and it's essential to prevent 98% of disasters from occuring (if you use -j0, "furious" is not acceptable). Also the smart Joint-stereo switching, activated with any -x# switch is pretty smart, I did a few tests on it and was very pleased with how intelligently it switches. Also WavPack Joint-stereo can benefit on many samples that MP3 cannot benefit on...because after the prediction algorithm finishes, the left and right channels are more similar than the original version...as long as just the high-freqs are fairly mono then WavPack Joint-stereo benefits, while MP3 requires that the whole signal be fairly mono to get a large benefit.

problem samples for WavPack lossy

Reply #33 – 2007-05-19 04:58:39

For robust quality I'd use -s0hhx as you would have fast encoding and still respectable decoding.. In theory more complex decoding = better quality. It might not be that apparent in normal music, but I am sure that on artificial waves etc there will be a difference. IMO the high x values isn't a practical solution.

I still don't understand what is wrong with your high bitrate mp3 and your decision to go this route. Also I don't see any abx tests and so its hard to say what is your trasparency thresholds of mp3 or even wavpack.

problem samples for WavPack lossy

Reply #34 – 2007-05-20 03:17:46

I have no problems with high bitrate mp3. It is halb27 who is trying to convince me to switch from high bitrate mp3 to high bitrate WavPack.

My transparency thresholds for mp3 are indeed such that 320 kbps is transparent to me on everything I've ever heard (until today...Eig). So I'm not complaining about mp3 at all, in regards to the sound quality I can perceive. I do complain a lot about LAME for being disorganized, having terrible documentation, not doing what it says it is doing, etc, but that is different.

problem samples for WavPack lossy

Reply #35 – 2007-05-20 03:48:09

halb, thanks for sending me 'Eig'. Unfortunately, I think that sample had the reverse-effect you intended. After hearing it, I was very impressed by mp3 and feel more encouraged to use mp3 over WavPack than ever before.

I didn't hear anything wrong at all with 'SeriousTrouble.' Even at 128 kbps Joint-Stereo it sounds fairly transparent to the original to me, on LAME 3.95. At 128 kbps Stereo there were obvious distortions, but this is typical performance for most average music samples in the genres I like (semi-contemporary, but not obnoxious like rock music). At higher bitrates 'SeriousTrouble' is flawless to me. Perhaps I don't know what to listen for in this sample?

'Eig' on the other hand was much more interesting. When I first heard it (the original) I nearly fell out of my chair. Eig is basically machinegun noises. The low/medium tones don't cause any artifacts but they serve to break up the Mono-ness of this sample. Without it, mp3 could use Joint-Stereo. Those stereo notes make LAME use 100% L/R frames at 320 kbps (I checked) and mostly L/R frames in other modes. Therefore making the machinegun noises, which are the real problem, harder to encode.

I listened to this sample for several hours. Mostly I only listen to the first 4 seconds, that is enough for me to hear the problems. The loud noises that appear later in the sample are interesting but they aren't the real problem, as far as I could tell. 'Eig' is basically the most severe pre-echo test one could find. I did all my encodings with -m s to make all results consistent, even at lower bitrates (which would otherwise force Joint-Stereo frames and cheat on this).

No official ABX tests on anything that follows (if for some reason I say something extremely remarkable and shocking to others, I can ABX myself but I'll only do so if I feel there is a point), but all the differences are fairly obvious to me (when I said there were differences) that I don't feel they are necessary. When 2 things sound similar to me I just say 'transparent' but that's being conservative (possibly still ABXable with difficulty). I'm not trying to prove anything to others, only to myself. Since I could ABX pretty most of the following mp3 versions from each other, I can only use my own judgement what sounds "worse" and what sounds "better" (you cannot prove what is worse and better if both can be ABXed from the original and each other, it's subjective).

Like I said before, 'Eig' is a pre-echo test of mp3. The machinegun noises turn mushier, more like a typewriter noises, due to the pre-echo induced by mp3. My typical LAME 3.95 encoding parameters are -k --noath which is very unusual. I also tested without those parameters sometimes as well. BTW, 'Eig' is the first sample I've ever heard which is not transparent to me as 320 kbps mp3, and it's also the first sample where I heard a clear difference between my LAME 3.95 -k --noath and normal LAME 3.95 without screwy parameters (other than -m s, on everything below).

LAME 3.95 -k --noath 128 kbps......................terrible pre-echo
LAME 3.95 -k --noath 128 kbps --allshort........exactly the same as above (not ABXable I think)
LAME 3.95 -k --noath 320 kbps......................MUCH better than 128 kbps, proving that increased bitrate helps pre-echo, still somewhat different from original WAV
LAME 3.95 -k --noath 320 kbps --allshort........identical to above
LAME 3.95 -k --noath 320 kbps --noshort.......sounds terrible, as bad as the 128 kbps versions
LAME 3.95 -k --noath 480 kbps (freeformat)...identical to original! as far as I could tell
LAME 3.95 320 kbps.....................................sounds identical to the -k --noath 320 version most of the time, but the very last two gun sounds at 4s sound quite different, it's WORSE with the -k --noath.
LAME 3.95 192 kbps.....................................sounds worse than the -k --noath 320 version most of the time, but sounds better on the last two gun fires again (I could be wrong on this, a difficult and subjective test...ABX testing won't help unless I chop only the last 2 gunfires and break up the test into parts).
LAME 3.95 128 kbps.....................................just sounds bad again everywhere, can't really compare objectively with the -k --noath 320 kbps anymore, sounds too different at all times
LAME 3.92 320 kbps.....................................identical to the original!
LAME 3.92 -k --noath 320 kbps.....................identical to the original! (btw, --noath doesn't do anything on "old" LAME 3.92, tested elsewhere, I just put it for consistency)
LAME 3.92 320 kbps --noshort......................sounds terrible, as expected
LAME 3.92 320 kbps --allshort......................surprisingly, sounds terrible also!!
LAME 3.92 128 kbps.....................................sounds much worse than even LAME 3.95 128 kbps
LAME 3.92 192 kbps.....................................sounds much worse than LAME 3.95 192 kbps

problem samples for WavPack lossy

Reply #36 – 2007-05-20 04:13:13

A number of things about those results are quite confusing but in the end I think I can explain most of them.

Tested elsewhere than here: LAME 3.92 still has bit reservoir at 320 kbps CBR, while LAME 3.95 does not. Previously it was unknown to me if that means LAME 3.92 has an unlimited frame size cap (LAME 3.95 caps at 320 kbps framesize). All I knew was that the bit reservoir was active (I tested in strange, unrelated way). However this test now strongly suggests or proves that LAME 3.92 has unlimited frame size....in practice probably 480 kbps maximum framesize (for 320 kbps CBR) due to the 511 bytes maximum bitreservoir size. This explains why the LAME 3.92 320 kbps CBR is far better than the LAME 3.95 320 kbps CBR, and is as good as LAME 3.95 480 kbps CBR. (BTW, LAME 3.97 supposedly has bit reservoir again at 320 kbps, but LAME 3.98 supposedly does "not", at least effectively). This also explains why LAME 3.92 is worse than LAME 3.95 at lower bitrates.

Eig is a very "unfair" sample regarding bit reservoir because it is machine gun noises with silence in between. Therefore the bit reservoir is always full because it builds up completely in 1 frame after the gunshot, due to the silence. In real music the bit reservoir may not be as amazing all the time.

It's strange that LAME 3.92 pre-echo performance degenerated when I added --allshort. That should improve pre-echo performance or leave it the same. My explanation is that --allshort makes LAME 3.92 be confused or stupid, and NOT build the bit reservoir, but that may not be the correct explanation. Anyways, for whatever odd reason, --allshort clearly hurts the pre-echo performance of LAME 3.92 (not true for LAME 3.95).

Adding --noshort -k to LAME 3.95 seems to worsen the pre-echo on the last 2 shots only, I don't know why. The high-freq spectra is also weird for only those 2 shots (I can "see" the same thing I can "hear"). LAME 3.92 also is bad at 192 kbps on those last 2 shots, regardless of -k --noath or no switches are specified. And they sound bad in the same way, there is a high-pitched pre-echo "puff" that affects only those last 2 shots (it also affects the 2 previous to that to a lesser degree, the other shots are not affected, I don't know why). At first I thought that this test proves that --noshort -k is harmful to my LAME 3.95 encodings, but after thinking more I am not so certain. The switches actually make LAME 3.95 sound like LAME 3.92 (with normal switches, or just -k). I think it might be LAME 3.95 without switches (using questionable ATH system) that is "cheating" in a sense, although in this case the cheating may be helpful in the end.

I also made quite a few lowpassed versions of mp3s, at various bitrates, and the lowpassed versions usually have less pre-echo...up to a point. If you lowpass too much it sounds bad again. --lowpass 12000 -b320 with LAME 3.95 was terrible.

The --noshort -k caused high-freq amplitudes to appear on the last 2 puffs which are clearly audible to me (but the amplitudes were not all above 20 kHz, it is strange but other amplitudes around 17 kHz suddenly appeared too). In this case having superhigh-freq amplitudes with insufficient bitrate to encode them, was more different than the original than just having highfreq-filtered silence (either that or LAME has bugs regarding high-freq encoding). But they were audible, and if they were supposed to be there then they should be there I guess (I can't hear them in the original, though, but both LAME 3.95 and LAME 3.92 say there is supposed to be some high-freqs there for whatever reason...and also the problem goes away at 480 kbps when the high-freqs are encoded with less quantization, even though the high-freqs still show on a spectra).

Overall though I'm impressed with how well mp3 handles 'Eig'. It's mainly because I theorize (by looking at both the time-domain waveform and Fourier-transform waveform of Eig) that no worse signal can possibly be given to mp3. 'Eig' is a bit like a delta-function sample...the worst case for a freq-domain encoder. Whereas high-freq sine wave is close to the worst for a time-domain encoder like WavPack. So comparing mp3's performance on 'Eig' to WavPack's performance on the '17 kHz disaster' seems fair to me. And I was shocked how well mp3 did on 'Eig.' Even though I'd never heard this kind of sample before, when I heard it, I predicted a bigger disaster for mp3 than it was in the end. I never expected 480 kbps mp3 to be transparent (or close). In contrast, WavPack needs around 800 kbps on the stereo 17/17.5 kHz disaster.

problem samples for WavPack lossy

Reply #37 – 2007-05-20 09:10:13

I did a couple more tests and I'm angry at LAME again for doing so many undocumented and (in this case) stupid things.

I tested Eig with LAME 3.95 -k --noath 256 kbps and the result seems superior to LAME 3.95 -k --noath 320 kbps. I am not 100% certain of this though, it was reasonably subtle so I admit an ABX test is necessary if I want to prove this claim to others.

But assuming that's correct, it means that halb27 was correct before when he said that a 256 kbps mp3 can be better quality than a 320 kbps mp3 sometimes, due to the bit reservoir. I had incorrectly assumed earlier that LAME would be smart enough to cap off the maximum framesize to correspond to a 320 kbps frame...even if there were a bit reservoir. But if I believe my ears now, then that's wrong. LAME is stupid and a 256 kbps mp3 can be better quality than 320 kbps mp3 sometimes, like halb27 said. That makes me angry, I can't believe how poorly documented, and in this specific case poorly thought-out by the developers, LAME is.

I also tested the above hypothesis with an entirely different sample and method with the same conclusion. My other sample was a typical song where there's no way I can ABX a hearable difference between 256 kbps and 320 kbps (I can only do so for Eig because it's extreme). But in the other sample I can instead look at a spectrograph and I believe I see (not conclusive, but suggestive) evidence that the 320 kbps version is more quantized at times (when transients are struck and short blocks trigger). For Eig, looking at a spectrograph was inconclusive, it was only hearing where the difference was noticeable.

I also tried LAME 3.95 -k --noath -b256 --nores, to turn off the reservoir, and the result is bad again. I tried to test LAME 3.95 192 kbps vs 320 kbps again...but it is pointless because LAME does too many different things at 192 kbps (extra quantization routines of high-freqs that cannot be turned off) so trying to ABX a difference between the two doesn't prove anything. I wrote my thoughts on LAME 3.95 192 kbps in the earlier post and stick to it. But that doesn't mean anything because 192 kbps is too different.

problem samples for WavPack lossy

Reply #38 – 2007-05-20 10:37:15

Quote from: Porcupine on 2007-05-20 03:17:46

It is halb27 who is trying to convince me to switch from high bitrate mp3 to high bitrate WavPack.

Correct, but when I did I wanted you to bring away from these -k, -ath etc. theoretical considerations which IMO are pretty useless. I hoped wavPack lossy would bring you peace of mind. But you're a champion at finding wavPack problems (and also a champion at being pessimistic as to have fear that 800 kbps may be necessary for wavPack).

There's absolutely nothing wrong using mp3 and looking for an encoder which is best for your purpose.

May be your eig experience has shown you that Lame's default settings like a slight lowpass have a positive effect in the overall view. That's why each mp3 encoder does lowpassing AFAIK. It's especially important with mp3, and Lame does it very well IMO.
So hopefully you have more trust in what encoder developers usually do.

As for the restricted bit reservoir usage of 320 kbps frames: though I personally don't like it as well the devs do it for a reason. The mp3 standard isn't very clear in this point. The standard can be interpreted in accordance with the restriction. And for practice: Fraunhofer decoders installed on Windows systems do behave according to these restrictions, so it's a responsible thing that the devs want the mp3s of their encoders be playable without problems on every Joe's Windows machines. And it's rare that mp3 can take profit from such a high audio bitrate. May be it's best to have this behavior as a default, but have a switch to allow 320 kbps frames to make full use of bit reservoir.

Everything's fine (with mp3, wavPack, and whatever you like): you just can't have everything at perfection.
The devs don't do stupid things. Of course perfection is out of their reach too.

Quite interesting that 3.92 CBR 320 came out so fine with eig. Guess I'll try 3.92. With 3.90.3 (which I had supposed to be more or less identical to 3.92) things weren't so fine. But anyway eig is just a sample to see how bad mp3 can be, but it's not very important for the usual listening practice (certainly different for lovers of such a machine gun music).

ADDED:

Yeah, just tried 3.92 --alt-preset cbr 320 on eig: it's astonishingly good and isn't a problem (to me) in practical listening situations.
As for SeriousTrouble BTW to me it's not a problem either at high bitrate (using 3.97 as suggested by shadowking).

So why don't you use 3.92 CBR 320? You may also give 3.98b1 a try. It has overcome serious (to me) problems 3.97 had.
Don't care so much about rather unimportant details you can't seriously improve upon anyway. Decide on an encoder (an old one may be fine) and enjoy the music.

problem samples for WavPack lossy

Reply #39 – 2007-05-22 03:14:12

Yeah, I will probably do my future encodings in LAME 3.92, maybe eventually also go back and re-encode my LAME 3.95 stuffs in 3.92.

I think (not sure, and definitely not rigorously tested) that "nspsytune" introduced in LAME 3.94/3.95 is better than "gpsycho", though. Newer LAME clearly outperformed older LAME to me on the Eig test, but only when the bit reservoir was active on both. If bit reservoir was active on LAME 3.95 320 kbps, the result should be better than LAME 3.92. I wish there had been a switch to turn it on, like you said. Supposedly LAME 3.97 has an unrestricted bit reservoir at 320 kbps, maybe it would do the best on 'Eig'? On my computer I currently have only 3.92 and 3.95 installed, I erased 3.97, too lazy to get it again. Earlier I had tried 3.98 alpha 11 as well, long ago (but documentation says the bit reservoir is capped at 320 kbps again, so I don't see a big incentive for me to use it).

I could use --nspsytune with LAME 3.92, that is another option. But I am afraid of using an older (non-default) version of nspsytune, perhaps it still has problems that I wouldn't discover until it was too late.

I don't have any problems with a 320 kbps maximum frame size either, because it is more compliant to the standard. The only thing that makes me angry at LAME 3.95/3.96 is that (if my tests are correct) the 256 kbps CBR do not comply to the standard at all. What is the point in making my 320 kbps files compliant (and sacrificing a little quality) if the 256 kbps is still non-compliant? Some of my albums I encoded in mixed 320/256/224/VBR (depending on the song, I check to see how hard it is to encode, but usually I encode at 320). At the very least, LAME should be consistent, but it seems like it is not.

BTW, I listened to Eig more carefully and even in LAME 3.92 320 kbps CBR it's not perfectly transparent yet, but very close I think. I think LAME 3.95 480 kbps CBR is even better, but that is extremely non-standard, I can't even play back such files at this point in time (I had to use LAME to decode it, and even LAME had some bugs decoding the same file which it created).

Also, I listened some more, and the difference between -k --noath and default parameters on Eig for LAME 3.95 is not as different as I initially wrote. I probably need to ABX myself to be sure I didn't imagine the difference, but I think I was correct.

problem samples for WavPack lossy

Reply #40 – 2007-05-22 15:39:49

Using --alt-preset xxx, especially --alt-preset insane for CBR 320 use, makes Lame use nspsytune from 3.90 on.

...if the 256 kbps is still non-compliant? ...

AFAIK there are no issues with 256 kbps frames. It's only for 320 kbps frames that the documentation of the mp3 standard is unclear/strange which unfortunately made the FhG developers produce a rather stupidly restricted decoder which is more unfortunately used as kind of a standard on Windows machines.

problem samples for WavPack lossy

Reply #41 – 2007-05-22 22:58:47

Thanks, I didn't know that the --alt-presets used nspsytune in the LAME 3.90~3.93 series. I will need to tinker with LAME 3.92. BTW, I have 3.92 not because I think 3.92 is better than 3.90 or 3.93, it just happened to be the old version I found first. Like you, I expected it would be the same as 3.90~3.93, but who knows.

There are no issues with 256 kbps frames in LAME 3.94~3.96. The issue is with 256 kbps CBR. Encodings made at 256 kbps CBR have the bit reservoir on and the framesize can grow as large as 256 + 156 = 412 kbps frames. I would guess that a 412 kbps frame is not compliant. I've never had any compatibility problems using Winamp, but maybe if I used a restricted decoder then it might have problems with my 256 CBR files I have made with LAME 3.95 (but not my 320 CBR files, which have no bit reservoir). That's the inconsistency which I don't like.

When I tested Eig, a LAME 3.95 256 kbps CBR file was better quality than LAME 3.95 320 kbps CBR, due to the bit reservoir difference...my 256 kbps version essentially being like a 412 kbps mp3 file for Eig.

At least LAME 3.92, 3.97, and 3.98 are consistent. LAME 3.94~3.96 is not consistent. For LAME 3.92 the bit reservoir is always on and always unrestricted, and always non-compliant. This is a consistent treatment so it's good. LAME 3.97 claims the same in the documentation (I did not test). LAME 3.98 documentation claims to always have bit reservoir on, but restrict the size of each frame so it's not bigger than 320 kbps + sideinfo, so that's good too (consistent).

problem samples for WavPack lossy

Reply #42 – 2007-05-29 09:09:59

I decided to do some intense mp3 testing with real music and some HA samples. So far wavpack like mpc, is more stable to me than mp3. I am interested in a hifi-portable mp3 solution so I don't have to transcode.

-V5 ~ -V4: After a while nothing is transparent. Artifacts / pre-echo is abxable on lots of things , some which are annoying. Problem samples are annoying. -V4 shows no improvements with hardcore samples and marginal improvement on lesser problematic cases.

- V3 : There is a quality boost on some material. Some problems solved, but somehow NEW problems are introduced that are not present at lower presets. Problem samples show some improvement over -V5, still annoying at times.

- V2 : Another quality boost - most low-medium problem cases are reduced or resolve. I had a feeling from the past that its still not clean on some material. This time around found pre-echo still present on some drum attack, clean guitar notes. I could not normaly abx these with MPC standard. Problem samples show some improvement over -v3. Some become acceptable, others still annoying.

- V1: The differences I heard on V2 seem to resolve. My gut feeling is that low-medium problem cases will reach full transparency here. Hardcore problem cases (pre-echo) are still easily abxable at times, but many will sound okay. Quality will not go up on some hardcore samples even up to 320k.

My impressions:

I could not get stable quality below -V2. Quality scales very badly compared to wavpack because throwing bits at wavpack pays off. Throwing bits at mp3 doesn't do anything on hardcore samples. I was more interested about low-medium problem cases that would be common at -v5 which people rate highly. In theory I thought that -V4 or -V3 would make these artifacts disappear or at least reduce them a lot, but in practice this wasn't the case several times. Worse is that higher presets sometimes add artifacts not present on lower presents. -V3 actually fixed several -v5/4 problems. At -v5 ~ v3 things are like wavpack at 260k - hit / miss.

-V2 is a possible solution, although my gut feeling is that pre-echo can be picked up on guitar strings, drum attacks etc etc. Not major problems but a bit annoying. MPC and probably AAC/OGG will be better at 190k.
At higher presets (-V1 or more) the 'little' problems will dissappear and bring mp3 in line with mpc and the rest. Bad pre-echo cases won't resolve much. Some electro music is cannot be encoded with mp3, but also problematic for the other codecs to a lesser extent. I found wavpack high modes do quite well on artificial music - considering mpc and the rest also fail on some of these.

The major mp3 issues are pre-echo and -sbf21 bloating which lead to inferior performance per bitrate. The other big frustration is quality doesn't scale well. -V1 is a solution for me, although I don't know if I'll do it yet. Bitrate is 200~250k which is worse than mpc / ogg / aac 170k. Overall not to bad at all considering the age and limitation of mp3 format. Its still competitive

problem samples for WavPack lossy

Reply #43 – 2007-05-29 11:36:55

Which version did you test?

3.97 has a serious issue with tonal samples but 3.98b3 has overcome these to a great extent.
Especially pre-echo seems to be a lot better though this is a problem we have to stick to with mp3, but if it's not really annoying it's ok to me.
Moreover 3.98b3's VBR is working a lot more robust than 3.97's.
IMO for tonal samples it's best to use ABR or CBR > 200 kbps for very good quality, but if you're very sensitive towards pre-echo, it looks to me like -V1 (or -V0) is the better choice.

problem samples for WavPack lossy

Reply #44 – 2007-05-29 14:13:05

I tested 3.98b3. I still think V3 is a great tradeoff. But on 3 tracks there are new artifacts - ringing / swooshing - but --vbr-old was clean and now I am confused. Also I have a track from Nightwish 'Angels Fall First' and the LAME encoder is destroyed on both 3.97 and 3.98 (acoustic guitar intro).

Thing is, at 250k I am not far from 300k where wavpack is decent. I could stick to wavpack 350k and keep trascoding to -v5 which works great for portable only use.

problem samples for WavPack lossy

Reply #45 – 2007-05-30 00:12:15

Quote from: shadowking on 2007-05-29 09:09:59

Throwing bits at mp3 doesn't do anything on hardcore samples.

I disagree, at least for 'Eig' which is arguably the most hardcore sample theoretically possible for mp3 since it is a delta function (therefore, a "transient" "sine wave" in the transform domain...so it's a disaster...just as a high-freq sine wave in the time domain is a disaster for WavPack). You just have to throw a LOT of bits at the hardcore samples for mp3, more than 320 kbps. You have to throw about the same amount you throw to WavPack to get rid of the real problem samples, like 400-500 kbps, maybe even more. Unfortunately mp3 caps you at 320 kbps if you want to comply to standard (try some freeformat mp3s if you can get them to work).

The way I see it, there are 2 classes of types of "problems" for mp3. One is the type which goes away almost completely in all cases at 192 to 256 kbps, most "distortions" are in this category I think. The other type is the ones which only go away as you approach to lossless (so you need like 500 kbps to 1000 kbps and can still hear improvement) and apply to the most severe pre-echo cases ('Eig'). The more minor pre-echo cases that affect lower bitrates are other issues probably (failure of LAME to block-switch, etc), sort of in between, or a 3rd problem class.

I think that the reason Eig is so bad with mp3 is that psychoacoustics does not apply to Eig (or pre-echo in general, when there are absolutely no other sounds present and/or temporal masking from other instruments). Psychoacoustic literatures mostly are concerned with tones and masking of tones, for stationary tones. To achieve transparency with Eig for a mp3-like encoder, the main solution is to make sure the transformed function is not quantized in any way, therefore still using the same number of bits and still lossless in a sense. There's no "cutting corners" with "psychoacoustics" with something like Eig, is the way I see it.

I also don't think sfb21 bitrate bloat is as big a deal as people say. After a recent discussion I acknowledge it exists now, and think I understand it fairly well. But:

At lower bitrates, I think V3 and lower or 192 kbps CBR and lower (depending on LAME version)....LAME's internal settings do not allow bitrate bloat to have any effect. Instead, LAME chooses to allow the high freqs in the sfb21 range (16+ kHz) to be encoded terribly. The bitrate won't bloat at all with this approach. You'll just have really quantized high freqs. No bits are wasted, you get what you pay for.

At higher bitrates.....LAME will make sacrifices and try to encode sfb21 better at the cost of *perhaps* overusing bits in frequency ranges corresponding to the other scalefactor bands. This is the well-known phenomena of bitrate bloat. But I am not convinced it is even a bad thing at high bitrate. The key question is the *perhaps* part. The psychoacoustic model will complain and say "Why am I overusing so many bits on the low/middle frequency ranges?" But the psychoacoustic model is not always correct. Since I currently believe that psychoacoustic models are semi-worthless to encode serious pre-echo disaster samples like Eig, it may not even matter whether you listen to your psychoacoustic model or not, at high bitrate. It may just be better to use no psychoacoustic model at all and just quantize all freqs the same (flat quantizaton). This is highly inefficient, but it doesn't matter most of the time since your bitrate is so high anyway. And for the true disasters, this may actually be the best approach.

------------

BTW, I have always thought that WavPack lossy is competitive with mp3 at similar bitrate. I often said that many samples compressed transparently to me at 200 kbps with WavPack. It just depends on the kind of material, and it's almost the opposite between WavPack lossy and mp3, what performs well and what doesn't. To me thought, WavPack really needs a VBR mode while mp3 doesn't. For mp3, the bit reservoir is the savior, as long as the encoder is smart enough to utilize it every time there is a transient (which it might not be always, I dunno). I personally don't like VBR with mp3. Even with V0, the quality may be worse than 256 kbps CBR sometimes because no bit reservoir (or a minimally functional one, as claimed by the LAME documentation). The quality will be better than 256 kbps CBR at times also, strengths and weaknesses to both approaches. Like halb said it really seems like the 3.98 VBR is better now from what the LAME developers said, but bleh I'm still not really interested in VBR for mp3.

WavPack's problems don't really lie with transients so a bit reservoir can't help it. Instead, I think WavPack would be great with a VBR with a very dynamic range. (Sorry, I still haven't tried Optimfrog. I'm not sure what is the dynamic bitrate range for Optimfrog).

Notice