problem samples for WavPack lossy

Topic: problem samples for WavPack lossy (Read 19843 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

problem samples for WavPack lossy

2007-05-12 10:06:10

Here is my dumping ground for any samples I find to be exceptionally problematic for WavPack lossy mode. Links are in WavPack lossless format. I suggest re-compressing with the lowest possible bitrate (-b24) at first to see how bad it sounds, then gradually working up until transparency is achieved (or, perhaps never achieved until the bitrate hits lossless).

All comments, or postings of other problem samples, are welcome. I am curious how my sample(s) compare to 'badvilbel' which I've never heard for myself. In the case of my first sample though, it is a problem only for WavPack lossy...mp3 has no problems encoding this same passage transparently at low bitrate. From my limited experience thus far, it seems that what WavPack lossy finds difficult to encode is very different from what mp3 finds difficult to encode, and also very different from what WavPack lossless finds difficult to compress.

problem samples for WavPack lossy

Reply #1 – 2007-05-12 10:49:19

Yes that's correct. Different samples and they sound different too. Wavpack and dualstream are expected to do worse on highly tonal signals where there is not much masking from other sounds / frequencies. I have a collection of similar samples.

At 256k fx3s0 the difference is obvious but not startling to me. With the default noise shaping it is more annoying - I can hear noise move around the note. With s0.5 positive shaping a hiss increases and the effect is like pouring sand. I might test auto shaping later.

320 fast mode [fx3] - better , can abx some bits though.
320 hhx mode - like above .. some parts are better though
384 fast mode - Hard. abx sometimes no sometimes yes.
384 hhx mode - no difference
448 fast mode - no difference
DualStream Q0 241k - obvious problems.
DualStream Q1 278k - similar to WV 320
DualStream Q3 359k - I don't know.. 6/8 then 7/8 - a very subtle hiss. Hard to explain exactly.

Dualstream vbr works well, but at very high bitrate wavpack normal and fast mode start to look more attractive.

Other encoder like MPC is also not 100% clean around 4.9 secs. Nice sample !

problem samples for WavPack lossy

Reply #2 – 2007-05-12 10:59:34

Quote from: shadowking on 2007-05-12 10:49:19

Yes that's correct. Different samples and they sound different too. Wavpack and dualstream are expected to do worse on highly tonal signals where there is not much masking from other sounds / frequencies.

Yup, these are some of the exact same comments I made to halb27 in PM. The positive side to this is that WavPack lossy (and probably OptimFrog dualstream too) can perform extremely well on some other samples that mp3 has problems with, especially things like loud and obnoxious rock music, which has lots of loud noises at all freqs, which often completely masks the quantization noise of WavPack lossy even at the minimal 200 kbps.

problem samples for WavPack lossy

Reply #3 – 2007-05-12 11:17:34

The behavior is not unexpected for this class of samples. At 200k I abx everything. I suppose you want to learn how wavpack does its deed by using lowest bitrate. With these encoders you will need 300~400k for robust performance - you can't have a free ride. Removing filtering, lowpass, noise shaping has a bitrate price. At 300 and over even if there was a difference you probably wouldn't notice, but if you train to find the problem at low bitrate then you become more sensitive. Also since your music doesn't compress much in lossless you might consider 450k for a very large headroom - yet its still half the lossless bitrate.

When you test a wide range of samples you will probably find 300~350k enough. You could stick with that and do a correction file dump to DVD as an insurance. Otherwise, You can crank it to 400~450k. In the beginning I suggest to keep the correction file until you know exactly what you are doing or want.

problem samples for WavPack lossy

Reply #4 – 2007-05-12 18:06:32

Nice sample for me too.

It's one of those rather rare samples where a high x value helps. My favorite noiseshaping of s0.4 is slightly helpful for my ears too. With -fb350x5s0.4 it's transparent for me - but this may be different for younger ears.

problem samples for WavPack lossy

Reply #5 – 2007-05-12 23:26:21

Thanks for sending me 'badvilbel' and 'Atem-lied' samples, halb27. I've been listening to these samples along with my 'Track03' for 2 hours. The 'furious' link for some reason did not work but that is okay, don't bother fixing it, you have already given me enough. Everything I am about to say I did NOT use proper ABX testing on, only regular (careful) listening, but I think I'm a fairly conservative/accurate/experienced/nottoomuchplacebo listener.

Regarding my own sample, I have very similar evaluations as shadowking. Perhaps just a tiny tiny bit more problematic to me, though, but I could be imagining. I also assume that if shadowking "practiced" his ability to ABX this passage might increase slightly (in regards to me deciding the bitrate I would want to encode this passage at).

I tried -s0 as you suggested (I had played with this setting in general previously, but not exactly as you suggested, and not really for this song) and was surprised. I totally agree with you, -s0 helps this song significantly. Yet, that's not what I found in general, so in practical reality (if/when I encode albums for real, not testing) I might be too dumb to apply -s0 to songs like this (unless I applied it to everything). In general I find -s-0.5 (the default) to be the best, with -s0.5 and -s0 fairly good as well (-s-1 and -s1 are terrible, too extreme). But it depends on the song, as your suggestion encouraged me to realize.

halb27 mentioned to me that he thinks 'Atem-lied' is also better with -s0 or -s(positive#). He said he hears some problems with -s(negative#) such as the default setting. However, I felt differently. I thought 'Atem-lied' was best with the default and worse with -s0 and -s0.5. I never tried to mess with -s for 'badvilbel'.

Now for my thoughts on halb27's samples. Atem-lied is basically a telephone ringing sound plus a woodflute playing a tune, and nothing else (it's a reasonably unbusy/tonal sample, as expected for WavPack lossy problem samples). I found the level of noise to be fairly similar to my problem sample (my sample is softer though, so turn up volume to same level as Atem-lied). But it's interesting because I hear hardly any noise at the end of the sample, when there is only the telephone by itself for a moment then the flute by itself for a moment. It's only when they play simultaneously that the terrible noise appears. I would have expected the telephone noise by itself to be just as bad. Maybe in the future this will give me hints what other kinds of problem samples to search for. In my song also, although the noise is bad at all times, it is noticeably worse during the brief times when more of the tinkly sounds are added (twice in the clip). Perhaps interaction of two problematic sounds creates an extra problematic sound. (but probably four or more problem sounds together would not be a problem anymore, as in that case the song becomes too busy and the noise drowned out).

Badvilbel is a weird noises sample (only some of them are problematic). The problematic parts are the high freq weird noises, which in general have energy in the 15 kHz to 18.5 kHz region. This is not a tonal sound but the sound is constricted within a bandwidth and the song is silent otherwise, so its expected to be a Wavpack lossy problem. Badvilbel highly impressed me. The level of noise is way worse than my sample and Atem-lied. At 400 kbps with default settings it is totally obvious to me still (in the really bad parts). Even 450 kbps is still quite noticeable. About 500 kbps was necessary for me to think it is was transparent.

But then I discovered that for whatever odd reason, -x3 (or higher) greatly saves the day for Badvilbel. The different is HUGE when this switch is added (-x and -x2 are near-useless, as seems to generally be the case in my experience). With -x3 alone, Badvilbel is completely and utterly transparent to me at 400 kbps (despite the reported avg/max noise level only being comparable to 450 kbps with default parameters...the -x3 switch is amazing here). With settings like -b384hx4, I have full confidence Badvilbel is safely transparent.

However the flip side to this is that while the -x3 (or higher) switch is generally terrific, the amount of help it gave to Badvilbel was the exception. Even settings like -hhx4 didn't seem to help Atem-lied and my sample by such an amount. Assuming you use the quality switches, I think Atem-lied and my sample are worse than Badvilbel. I sort of concur with shadowking that these samples are also 99% transparent with -b400hx4 or similar...but not 100%. Atem-lied is not so bad though because the noise is fairly constant (you can check at 200 kbps...terrible noise but kind of the same always)...with my sample the noise is varying a bit and more annoying. I'm not sure if I could ABX my sample at such a setting, but I think it could be possible with a lot of practice.

In any case, if these 3 represent the worst kinds of problems I could encounter with WavPack then to me that's not a problem. I am happy to encode at 400 to 450 kbps with the qual switches. I guess my fear is that there are far worser problem samples than these that can exist. I will continue searching. I already made one problem sample that is ultra terrible (never transparent until lossless is reached) but I artificially generated it so it doesn't count. Also I think there is more than one way to generate really bad samples so I'm going to try to figure out all the ways, after which I can consider what is the likelihood of encountering in actual music (my fears were high after I discovered my problem sample because it was only the 4th song I ever tried to encode, but since then I encoded 50 more songs without anything remotely this bad, so my fears are lower now).

problem samples for WavPack lossy

Reply #6 – 2007-05-13 04:37:48

Badvilbel sample is a problem for compression. Play with lossless modes and watch the effect -x and high modes have.

Also consider Dualstream which was designed for what you seem to want. Ghido's quality method works very well and is more consistent than -x mode approach. I have a hunch that quality 3 will do the job for you and save you bits.

problem samples for WavPack lossy

Reply #7 – 2007-05-13 09:45:34

My experience with Dualstream quality mode is very restricted as I just tried it not long ago after you wrote about it recently, but to me too quality is astonishingly robust throughout the different genres.

problem samples for WavPack lossy

Reply #8 – 2007-05-14 03:59:01

Sorry to be blunt, but it will probably be a very long time before I feel an interest in Optimfrog dualstream. Not because I have anything against it, but I only just started using WavPack, and I prefer to concentrate all my free time to testing just one format. WavPack seems to be the more commonly used format and open-source, so I feel it is a fair competitor for its opponent (in my mental world) of MP3.

Similarly, on the transform-side of the "formats war", I've only really tested MP3 heavily over the years. I've encoded/decoded/listened to a few OGG, MP2, and AC3 files...but that's about it. I'm content to choose one representative from the transform-approach, and one representative from the lossless/lossy/prediction approach. Even if MP3 and WV are not be the best players on their "teams" I think they should represent the general pros and cons of their side well enough.

I'll still read and think about any comments or comparisons you make between Optimfrog and WavPack though, I just don't want to go into the trouble of testing them against each other myself, because to me they are "friends". Also, if I tested Optimfrog I have to test OGG also, to be fair, and I don't have time.

BTW I was thinking and listening a bit more, and I realized that the volume level you listen at can be important when studying these problem samples for WavPack/Optimfrog. For samples like these where the quantization noise is unmasked, the louder you listen, the easier it is to hear them. If you listen too loud the song itself might become too loud for you, but even then a highly-tonal problem sample will not be masking the noise very effectively so it's still easier to hear. I have no idea what volume levels shadowking and halb27 listen at compared to me. For the most part I've been listening to these problem samples relatively soft, but I turn up the volume when I increase the bitrate, too (but so far I have still tested everything at volume levels a bit below what I sometimes listen to music at). For people who like to listen to music really loud sometimes, they might need to use more bits on the problem samples (for normal music it doesn't matter as the noise gets masked by similar frequencies regardless of volume setting). It's almost like the ATH issue with mp3 files, in a way (not ATH as related to high-frequencies only, but just the idea of having an ATH floor for mp3 encodings).

I also played with lossless compression of Badvilbel as shadowking suggested, but I didn't find anything out of the ordinary. Badvilbel (the short clip that halb sent me) compresses well with default settings (around 40%) and the ratio gets better as better settings are used. I have encounted samples (white noise) where the lossless compression doesn't improve much or gets worse with "better" settings, but Badvilbel isn't one of them.

This is related to one of my ideas regarding my search for problem samples for WavPack, though. So far, the problem samples I've encountered compress reasonably well losslessly (30% to 45%), yet they are highly tonal and perform poorly in lossy mode. A number of my other songs compressed awfully when lossless (only 15% compression) but were transparent at 200 kbps (to me) because they were obnoxious rock songs filled with noise. I'm worried that there could potentially be a tonal problem sample that also compresses awfully when lossless...and if such a thing exists then that could be a problem far worse than what I've heard so far. That's what I'm looking for now. Another idea I had is that a noise-like sample (that compressesly awfully when lossless) that is too dynamic and rapidly goes on then off, might also fail to mask the noise and that could be a worst-case disaster as well. 'Atem-lied' might actually be a bit like such a sample (the telephone sound rapidly goes on and off) and that's what gave me the idea.

EDIT: BTW, in some ways, the worst problem sample I've heard now is a plain 17000 Hz pure sine wave (use 100% amplitude). I haven't provided a download, but if anyone is interested maybe they can just make one for themselves with whatever WAV-editing program. At least with the WAV-editing program I used, the resultant sine wave turned into a 200 kbps WV file with the worst possible noise level (-10 dB, the worst theoretically possible, white noise yields similar) and was much worse than Badvilbel. As with Badvilbel, the -h or -x4 switches save the day greatly, but even with those switches, more than 500 kbps is required for transparency at a normal volume. However, a 100% amplitude high-freq sine wave is not a reasonable thing to occur in real music. I would say that a 25% amplitude high-freq sine wave is the most that should occur (because any song will contain other sounds as well sometimes, and clipping must be avoided). With a 25% 17 kHz sine wave, -b450x4 is transparent for me (400 not sufficient, and the x4 switch is absolutely critical otherwise still need 500+ kbps).

The other interesting thing that I noticed with this sample is that the -x switches, especially -x3 and higher, adds a repetitive clicking sound into the noise, while the -h switches don't. Also, for this sample, -hx# is a really bad combination, it degenerates the quality significantly compared to either -h or -x# alone. I'll have to test more on real music before I decide what kind of encoding parameters I favor, but this sine wave test makes me favor -x4 as my final encoding choice. In real music, adding the -x switch to the -h switches almost always helps (not much, though)...but if there is a chance of significant degeneration such as occurs with a plain sine wave, then I don't want to combine them (in which case, I choose -x4 over -h).

problem samples for WavPack lossy

Reply #9 – 2007-05-14 04:28:12

I listen and abx with normal volume. But when testing Guruboolez's organ sample I had to turn it up a bit as its really low volume. Even at 256k it sounded ok. You are right about the volume though. I will never listen to stuff THAT loud as its bad for hearing and equipment. Also your amp / soundcard will put out more noise as you increase the level.

As for these artificial samples. I am too scared for my hearing and equip to mess with them. As for discovering problems on CD's ; I think listening tests are a waste of time around 384k and over.

problem samples for WavPack lossy

Reply #10 – 2007-05-14 08:42:20

When testing I hear a bit louder than I normally do but not very much.
I want to stay within a realistic listening situation.
It's sometimes a problem with a short clip to judge on realistic volume (with bruhns for instance).

problem samples for WavPack lossy

Reply #11 – 2007-05-14 08:54:26

Quote from: Porcupine on 2007-05-14 03:59:01

... BTW, in some ways, the worst problem sample I've heard now is a plain 17000 Hz pure sine wave (use 100% amplitude). ...

Badvilbel BTW has a strong signal in the 18+ kHz range at least at the first spot where wavPack noise is very loud, and it has no signal in the mid frequency range. Guess that's a plausible reason for wavPacks bad behavior: strong signal in the hardly-or-not-to-hear range producing unmasked noise where it's easy to hear.

I'll try a 17 kHz lowpass before encoding tonight. Guess that will help a lot. Not a solution for you, Porcupine, but I can consider going this way in case it really helps, especially as I will go shadowking's way using fast mode and ~350 kbps.

problem samples for WavPack lossy

Reply #12 – 2007-05-14 18:31:39

Just encoded a 17 kHz lowpassed badvilbel version using -fb350x5s0, and the difference against the no lowpass version is enormous. It's still not transparent with this setting, but I can easily accept the small differences especially as the original is noisy too.

May be a good procedure when it's up to encoding especially electronic music for those who don't care about lowpassing.

problem samples for WavPack lossy

Reply #13 – 2007-05-15 00:20:48

Yes, I totally understand being afraid to mess with artificial samples. I am, also. Actually, I carefully try to make sure I understand my equipment, and calculate the dB of the 100% sine wave given my amp setting (it has a numerical dB output, however its a relative number to something else so I have to do more calculations), and also know the db/W sensitivity ratio of my speakers and the power handling limits of the tweeter....when I do artificial tests. Even then, I worry because the stupid 17 kHz sine wave generates a huge, deafening transient surge (PAK!) the moment I hit Play and when the sine wave ends. In theory I guess it might be okay because the amplitude of the transient surge should not possibly be that much louder than my 100% amplitude 17 kHz sine wave itself, but maybe that theory is wrong so I'm very afraid, because the surge spark is so loud (btw, a few real songs surge too when the composer does not add perfect silence to the beginning and end). Maybe the surge is Winamp's fault.

Halb, I think maybe your spectrum analyzer and the sound is out of sync, or you didn't look carefully. The "strong" (it's actually only ~5% amplitude I think, but that's still unusually high compared to most music) 18 kHz sine wave does generate lots of noticeable hiss but that's not when the hiss is worst. It's in the 1 second leading up to the 18 kHz sine wave where the Badvilbel hiss is bad. During that time there is a weird, changing, noise-like spread of freqs in the 13 kHz to 18 kHz region of relatively low amplitudes (0.5% amplitude...however this is a loud sound overall, much louder than the 18 kHz sine wave, because it's a spread of noise I think so lots of "energy" when "integrated" overall).

This is what the Badvilbel clip you sent to me sounds like to me, WAV at top, low bitrate WV on bottom:

BUUUUUUUUUUUUUUUUUUU............byoing....................BUUUUUUUUUUUUUUUUUU......byoing...........
.

BUUUUUUUUUUUUUUUUUUU..........ffFFFFFFFfffffffffffffff....BUUUUUUUUUUUUUUUUUU.....ffFFFFFFFffffff...
..

BUUUUUU = stupid loud noise, nothing to do with WavPack's problems
byoing = funny high-frequency sound, hard to describe, but clearly audible and always louder than the FFFFF
FFFFFF = loud WavPack quantization "white" noise/hiss.............ffffff = softer hiss

If I had better hearing I would probably be able to better hear the 18 kHz sine wave going "eeeeeee"
after the byoing but I can't really hear much of it so it kind of sounds like nothing (I can hear 18 kHz, but
I need more than 5% amplitude at normal listening volume).

At medium bitrates like 300 kbps I already don't hear the ffffff parts anymore, only the FFFFF parts I can still hear and that's what I call the problem parts of Badvilbel.

Maybe you should try a lowpass filter at 14 kHz or even 13 kHz, I would bet all the hiss would competely disappear even at 200 kbps. I haven't tried, I'll let you try if you want to. But the byoing will disappear too so you ruined Badvilbel if you filter that low.

BTW, I also edited the end of my previous post regarding 17 kHz sine wave test, I just tested it more carefully and may have made some mistakes earlier, not sure. Anyways, I'm going to test real music now again.

problem samples for WavPack lossy

Reply #14 – 2007-05-15 08:18:46

Probably you hear a lot more of weird stuff than I do.
Can you give me the exact second with the ...byoing... / ...ffFFFFFFFfffffffffffffff... please?

As for the lowpass I'm only interested in settings that are of practical use (to me). I've been experimenting a bit with a 16.5 kHz lowpass before encoding last night and I'm pretty happy with the results when applied to problem samples though the effect usually is not so positive as with badvilbel.

Will try in the next weeks the other way around whether this lowpassed music will lower my musical enjoyment with regular music. Guess it won't and if this is so it's a way for me that makes wavPack lossy quality more stable.

problem samples for WavPack lossy

Reply #15 – 2007-05-15 15:35:37

Quote from: halb27 on 2007-05-14 18:31:39

Just encoded a 17 kHz lowpassed badvilbel version using -fb350x5s0, [..]

May be a good procedure when it's up to encoding especially electronic music for those who don't care about lowpassing.

It may be beyond me, but at the point where lowpassing is involved to get transparency at bit rates around 350 kps, I think it's time to look at other lossy codecs

problem samples for WavPack lossy

Reply #16 – 2007-05-15 16:01:50

Yeah if you lowpass definately go for other codecs. Things seems stable enough for me using 320 -hx4 even on the rare synthetic stuff. Encoding is very slow but only done once and decoding on this new high mode is competitive - x17 vs x9 for optimfrog. I think Bryant's manual is quite good.

quote regarding -x:

Because the standard compression parameters are optimized for "normal" CD music audio, this option works best with "non-standard" audio (synthesized sounds, non-standard sampling rates, etc.) where it can often achieve enormous gains.
The default level (n=1) provides a decent improvement with little cost in encoding speed and is recommended for all but the most time critical encoding. Higher levels provide some marginal improvement with an increasing cost of encoding speed. The highest levels (n = 4-6) are extremely slow but can provide significant improvement in special situations (i.e. synthesized sounds).

problem samples for WavPack lossy

Reply #17 – 2007-05-15 16:07:16

I admit lowpassing with a hiqh quality lossy codec is a bit strange, but after all with a lossy codec it's always compromise and it's personal taste what you're willing to give away.

Sure a 16.5 kHz lowpass is only for people who hear next to nothing in this range like me.

problem samples for WavPack lossy

Reply #18 – 2007-05-15 16:29:27

Quote from: Porcupine on 2007-05-14 03:59:01

The other interesting thing that I noticed with this sample is that the -x switches, especially -x3 and higher, adds a repetitive clicking sound into the noise, while the -h switches don't. Also, for this sample, -hx# is a really bad combination, it degenerates the quality significantly compared to either -h or -x# alone. I'll have to test more on real music before I decide what kind of encoding parameters I favor, but this sine wave test makes me favor -x4 as my final encoding choice. In real music, adding the -x switch to the -h switches almost always helps (not much, though)...but if there is a chance of significant degeneration such as occurs with a plain sine wave, then I don't want to combine them (in which case, I choose -x4 over -h).

Old high mode is still superior in most cases. Simply use -hhx1 will save you lots of encoding time, yet still give strong compression and quality. If you want a pseudo ultra-high mode use -hhx4.

problem samples for WavPack lossy

Reply #19 – 2007-05-15 20:43:19

Yes, the -h alone is superior to the -x4 in general, and especially with sine wave test. However for normal music (Badvilbel and sine wave does not count, but Atem-lied and my sample count) I find that they are comparable (but -h is better) and neither helps much. And in normal music -hx4 helps even more by a tiny tiny bit (but again I don't like the possibility of significant degeneration in rare cases).

I hate -x and -x2. They have rarely done anything good for me on any song and have degenerated quite a few. Only -x3 and higher is worth it to me. But I think I saw you say in another post that -x turns on smart Joint Stereo switching, is that right? (I am thinking of discussing this topic at a later time). If that's the case then -x3 or higher seems good to me (because I hopefully get the smart Joint Stereo), but I still wouldn't use -x or -x2 which so far have degenerated as many songs than they've helped.

I don't care about encoding time much, but I very slightly care about decoding time, enough to make me favor -x4 over -h (assuming they perform almost the same, which is usually the case).

halb27, 'byoing' occurs at approximately 4.7 seconds to 5.0 seconds. In the case of a WV file the FFF fluttering hiss is also worst and occuring at this exact same time. After 5 seconds the hiss is still there but less...this is when the 18 kHz sine wave is active according to my spectrum analyzer. Byoing is a terrible way for me to describe that sound, but I can't think of a good way to describe it. It is a high fluttery whirring sound, a little bit similar to the FFF fluttering hiss itself except much higher.

problem samples for WavPack lossy

Reply #20 – 2007-05-15 20:55:37

I tested another CD which was filled with instrumentation (music style is not the same) similar to the first problem sample I found. As I expected, this entire CD consists of somewhat problematic songs, oh well, but they are still handled (I think) with 384 kbps.

Here are a couple clips from a typical song on this album. I don't think they are as bad as the 3 problem samples from before but they are close. The second sample is quite funny at 200 kbps, so I recommend everyone to try it just for a laugh. If the original file is not heard first to know what it is supposed to sound like, it could pass as transparent because the noise sounds like it is supposed to be there.

problem samples for WavPack lossy

Reply #21 – 2007-05-16 02:20:06

Interesting samples.

Track 3: some hiss on keys and hiss on violins both WV and dualstream.

WV 256: obvious
WV 320: not obvious, a bit of hiss still on violins
DS Q1: obvious
DS Q2: Better, but still obvious.
DS Q3: Fail to abx

Track 4: Serious hiss for wavpack.. hiss for DS.

WV256: obvious in many parts
WV320: obvious in many parts
WV384: Harder.. Can abx a hiss around 15 secs.
WV448: 8/8 around 15 secs.. Did I make some mistake ?.. tried several more times and failed to abx.
WV448 fast: 7/8 .. its still there, worse on fast mode.
WV512: Nothing wrong.

Dualstream:

Q1: Obvious
Q2: Not bad at all, 1 part abxed.
Q3: Fail to abx.

Again Dualstream quality is robust and impressive across different samples. Quality 3 is IMO a stable solution resulting bitrates are around 320~360k.. Default settings yeilded a bitrate of 325k and very good quality. Fast mode gave 347k with bit identical quality.. WV wasn't clean at 448k.

problem samples for WavPack lossy

Reply #22 – 2007-05-16 09:07:33

As a side product of developing a (hopfully useful) quality checking program I produced some error files yesterday (difference original wav - encoded wav).

It was quite interesting. As David has always said it's pure noise. With the samples like trumpet where low bitrate encodings sound like a distortion artefact to me I formerly had the suspicion that something like a distortion can be heard in the error file. But it's not like that. The lowfrequency noise in these cases just seems to interact with the signal in a way that makes it sound like distortion (to me).

It was also quite interesting to get an impression of the noise when using default noise shaping, s0, or s0.4.
Well, to my taste s0 sounds definitely the most 'natural'. s0.4's noise is (to my aged ears) not as loud as s0's. But it is already of a pretty 'bright' nature which is not as natural as s0's flat noise.
I didn't feel like that with my practical encodings, but due to these results I think it's better to play it safe and use s0.

problem samples for WavPack lossy

Reply #23 – 2007-05-16 21:47:01

Yeah, it's pure noise, but in a dynamic sample, the noise may get very loud for just an instant, so it could be perceived as a distortion in a rare case. When the noise changes with time in a certain way, it can develop a little bit of sonic character. The noise in the Track04.wav I posted sound to me like cymbals, especially at 200 kbps (I used -s-0.5 the default though, maybe it sounds different with different noise shaping). So I laughed when I heard it because I thought it was cymbals there but it's supposed to be nothing (just the high tinky noise, which is what triggers the "cymbals").

Actually, I'm very sad today because I learned a shocker about WavPack lossy, partially due to halb27's "furious" sample he sent me. I'm not sure again whether I want to use WavPack lossy at all, or if I do, I must reconsider the bitrate yet again.

To me, "furious" is the worst sample I've heard yet. However, it gets much better with -s0, and even better with -s0.5 (this kind of signal really benefits from positive noise shaping). Also, since "furious" is a relatively simple sample, it benefits greatly from switches, like "Badvilbel", but not quite as much maybe. Oddly, "furious" is the first sample I've encountered where -x4 has a huge benefit while -h has almost no benefit at all. But -hh worked great on "furious"...I don't know what the switches are doing exactly but the behavior is interesting. In the end, to be fairly transparent (possibly still ABX'able, but since I'm not doing rigorous ABX testing I can only make conservative statements) furious requires -b450x4 with default noise shaping, or else -b400x4s0. Without -x4 it requires a bit more.

So I've decided that I have to always use -s0 and cannot use the default noise shaping. The default noise shaping may be the best for semi-tough music such as orchestral, but for the worst case disasters it is very bad and -s0 and -s0.5 are better. I am reluctant to use -s0.5 in general because it's kind of bad on moderately-tough music, but I'm still considering that option because it seems to perform the best on a worst-case disaster like "furious".

shadowking, I forgot to ask, what kind of noise shaping are you using in all your evaluations? Your normal preferred of -s0, or the default? Up until now I've always used the default (for when I state what bitrate is required to be transparent to me).

In any case, the real shocker came to me next. I realized something awful..."furious" is a MONO signal. It is perfectly mono I believe, or extremely extremely close. Try adding the -j0 switch to its lossy tests and DISASTER strikes. Normally, -j0 does not make a huge difference and I think some people might prefer it for various reasons. But here, because "furious" was a true mono signal, it makes a big difference and causes a disaster. With -j0, much more than 500 kbps is required for "furious" to approach transparency.

Then I realized I had also been making a big error in my logic stemming from long ago....when I did my sine wave test I forgot that it was mono, too. I artificially generated a new file consisting of 17000 Hz sine wave of 100% amplitude in the left channel, and 17500 Hz sine wave of 100% amplitude in the right channel...and all I can say is OH NO THIS IS REALLY BAD. Without switches, it cannot become transparent until close to 1000 kbps, it's not even reasonable until 800 kbps. And the filesize when lossless is 95% of the original (My previous, flawed test suggested to me that a lone high-freq sine wave was easy to compress, but I stupidly forgot I was making mono files. Apparently a high-freq sine wave is not compressible....which is actually what I had initially guessed should be the case).

All of my test samples already have significant stereo separation I think (-j0 makes no difference or even improves the sound), but "Badvilbel" is pseudo-mono too, and also more dangerous than its tests indicate. If one had a stereo version of "furious" (similar to my new sine wave test) the result will be disaster.

The one saving grace so far of "furious", "Badvilbel", and "sine wave" is that all are relatively simple signals that gain a reasonable amount of compression and quality from the -x4 and -h switches. My new stereo sine wave test improved a huge amount (-20 dB reduction in noise) from the -h switch...it somehow figures out that my left and right channel are not all that different and does something with that. But, this is a real danger because it should not be hard to add a little bit of complication into the signal and therefore render the -x4 and -h switches as useful/useless as ordinary music, in which case it's a disaster.

Said plainly, a stereo version of furious would itself be a disaster, and a slightly more complicated one that defeats the -x4 switch would be the worst-case scenario.

The only good thing to come of this is that I better understand how WavPack compression works now. A dynamic signal (telephone) is not a problem like I previously thought. All that matters is high-amplitude high-freq sounds, at least for the normal mode of WavPack. Defeating the switches I still have to play with a bit. But right now things don't look good to me. I did some rough calculations and think that it should be possible to make a real-life stereo song with reasonable restrictions on high-freqs (no more than 6% amplitude, not 100%) and have it require 500 kbps minimum for transparency (switches wouldn't help much, because real songs don't benefit much). A song that had 12% amplitudes (kind of extreme but not unreasonable by any means) could require more like 550 to 600 kbps and be disastrous.

Most of my test samples have only 2% to 8% amplitudes I estimate, the reason they aren't yet worst case disasters is that they are still slightly untonal. Furious probably has 8% also, but is much more "tonal" (even though it sounds like videogame shooting noises "pew pew", its because the freq is super high) so the biggest problem.

problem samples for WavPack lossy

Reply #24 – 2007-05-17 00:35:49

Hallo Porcupine,

Furious, badvilbel are the worst known samples AFAIK (BTW they're not my samples, I just gave them to you). Sure with your rigorous kind of way you can think of samples that are even worse, but in the end that's not real life. You're about to consider using a very high quality setting like -hhb450x4s0, and with this you'll get transparency with probably all your tracks. Even if you should really encounter a track which isn't perfect the problem should be very small.

No lossy codec can work miracles (though an internal quality control might improve things). If you want absolute security you should go lossless.

Notice