Hydrogenaudio Forums

Lossless Audio Compression => WavPack => Topic started by: bryant on 2007-04-24 03:03:34

Title: WavPack 4.41 beta3 available
Post by: bryant on 2007-04-24 03:03:34
With any luck, this should be the final beta before 4.41.0 is released:

The Windows builds are here (http://www.wavpack.com/wavpack-4.41.0-beta3.zip) and the SVN repository for building for everything else is here (http://svn.slomosnail.de/wavpack/trunk). Note that for *nix you will want to specify --enable-mmx on the configure line to enable the new MMX code.

As always, thanks in advance for any testing, comments, and suggestions... 
Title: WavPack 4.41 beta3 available
Post by: shadowking on 2007-04-24 08:57:09
Seems to be even faster than previous versions on my P3 550. I've changed my encoder settings from brutal hhx3 320k to: fx3 350k

Noise is actually 0.5~2db better than my old setting and I've gained heaps of speed in encoding and decoding. Now the correction files don't bog down performance too. I've also done some abx tests and the new setting is fully transparent on a wide range of samples. Some extreme artificial is suffering when not using -hh, but its not perfect either at times. I guess only some quality mode method stop that.

Now is there any chance of adding a switch to activate the new noise shaping algorithm for 4.41 final ? Something totaly optional like --ans for advanced users like me.

Anyway what I want to say is that Wavpack is now really usable with my new settings thanks to your new x-modes and speed boost.
Title: WavPack 4.41 beta3 available
Post by: halb27 on 2007-04-24 10:15:27
Good news.

I never really considered using -f mode - this sounds real interesting.

I would also welcome very much the possibility to use a hopefully improved noise shaping scheme, or - more generally speaking - features which are of special interest for using lossy mode (thinking of a stronger S/N control for instance).

I don't know, David, whether or not this matches your interests as well, David.
Anyway: wavPack is great!
Title: WavPack 4.41 beta3 available
Post by: DARcode on 2007-04-24 14:15:32
[a href="http://img482.imageshack.us/my.php?image=wv4410b3testkp8.png" target="_blank"] !
Title: WavPack 4.41 beta3 available
Post by: bryant on 2007-04-25 17:52:36
[a href="http://img482.imageshack.us/my.php?image=wv4410b3testkp8.png" target="_blank"] !

Wow, I'll accept these results, thanks! Too bad the improvement is not the same on all CPUs...

As for the noise shaping / VBR project, I am actually very interested in working on it. The problem has been (and still is) the time required vs. the time available. It's probably something like a 2-3 week project, and doesn't pay for a single can of cat food! But I sometimes lay awake thinking about how I'd go about it, which is a good sign that one day I'm just going to jump in and do it. 

Thanks for the feedback...

David
Title: WavPack 4.41 beta3 available
Post by: halb27 on 2007-04-25 19:06:01
[As for the noise shaping / VBR project, I am actually very interested in working on it.

Sounds fine.
So just take your time and do it when you have the time to do so.
We all appreciate your great work.
Title: WavPack 4.41 beta3 available
Post by: skamp on 2007-04-26 00:37:59
Added --skip and --until commands to WvUnpack to allow decoding a subset of the entire file
I find that feature to be very useful, but it would be really nice if it was available with the encoder as well (like FLAC).
Title: WavPack 4.41 beta3 available
Post by: DARcode on 2007-04-26 13:58:19

[a href="http://img482.imageshack.us/my.php?image=wv4410b3testkp8.png" target="_blank"] !

Wow, I'll accept these results, thanks! Too bad the improvement is not the same on all CPUs...

[...]

Thanks for the feedback...

David
I used an Intel P4 630 (3 GHz Prescott core), what's the CPU which's given the best results so far please? I get improvements at home with an AMD Sempron64 3400+ too, but not this consistent.
Title: WavPack 4.41 beta3 available
Post by: GeSomeone on 2007-04-26 14:44:37
As for the noise shaping / VBR project, I am actually very interested in working on it. The problem has been (and still is) the time required vs. the time available.
David

We understand about time, I would even suggest you eat something else than cat food. 

For extra motivation: The lossy/hybrid mode might become more important as TAK is very tempting these days, but is lacking such feature.
Title: WavPack 4.41 beta3 available
Post by: bryant on 2007-04-26 18:16:21

[As for the noise shaping / VBR project, I am actually very interested in working on it.

Sounds fine.
So just take your time and do it when you have the time to do so.
We all appreciate your great work.

Thanks, and I trust that the three of you will be ready for some listening tests when I begin... 

Added --skip and --until commands to WvUnpack to allow decoding a subset of the entire file
I find that feature to be very useful, but it would be really nice if it was available with the encoder as well (like FLAC).

Yes, the plan is to put it in the encoder also at some point. But I put it in the decoder because I think it's important for some music servers (like SlimServer) that use the command-line programs for decoding to a pipe (especially for extracting tracks from images, but also just seeking).

The funny thing is that since I added it, I've been using the feature all the time!
Title: WavPack 4.41 beta3 available
Post by: bryant on 2007-04-26 18:33:34
I used an Intel P4 630 (3 GHz Prescott core), what's the CPU which's given the best results so far please? I get improvements at home with an AMD Sempron64 3400+ too, but not this consistent.
I also have a P4 and see improvement like you, so I think that's probably the best. My wife's laptop is a Turion64 and shows less improvement. Synthetic Soul's Athlon XP shows almost no improvement (at least through beta2). I believe that according to one of Josef Pohm's tests even WavPack's normal mode is faster than TAK and FLAC when run on a P4!


We understand about time, I would even suggest you eat something else than cat food.
Well, the cat will only eat sushi (she's a little spoiled) so I can only afford to cat food! 
Quote
For extra motivation: The lossy/hybrid mode might become more important as TAK is very tempting these days, but is lacking such feature.
Haha, motivation is not one of the things I'm lacking. Unless you ask my wife, of course! 
Title: WavPack 4.41 beta3 available
Post by: Josef Pohm on 2007-04-26 20:32:34
Evaluation of 4.41.0Beta3 on my PIV Prescott 2.8ghz is currently being performed!

Here is some complete evaluation of latest WavPack binaries conducted on my A64 3500+ Orleans.
Ratio is calculated on my SetF.

Code: [Select]
|      | Ratio |    4.40.00   |   4.41.0b1   |   4.41.0b2   |   4.41.0b3   |

|  f   | 65,66 | 122,6  151,2 | 131,3  159,5 | 137,6  179,3 | 147,7  178,1 |
| fx1  | 65,24 |  69,9  151,2 |  77,8  158,5 |  79,3  178,1 |  79,3  176,9 |
| fx2  | 65,17 |  51,2  150,3 |  77,8  158,5 |  59,0  178,1 |  59,2  178,1 |
| fx3  | 65,13 |  30,7  154,8 |  35,7  158,5 |  35,9  179,3 |  36,4  181,8 |
| fx4  | 65,08 |  13,0  152,0 |  15,1  161,5 |  15,1  179,3 |  15,4  180,6 |
| fx5  | 65,07 |  10,5  152,9 |  12,2  159,5 |  12,2  178,1 |  12,6  179,3 |
| fx6  | 65,07 |   9,0  152,0 |  10,4  160,5 |  10,4  178,1 |  10,8  176,9 |

|      | 64,00 | 102,0  128,1 | 113,0  136,1 | 115,0  147,7 | 116,1  150,3 |
|  x1  | 63,56 |  51,5  127,5 |  60,6  131,3 |  61,3  146,1 |  62,7  147,7 |
|  x2  | 63,51 |  33,9  129,4 |  41,3  133,3 |  41,6  144,4 |  42,8  147,7 |
|  x3  | 63,52 |  18,3  126,2 |  23,0  133,3 |  23,1  145,3 |  23,9  149,4 |
|  x4  | 63,27 |   4,8  127,5 |   5,8  130,7 |   5,8  145,3 |   6,1  148,6 |
|  x5  | 63,25 |   3,4  128,1 |   4,0  132,0 |   4,0  145,3 |   4,2  146,9 |
|  x6  | 63,20 |   1,7  127,5 |   2,0  130,7 |   2,0  144,4 |   2,1  147,7 |

|  h   | 62,88 |  74,3  100,4 |  83,3   99,6 |  85,2  106,1 |  83,9  114,5 |
| hx1  | 62,78 |  36,2  101,2 |  43,6  102,4 |  44,1  109,7 |  45,6  114,5 |
| hx2  | 62,70 |  21,8  100,8 |  27,0  101,2 |  27,1  109,2 |  28,2  114,5 |
| hx3  | 62,66 |  10,9  100,4 |  13,8  102,4 |  13,8  110,2 |  14,5  115,0 |
| hx4  | 62,43 |   2,9  100,8 |   3,5  102,8 |   3,5  109,2 |   3,6  114,0 |
| hx5  | 62,39 |   2,2  101,2 |   2,7  102,8 |   2,7  109,2 |   2,8  114,5 |
| hx6  | 62,38 |   1,5  100,8 |   1,8  102,4 |   1,8  110,6 |   1,9  115,0 |

|  hh  | 62,47 |  59,6   81,0 |  67,9   82,0 |  69,5   87,2 |  67,7   91,2 |
| hhx1 | 62,28 |  26,9   81,0 |  33,5   81,5 |  33,6   85,8 |  35,1   91,2 |
| hhx2 | 62,22 |  15,4   81,3 |  19,5   81,8 |  19,6   86,4 |  20,6   91,5 |
| hhx3 | 62,17 |   7,4   81,8 |   9,6   81,8 |   9,6   86,4 |  10,2   91,2 |
| hhx4 | 62,01 |   1,9   82,3 |   2,2   82,0 |   2,2   87,0 |   2,4   91,2 |
| hhx5 | 61,96 |   1,2   82,3 |   1,5   82,0 |   1,5   86,7 |   1,6   91,9 |
| hhx6 | 61,96 |   0,9   82,3 |   1,1   80,0 |   1,1   87,0 |   1,1   91,9 |


4.41.0 Beta3 brings yet some minor but nevertheless nice speed improvements (on average I would say a 3% both in encoding/decoding) over Beta2.
I like the improved performance on decoding speed of -h and -hh modes.

All in all, from 4.40.0 to 4.41.0b3 the average improvement on this CPU is around 20/25% on encoding speed and 15% on decoding speed.
Title: WavPack 4.41 beta3 available
Post by: Josef Pohm on 2007-04-27 11:57:58
Now, I'm done with the evaluation of WavPack latest binaries also on my PIV Prescott 2.8ghz.
Ratio is referred to my SetF.

Code: [Select]
|      | Ratio |    4.40.00   |   4.41.0b1   |   4.41.0b2   |   4.41.0b3   |

|  f   | 65,66 |  96,7  115,6 | 100,4  124,4 | 102,4  137,6 | 101,2  139,0 |
| fx1  | 65,24 |  59,9  116,1 |  64,2  123,2 |  65,0  137,6 |  64,8  140,5 |
| fx2  | 65,17 |  45,1  117,1 |  49,1  123,8 |  49,4  137,6 |  50,0  137,6 |
| fx3  | 65,13 |  27,8  117,1 |  31,1  125,6 |  31,2  138,3 |  31,8  139,8 |
| fx4  | 65,08 |  12,0  114,5 |  13,5  124,4 |  13,5  139,8 |  13,4  136,8 |
| fx5  | 65,07 |   9,7  115,0 |  10,9  123,8 |  10,9  139,0 |  11,4  139,0 |
| fx6  | 65,07 |   8,3  115,6 |   9,3  123,2 |   9,3  139,8 |   9,8  136,8 |

|      | 64,00 |  82,0   98,1 |  85,8  105,7 |  87,0  115,0 |  86,7  114,0 |
|  x1  | 63,56 |  44,4   95,9 |  49,5  102,0 |  50,1  110,6 |  51,4  113,5 |
|  x2  | 63,51 |  29,4   95,6 |  34,0  101,6 |  33,9  111,6 |  35,8  114,5 |
|  x3  | 63,52 |  15,9   95,6 |  19,0  102,0 |  18,7  112,6 |  20,2  114,5 |
|  x4  | 63,27 |   4,3   95,9 |   5,1  100,8 |   5,1  110,2 |   5,4  113,0 |
|  x5  | 63,25 |   3,0   95,6 |   3,5  100,0 |   3,5  109,7 |   3,8  114,5 |
|  x6  | 63,20 |   1,5   91,5 |   1,8  100,4 |   1,8  105,3 |   1,9  113,5 |

|  h   | 62,88 |  57,1   75,4 |  64,8   74,7 |  64,8   78,8 |  65,5   87,0 |
| hx1  | 62,78 |  30,0   76,5 |  34,7   78,3 |  34,4   83,1 |  36,6   88,1 |
| hx2  | 62,70 |  18,3   76,2 |  21,9   77,8 |  21,7   82,5 |  23,5   88,1 |
| hx3  | 62,66 |   9,2   76,2 |  11,4   77,8 |  11,3   82,3 |  12,4   88,1 |
| hx4  | 62,43 |   2,6   75,4 |   3,1   77,8 |   3,1   83,1 |   3,4   88,1 |
| hx5  | 62,39 |   1,9   76,2 |   2,3   78,3 |   2,3   82,8 |   2,5   88,4 |
| hx6  | 62,38 |   1,3   76,5 |   1,6   78,3 |   1,6   83,9 |   1,8   87,0 |

|  hh  | 62,47 |  49,7   60,9 |  52,5   62,7 |  53,0   65,8 |  52,6   70,3 |
| hhx1 | 62,28 |  22,2   61,2 |  26,6   61,8 |  26,7   64,8 |  28,4   69,7 |
| hhx2 | 62,22 |  12,9   61,8 |  15,8   61,9 |  15,8   65,0 |  17,1   69,0 |
| hhx3 | 62,17 |   6,3   60,9 |   7,9   61,8 |   7,9   65,0 |   8,6   69,1 |
| hhx4 | 62,01 |   1,7   62,2 |   2,0   61,5 |   2,0   65,3 |   2,2   69,5 |
| hhx5 | 61,96 |   1,0   62,1 |   1,2   61,8 |   1,2   65,5 |   1,4   70,7 |
| hhx6 | 61,96 |   0,7   62,4 |   0,9   61,8 |   0,9   66,0 |   1,0   70,8 |


Improvement of 4.41.0Beta3 over 4.41.0Beta2 is (on average) around 6% in encoding speed and 4% in decoding speed.

Improvement from 4.40.0 to 4.41.0Beta3 is around 20/25% in encoding speed and 15/20% on decoding speed, which is about the same shown on the A64 in my previous post.
Title: WavPack 4.41 beta3 available
Post by: bryant on 2007-04-27 20:47:19
Improvement of 4.41.0Beta3 over 4.41.0Beta2 is (on average) around 6% in encoding speed and 4% in decoding speed.

Improvement from 4.40.0 to 4.41.0Beta3 is around 20/25% in encoding speed and 15/20% on decoding speed, which is about the same shown on the A64 in my previous post.

Thanks, Josef, for your comprehensive results! It's nice to see this much consistency across the two platforms.

BTW, I am planning for the real release in just a few days (with no further changes)... 
Title: WavPack 4.41 beta3 available
Post by: shadowking on 2007-04-28 07:48:26
I would also welcome very much the possibility to use a hopefully improved noise shaping scheme, or - more generally speaking - features which are of special interest for using lossy mode (thinking of a stronger S/N control for instance).


Quality between ABR and VBR shouldn't *normally* be drastic, especially at transparent bitrates. My prefs for VBR would be:

- Maintain quality between the compression modes [fast / normal / high / very high]. The goal is not to have any differences between the modes, but the slower ones would simply save a few bits.

- Better bitrate allocation - pure mono content only needs half the bitrate (currently the adjustment must be made by the user). Different sampling rates and multi-channel need different bitrate to 44.1 stereo and the user needs to calculate this at present. VBR should handle this.

- Noise shaping in wavpack 4.x -alpha- is already performing well on nearly everything I've thrown at it. More improvements are always good, but incorporating the existing 4.x model into a release would be an adequate start.
Title: WavPack 4.41 beta3 available
Post by: bryant on 2007-04-29 21:04:34
- Better bitrate allocation - pure mono content only needs half the bitrate (currently the adjustment must be made by the user). Different sampling rates and multi-channel need different bitrate to 44.1 stereo and the user needs to calculate this at present. VBR should handle this.

You can achieve some of this by using the "bits per sample" option of the -b command instead of the "kbits per second" option. I actually use this more often myself because you don't have to worry about sample rate and number of channels, and find that values between 3.0 and 4.5 the most useful.



- Noise shaping in wavpack 4.x -alpha- is already performing well on nearly everything I've thrown at it. More improvements are always good, but incorporating the existing 4.x model into a release would be an adequate start.
I'm glad to hear that the alpha is still working out for you. My first goal would be to incorporate basically the same thing, but in a form that would work with the existing correction file format (i.e. without breaking decoders).

The next thing I'd try is to use a simple estimation of tonality to increase or decrease the allocated bitrate, thereby creating a VBR mode.
Title: WavPack 4.41 beta3 available
Post by: shadowking on 2007-05-01 04:33:36

- Better bitrate allocation - pure mono content only needs half the bitrate (currently the adjustment must be made by the user). Different sampling rates and multi-channel need different bitrate to 44.1 stereo and the user needs to calculate this at present. VBR should handle this.

You can achieve some of this by using the "bits per sample" option of the -b command instead of the "kbits per second" option. I actually use this more often myself because you don't have to worry about sample rate and number of channels, and find that values between 3.0 and 4.5 the most useful.


Yeah !.. That is exactly what I was after. I tested normal CD, multichannel, 48khz, mono samples and it works great. I settled on '4' which gives around 4:1 compression. Thank you.
Title: WavPack 4.41 beta3 available
Post by: halb27 on 2007-05-06 13:33:47
Thanks, David, for your great work, and shadowking too for your inspiration (thinking of s0 and fast mode usage).
I've just finished a small listening test using fast mode at 270 kbps with which I wanted to find out what kind of noise shaping to use and whether it is worthwile using a higher x mode than x4 (I don't want to go lower than x4 cause encoding performance is fast enough for me).

I used 270 kbps for ease of judgement which setting to prefer (a preliminary test at 300 kbps showed me that while it's easy to hear a differences it's not easy to prefer one setting over the other especially when it's about judgement of noise). Having done the test I'm not sure whether these decisions were really easier using 270 kbps - maybe a little bit.

In my preliminary tests I had some more samples onboard, especially 'Carol of the Bells'. But after my very first warmup tests at 250 kbps where I had trouble in the beginning recognizing the problem with bruhns I realized that I should stay within a testing environment which is not too far away from practical listening situations. That is testing shouldn't become too painful: if I can't securely differentiate a problem after trying intensively several times, I treat it as non-existing as I will not hear it in practical listening situations. The bruhns problem of course turned out pretty audible after some intensive trials at 250 kbps. Another thing I realized is listening volume, and that's why 'Carol Of the Bells' left the samples cause the problem in the beginning is at very low volume when loudness is turned up in an extreme way which is not a normal listening situation. I figured out what a very loud listening situation is for the entire track, and with that there is no problem with 'Carol Of the Bell' at all. May be this applies to bruhns in a similar way, but as I don't have the entire track I don't know whether that's true.

For my 270 kbps test I used 4.41 beta3 as well as the 4.xx experimental version with auto noise shaping.
I used the samples Atem-lied, Furious, florida_seq, eig, herding_calls, trumpet, harp40_1, badvilbel, bruhns, castanets.

I've seen three areas of trouble for the samples in my test:

Low-frequency noise that sounds rather like an ugly artefact or a distortion than just noise
This applies to Atem-lied, herding_calls, trumpet.

Default noise shaping and (with trumpet und herding_calls) auto noise shaping have a specific bad behavior in this area.
Using no noise shaping or - for the better - a positive noise shift makes the problem nearly disappear for herding_calls und trumpet.
This strategy works with Atem-lied too, but other than with highest noise-shifts (s0.99) the distortion remains pretty audible.

There is an interesting parallel with OptimFrog DualStream advanced noise shaping which behaves very badly in this area too - especially when looking at trumpet.

Using x6 instead of x4 has a small positive effect for Atem-lied and herding_calls.

Inaccuracy
This applies to Furious, Florida_seq, harp40_1.

Any setting is bad with Furious, the slightly best setting is s0.4 or s0.5 IMO.
With Florida_seq auto noise shaping is best IMO, but s0.4 is very close.
harp_40.1: s0 und s0.6 are best IMO, but s0.4 or s0.5 are very similar.

x6 has a positive effect on Florida_seq.

Noise
This applies to badvilbel, bruhns (and to a rather low degree: eig and castanets).

badvilbel: any setting is bad, but auto noise shaping and s0.4 are the best among the bad.
With auto noise shaping noise is lower in volume than with s0.4, but it gets more attraction on the other hand as the character of the noise is changing, and it's a bit unnaturally high in frequency on some parts.
badvilbel gives us a good opportunity to learn about the character of the noise when using various noise shifts. To me s0.4 sounds pretty much like analogue noise yet, while using high values for s makes the noise sound a bit artificial of too high a frequency.
bruhns: in this case default noise shaping and auto noise shaping are best. Quality doesn't essentially degrade using s0 or s0.4 however. A higher s setting like s0.6 however is remarkably worse.
eig:s0 to s0.6 are best.
castanets: hard to concentrate on the noise behind the castanets. Best are default, auto, and no noise shaping, s0.4 isn't essentially behind, but higher values for s are.

x6 has a positive effect on bruhns.

Conclusion
a) x6 can have a positive effect in practice. For a final test I tried x5 on Atem-lied, herding_call, florida_seq and bruhns, and to me in a practical sense x5 does the job like x6 does. So I will use x5.
b) I consider the distortion effects in Atem-lied , herding_calls, trumpet as being very serious cause they can apply to rather simple tonal samples. That's why I don't want to use default noise shaping nor auto noise shaping in it's current form.
c) x5s0.4 is the best setting for me, and with it only these samples are of a real problematic kind (keep in mind the bitrate): badvilbel, Furious, Atem-lied (in this order). It's all artificial music luckily which I don't care too much about. Of course using s0.4 is a matter of taste. Those who care a lot about the 'noise problem class' (which can be assumed to be the most common class of problems) using for instance s0 may be more advantagous.

During the next weeks I'll get real life experience with my archive. I will also think about a bitrate usage strategy cause with the results of my little test I feel using a bitrate in the 300...350 kbps range will be perfect to me most of the time, but I'd like to know when to go higher.
Title: WavPack 4.41 beta3 available
Post by: halb27 on 2007-05-06 14:40:56
@David:

My little test made me consider some things concerning improvents of lossy mode which recently were in focus: noise shaping and - more important - better control of quality.

Noise shaping
If auto noise shaping wouldn't have this severe distortion effect it would be quite interesting.
If auto noise shaping was restricted not to do negative noise shifts this problem wouldn't exist (sure this will also restrict some positive effects of current noise shaping).

Anyway as s0.4 seems to be a good overall noise shift I wonder whether this can be considered an approximation to ATH shape noise shaping or equal loudness curve noise shaping.

I looked up old threads again where you were discussing these things with DickD and other people and the idea of putting a lot of noise into the highest frequency region where our ears are very insensitive or even into the ultrasonic region beyond ~ 18 kHz sounds very attractive. To take care of the tweeters when doing extreme ultrasonic noise shaping a lowpass in the decoder might do the job).

A noise shaping according to roughly the equal loudness curve has also the advantage that noise character isn't dynamically changed what can be an issue by itself.

Better control of quality
This is the more vital point, and may be less work to implement than doing more advanced noise shaping (where using s0.4 as a very rough approximation to equal loudness curve considerations at least I am pretty much satisfied).

According to my test I can use pretty low bitrate most of the time but I should know when to go higher or for the better - the encoder should do it on it's own. I wouldn't call that VBR necessarily cause you do kind of a quality control already now though it's more like a side effect.

If I understand your mechanism correctly from what you wrote, you quantisize the prediction error according to a user-given value (usually computed from target bitrate). Whenever the predictor works fine this automatically means that the relative total error is kept under good control. When the predictor doesn't work fine because a transient comes up, this relative error will still be alright for a while as the error criterion lags a bit behind thus requiring good accuracy for the transient even though the predicted value was poor. On the falling edge of the transient however prediction is poor again, and this time this is not covered by the lagging error control mechanism, in contrary the lagging error control enhances the resulting error. Due to temporal masking however, there's a chance we won't hear this error.
Guess this mechansm isn't always sufficient, and due to the character of say Furious and Atem-lied I can imagine it's this mechanism that causes the problem. (I have no idea however what goes wrong with badvilbel where additional noise is larger than the signal in the 300 kbps region).

I guess it's not a big addition to do an explicit error control (naively speaking of the kind: average over a say 5 ms period the absolute values of the samples of the effective error [prediction+quantisized prediction error against original sample] and put it into relation to the corresponding average over the original. Measure this relation against a threshold that correponds to the user's quality demand. Allow for exceeding of this threshold for a small number of consecutive 5 ms windows under certain conditions which correspond to temporal masking. Re-quantisize with higher accuracy if necessary. Something like that and using an initial quantization accuracy which corresponds with the user's quality demand).

I tried to use your displayed S/N value when using the -n switch  for manual quality control .
But the results look strange to me. I got a S/N result for badvilbel which was a lot better than for regular tracks, both avg. and peak noise values, and at the worst spots noise sounds louder than the signal with badvilbel!
So what does the displayed S/N tell? I can't beleive it's the ratio of overall encoded noise (the contents of the correction file) relative to the signal.

The meaning of S/N may be of interest for another trial to control quality on my own. According to your mechanism as I understand it quality is bad when the predictor works bad. So it may be worthwile from a current wavPack user's point of view to compare the worst case S/N of a say fast mode encoding against that of a high mode encoding. In case of unusually high differences this may be a crude heuristics to increase bitrate.
ADDED: This is nonsense unfortunately. I just took a track with near-equal reported noise values using fast and high mode, added badvilbel to the middle of the track, and noise values are still close together.
EDITED: I see: the -n switch makes display just the noise level not the S/N ratio. I misinterpreted this. Displaying S/N as well would be quite welcome as it may be the basis for a crude bitrate decision.

To do some investigation on the error on my own is there an easy way to get a wav file from the correction file?
EDITED: Not necessary. I can read the decompressed wavs with and without correction file and do all analysis on these in case I really want to do that (not sure yet - guess I prefer comparing worst S/N between fast and normal mode).
Title: WavPack 4.41 beta3 available
Post by: shadowking on 2007-05-06 16:14:47
Thanks for your tests. That is interesting you favour positive shaping. I know it has some benefits on those weirdo samples, but overall s0 and positive shaping increases the hiss which affects a lot of the normal stuff . With tonal samples and long notes s-0.5 is better without a doubt if I were to test 256k. I usually don't pick it above 300k, but remember Guruboolez saying that he heard something quite difficult to hear. I think he wouldn't hear it with negative shaping. It also really weird that herding calls is better for you with positive shaping as for me it is the opposite - hiss gets louder. It could be some ears work differently. I have also noted a problems with Dualstream --ans below quality 3 . Now -x5 might give better result on very few samples (I usually favour -x4 over it), but on wide range testing I've decided that it is not the way to go and too crude. I have narrowed it down to the following: for transcoding, rockbox use and correction file efficiency go for normal mode -x or fast mode -x3. For a bit more headroom use -x4 normal .. If decoding speed isn't really important go -hhx. With fast mode anything above -x4 has no effect anyway. Also high modes (esp -hh) handle sweeps, tones better [furious etc]

Now S/N values of -n don't mean anything by itself but differences between the modes might be audiable. It would be good if you could test your CD's at these same settings .  When I say 'wide-range' of samples I mean stuff on my CD's and most of the stuff of HA public listening tests.
Title: WavPack 4.41 beta3 available
Post by: halb27 on 2007-05-06 19:04:42
...but overall s0 and positive shaping increases the hiss which affects a lot of the normal stuff . With tonal samples and long notes s-0.5 is better without a doubt if I were to test 256k. I usually don't pick it above 300k...

Yes, when the "problem" sounds like "normal" noise, default setting usually is better than a positive noise shift. I also beleive this is the most usual error behavior, but:
a) even at 270 kbps the "problem" usually is rather negligible to me, and going productive I will use something like 320 kbps or more.
b) even at 270 kbps the audible difference between using default setting and s0.4 isn't a big one to me
c) this "normal" noise kind of error is something that I can accept rather easily cause it sounds rather natural if audible at all.
On the other hand the distortion kind of problem may be rather rare but it sounds rather terrible to me, and I can hear it easily way above 300 kbps - especially Atem-lied. That's why I easily accept a small increase in hiss while avoiding these artefact like problems. But of course it's a matter of taste and where to put the emphasis.
Quote
It also really weird that herding calls is better for you with positive shaping as for me it is the opposite- hiss gets louder.
I didn't concentrate on hiss [aka "normal" noise to me - but maybe I have a lingual problem here cause I'm not a native english speaker] but on the noise which sounds like a distortion or strange artefact. This is the overwhelming issue to me - may be I missed some hiss issue. I just looked up my protocol for herding_calls, and to me default NS is worst, auto NS is slightly better, and a zero or positive noise shift is best with no essential difference between the amount of noise shift - it just keeps away from this extra low frequency artefact.
Quote
Now -x5 might give better result on very few samples (I usually favour -x4 over it), but on wide range testing I've decided that it is not the way to go and too crude.
Sounds worse than 'not worth while' - a judgement I can easily understand cause also to me -x5 is only slightly advantageous on rare occasion compared to -x4. But do you know of disadvantages when using -x5 (I don't care much about performance)?
Title: WavPack 4.41 beta3 available
Post by: shadowking on 2007-05-07 01:56:30
Tried to abx between 3 herding lossy encodes. I failed at -ans vs s0, abxed 7/8 between -ans and s0.4 and 7/8 with s0 vs s0.4. Positive shift hiss was more noticeable. On the other hand -ans noise mght be more blocky and maybe sound less natural to you than a pure hiss. BTW I agree that pure hiss is better than blocky noise. I was testing 4.3x vs 4.41 and probably should be testing only 4.31 encoders.
Title: WavPack 4.41 beta3 available
Post by: halb27 on 2007-05-07 09:07:07
Tried to abx between 3 herding lossy encodes. I failed at -ans vs s0, abxed 7/8 between -ans and s0.4 and 7/8 with s0 vs s0.4.

At 270 kbps the differences of the settings for the serious problems were often abxable but that doesn't tell which setting is the less annoying one in an overall sense. Which of course is a matter of taste to some extent.
But I will go testing herding again cause you are writing about hiss ("normal" noise?) which was not something in my focus. Maybe I missed something.
BTW Atem-lied and trumpet probably may be the better samples for this distortion like noise. trumpet is definitely the most outstanding sample to show how extremely bad things can go when making noise shaping dynamic (using OptimFrog DualStream in this case - but basically it's the same thing with wavPack auto noise shaping).
Title: WavPack 4.41 beta3 available
Post by: shadowking on 2007-05-07 09:23:07
Yeah I'll try trumpet which BTW never struck me as an annoying sample at least using 300k . Dualstream --ans has issues at < quality 3 setting, at Q3 --ans seems to work good. Does it make sense ? Guruboolez also wrote something similar.
Title: WavPack 4.41 beta3 available
Post by: xmixahlx on 2007-05-07 09:32:49
i packaged beta3 for rarewares/debian - available now.


thanks, david!
Title: WavPack 4.41 beta3 available
Post by: shadowking on 2007-05-07 09:40:54
Ok trumpet -ans sounds like DOLBY C and trumpet s0.4 no DOLBY
Ha ! Its really weird. I abxed 7/8 -ans vs Original and 7/8 s0.4 vs original. s0.4 more hissy but -ans more 'loud'. Failed to abx between -ans and s0.4 (3/8)..
Title: WavPack 4.41 beta3 available
Post by: halb27 on 2007-05-07 10:34:31
You're talking about wavPack, fast mode and a bitrate of 270 kbps (or similar)?
As you never write about something like an artefact or a distortion or something like that I really wonder whether we're talking about the same thing. Or maybe our ears and minds are really very different in categorizing this error. To me this error falls into a totally different class than hiss (hiss as comparable to tape noise). Maybe you want to try trumpet with Optimfrog fast mode ans quality 1 (in order to match ~270 kbps) to make sure we're talking about the same thing.
Title: WavPack 4.41 beta3 available
Post by: shadowking on 2007-05-07 12:23:37
That Dualstream setting  is bad I agree, but that is not what I hear with wavpack. You may have borked some wavpack settings maybe ?

My wavpack file is 293k with the following command:

-i -qfx5b270 - %d 
MD5:C2B49CA7788135916141374B2E1976C6


ABX wavpack -ans 270k vs DS --ans Q3: 8/8 [WV wins]
ABX wavpack 4.3 -ans vs. wavpack 4.3 s0: 5/8 [failed to pick them apart again].


Also Dualstream --ans problems are severe I easily abx 8/8 at quality 3 --ans 332k. So if --ans can cause problems at q3 at least on this sample. Anyway Florin himself favours flat noise with VBR over --ans. Again I don't want to draw conclusions from 1 or 2 samples.
Title: WavPack 4.41 beta3 available
Post by: halb27 on 2007-05-07 13:14:33
That Dualstream setting is bad I agree, but that is not what I hear with wavpack. You may have borked some wavpack settings maybe ?

My wavpack file is 293k with the following command:

-i -qfx5b270 - %d 
MD5:C2B49CA7788135916141374B2E1976C6

Hmmm, to me it's essentially the same kind of noise with wavPack, only not as bad as Dualstream's trumpet result. So I guess you really don't have a distortion like problem with wavPack.
Within my test I used wavPack 4.41b3 -fb270x4s0mi (resp. another value for s resp. no s-witch) and
wavPack 4.xx experimental -fb270x4mi for the auto noise shaping variant.
What is the -q switch?
Title: WavPack 4.41 beta3 available
Post by: shadowking on 2007-05-07 13:26:40
The difference may be with 4.4 encoder performing better than 4.3. I tried 4.3 -ans vs 4.41 using your settings and felt that 4.4 was cleaner. I failed again at 6/8 though. Difference is not great. I think -ans or negative shift might add a subtle 'whooo' noise as opposed to the normal 'tsssss' hiss. But with my current abx test it doesn't show. So far Bryant's -ans is stable enough for me - better that Ghido's and better than the current wavpack default. Maybe manual tunings are sometimes better but the point is to provide some compromise between cryptic commandlines (which I hate) and quality. Some advanced user can always override it anyway.
Title: WavPack 4.41 beta3 available
Post by: halb27 on 2007-05-07 14:49:23
... I think -ans or negative shift might add a subtle 'whooo' noise as opposed to the normal 'tsssss' hiss. ...
Guess we're talking about the same thing now, and it's just subtle to you and rather annoying to me which may be a matter of taste, as is this:
... So far Bryant's -ans is stable enough for me - better that Ghido's and better than the current wavpack default. Maybe manual tunings are sometimes better ...
which depends for instance on how to weigh the distortion like problems against the hiss problems, how to judge the difference ans against s0.4 for the better and worse, and how to weigh the fact that with ans noise character can rapidly change within a track. It's individual judgement, that's all.
Anyway I'd prefer an equal loudness curve based noise shaping over individual parameters, but when it's up to a wish list I feel a better quality control is more of concern. No matter which preference on noise shaping wavPack lossy usually is very good even with fast mode at ~ 300 kbps and it would be great to have a machinery which uses more bits in situations when the effective S/N ratio is bad when using ~ 300 kbps. And from my understanding this is easier to implement than a complicated noise shaping scheme (though a noise shaping which keeps noise flat in frequency for say 35% of the noise and put the other 65% into the area above say 12 kHz [to a strongly increasing degree the higher the frequency, pretty soft at ~ 12 kHz, still rather soft at  ~15 kHz, but strong at ~21 kHz] sounds rather rappealing to me).
Title: WavPack 4.41 beta3 available
Post by: shadowking on 2007-05-07 16:04:18
I too favour some quality control vbr over anything else and yeah i did stumble onto another example where -ans ads some unnatural noise. I still think its a good thing to have for the lower 'rockbox' bitrates and even for higher bitrates as long as its predictable enough. Maybe something there could still use some tuning.
Title: WavPack 4.41 beta3 available
Post by: halb27 on 2007-05-09 10:23:01
I'm about to figure out how to do some quality control on my own, and I found something I didn't know which may be the reason why I prefer s0.4 over s0:So s0.4 may be the best compromise for me (and other people of a certain advanced age) weighing the distortion problem against the hiss problem, but for younger people this may be different.

ADDED later:
Taking into account, that according to the equal loudness curve we're pretty sensitive in the roughly 700 Hz ... 6 kHz area (below and above sensitivity starts dropping in a rather steep way), and that we're sensitive toward noise especially in the 6 kHz area, a noise shaping scheme that keeps noise out of the ~ 700 Hz ... 10 kHz area to a large extent would be very welcome IMO.
Title: WavPack 4.41 beta3 available
Post by: shadowking on 2007-05-09 10:49:08
There is some truth there. Also depends on the music you listen to. For  electronic HF aggressive sounds , positive shaping may be helpfull in some situation to reduce 'noise' . For classical music and some instrumental , negative shaping will reduce the hiss that might be percieved on long notes - flute, violin, solo vocal etc etc.
Title: WavPack 4.41 beta3 available
Post by: Porcupine on 2007-05-14 03:26:21
Nice lively discussion here. I'm only a new user of WavPack but so far I pretty much agree with most of what was said here.

halb27, oh so that is how old you are. I was wondering, but I would never ask such a thing directly because it's impolite.  I also discovered something strange about high-freq hearing just yesterday that I didn't know previously (I was trying to artificially create more problem samples for WavPack). High-freq sensitivity greatly depends on whether my ear is directly facing the speaker (as would be the case with headphones, but not normally the case with loudspeakers which is what I use) or off-angle a bit. If the ear faces directly then the high-freq sounds can be shockingly louder, possibly perceptually a +30 dB gain at least for me. I discovered this in personal testing with sine waves and white noise in the 15 - 19 kHz region. I don't know if that effect would apply down to 2 - 14 kHz though, I think probably not, but it might be something fun to try for yourself. BTW I'm talking about making sure the ear faces the speaker, not making sure the speaker faces the ear (which is also critical to high-frequency response, but that is speaker acoustics, not ear physiology).

I also discovered just how badly damaged my left ear is compared to my right ear. I had mentioned this earlier in my initial controversial thread in the mp3 section (where everyone got mad at me), but I had no idea it was as bad as I just discovered. Well, my right ear is damaged also (I damaged my hearing badly a couple years ago) but the fact that it is so different from my left ear now proves to me the extent of what I did (if both ears were damaged the same I might 'forget' what music used to sound like to me...even now I'm sure I forgot somewhat because my damaged right ear is now my only currently available reference). My left ear is almost completely deaf in the 15 kHz to 17 kHz region (it's probably utterly deaf above). Assuming I understand my hardware setup correctly, at 60 dB my left ear was deaf (my amp says -50 dB on the digital readout, which I *think* corresponds to 60 dB sound level given my speaker/amp/computer settings). But my right ear was blasted with sound, as loud as typical listening volume of most things. Then I discovered my left ear still works okay (+30 dB) if I turn my ear facing the speaker...yay.  Otherwise, I can kind of still hear those sounds in my left ear "normally" if I crank my amp up +30 dB.

I don't think my right ear, even as it is now, is that much less sensitive at 15 kHz as it is at 3 - 6 kHz.

Anyway, for me at least the bottom line is that I think different people (age groups, genetics, etc) respond to high freqs too differently, so while I agree that "equal loudness" noise-shaping might be good, I'm not overly interested in it. I've seen some amazingly different looking equal-loudness curves when conducted by different studies, so I have no idea which to believe. I feel satisfied with the basic noise-shaping options as they are.

I'd be a lot happier to see a semi-psychoacoustic VBR quality control for WavPack, as I mentioned before in PM. Even though I am willing to spend the bits, I'd still feel foolish encoding everything at 400-450 kbps just because of the occasional highly-tonal, completely unmasked problem sample...when most of the time I think WavPack lossy is transparent (for the typical kind of music I listen to) at 256 - 320 kbps.
SimplePortal 1.0.0 RC1 © 2008-2020