lossyWAV Development

Topic: lossyWAV Development (Read 573699 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

lossyWAV Development

Reply #900 – 2008-03-03 09:03:41

Quote from: halb27 on 2008-03-03 09:01:49

Your new v0.8.0 settings are very attractive to me.
A well-spaced differentiation in quality parameters IMO, and everybody's needs should be satisfied by one of these settings.

Casual listening to v0.8.0 -7 is not revealing any glaring problems, so I'm happy!

lossyWAV Development

Reply #901 – 2008-03-03 12:04:18

I take back my previous concerns. With Skew (is it fixed internally at 36?) and SNR, it's much harder to make the quality fall off a cliff, at least for samples where the lowest bins are at lower (more audible) frequencies. I'm guessing any "problems" will be for samples where the lowest bins are at higher frequencies (typically less audible).

I'm really impressed with the way all this tuning has come together - well done Nick.C, halb27, and other listeners.

You do realise that you've engineered a kind of crude psychoacoustic model?

Cheers,
David.

lossyWAV Development

Reply #902 – 2008-03-03 12:33:01

Quote from: 2Bdecided on 2008-03-03 12:04:18

I take back my previous concerns. With Skew (is it fixed internally at 36?) and SNR, it's much harder to make the quality fall off a cliff, at least for samples where the lowest bins are at lower (more audible) frequencies. I'm guessing any "problems" will be for samples where the lowest bins are at higher frequencies (typically less audible).

I'm really impressed with the way all this tuning has come together - well done Nick.C, halb27, and other listeners.

You do realise that you've engineered a kind of crude psychoacoustic model?

Cheers,
David.

Oops - that wasn't what was meant to happen!! It does seem to work though. Skew is indeed fixed at 36dB.

I think the final element which has allowed the bitrate to be reduced to the level that it has at v0.8.0 -7 is the addition of the variable maximum_bits_to_remove.

Very happy with the results - will move to v0.8.1 RC3 after a couple of days delay for problem reports.

lossyWAV Development

Reply #903 – 2008-03-04 00:37:16

Quote from: 2Bdecided on 2008-03-03 12:04:18

I take back my previous concerns. With
It's much harder to make the quality fall off a cliff...
You do realise that you've engineered a kind of crude psychoacoustic model?

I wonder how it will sound at 200kbps. Has there been any experimentation on that low of a bitrate? I'm fairly certain it would be inferior to the average mp3, but I starting to get curious as to just how much the bitrate can be lowered...

So here's my suggestion. Why not go all the way to the bottom of the bitrate barrel, and tune your way up? That's what Aoyumi did/does with Vorbis, which gave it (literally) the best lossy quality in the world. Apparently, you can scale up the changes you make in the lower bitrates to the higher ones, and all bitrates would end up with the benefit.

The point is, it's much easier to catch and tune for artifacts at low bitrates. Once tuned, though, the tuning would apply to practically all bitrates, making all quality levels better...see what I'm saying? There's no way I can abx 350kbps, but if you sent me down to 200, I could, and we can "tune things up."

[edit]On a different note, on many files, the difference between -7c and -6c is under 7kbps. This doesn't seem like what was intended...

lossyWAV Development

Reply #904 – 2008-03-04 08:30:30

Quote from: The Sheep of DEATH on 2008-03-04 00:37:16

Quote from: 2Bdecided on 2008-03-03 12:04:18
I take back my previous concerns. With
It's much harder to make the quality fall off a cliff...
You do realise that you've engineered a kind of crude psychoacoustic model?
I wonder how it will sound at 200kbps. Has there been any experimentation on that low of a bitrate? I'm fairly certain it would be inferior to the average mp3, but I starting to get curious as to just how much the bitrate can be lowered...

So here's my suggestion. Why not go all the way to the bottom of the bitrate barrel, and tune your way up? That's what Aoyumi did/does with Vorbis, which gave it (literally) the best lossy quality in the world. Apparently, you can scale up the changes you make in the lower bitrates to the higher ones, and all bitrates would end up with the benefit.

The point is, it's much easier to catch and tune for artifacts at low bitrates. Once tuned, though, the tuning would apply to practically all bitrates, making all quality levels better...see what I'm saying? There's no way I can abx 350kbps, but if you sent me down to 200, I could, and we can "tune things up."

[edit]On a different note, on many files, the difference between -7c and -6c is under 7kbps. This doesn't seem like what was intended...

I do not really want to try to go that low.... Some of the albums I've processed using v0.8.0 -7 are coming in at about 280kbps - with no glaring artifacts. I think that the main objectives of the development process have been met (or exceeded) and I am content with the current -7.

Overall, as the encoded processed file will carry the file extension of the encoder, I want to make sure that the quality of any processed output will not negatively skew public opinion against the lossless encoder.

I too am interested in "how low can we go?" - so I'll post beta v0.8.1 with a revised -nts maximum value.

On the -7c / -6c bitrate delta, I think that that means that we are approaching a limit imposed by the combination of the parameters used to maintain quality and therefore it is working perfectly. Always remember, lossyWAV is pure VBR.

lossyWAV beta v0.8.1 attached to post #1 in this thread.

From a test using my 53 problem sample set:

Code: [Select]

|-----|-----------|-----------|-----------|
| SNR |  NTS=18   |  NTS=21   |  NTS=24   |
|-----|-----------|-----------|-----------|
|   6 | 305.8kbps | 295.2kbps | 287.8kbps |
|   7 | 307.3kbps | 297.1kbps | 289.9kbps |
|   8 | 309.2kbps | 299.2kbps | 292.3kbps |
|   9 | 311.2kbps | 301.6kbps | 294.9kbps |
|  10 | 313.6kbps | 304.2kbps | 297.8kbps |
|  11 | 316.3kbps | 307.3kbps | 301.1kbps |
|  12 | 319.7kbps | 311.1kbps | 305.2kbps |
|  13 | 323.8kbps | 315.6kbps | 310.1kbps |
|  14 | 328.3kbps | 320.6kbps | 315.4kbps |
|  15 | 333.2kbps | 326.0kbps | 321.1kbps |
|-----|-----------|-----------|-----------|

From which, -snr 15 -nts 18 and -snr 14 -nts 21 might be reasonable. I listened to -snr 6 -nts 24 and it was awful and -snr 9 -nts 24 wasn't much better.... I would consider the lower limit for -snr to be 12 and the upper limit for -nts to be 21.

lossyWAV Development

Reply #905 – 2008-03-04 12:39:48

Don't forget Sheep that lossyWAV can only add spectrally flat noise. If you push it far enough, you'll just end up with something that's a very complex way of delivering a 5-bit LPCM file!

Tuning at a point where you can hear the noise, and then cranking the bitrate up, does have merit. However, it makes more sense when the noise is shaped to match the music. lossyWAV doesn't do that. It still makes some sense, however.

Cheers,
David.

lossyWAV Development

Reply #906 – 2008-03-04 13:26:56

I don't think we should not go lower than -7 at this stage. My guess is that -7 and maybe higher setting can be abxed on a quite pasage with the volume cranked right up. Its not normal listening but its something to consider. Quality will collapse below 240 k or somewhere near. With wavpack and dualstream its possibe to get good output @ 235 k esp on louder music .. but audiable hiss / noise with quite passages will be there and not hard to hear on some critical sample. I don't know how lossywav will sound with a quality collapse - it could be spurts of offensive noise rather than just hiss. 280 k can yield mostly transparent results I think, but 235k is pushing it to the limit and lossywav doesn't need a bad rep. There are better solutions at < 250 k .

lossyWAV Development

Reply #907 – 2008-03-04 13:36:49

Quote from: shadowking on 2008-03-04 13:26:56

I don't think we should not go lower than -7 at this stage. ...

Exactly what I am thinking. We've reached an average bitrate of ~310 kbps with very good quality, and quality drops more than bitrate when trying to achieve significantly more - at least with the current techniques.

Nick, I think it was me who made you stop from further investigating the noise shaping approach. But that was in another situation. I still wouldn't like a development with a weak basis when it's up to the -3 or -2 quality region, especially as this approach isn't intrinsically safe - other than using -skew and -snr or the RMS oriented max_to_remove_bits which only make the basic approach more defensive. But now things have changed and there's interest in going rather low in bitrate while allowing the utmost quality to be missed a bit. Moreover I think it's safe to say the techniques used so far have matured. In this situation I'd like to encourage you to continue with what you once started in case you are interested.

lossyWAV Development

Reply #908 – 2008-03-04 13:44:16

Quote from: shadowking on 2008-03-04 13:26:56

I don't think we should not go lower than -7 at this stage. My guess is that -7 and maybe higher setting can be abxed on a quite pasage with the volume cranked right up. Its not normal listening but its something to consider. Quality will collapse below 240 k or somewhere near. With wavpack and dualstream its possibe to get good output @ 235 k esp on louder music .. but audiable hiss / noise with quite passages will be there and not hard to hear on some critical sample. 280 k can yield mostly transparent results I think, but 235k is pushing it to the limit and lossywav doesn't need a bad rep. There are better solutions at < 250 k .

I hear what you are saying - especially about not needing a bad reputation.....

The hiss on quiet passages may already be mitigated by the variable maximum_bits_to_remove which takes into account the RMS value of the codec_block being processed.

Off at a tangent, at the moment there are 3 spreading-function strings for -1, -2 and -3 (-4 to -7 being copies of -3). As the spreading-function string from -3 has done so well for -3 to -7, is there any merit in making all the spreading function strings the same as -3?

If this happened, then I could envisage a modification where quality could be specified between 0 and 1 where 0 = -7 and 1 = -1, using say 3 decimal points resolution, with 0.5 equating to the current -3.

Also, would it be beneficial to shift to 3 FFT analyses for quality presets -1? Possibly if quality<0.5 then FFT Analyses = 2, if quality>=0.5 then FFT Analyses=3.

Using the -3 spreading function, the revised -1 would produce 504.1kbps for my 53 problem sample set using the original 4 FFT analyses (501.3kbps with 3 FFT analyses) and the revised -2 would produce 468.2kbps using 3 FFT analyses.

lossyWAV Development

Reply #909 – 2008-03-04 13:44:33

The way I see it, there are basically three ranges of bitrates in mainstream music: 64-320 kbps (the upper limit being that of MP3 CBR); 600-1000 kbps (lossless codecs); lossy codecs such as WavPack Hybrid, OptimFROG DualStream and lossyWAV would fill the gap in-between quite nicely, IMO. I don't see much point in competing in two fields where there's already quite a lot of competition.

At 320 kbps, it doesn't take more space than the highest quality MP3's that some people swear by, so if it's transparent and more suitable for transcoding than psycho-acoustic codecs, I'm happy with it.

lossyWAV Development

Reply #910 – 2008-03-04 13:50:19

Quote from: halb27 on 2008-03-04 13:36:49

Quote from: shadowking on 2008-03-04 13:26:56
I don't think we should not go lower than -7 at this stage. ...
Exactly what I am thinking. We've reached an average bitrate of ~310 kbps with very good quality, and quality drops more than bitrate when trying to achieve significantly more - at least with the current techniques.

Nick, I think it was me who made you stop from further investigating the noise shaping approach. But that was in another situation. I still wouldn't like a development with a weak basis when it's up to the -3 or -2 quality region, especially as this approach isn't intrinsically safe - other than using -skew and -snr or the RMS oriented max_to_remove_bits which only make the basic approach more defensive. But now things have changed and there's interest in going rather low in bitrate while allowing the utmost quality to be missed a bit. Moreover I think it's safe to say the techniques used so far have matured. In this situation I'd like to encourage you to continue with what you once started in case you are interested.

My noise shaping attempt was in retrospect agricultural to say the least, including a bit of guesswork - it was quite rightly consigned to the recycler. I would really like to be able to understand how noise shaping works and, more importantly, how to implement it in this context - however, I haven't yet found any sources which are understandable to me.

To use noise shaping which relates to the music may be an infringement of the patents David mentioned some time ago however.

lossyWAV Development

Reply #911 – 2008-03-04 14:46:12

Quote from: shadowking on 2008-03-04 13:26:56

My guess is that -7 and maybe higher setting can be abxed on a quiet pasage with the volume cranked right up. [..] Quality will collapse below 240 k or somewhere near.
[..] 280 k can yield mostly transparent results I think, but 235k is pushing it to the limit and lossywav doesn't need a bad rep.

Although this could be true, there is a bit of guessing involved.
2 points to keep in mind
- (as 2Bdecided keeps telling) the bitrates are not fixed .. so the bit rate result for a loud track can be much different from a not so loud track. ( 280k may be ok for one track while another might need 380k)
- lossyWav does a good job in avoiding problems at quiet passages.

I agree there is no sense in having an awful sounding pre-set at the achieved bit rates. It seems that 0.8.0b hit a fairly good range of workable settings.

lossyWAV Development

Reply #912 – 2008-03-04 15:58:08

Quote from: Nick.C on 2008-03-04 13:44:16

... As the spreading-function string from -3 has done so well for -3 to -7, is there any merit in making all the spreading function strings the same as -3? ...

I see sense in having the spreading a little bit more demanding with -2 and especially -1 cause these settings are out for getting a certain security margin. I wouldn't put this only into the -nts value.
IMO there's no need for a change but instead of changing the spreading I'd rather use 3 analyses instead of 4 with -1 and maybe just 2 with -2. This would speed up things, and I don't think these many anakyses are really necessary.

I personally don't like a continuous quality scale but prefer it the way it is. Discrete values make me feel better as the quality details are more transparent.

lossyWAV Development

Reply #913 – 2008-03-04 17:11:52

Quote from: Nick.C on 2008-03-04 13:50:19

My noise shaping attempt was in retrospect agricultural to say the least, including a bit of guesswork - it was quite rightly consigned to the recycler. I would really like to be able to understand how noise shaping works and, more importantly, how to implement it in this context - however, I haven't yet found any sources which are understandable to me.

Do you want me to dig out my fixed noise shaping version? I think it worked properly. It was a long time ago!

Cheers,
David.

lossyWAV Development

Reply #914 – 2008-03-04 17:54:23

Quote from: 2Bdecided on 2008-03-04 17:11:52

Quote from: Nick.C on 2008-03-04 13:50:19
My noise shaping attempt was in retrospect agricultural to say the least, including a bit of guesswork - it was quite rightly consigned to the recycler. I would really like to be able to understand how noise shaping works and, more importantly, how to implement it in this context - however, I haven't yet found any sources which are understandable to me.
Do you want me to dig out my fixed noise shaping version? I think it worked properly. It was a long time ago!

Cheers,
David.

That would be wonderful - I can understand your code .

lossyWAV Development

Reply #915 – 2008-03-05 16:43:53

Nick,

Here it is. Hope it's some use to you. I'm sure SebG could explain noise shaping pretty well.

No claims that this is correct, but it seems to work. It's "optimised" for debugging, not reading or running!

NOTE: This is only provided to demonstrate fixed noise shaping. Don't use it to encode anything - it's a hack of two old versions and the rest of the code probably doesn't work properly.

Note too that I don't think it handles zero bits to remove properly. Without dither, it's easy to get limit cycles in this instance.

You'll have to figure out how much noise shaping "buys" you - obviously it depends on the input signal, which is why I didn't use fixed noise shaping - but it's probably useful if you're aiming for lower bitrates.

Cheers,
David.

lossyWAV Development

Reply #916 – 2008-03-05 17:09:33

Quote from: 2Bdecided on 2008-03-05 16:43:53

Nick,

Here it is. Hope it's some use to you. I'm sure SebG could explain noise shaping pretty well.

No claims that this is correct, but it seems to work. It's "optimised" for debugging, not reading or running!

NOTE: This is only provided to demonstrate fixed noise shaping. Don't use it to encode anything - it's a hack of two old versions and the rest of the code probably doesn't work properly.

Note too that I don't think it handles zero bits to remove properly. Without dither, it's easy to get limit cycles in this instance.

You'll have to figure out how much noise shaping "buys" you - obviously it depends on the input signal, which is why I didn't use fixed noise shaping - but it's probably useful if you're aiming for lower bitrates.

Cheers,
David.

Thanks very much David, I'll try to get my teeth into it tonight....

lossyWAV Development

Reply #917 – 2008-03-05 17:34:58

Hi all,

Does anyone know if there are issues using FLCDrop with the latest version of LossyWav. The reason I ask is due to all these new settings. In FLCDrop AFAIK there's just the old 1,2,3 and I was wondering if those command switches are still relevant, with -3c and -7a et al ?

Thanks.
C.

lossyWAV Development

Reply #918 – 2008-03-05 19:17:12

I'm 2 inches from releasing an updated version of the batch file and the front end. So far, the changelog looks like this, but it's not guaranteed final yet.

Code: [Select]

lFLCDrop Change Log:
v1.2.0.5
- presets updated to -1 through -7
- all presets create correction files, except custom

lFLC.bat Change Log:
v1.0.0.7
- added automatic functionality for the -merge option
- new variable in custom preset to enable/disable automatic merging
- custom preset defaults match normal -2 preset functionality

I'm just dealing with a possible bug (or screw up on my part) for the automatic -merge function, and then merging that code into the custom preset section, and it should be fully updated and synced with current lossyWAV "goings-on".

re: the automatic merge function, if the FLAC file to decode has custom metadata, will check the decoded WAV file for the lossyWAV "tag". If it's a lossyWAV, then it will see if a .lwcdf.flac exists and decode to .lwcdf.wav, or if no .lwcdf.flac exists, it will check for a .lwcdf.wav, or else exit. And in the first two cases of lossyWAV correction file existing, it will ultimately run the -merge option, and delete the two lossyWAV files. (the .wav files, not the source .flac files)

[edit] yep, already thought of a needed change to the changelog... to add a custom preset variable to toggle the deleting of the pre-merged .lossy.wav files. and i also realized that i'm not handling the encoding of a .lwcdf.wav file to .lwcdf.flac file (if it exists) when encoding an already lossy .wav file. wowzahs. now i'm more than 2 inches away. [/edit]

lossyWAV Development

Reply #919 – 2008-03-05 20:36:48

Quote from: jesseg on 2008-03-05 19:17:12

now i'm more than 2 inches away

Thanks jesseg for the update. Regardless of how many inches, I shall wait for your new release.

Good luck with it.

C.

lossyWAV Development

Reply #920 – 2008-03-05 21:09:48

Quote from: jesseg on 2008-03-05 19:17:12

re: the automatic merge function, if the FLAC file to decode has custom metadata, will check the decoded WAV file for the lossyWAV "tag". If it's a lossyWAV, then it will see if a .lwcdf.flac exists and decode to .lwcdf.wav, or if no .lwcdf.flac exists, it will check for a .lwcdf.wav, or else exit. And in the first two cases of lossyWAV correction file existing, it will ultimately run the -merge option, and delete the two lossyWAV files. (the .wav files, not the source .flac files)

Thanks for the PM: -merge function now appears to be working (if both files are in the same place.....).

I've had a play with the method David supplied for noise shaping and early though it is, I'd like to get a second opinion from better ears.

Static noise shaping has been employed and is not optional (at this time - I'll make it optional later, v0.8.1 is still available). I have listened to -7 -nts 30 -snr 12 and it's "acceptable" but I have limited allowable volume (kids in bed) and would like more ears to listen in. For my 53 problem sample set it produces 301.8kbps(!).

lossyWAV beta v0.8.2 attached to post #1 in this thread.

lossyWAV Development

Reply #921 – 2008-03-06 15:02:04

Interesting. You might be aware of this, but the first post says 0.83 is actually available already (unannounced! ), but no download links other than 0.6.7rc2 are available!

Maybe you're just in the process of updating?

lossyWAV Development

Reply #922 – 2008-03-06 15:35:55

Quote from: The Sheep of DEATH on 2008-03-06 15:02:04

Interesting. You might be aware of this, but the first post says 0.83 is actually available already (unannounced! ), but no download links other than 0.6.7rc2 are available!

Maybe you're just in the process of updating?

Oops - what a mistake.

lossyWAV beta v0.8.3 attached to post #1 in this thread.

lossyWAV Development

Reply #923 – 2008-03-06 16:29:10

I've noticed that turning on shaping (originally just -7c), the resulting flac is actually 1.5% larger. Is this intentional? Or are tweaked snr and nts options a must with shaping?

I tried -7 -nts 30 -snr 12 -shaping, but quality was very scratchy (read: added noise) on the piano sample I tested with. In terms of artifacts, -snr 12 -nts 21 without shaping actually produced the better result on this sample, at roughly the same bitrate.

Maybe I got a b0rked build? I guess I can upload the sample a bit later. Cheers!

lossyWAV Development

Reply #924 – 2008-03-06 18:23:18

Hi, 2B!

I just skimmed trough LossyFLAC.m and noticed that there's a misunderstanding regarding filter coefficients. The filter coefficients from "the book" are b=[2.033 -2.165 1.959 -1.590 0.6149]; which corresponds to H(z)=2.033-2.165*z^-1...+0.6149*z^-4. But this isn't actually the noise shaping filter in this case. 1-z^-1*H(z) is. It's common and popular to write the transfer function of noise shaping filters as 1-z^-1*H(z). So, in case you have the filter coefficients for H(z) and want to plot the frequency response of the actual noise shaping filter you need to use freqz([1 -b]) for the FIR cases. Since you're removing the leading coefficient and inverting signs you just need to skip this part for the "book filter".

You'll see that the response of the filter isn't that bad after all. Its deviation from the one I was suggesting is within +/-5 dB at nearly all frequencies.

Just to confuse you a bit more I'm rewriting the transfer function's expression of the filter I was suggesting:

Code: [Select]

1 -1.1474 z^-1 +0.5383 z^-2 -0.3520 z^-3 +0.3475 z^-4
-----------------------------------------------------  =
1 +1.0587 z^-1 +0.0676 z^-2 -0.6054 z^-3 -0.2738 z^-4


            2.2061 -0.4707 z^-1 -0.2534 z^-2 -0.6213 z^-3
1 - z^-1 -----------------------------------------------------
         1 +1.0587 z^-1 +0.0676 z^-2 -0.6054 z^-3 -0.2738 z^-4

The new numerator is simply a-b with the leading zero removed (polynomial division + factoring out -z^-1). This form has its advantages when it comes to implementig noise shaping. The following image is a "DSP circuit picture" explaining how noise shaping can be done:

Still, I think the use of fixed shaping for this purpose is very limited. You could do much better with some easy signal adaptive filters like H(z)/A(z) where H(z) is some fixed filter and 1/A(z) is the LPC synthesis filter for the current frame or something like that.

Cheers,
SG

Notice