lossyWAV Development

Topic: lossyWAV Development (Read 573713 times) previous topic - next topic

0 Members and 2 Guests are viewing this topic.

lossyWAV Development

Reply #775 – 2008-01-12 17:29:26

Quote from: Bourne on 2008-01-12 15:50:28

can we expect full transparency when it reaches 1.0 final ? This is pretty cool.

The stated aim is full transparency for -3 with -2 and -1 being more conservative options for the user. At present -3 is pretty near transparent (although to my ears it is, but my ears certainly aren't the best on the planet....), but we're trying to iron out the problem for v0.6.4 RC1 with Bruce Springsteen's Livin In The Future identified by Alex B. Beta v0.6.5 was a pretty good comeback as Alex B's ABX'ing was inconclusive (I take it that means somewhere being able to and not being able to ABX the resulting WAV file).

With more finely tuned ears listening out for artefacts we'll get closer and closer to transparent (though to get there absolutely would probably take an infinite amount of time).

As I said, -3 is currently transparent for me, but transparency is in the ear of the beholder....

lossyWAV Development

Reply #776 – 2008-01-12 21:46:06

Quote from: Alex B on 2008-01-11 11:06:10

...
Your samples 00000_00595ms, 09400_10400ms, 19800_21000ms, 21600_23100ms
...

Finally I found the time to abx your samples (I had a lot of trouble trying to bring my system to an uptodate state - now I'm back to my old configuration).

With your 00000_00595 samples I got at a 6/7 which in the end was 7/10.
With 19800_21000 I also have the suspicion that something's wrong but could not abx it.
With 21600_23100 I got at 6/8 and ended up 6/10.

Though these aren't good results I think it's enough for a confirmation.

I tried 0.6.6 on your samples. The results are better, but with 00000_00595 I got at 7/9 and ended up 7/10.
So the problem is still there.

I went back to 0.6.4RC1 and used a setting of -3 -nts 0.
Now I can't abx the problem any more.

So this is evidence that 2Bdecided is right and it's just a -nts problem.

As for this I suggest we default -3 to -nts 0, -2 to -nts 2 and -1 to -nts 4, and keep -spf the way it was done with 0.6.4RC1 (IMO the high frequency range is covered already well by the short FFT with its low spreading value).

I still feel uncomfortable with abrupt noise level changes, but maybe this is a wrong idea. At least it's not backed up by this sample.

Average bitrage will increase again - something which isn't liked especially with -3. In the wiki there's encouragement already to use a higher -nts value than default for people who prefer a smaller filesize and accept minor errors. Maybe we should find a formulation which enforces this encouragement.

lossyWAV Development

Reply #777 – 2008-01-13 01:48:10

Quote from: halb27 on 2008-01-12 21:46:06

Quote from: Alex B on 2008-01-11 11:06:10

...
Your samples 00000_00595ms, 09400_10400ms, 19800_21000ms, 21600_23100ms
...

Finally I found the time to abx your samples (I had a lot of trouble trying to bring my system to an uptodate state - now I'm back to my old configuration).

With your 00000_00595 samples I got at a 6/7 which in the end was 7/10.
With 19800_21000 I also have the suspicion that something's wrong but could not abx it.
With 21600_23100 I got at 6/8 and ended up 6/10.

Though these aren't good results I think it's enough for a confirmation.

I tried 0.6.6 on your samples. The results are better, but with 00000_00595 I got at 7/9 and ended up 7/10.
So the problem is still there.

I went back to 0.6.4RC1 and used a setting of -3 -nts 0.
Now I can't abx the problem any more.

So this is evidence that 2Bdecided is right and it's just a -nts problem.

As for this I suggest we default -3 to -nts 0, -2 to -nts 2 and -1 to -nts 4, and keep -spf the way it was done with 0.6.4RC1 (IMO the high frequency range is covered already well by the short FFT with its low spreading value).

I still feel uncomfortable with abrupt noise level changes, but maybe this is a wrong idea. At least it's not backed up by this sample.

Average bitrage will increase again - something which isn't liked especially with -3. In the wiki there's encouragement already to use a higher -nts value than default for people who prefer a smaller filesize and accept minor errors. Maybe we should find a formulation which enforces this encouragement.

[Vino Rosso]Meh - oh well, just back from my company's Christmas party to a variation order for lossyWAV - no problem..... On the plus side, if v0.6.4 RC1 with -3 -nts 0 solves the problem then we will all benefit from the 50% speedup found when I started investigating Alex B's problem and potential solutions. Not the end of the world then - just a few kbps extra.....

On the face of it, maybe -nts 0 is the only acceptable starting point for the lowest quality option - so -nts -2 for -2 and -nts -4 for -1?

Ouch - 462kbps for my 53 sample set (40.98MB). But, we want transparency at all quality presents - so be it.[/Vino Rosso]

lossyWAV Development

Reply #778 – 2008-01-13 11:25:19

I've tried 0.6.4.RC1 -3 -nts 0 on my small regular track sample set which however has shown to be pretty representative for regular music. The average bitrate is 402 kbps.

I was a little fast last night with conclusions, probably because I was so happy having been able to abx the problem finally. What is missing at the moment IMO is AlexB's opinion towards -3 -nts 0.
AlexB, do you mind trying 0.6.4.RC1 -3 -nts 0?

lossyWAV Development

Reply #779 – 2008-01-13 17:23:08

Quote from: halb27 on 2008-01-13 11:25:19

I've tried 0.6.4.RC1 -3 -nts 0 on my small regular track sample set which however has shown to be pretty representative for regular music. The average bitrate is 402 kbps.

I was a little fast last night with conclusions, probably because I was so happy having been able to abx the problem finally. What is missing at the moment IMO is AlexB's opinion towards -3 -nts 0.
AlexB, do you mind trying 0.6.4.RC1 -3 -nts 0?

Spooky - my 10 album test set got 402kbps as well [edit] at -3 -nts 0; 450kbps at -2 -nts -2 and 494kbps at -1 -nts -4 [/edit] .....

lossyWAV Development

Reply #780 – 2008-01-13 20:40:43

This is an adequate and pretty evenly spread increase in bitrate to me for -3, -2, -1.

lossyWAV Development

Reply #781 – 2008-01-13 21:18:36

Quote from: halb27 on 2008-01-13 20:40:43

This is an adequate and pretty evenly spread increase in bitrate to me for -3, -2, -1.

Ok, I'll post v0.6.7 RC2 in the thread. You should notice a fairly impressive improvement in processing throughput.

lossyWAV Development

Reply #782 – 2008-01-13 22:02:53

Thank you, Nick.
Speed is very good.
Guess -nts defaults are -nts 0 for -3, -nts -2 for -2, and -nts -4 for -1. Right?
But what else is different compared to 0.6.4RC1? Average bitrate for my regular sample set is now 403 kbps.

lossyWAV Development

Reply #783 – 2008-01-13 22:10:30

Quote from: halb27 on 2008-01-13 22:02:53

Thank you, Nick.
Speed is very good.
Guess -nts defaults are -nts 0 for -3, -nts -2 for -2, and -nts -4 for -1. Right?
But what else is different compared to 0.6.4RC1? Average bitrate for my regular sample set is now 403 kbps.

If you were one of the first two to download v0.6.7 RC2 then you downloaded a version which still had the "maximum additional bits_to_remove increase per codec_block" mechanism active, with a delta of +2 bits. Sorry , I tried to remove it as quick as I could - try re-downloading....

lossyWAV Development

Reply #784 – 2008-01-13 22:49:32

It looks fine now (I tried with AlexB's sample).

I'll change the wiki as I described the -nts defaults.

lossyWAV Development

Reply #785 – 2008-01-14 00:29:22

Sup all, I've got a couple of questions about lossyWAV:

1) the wiki is angling at standard lossless decoders (like flac, etc) decoding lossy.flac/etc., but will standard WAV decoder decode lossyWAV correctly?

2) if all level settings are aiming at transparency... why have level settings?

lossyWAV Development

Reply #786 – 2008-01-14 01:02:01

Quote from: lexor on 2008-01-14 00:29:22

2) if all level settings are aiming at transparency... why have level settings?

I second this.

If -3 is beeing tuned to be transparent under any known condition, it would make sense for me to have one safer setting which handles possibly unknown problem files better. Beeing the more paranoid one, i probably would choose this (-2) . But i would never like to go even higher (-1). For me there is also some kind of a psychological barrier: For my taste lossy (wave) files should not have more than half the size of lossless files (on average)...

But this is just my taste...

lossyWAV Development

Reply #787 – 2008-01-14 02:12:03

Quote from: TBeck on 2008-01-14 01:02:01

I second this.

I don't.

I hope you'll keep the 3 levels.

So far I've been using lossy.wav -2 then encoding to flac (testing with vinyl restoration projects and results are very good).

For me it's like this:

-1 when it HAS to be transparent (eg. if I'd spent many many hours working on a piece in whatever capacity)
-2 when I really want it to be transparent (and figure that only in extreme cases it won't be, -2 is the perfect setting between MP3 320 and Lossless, and for me preferable to WavPack Hybrid).
-3 when I'd like it to be transparent, but I'm not too fussed if it isn't (I've got plenty of music which springs to mind).

So please keep the 3 levels -- and thanks for all your hard work.

C.

By the way -- has anyone done listening tests to MP3s transcoded from lossy.wav versus .wav?

In theory should there be any perceptual difference?

C.

lossyWAV Development

Reply #788 – 2008-01-14 02:48:43

Since it is still "lossy" I think that to have an option to choose for is still a better way to go. I mean, lossless codec do have an option even though it will work just fine without one or if developer decide to not include it. and still we got a lot of option anyway. ( which is good BTW.)

Oops. nearly for got what I'm here for. I drop by to show my gratitude & encouragement to everyone involve in this. (2Bdecided,Nick.C,halb27 and anyone else that I'm not mention) Thanks for your time and effort.

lossyWAV Development

Reply #789 – 2008-01-14 07:43:59

Quote from: lexor on 2008-01-14 00:29:22

1) the wiki is angling at standard lossless decoders (like flac, etc) decoding lossy.flac/etc., but will standard WAV decoder decode lossyWAV correctly?

The WAV file is still a WAV file - there is no decoding to do as all that is different between the original lossless WAV file and the lossyWAV file is that some LSB's are zero.

Quote from: lexor on 2008-01-14 00:29:22

2) if all level settings are aiming at transparency... why have level settings?

Every lossy codec I've come across has quality settings - all presets aim at transparency, some fail with some tracks, with reducing likelihood as output bitrate increases.

Quote from: carpman on 2008-01-14 02:12:03

So please keep the 3 levels -- and thanks for all your hard work.

By the way -- has anyone done listening tests to MP3s transcoded from lossy.wav versus .wav?

In theory should there be any perceptual difference?

I found a post on anythingbutipod.com which tends to suggest that an OGG file transcoded from lossyWAV was bigger than lossless > OGG. As to perceptual differences, I think that's a question for David....

Quote from: TBeck on 2008-01-14 01:02:01

I second this.

If -3 is beeing tuned to be transparent under any known condition, it would make sense for me to have one safer setting which handles possibly unknown problem files better. Beeing the more paranoid one, i probably would choose this (-2) . But i would never like to go even higher (-1). For me there is also some kind of a psychological barrier: For my taste lossy (wave) files should not have more than half the size of lossless files (on average)...

But this is just my taste...

It is as you say, but -3 at v0.6.4 RC1 has proven *not* to be transparent within a couple of days of release. I can't say I was very happy, but I was delighted that Alex B's ears are so good that he was able to identify a problem with the track in question. So, -1 for paranoics, -2 for most people and -3 for DAP users (my preference being -3).

Quote from: buktore on 2008-01-14 02:48:43

Since it is still "lossy" I think that to have an option to choose for is still a better way to go. I mean, lossless codec do have an option even though it will work just fine without one or if developer decide to not include it. and still we got a lot of option anyway. ( which is good BTW.)

Oops. nearly for got what I'm here for. I drop by to show my gratitude & encouragement to everyone involve in this. (2Bdecided,Nick.C,halb27 and anyone else that I'm not mention) Thanks for your time and effort.

Thanks for the appreciation - we've all had fun with this project!

lossyWAV Development

Reply #790 – 2008-01-14 11:03:14

Quote from: Nick.C on 2008-01-14 07:43:59

I found a post on anythingbutipod.com which tends to suggest that an OGG file transcoded from lossyWAV was bigger than lossless > OGG. As to perceptual differences, I think that's a question for David....

I saw that too. It matches my early tests with mp3. It's not a big deal.

What is interesting is taking mp3 problem samples, and trying to ABX WAV>mp3 vs lossy.WAV>mp3. It would be nice if -1 (at least) could make that difference unABXable - but this might be unrealistic. I should get back to playing around with trumpet.wav or whatever it was called.

Cheers,
David.

lossyWAV Development

Reply #791 – 2008-01-14 21:08:32

Quote from: TBeck on 2008-01-14 01:02:01

to my taste lossy (wave) files should not have more than half the size of lossless files (on average)..

[pedantic]I think you mean lossyFlac (or lossyTAK ), as lossyWav files are the same size as the source wavs[/pedantic]

Yes, I would wish that too, but I found out that the nature of the source file makes a big difference.
just some examples:
a reasonably quiet track (a singer and a guitar) that rates 553 with FLAC -8 and 429 with lossyFlac -3 -nts 0
a lot louder track (another singer with just a guitar) rates 857 in FLAC -8 but 347 with lossyFlac -3 -nts 0

go figure

lossyWAV Development

Reply #792 – 2008-01-14 22:19:52

Quote from: GeSomeone on 2008-01-14 21:08:32

Yes, I would wish that too, but I found out that the nature of the source file makes a big difference.
just some examples:
a reasonably quiet track (a singer and a guitar) that rates 553 with FLAC -8 and 429 with lossyFlac -3 -nts 0
a lot louder track (another singer with just a guitar) rates 857 in FLAC -8 but 347 with lossyFlac -3 -nts 0

It seems counter-intuitive, but looking at the nearly 3700 tracks that I've processed, the higher the initial bitrate, the lower the processed bitrate and vice-versa (subject to usual caveats about tracks which do not follow the generalism) [both processed bitrates less than the lossless bitrate].

lossyWAV Development

Reply #793 – 2008-01-15 07:24:01

Quote from: GeSomeone on 2008-01-14 21:08:32

... a reasonably quiet track (a singer and a guitar) that rates 553 with FLAC -8 and 429 with lossyFlac -3 -nts 0
a lot louder track (another singer with just a guitar) rates 857 in FLAC -8 but 347 with lossyFlac -3 -nts 0 ...

When there's only very few instruments probability is high that parts of the spectrum have low energy. The lossyWAV principle is based on preserving the low energy parts with reasonable accuracy. So 'simple' music needs more bits as a rule.
The more instruments the more noise-like becomes the music - technically speaking - and the harder it gets for a lossless codecs.

lossyWAV looks worst compared to pure lossless with quiet 'simple' music. lossyWAV has no chance to save a significant amount of bits in this case.

I see it in a positive way: in many cases lossyWAV saves a lot of bits compared to lossless. In those cases where the relation isn't so good it's for the most part because lossless is already very efficient.

lossyWAV Development

Reply #794 – 2008-01-15 12:27:06

Question: David mentioned in another thread about the number of actual bits remaining after rounding.

Is there any perceived benefit to be gained by implementing a(nother) safety net as follows:

When filling FFT array, OR a mask variable with the absolute value of each sample. This will allow the determination of the maximum set bit in the codec_block for that channel (max_bit).

Limit the bits_to_remove to the lower of the calculated value and Max(0,(max_bit-minimum_bits_to_keep)), thereby retaining at least minimum_bits_to_keep bits of actual resolution in that codec_block.

[edit] Also, if the number of clipped samples were restricted to, say, 5 per channel per codec_block (i.e. max of 10 for stereo, 0.977% of samples in the codec_block), would that seem reasonable? Even if they were all in series that would only be 0.1134 milliseconds. The reason I ask is that when I apply this to the livin_in_the_future problem track, although it clips, the bits_to_remove lost due to clipping is zero with only 196 clipping samples in the whole file (1323000 samples x 2 channels). [/edit]

[edit2] Say, -1 = 0 clips; -2 = 1 clip; -3 = 5 clips? [/edit2]

lossyWAV Development

Reply #795 – 2008-01-15 15:14:01

I guess it doesn't hurt, but I also think it won't reduce bitrate in a significant way.

lossyWAV Development

Reply #796 – 2008-01-15 15:26:09

Quote from: halb27 on 2008-01-15 15:14:01

I guess it doesn't hurt, but I also think it won't reduce bitrate in a significant way.

The first will increase the bitrate, the second certainly reduces it. I will post beta v0.6.8 in the first post of this thread, using minimum_bits_to_keep=5 and maximum_clips = (0,1,5).

lossyWAV Development

Reply #797 – 2008-01-15 15:56:41

Sorry for being not clear. I only addressed your second suggestion.
As for your first certainly it's another defensive action, but it looks a bit like not having confidence in the lossyWAV principle.

lossyWAV Development

Reply #798 – 2008-01-15 16:59:25

Are you using minimum_bits_to_keep in a defensive way already? Sorry, I'm not keeping up. If it's key to maintaining quality as it is, then maybe you should add what you propose. If it's not, then extending downwards to help quieter blocks doesn't seem necesary. If it is necesary, it would be better to keep the nosie floor at least x dB below the peaks in the spectral domain, rather than in the time domain - which is what I was trying to get at in a post on the last page.

I'm not sure what you're setting at with the clipping. If you let one sample clip in a block, then there are no wasted bits in that block, surely? The sample is 1111111111111111 so no zeros, so wasted_bits=0. Not sure how other codecs handle it - I remember Bryant saying wavpack was different.

Cheers,
David.

lossyWAV Development

Reply #799 – 2008-01-15 17:58:03

Quote from: 2Bdecided on 2008-01-15 16:59:25

Are you using minimum_bits_to_keep in a defensive way already? Sorry, I'm not keeping up. If it's key to maintaining quality as it is, then maybe you should add what you propose. If it's not, then extending downwards to help quieter blocks doesn't seem necesary. If it is necesary, it would be better to keep the nosie floor at least x dB below the peaks in the spectral domain, rather than in the time domain - which is what I was trying to get at in a post on the last page.

I'm not sure what you're setting at with the clipping. If you let one sample clip in a block, then there are no wasted bits in that block, surely? The sample is 1111111111111111 so no zeros, so wasted_bits=0. Not sure how other codecs handle it - I remember Bryant saying wavpack was different.

Cheers,
David.

On the clipping front, if bits_to_remove=6 then what would have been 10000000 00000000 (assuming there was no sign bit - that bit is done with floats) would be clipped to 01111111 11000000, i.e. as if it had been rounded down not up.

On the minimum_bits_to_keep front, at present maximum_bits_to_remove=bits_per_sample-minimum_bits_to_keep = 16-5 = 11 for *all* codec_blocks. With the new proposal, if the highest filled bit (taking the ABS of -ve numbers first) is the 8th then at most 3 bits would be removed, regardless of what the algorithm produced.

I do have faith in the method, I just like belt, braces and hands in pockets keeping trousers up......

Notice