lossyWAV 1.2.0 Development Thread

Topic: lossyWAV 1.2.0 Development Thread (Read 313954 times) previous topic - next topic

0 Members and 2 Guests are viewing this topic.

lossyWAV 1.2.0 Development Thread

Reply #500 – 2009-08-21 20:21:08

People with higher knowledge than you, told you Lame -V2 is transparent. You were naive enough to trust them. The truth is no lossy codec is transparent on all samples. Lossywav is no exception, but due to the technique used, the probability that a given sample is not transparent can be (much) lower with lossywav than with Lame (depending on the input).

The above is only true if you give lossywav enough bitrate, if you want to go below ~400Kbps then stay with classic lossy codec.

Advantage:
1: Splitable & Joinable (usable as CDImage+cue)
2: Not affected by the usual DCT killer samples
3: The above means Lossywav is much better than ANY other lossy codec (Including musepack) on electronic & live music.
4: Perfect with Sansa Fuse DAP
5: Not tied to a norm. (MPEG)

Flaw:
1: Twice the bitrate of other lossy codecs for the same quality on 95% samples. (non-electronic & non-live music)
2: Not tied to a norm. (MPEG) (for me this is an advantage, but some people like the safety of big norms)

Code: [Select]

I want the files to be as small as possible without quality loss.

for this nero ACC 256Kbps VBR is (much) better than lame, but you need a DAP that support AAC.

If you don't already have a DAP then the best choice is to get a Sansa Fuse & use lossywav -portable, IMHO. This is where lossywav really shines.

lossyWAV 1.2.0 Development Thread

Reply #501 – 2009-08-21 22:57:24

Quote from: Nick.C on 2009-08-20 20:12:27

lossyWAV beta 1.1.4g attached to post #1 in this thread.

Thank you, Nick.
I encoded several tracks and carefully listened to them. Everything was fine.

Biitrate is a little bit lower than v1.1.3e's which I used so far for productive purposes (379 kbps vs. 381 kbps for portable with my standard test set, 338 kbps vs. 341 kbps for -q 1.5).
Parameter -p is targeting at something new which is quite interesting IMO for the lower quality settings.
Using -p bitrate goes up 1 kbps for portable and -q 1.5. For -q 0 bitrate increase is 3 kbps.
Looking at this I can imagine we can afford having -p take care of an even better S/N ratio.
I'll try to ABX -p vs. non -p during the next days.

lossyWAV 1.2.0 Development Thread

Reply #502 – 2009-08-22 18:42:32

Hallo Nick,

If you can find the time, do you mind allowing the --limit parameter to go below 16000 Hz?

I found a 128 kbps mp3 track in my collection originating from web radio with a drop out in it. I converted it to wave, edited the wave file, and wanted to convert the result to lossyFLAC -P. Bitrate was an adequate 580 kbps which is certainly due to the fact that the mp3 track was lowpassed at roughly 15 kHz. So I'd like to use --limit 15000 (or maybe a bit lower) in order to get at an adequate bitrate.

Apart from this it would be interesting IMO to see by how much bitrate will drop when using --limit 15000 or similar, and of course what impact this will have on quality. Maybe interesting for the very low quality settings and quite in line with the effect of noise shaping (allowing for a higher S/N ration for the very high frequency range) which is used to a minor extend with the very low quality settings.

lossyWAV 1.2.0 Development Thread

Reply #503 – 2009-08-22 19:17:05

Hi everybody
I also record from the radio (in fact FM broadcast) and I found, that I get best results (i.e. smallest file size of the resulting FLAC file for a given quality level) when I cut frequencies at 16300 Hz. To apply lowpass filter on WAV I use the free Stereotool (mentioned already here on Hydrogenaudio). In such a case I get usually lossyFLAC bitrate between 430 and 470 kbps (lossyWAV 1.1.4g -p -q 2).

I use lossyWAV/FLAC for several months now for archiving purposes of FM brodcast recordings.

lossyWAV 1.2.0 Development Thread

Reply #504 – 2009-08-22 20:11:04

Quote from: halb27 on 2009-08-22 18:42:32

If you can find the time, do you mind allowing the --limit parameter to go below 16000 Hz?

Wishes, commands, that sort of stuff.... lossyWAV beta 1.1.4h attached to post #1 in this thread.

Quote from: zorzescu on 2009-08-22 19:17:05

.... To apply lowpass filter on WAV I use the free Stereotool....

I don't know that you really need to lowpass before processing with lossyWAV.

lossyWAV 1.2.0 Development Thread

Reply #505 – 2009-08-22 21:25:18

Quote from: Nick.C on 2009-08-22 20:11:04

I don't know that you really need to lowpass before processing with lossyWAV.

I did some test:

Code: [Select]

                       lossyWAV/FLAC  bitrate       file size             FLAC           size
------------------------------------------------------------------------------------------------
without lowpass                    578 kbps         26.6 MB            1081 kbps      49.8 MB
after lowpass at 16300 Hz          494 kbps         22.7 MB             967 kbps      44.6 MB

After lowpassing WAV file, the resulting bitrate lowers. Cutting frequencies above 16 kHz does not cut sound data in FM radio broadcast. So I decided I need lowpass before further WAV file manipulations

Edit:
lossyWAV q=3, FLAC -5

lossyWAV 1.2.0 Development Thread

Reply #506 – 2009-08-22 21:49:19

I compressed the same sample as above (6 mins WAV, FM broadcast) with lossyWAV 1.1.4h and --limit 15000
Here are the results:

Code: [Select]

                       lossyWAV/FLAC  bitrate       file size
--------------------------------------------------------------------------------------
without lowpass                    441 kbps         20.3 MB
after lowpass at 16300 Hz          390 kbps         17.9 MB

Quick look shows that I can hear no difference In fact I am not very young

lossyWAV 1.2.0 Development Thread

Reply #507 – 2009-08-23 00:22:01

Quote from: Nick.C on 2009-08-22 20:11:04

... lossyWAV beta 1.1.4h attached to post #1 in this thread. ....

Incredible, you must be a wizard! Thank you very much.

Bitrate for my mp3 track at portable quality comes down from an exact 594 kbps (580 kbps was from memory so not exact) to 485 kbps when using --limit 14500. So it helps a lot.

I also tried zorzescu's trick of lowpassing before doing lossywav but this didn't improve things more as was expected because the mp3 track is lowpassed already.

I was curious about bitrate reduction and tried --limit 14500 on my usual test set. Average bitrate for --portable comes down to 361 kbps.
As -q 2.0 can be considered so far to yield extremely good quality too I also tried -q 2.0 -p --limit 14500. Result is 348 kbps and quality is fine from just careful listening.
I must say though that I had a suspicion that at the beginning of Simon & Garfunkel's 'I am a Rock' which is pretty noisy in the original the very high frequency part of the noise was a tiny bit louder. Got at 4/5 but lost afterwards (arrived at 4/8). I will go into this more in the next days.

lossyWAV 1.2.0 Development Thread

Reply #508 – 2009-08-23 00:30:18

Quote from: zorzescu on 2009-08-22 21:49:19

I compressed the same sample as above (6 mins WAV, FM broadcast) with lossyWAV 1.1.4h and --limit 15000
Here are the results:

Code: [Select]
                       lossyWAV/FLAC  bitrate       file size
--------------------------------------------------------------------------------------
without lowpass                    441 kbps         20.3 MB
after lowpass at 16300 Hz          390 kbps         17.9 MB
Quick look shows that I can hear no difference In fact I am not very young

FM radio bandwidth usually is around 15.5 kHz (though I'm not sure about the exact situation in your country). As you do have audio content above 16.3 kHz I guess that's noise from the transmission or your radio.
390 kbps is what can be expected on average from quality -q 3.

lossyWAV 1.2.0 Development Thread

Reply #509 – 2009-08-23 10:10:13

Hallo Nick,

I think the idea behind going --limit 14500 in order to not analyze frequency areas without audio contents can be put on a much better basis.

2Bdecided's principle is based on looking at the audio frequency with lowest audio energy and choosing number-of-bits-removed in such a way that this energy is not suppressed by the method's added noise.
If I see it correctly in real world computation so far 'frequency with lowest audio energy' is replaced by 'frequency with lowest energy - no matter whether there is audio contents or not'. 2Bdecided was aware of this leading to a poor amount of bits-to-remove, and weakened things by use of a spreading function to get at usable results.
This approach has a basic flaw which tends to yield results that are too pessimistic. Things are very clear with lowpassed music when lowpass is lower than --limit. But I think this is also the reason why music originating from one or few instruments is not encoded efficiently. There will be frequency areas without musical contents driving the current machinery to uselessly hold bits-to-remove low.
IMO a simple soulution would be to ignore the result of the spreading function (or let the spreading function return an artificially big value) whenever the audio content for the bins involved is zero. As a criterion for this situation due to rounding errors in the FFT calculation and/or the spreading calculation we probably cannot compare the spreading result to an exact zero but have to allow it to be below a (very low) threshold to consider this as zero audio contents for the bins involved.

I can imagine this will help a lot for such situations like mine when reencoding modified mp3 tracks, or when encoding FM radio like zorzescu does. No special values for --limit necessary, it's all automatic.
I guess it will help a lot for the currently unlucky situation when bitrate is inadequately high with 'simple' music (simple with respect to the audio spectrum).
For other kind of music it mght help locally on occasion, and thus may bring bitrate down also a little bit.
No drawback seen.

Thinking of this and of the low quality settings IMO it can be tolerated to ignore the ends of the audio spectrum for the noise analysis to a higher degree the lower we go down with quality demand. The idea behind is that within real world music and ignoring very artificial music it is the not too extreme frequency range which dominates the musical contents. So audio analysis can concentrate on this. So I'd welcome if you could provide an experimental --lolimit and --hilimit parameter which has the result of the spreading function be ignored (or lets the spreading function return an artificially high value) in case one of the bin's frequency involved is below --lolimit or above --hilimit.
If this proves to be helpful we may consider using it internally depending on -q value (only for low -q value).

Once on a wishing trip: do you mind giving the -p option a parameter demanding the error to be not only at the signal level or below, but to be a certain amount below the signal level? Ideally the parameter has the meaning of 'S/N ratio' (so a value of 2 means noise energy is half the signal energy in worst case).

lossyWAV 1.2.0 Development Thread

Reply #510 – 2009-08-23 13:50:04

The mechanism already disregards bins which correspond to frequencies lower than 20Hz or higher than 16kHz (or --limit).

You've given me a few ideas - I'll have a think then post some proposals....

--lolimit and --hilimit would be fairly easy to implement - --hilimit exists already (as --limit).

If we are talking of ignoring very low bin values then I would probably scrap the existing spreading mechanism(s) - an average would be taken then every bin below (as yet unknown)% of the average would be disregarded.

lossyWAV 1.2.0 Development Thread

Reply #511 – 2009-08-23 14:08:32

Quote from: Nick.C on 2009-08-23 13:50:04

... an average would be taken then every bin below (as yet unknown)% of the average would be disregarded.

I'm not sure whether this is what I have in mind. This disregards bins with low energy relative to something. But whatever 'something' is: in case there is nearly no audio contents in 'something', bins compared to it aren't ignored. What I have i mind is ignorance of the spreading outcome based on an absolute criterion (being below a very low threshold which only accounts for rounding errors that make the spreading result non-zero even with zero valued bins).
OK, maybe if 'something' is related to the RMS of the block or such, this may be useful. I guess however that some kind of spreading will still be useful for the higher frequencies.

As --limit already works as I described for --hilimit: can you allow for a lower value than 14500?

lossyWAV 1.2.0 Development Thread

Reply #512 – 2009-08-23 14:30:37

Easily - it's a matter of changing the help text and one value in the parameter input checking routine.

lossyWAV 1.2.0 Development Thread

Reply #513 – 2009-08-23 17:02:02

I wrote:
'This approach has a basic flaw which tends to yield results that are too pessimistic.'
and
'No drawback seen.'.

Not exactly true I'm afraid. The method's added noise still has to be low also when there is no audio content in a certain frequency area.
So it's not a basic flaw, rather a basic disadvantage.
I don't think this makes these new ideas worthless because the fact remains that with no audio contents in a certain range there is no good decision basis for choosing bits-to-remove.
My feeling is that we should apply such bin-ignoring-tactics for zero-audio-containing bins at the outer edges of the frequency range (not necessarily only at the extreme edges), something like for instance above 8 kHz and below 150 Hz, and only, if RMS of the block's total signal is above a certain value (for a very basic idea of masking).
It's all heuristics of course, but this applies already for the spreading function.

lossyWAV 1.2.0 Development Thread

Reply #514 – 2009-08-23 17:15:18

Thanks for the added input - still thinking about how best to modify the --postanalyse parameter and spreading function.

lossyWAV beta 1.1.4j attached to post #1 in this thread.

lossyWAV 1.2.0 Development Thread

Reply #515 – 2009-08-23 21:26:42

Thank you, Nick.

lossyWAV 1.2.0 Development Thread

Reply #516 – 2009-08-23 22:45:39

I couldn't resist and tried --portable --limit 10000 to get an idea of at what a bitrate we may get when following the current ideas. For my usual test set of various pop music bitrate came down to 317 kbps!

I tried to ABX 'I am a rock' again but didn't succeed (3/3 turned to 5/7 turned to 7/10). Of course we don't know whether I'm just bad at ABXing at least right now. The track wasn't suspicious to me today though, and I just tried because of yesterday's experience.
My suspicion today was more with another track (Friedemann's Sentimental Elegance), but a starting success of 4/4 turned to 6/10. Suspicion remains though.
I went over to my problem sample set, and the first thing to be remarked is that despite --limit 10000 the sensitivity towards difficult tracks isn't seriously reduced. For instance the sample 'Mandylion' shadowking gave recently has a bitrate of 443 kbps which is a lot higher than the average of 317 kbps, and it's only a bit lower than when using 1.1.3e --portable which takes 465 kbps.
I tried to ABX these samples, but again without success. Best result was with eig (7/10). With eig I had the impression most of the time that pitch is shifted up a subtle bit towards higher frequencies - an impression I remember I also had with some former lossyWAV listening tests.

To be clear I don't think that --portable --limit 10000 is a transparent setting - I'm just bad at ABXing.
But I think the ignoring-bins-when-there-is-no-audio-tactics is a promising way to go.
Even with what we've got right now I personally prefer --portable --limit 10000 over a low -q setting with same resulting average bitrate.

lossyWAV 1.2.0 Development Thread

Reply #517 – 2009-08-24 08:07:55

Found a few things I wanted to change in the --postanalyse function - it now uses the existing spreading function among others.

lossyWAV beta 1.1.4k attache to post #1 in this thread.

lossyWAV 1.2.0 Development Thread

Reply #518 – 2009-08-24 10:09:27

Quote from: halb27 on 2009-08-23 10:10:13

If I see it correctly in real world computation so far 'frequency with lowest audio energy' is replaced by 'frequency with lowest energy - no matter whether there is audio contents or not'. 2Bdecided was aware of this leading to a poor amount of bits-to-remove, and weakened things by use of a spreading function to get at usable results.
This approach has a basic flaw which tends to yield results that are too pessimistic. Things are very clear with lowpassed music when lowpass is lower than --limit. But I think this is also the reason why music originating from one or few instruments is not encoded efficiently. There will be frequency areas without musical contents driving the current machinery to uselessly hold bits-to-remove low.

Well, it's "useless" if it's masked, but not "useless" if it's not masked.

...and to determine whether it's masked or not, you need a proper psychoacoustic model.

Cheers,
David.

lossyWAV 1.2.0 Development Thread

Reply #519 – 2009-08-24 10:14:33

Quote from: zorzescu on 2009-08-22 21:25:18

Quote from: Nick.C on 2009-08-22 20:11:04
I don't know that you really need to lowpass before processing with lossyWAV.

I did some test:
Code: [Select]
                       lossyWAV/FLAC  bitrate       file size             FLAC           size
------------------------------------------------------------------------------------------------
without lowpass                    578 kbps         26.6 MB            1081 kbps      49.8 MB
after lowpass at 16300 Hz          494 kbps         22.7 MB             967 kbps      44.6 MB
After lowpassing WAV file, the resulting bitrate lowers. Cutting frequencies above 16 kHz does not cut sound data in FM radio broadcast. So I decided I need lowpass before further WAV file manipulations

Yes, you should - almost all FM receivers allow the 19kHz pilot tone through, and various noise either side of that frequency. It's not loud enough for (almost) anyone to hear it, but it's certainly there, visible on a spectrogram, and wastes bits when encoding. This is a fundamental FLAC (and also mp3!) issue - more high frequency data = more bits needed. The bitrate increase/decrease is incidental to the use of lossyWAV in this case - lossyWAV doesn't care.

It is worth lowering the frequency at which lossyWAV stops checking for spectral low points in this case - but for efficiency, it's better still to resample to 32kHz.

Cheers,
David.

lossyWAV 1.2.0 Development Thread

Reply #520 – 2009-08-24 12:44:44

Quote from: 2Bdecided on 2009-08-24 10:09:27

Quote from: halb27 on 2009-08-23 10:10:13
....There will be frequency areas without musical contents driving the current machinery to uselessly hold bits-to-remove low.
Well, it's "useless" if it's masked, but not "useless" if it's not masked. ...

Yes, and the current approach certainly is sure-proof, but also very pessimistic. Maybe we should leave it like that for the --standard quality or better.

And yes, my suggestions given are over-simplicistic.
But I guess it's worth while striving at a rude psychoacoustical model which may still be defensive for quality levels below but close to --standard but not as super-pessimistic as it's done right now. With --portable we allow for a very small chance of subtly audible errors. Below --portable we allow for more.

A simple psychoacoustical model may be just ATH-related, of the kind:
if the outcome of the computational spreading function is below a roughly ATH related frequency dependent threshold let the spreading function return the threshold value instead. The threshold can have the property that at say -q 4.0 or above it is exactly related to the ATH such that added noise is below the ATH.
This is still pessimistic as usually there is some masking.
We can take care of this in a rude form by rising the threshold values with decreasing -q value. Which brings an increasing chance of audible errors, but that's what we want from lower -q values (sure the chance for errors should be very low at --portable, and audible issues should be subtle).

Maybe some day a more elaborate method can take account of masking in a preciser way by measuring the energy in masking effect-related frequency bands as is done with mp3 etc.

For the moment a rough ATH related machinery can already lead to improvement IMO, especially with respect to the HF area.

lossyWAV 1.2.0 Development Thread

Reply #521 – 2009-08-25 07:40:29

Resultant bitrate tables updated for beta 1.1.4k with / without --postanalyse.

lossyWAV 1.2.0 Development Thread

Reply #522 – 2009-08-25 08:04:32

Bitrate increase for -p is higher now for the --portable and --standard than for the lower quality settings other than with the previous menchanism. I thought --postanalyze takes care of something similar to the S/N ratio as a whole not related to the usual lossyWAV mechanism (execpt for implementation details). What exactly does --postanalyze do?

lossyWAV 1.2.0 Development Thread

Reply #523 – 2009-08-25 12:20:43

The --postanalyse parameter simply carries out a single windowed FFT analysis per channel of the original codec-block and calculates the spreading function average and minimum. These two values are modified by noise_threshold_shift_average and noise_threshold_shift_minimum (were snr_value and noise_threshold_shift respectively). The minimum value of these two is stored.

Once the first pass remove_bits procedure has produced a clipping limit compliant set of correction data, that data is analysed using the same windowed FFT analysis. The spreading function results are calculated and only the average is used, unmodified.

If the correction data spreading average exceeds the previously stored minimum value for the original data then the bits_to_remove value is reduced by 1, the remove_bits procedure is repeated and the correction data FFT process is repeated until the added noise is below the acceptable level.

lossyWAV 1.2.0 Development Thread

Reply #524 – 2009-08-25 12:55:15

I wonder what part of this process is depending on -q. There must be something because otherwise bitrate increase for the very low quality settings is expected to be much higher compared to the increase of say --standard. Is it the noise_threshold_shifts? If so wouldn't it be better to constantly use those noise_threshold_shift values here that belong to say --portable in order to maintain a certain minimum quality this way? I guess the main advantage of this recent addition is for the low quality settings. Quality from --portable on is excellent already.

Notice