Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV 1.2.0 Development Thread (Read 310491 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

lossyWAV 1.2.0 Development Thread

Reply #475
I'm looking again at window functions - specifically, implementing the Blackman window which takes a parameter alpha in the range 0..1. Zero is equivalent to the existing Hanning window function used, 0.57 is approximately equal to the Flat-Top window. I have been re-calculating noise threshold values for values in the permissible range (although I am thinking of limiting the user selectable alpha range to 0.0 to 0.6) and have spotted a relationship between some constants specific to each alpha value.

Ah, cool. So do you aim for high-resolution or for high-dynamic-range? The article claims that this graph might also be useful.

lossyWAV 1.2.0 Development Thread

Reply #476
Nick.

This weekend i've been working on jLossyWav in order to upgrade it to 1.1.4.

I've implemented buffered I/O (speed has increased considerably due to this), and copied partially the console interface (most of the switches, help, etc..)

On the algorithm itself, i haven't got that further though. There are a couple of things i'd like to get your opinion on.

First, about the Hanning window:
I've implemented it so that it is equal on both sides, while your implementation is off by one (starts at zero, and then it's equal). I remember i had a problem and i modified the window to be like it is now, but I can't remember what the problem exactly was.

Next, about the procedures : Post_Process_FFT_Results , Sort_FFT_Results. I see they are for the new sortspread.
About the first, it seems it gets the Real value out of the Complex one and stores it, skewed, in fft_result, similarly of what the begining of Spreading_Complex does. is that it?
About the second, is that a bubble sort algorithm, so that fft_result is ordered from smallest to biggest? Meaning, the goal being to get the smallest ones?

Spread_Complex also changed (I remember you tried different versions, so this is to be expected) If i get it right, it gets the minimum of the x-1 to x+1 vs the minimum of the x to x+somewidth, plus the acummulators of the averages of these values plus.. well.. it's quite complicated to explain :S

I still have some work to readapt the things that have changed, as well as to add the missing things. I hope for a release during july.



Btw.. have you considered refactoring the source a bit? That lossywav.dpr file is getting too complex. Especially the massive amount of variables. (naming conventions, constants placed as vars, local variables put as global variables...). This would greatly help other developers.

lossyWAV 1.2.0 Development Thread

Reply #477
[JAZ],

Glad that you're still interested in the project - it's straining my brain a bit at the moment (I'm trying to clearly understand how the reference_threshold values are made up when the bit-removal noise is calculated in an unpublished routine).

Hanning window: I changed it to the zero start to allow index shift access (store all arrays in one) - however I realise that that is not such a good idea and have reverted (in the unreleased 1.1.4e) to fully symmetrical values;

Post_Process_FFT_Results: Yes - you got it in one;

Sort_FFT_Results: Again correct.

Spread_Complex combines the old_spread and new_spread algorithms and calculates min_old; min_new; average_old and average_new concurrently. Then the lower minimum has the noise_threshold_shift applied and the lower average has the snr_value applied. Then the lower of the modified minimum and modified average is used as the single result. Each discrete old_value averages the range result
  • to result[x+somewidth] as you said and new_value = ((result[x-1] + result[x+1])*factor + result
  • ) / (1+2 * factor).

    I'll try to make the code more comprehensive.

    Nick.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.2.0 Development Thread

Reply #478
In looking at the reference threshold values again I have been playing (yet again) with dither.

Previously the order was:

take each sample, divide by 2^bits-to-remove, add optional dither, round, multiply by 2^bits-to-remove.

This increased the bitrate quite dramatically.

However, each bit that is removed from the sample is random, so why not use:

take each sample, add triangular dither, divide by 2^bits-to-remove, round, multiply by 2^bits-to-remove

instead?

This does not change the reference threshold values beyond 4 bits-to-remove (higher than undithered for 1, 2 & 3).

Thoughts?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.2.0 Development Thread

Reply #479
Quote
take each sample, divide by 2^bits-to-remove, add optional dither, round, multiply by 2^bits-to-remove.

Doesn't that effectively multiplies your dither by 2^bits-to-remove? Seems wrong.

As for your question, what is the distribution of a bit removed from the sample before calling it "random"?

lossyWAV 1.2.0 Development Thread

Reply #480
Doesn't that effectively multiplies your dither by 2^bits-to-remove? Seems wrong.

As for your question, what is the distribution of a bit removed from the sample before calling it "random"?
Music will *probably* follow a fairly random distribution, at least in the lowest few bits.

In code, no dither:

processed_sample:=Round(raw_sample/powersoftwo(bits_to_remove))*powersoftwo(bits_to_remove)

Old dithering:

processed_sample:=Round(raw_sample/powersoftwo(bits_to_remove)+dither)*powersoftwo(bits_to_remove)

Proposed dithering:

processed_sample:=Round((raw_sample+dither)/powersoftwo(bits_to_remove))*powersoftwo(bits_to_remove)

lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.2.0 Development Thread

Reply #481
To me this sounds healthier than old dithering.
As from my understanding dithering was amplified with the old scheme according to bits-to-remove though still remained at supposed noise level.
Anyway I thought dithering was expected not to be useful. Is bitrate lower with this kind of dithering than when not doing dithering at all?
lame3995o -Q1.7 --lowpass 17

 

lossyWAV 1.2.0 Development Thread

Reply #482
Anyway I thought dithering was expected not to be useful.

Wasn't it even working against the idea of lossyWav?
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV 1.2.0 Development Thread

Reply #483
Old dithering:

processed_sample:=Round(raw_sample/powersoftwo(bits_to_remove)+dither)*powersoftwo(bits_to_remove)

Proposed dithering:

processed_sample:=Round((raw_sample+dither)/powersoftwo(bits_to_remove))*powersoftwo(bits_to_remove)

...but if the "dither" is at the LSB level in both cases, then in the proposed case, the dither isn't doing much once you're removing more than a couple of bits.

What is point, "dither"?

Cheers,
David.

lossyWAV 1.2.0 Development Thread

Reply #484
Anyway I thought dithering was expected not to be useful.

Wasn't it even working against the idea of lossyWav?

Exactly. The point of lossyWAV is to remove trailing n bits which are estimated to be just noise. The point of dithering is to transfer the noise from the removed bits to the last remaining bit (in order to mask the artefacts produced by removing the said bits). So basically, dithering makes the last bit which was estimated not to be noise noisy.

lossyWAV 1.2.0 Development Thread

Reply #485
Just got into some more lossy codec testing recently. I was interested to see how more tonal music fares and there seem to be some issues with quality/bitrate. Note I only tested latest 'stable' release.

1 - On the attached sample: http://www.hydrogenaudio.org/forums/index....showtopic=73344

Heavy noise @ Q0 (360 k) , still noisy @ Q1 (402 k ). In contrast wavpack lossy v4.5 ABR 230k performs better - nearly transparent.


2 - On an album i have 'black tape for a blue girl'. The lossless bitrate is 647 k. Lossywav -S managed to produce bigger files than the original (2 tracks) while lossywav -P didn't achive any saving for those 2 tracks. The rest where smaller though.


I cannot draw a solid conclusion but a few thoughts: VBR rate may be too unstable for more silent music (very high bitrate and if quality lowered - noise and still high bitrate). -P has to produce saving in all tracks (at least not make the bigger) while being fully or very close to transparent. I believe -P is doing okay in that regards.


lossyWAV 1.2.0 Development Thread

Reply #486
Thank you for your sample, shadowking.
I'll listen to it when I'll be back from holidays.
We have known that -q 0 doesn't provide good quality. Real good quality starts in the -q 1.5 ... 2.0 zone.
It has also been known that simple tonal tracks are handled inefficiently compared to lossless codecs because lossless codecs work very efficiently in this case when using a large blocksize, but lossyWAV blocksize is only 512 samples (using a multiple of 512 samples helps only on rare occasion as it makes the lossyWAV procedure less efficient). Moreover in case FLAC is used FLAC often works inefficiently for this kind of music, With simple tonal music TAK/wavPack lossless/Monkey are often as efficient or even more efficient than lossyFLAC -P.
Your sample seems to combine these worst case scenarios for lossyWAV (bitrate is unusually high). As such it looks like it's a good example for testing potential improvements of lossyWav.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.2.0 Development Thread

Reply #487
I'm sure we've been round this loop at some point before.

Solutions are

1. Simple: don't use the lossy version when the lossless is smaller
2. Complex: try different blocksizes
3. Very complex: use dynamic blocksizes

3 is a complete redesign of lossyWAV and FLAC, so isn't going to happen.
2 and 1 require file functionality that's outside of lossyWAV itself - maybe in the .bat file, front-end, whatever. lossyWAV as stands can't do it because it never sees the final FLAC file.


I don't think there are significant improvements needed to the "model" for tonal signals - it already knows that it shouldn't touch them - hence the high bitrate. Wavpack lossy does better because it can use appropriate block lengths.

Cheers,
David.


Cheers,
David.

lossyWAV 1.2.0 Development Thread

Reply #488
lossyWAV beta 1.1.4e attached to post #1 in this thread.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.2.0 Development Thread

Reply #489
Thank you very much for the new version, Nick.
Can you tell a bit about the new parameters?
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.2.0 Development Thread

Reply #490
The new --maxsnr parameter works in the same way as the old -nts parameter did, except it is a threshold below the maximum spread FFT result for a particular analysis rather than above or below the minimum spread FFT result. Another name for it could be upper_noise_threshold_shift. SNR is still in there as a threshold below the average spread FFT result.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.2.0 Development Thread

Reply #491
Bugfix release. lossyWAV beta 1.1.4f attached to post #1 in this thread.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.2.0 Development Thread

Reply #492
There's been so many releases recently, and to be honest, I don't understand much of what is being done in LossyWAV these days.
I use LossyWAV all the time, but I'm currently using version 1.0.1.0. Is there a reason to upgrade or are all the updates experimental fine tuning?

I'm confused.

C.
PC = TAK + LossyWAV  ::  Portable = Opus (130)

lossyWAV 1.2.0 Development Thread

Reply #493
... Is there a reason to upgrade or are all the updates experimental fine tuning? ...

Not really I think.
The simple fact is that lossyWAV according to 2Bdecided's principle worked from the very start. It is Nick.C's merit that he added some internal procedures of precaution that enabled lossyWAV to work excellently with the 'portable' quality level at ~380 kbps. Since the establishment of the 'portable' quality there had been variants of certain technical aspects none of which really led to an improvement. Probably with just variants of the current principles no improvement can be expected. This is my personal opinion, but if I understand some of 2Bdecided's remarks correctly this is the way he thinks too. There is also no need for improvement IMO, the results are great (maybe with the exception of the encoding of simple tonal music where lossyWAV + a lossless codec is pretty inefficient compared to pure lossless which is very efficient here - but again, this is immanent to the lossyWAV principles and when using 'portable' the situation is usually acceptabe even in this case).
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.2.0 Development Thread

Reply #494
Thanks halb27, that's very clear.
I agree with what you said: "There is also no need for improvement IMO, the results are great".

C.
PC = TAK + LossyWAV  ::  Portable = Opus (130)

lossyWAV 1.2.0 Development Thread

Reply #495
Just got into some more lossy codec testing recently. I was interested to see how more tonal music fares and there seem to be some issues with quality/bitrate. Note I only tested latest 'stable' release.

1 - On the attached sample: http://www.hydrogenaudio.org/forums/index....showtopic=73344

Heavy noise @ Q0 (360 k) , still noisy @ Q1 (402 k ). In contrast wavpack lossy v4.5 ABR 230k performs better - nearly transparent. ...
-P has to produce saving in all tracks (at least not make the bigger) while being fully or very close to transparent. I believe -P is doing okay in that regards.

Finally I managed to listen to your sample. Not hard to ABX at -q 0 (I used v 1.1.0c as I thought this is what you call the latest stable release). But I couldn't ABX -q 1.
My -q 0 bitrate is 332 kbps when using FLAC -b 512 -8. I guess you used piping which produces some overhead significant with short track snippet. Ignoring this difference bitrate is unusually high anyway.

wavPack lossy is an alternative to lossyWAV of course, if player support isn't a restriction. Especially in the 300 kbps area and below it is expected to be the more appropriate choice IMO, while in the 380 kbps range I'd prefer lossyWAV qualitywise though wavPack lossy is great too. I admire David Bryant's work, but unfortunately there is no DAP support except for those players that can make use of Rockbox.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.2.0 Development Thread

Reply #496
Dumb question, but what type of noise does lossywav use for reducing the file size. Could another variant of lossywav use the different types of sound "color variants". Would an implementation for specific genre types be created, producing better compression for different frequencies? Could it allow lower bitrates?
Just seems interesting...

lossyWAV 1.2.0 Development Thread

Reply #497
The added noise is "created" by the bit-reduction of codec blocks. At its simplest it is white noise but, with noise shaping linked to the quality scale, is more noise shaped as quality setting increases.

Since lossyWAV carries out all of its modification to the audio on raw samples (simply rounding the sample to a number of bits) there is no possibility of treating different frequencies differently.

Some time ago the --awful (-A) and --nasty (-N) presets (equivalent to --quality -4 and -2 respectively) were introduced to allow user to *very* easily create artifacts in the audio. This was to demonstrate the reasoning behind the present --quality 0 (--zero or -Z) settings. Forcing the bitrate down really hurts the quality and introduces glaring artifacts at --awful, less to at --nasty and less still at --zero.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.2.0 Development Thread

Reply #498
lossyWAV beta 1.1.4g attached to post #1 in this thread.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.2.0 Development Thread

Reply #499
I sniffed at LossyWAV yesterday, and it seems like a good idea. But I have a question: suppose my need is to have the music files compressed for a portable device. I want the files to be as small as possible without quality loss. LossyWAV consider the setting --standard to be transparent. LAME MP3 files are considered to be transparent at the setting -V2. And the LAME files are much smaller than a LossyWav file at these settings. If "transparent" means the same here, what advantages would LossyWAV give over LAME?