Wonderful. Something like this is what I expected.
| -skew 0 | -skew 12| -skew 24| -skew 36-snr 0 |390 / 510|390 / 510|390 / 510|390 / 510
Hi Nick,I just started examining the behavior of -3 with respect to -skew and -snr.I only started using -snr cause I think there's something wrong:Code: [Select] | -skew 0 | -skew 12| -skew 24| -skew 36-snr 0 |390 / 510|390 / 510|390 / 510|390 / 510These values can't be identical to my former test, cause I used FLAC -b 1024 then and FLAC -b 512 now. But I wonder what's wrong hear: identical results with various -skew values is not what I expected.390/510 is a good result IMO, but is expected to be achieved with around -skew 24.
@Mitch 1 2 - Excellent find! Should extend the userbase of David's method......
------- ----- ----- ----- | | 1 | 2 | 3 |------- ----- ----- ----- | 512 | 434 | 427 | 425 || 1024 | 432 | 427 | 425 || 2048 | 430 | 424 | 422 || 4096 | 460 | 453 | 451 |------- ----- ----- -----
Quote from: Nick.C on 29 October, 2007, 03:54:40 AM@Mitch 1 2 - Excellent find! Should extend the userbase of David's method......Nice to see that WMALSL is working. I gave it a quick run and, with my old version of WMALSL, it looks like best frame size for that codec is 2048. When somebody else can confirm that is the case also with newer versions, we may want to add a dedicated switch, to avoid people using it with frame size 512 or 1024.By the way @2048 WMALSL performs halfway between TAK and FLAC.Set F, WMALSL-WMP9, 0.3.18 11236-FFFFF-1246D-FFFFFCode: [Select]------- ----- ----- ----- | | 1 | 2 | 3 |------- ----- ----- ----- | 512 | 434 | 427 | 425 || 1024 | 432 | 427 | 425 || 2048 | 430 | 424 | 422 || 4096 | 460 | 453 | 451 |------- ----- ----- -----
Quote from: halb27 on 29 October, 2007, 05:51:56 PMHi Nick,I just started examining the behavior of -3 with respect to -skew and -snr. ...I've run some skew tests on my 52 sample set:-3 -skew 0 -snr 0 > 433.0kbps;-3 -skew 6 -snr 0 > 435.3kbps;-3 -skew 12 -snr 0 > 439.1kbps;-3 -skew 18 -snr 0 > 446.1kbps;-3 -skew 24 -snr 0 > 458.7kbps;-3 -skew 30 -snr 0 > 479.8kbps;-3 -skew 36 -snr 0 > 511.3kbps.Is it possible that *none* of your samples have a minimum result below 3.45kHz?
Hi Nick,I just started examining the behavior of -3 with respect to -skew and -snr. ...
So, a -wm parameter to set codec_block_size to 2048 for all quality levels for WMALSL?
Nice to see that WMALSL is working. I gave it a quick run and, with my old version of WMALSL, it looks like best frame size for that codec is 2048. When somebody else can confirm that is the case also with newer versions, we may want to add a dedicated switch, to avoid people using it with frame size 512 or 1024.
Just one more idea:Though I love the idea of deciding (at least in principle) for each individual sample about the number of bits to remove we can see it a bit more practically:Overall view:Our stage 1 process provides blocks of 512 samples, and all samples within this block have the same number of bits removed.We do it under all circumstances, that is especially for -1, -2, -3.This way we are free with the stage 2 encoder to use any multiple of 512 as the blocksize, and for our best knowledge so far it's easy to find an appropriate blocksize (for instance 512 for FLAC and TAK, 1024 for wavPack, 2048 for WMAlossless).Detail view for stage 1:With a 512 sample block we can easily let it consist of several consecutive length-64-FFT and length-256-FFT windows.We can build for each 512 sample block an individual length-1024-FFT in a way that our 512 sample block lies in the middle of the 1024 sample FFT window. (Looking at only the length-1024-FFT windows: these cover the entire track overlappingly).May be it's good to apply a complex FFT window function for the length-1024-FFT, but I guess the simple approach is good enough.The length-1024-FFT window contains information from 256 samples in front of and after the block which make up for an inaccuracy. These access sample window parts correspond to ~5.8 msec each - a pretty short period IMO. Moreover in case the shorter FFTs have an independent influence on the number of bits to remove I don't think this is a dangerous inaccuracy. What I mean is: if one of the shorter FFTs yields a very low value bin, and if there's no lower one in the length-1024-FFT, this low value bin from a shorter FFT decides on the number of bits to remove.But this is the place IMO where we should say goodbye to length-2048-FFTs.
The way it's done energy from ~11.5 msec before and after the block make it into the decision making for the block. So a potential min bin may loose its min status due to energy from outside the block.
It's inefficient to remove more bits than a given lossless encoder can take advantage of.So say, for example, you run lossyWAV with 512 and FLAC with 1024. ... That means, within any FLAC block, half the samples might have more zeros than FLAC can take advantage of (because the other half have fewer zeros, defining and limiting the number of "wasted_bits" within that FLAC block). ...
Quote from: halb27 on 30 October, 2007, 10:41:02 AMThe way it's done energy from ~11.5 msec before and after the block make it into the decision making for the block. So a potential min bin may loose its min status due to energy from outside the block.That's intentional. One of the FFT analyses is usually concentrated on the block boundary, which for a 1024-point FFT at 44.1kHz is, as you say, about +/-11.6ms - though the windowing means the effect at the edges is pretty small. The reason for doing this is to catch low energy moments near the block boundary, which could otherwise be completely missed. If you miss them, you add too much noise; worse still, you can put a hefty transition in there as you switch to more bits removed.More generally, if you don't overlap analysis blocks, then there are moments of the audio that you never check, so you won't know if the noise you're adding is above or below the noise floor during those moments. Cheers,David.
BTW is there a windowing function like hanning used? With the overlapping it would be most welcome I think and it would reduce potential negative side effects of the 'foreign' samples. It would also reduce errors resulting from a rectangular window.
The Hanning window is used. I did toy with the idea of the centred analysis previously, but at that time I was more concerned with being able to duplicate exactly the results from David's Matlab script.
lossyWAV alpha v0.3.20 : WAV file bit depth reduction method by 2Bdecided.Delphi implementation by Nick.C from a Matlab script, www.hydrogenaudio.orgUsage : lossyWAV <input wav file> <options>Example : lossyWAV musicfile.wavQuality Options:-1 extreme quality level (-cbs 1024 -nts -3.0 -skew 30 -snr 24)-2 default quality level (-cbs 1024 -nts -1.5 -skew 24 -snr 18)-3 compact quality level (-cbs 512 -nts -0.5 -skew 18 -snr 12)-o <folder> destination folder for the output file-force forcibly over-write output file if it exists; default=offAdvanced / System Options:-nts <n> set noise_threshold_shift to n dB (-18dB<=n<=0dB) (reduces overall bits to remove by 1 bit for every 6.0206dB)-snr <n> set minimum average signal to added noise ratio to n dB; (0dB<=n<=48dB)-skew <n> skew fft analysis results by n dB (0db<=n<=48db) in the frequency range 20Hz to 3.45kHz-cbs <n> set codec block size to n samples (512<=n<=4608, n mod 16=0)-overlap enable aggressive fft overlap method; default=off-spf <3x5chr> manually input the 3 spreading functions as 3 x 5 characters; e.g. 44444-44444-44444; Characters must be one of 1 to 9 and A to Z (zero excluded).-clipping disable clipping prevention by iteration; default=off-dither dither output using triangular dither; default=off-quiet significantly reduce screen output-nowarn suppress lossyWAV warnings-detail enable detailled output mode-below set process priority to below normal.-low set process priority to low.Special thanks:Dr. Jean Debord for the use of TPMAT036 uFFT & uTypes units for FFT analysis.Halb27 @ www.hydrogenaudio.org for donation and maintenance of the wavIO unit.
OOPs, you're so fast!!I've read a lot about fft overlapping windows, and I haven't seen anybody doing less than 50% overlapping.I've just removed this part from my post, and a second later seen you having realized it.Thanks for your version and sorry for the confusion!But now that you've done it: let's see what 2Bdecided and other people have to say about it.Anyway for the 1024 sample FFT I think we should do the 1 FFT center approach - at least as long as we're happy with a 50% overlapping of the other FFTs as this has pretty much the same feasibility background.
Or, what about a fixed proportion of the largest FFT_length as the end_overlap? Say, 256, i.e. 0.25 of the 1024, for *all* analyses?
Quote from: Nick.C on 30 October, 2007, 06:32:38 PMOr, what about a fixed proportion of the largest FFT_length as the end_overlap? Say, 256, i.e. 0.25 of the 1024, for *all* analyses?I guess your concern is the same as mine: for the starting and ending 'overlap' half the FFT_length for the area outside the lossyWav block is a bit much and brings in wrong information to a major extent.Your approach of 25% seems appropriate to me and corresponds to the 50% overlap between adjacent FFT windows (meaning the most central 50% samples of the FFT windows are taken good care of by the hanning windowed FFT analysis).But why do you want to relate it to the longest FFT? IMO it should be 25% of the current FFT length.This more general procedure matches perfectly with the 1 FFT center spproach for a lossyWav blocksize of 512 and a 1024 sample FFT.
| -skew 0 | -skew 12| -skew 24| -skew 36-snr 0 |382 / 480|383 / 490|390 / 510|421 / 547-snr 12 |382 / 480|383 / 490|390 / 510|421 / 547-snr 24 |387 / 486|393 / 501|402 / 524|429 / 560
Although, niggling at the back of my mind is the thought that if it holds that you should overlap by 50% inside a codec block, why would we change that when looking outside the codec block in the end_overlap area?
Quote from: Nick.C on 30 October, 2007, 07:01:53 PMAlthough, niggling at the back of my mind is the thought that if it holds that you should overlap by 50% inside a codec block, why would we change that when looking outside the codec block in the end_overlap area?If you overlap 50% inside of the lossyWav block this means you have confidence that the region 25% to either side of the FFT window center carries the necessary information. Let's take this as valid assumption (otherwise we would have to increase the overlapping). With a 50% overlap these '25% away from the center' regions consecutively and nonoverlappingly cover the lossyWav block. At the start (analogously for the end) this means you need to start the first window just 25% before the current lossyWav block. The lossyWav block then starts at the very beginning of our trusted region of the first FFT window.Most vital it's for the long FFT as a lot of foreign energy would make it into the current lossyWav block analysis.
....lossyWAV alpha v0.3.20 attached:-overlap parameter added to reduce the end_overlap of FFT analyses to 25% FFT_length rather than 50%....