Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV Development (Read 561384 times) previous topic - next topic
0 Members and 2 Guests are viewing this topic.

lossyWAV Development

Reply #500
What do you get for your test set resampled to 32kHz, processed with -2?

Does 32k resampling followed by ReplayGain (only negative values applied) help even more?

It makes sense to have a -3 along the lines you're proposing, but I suspect the above will be dramatically more efficient, and still artefact-free (though with a 16k LPF and, with RG, loud tracks becoming quieter).

Cheers,
David.
I tried it with revised -3 settings (see below) and got:
WAV: 125.11MB;
FLAC: 69.36MB, 782kbps;
-1: 52.59MB, 593kbps;
-2 @ 44.1kHz: 45.10MB, 509kbps;
-2 @ 32.0kHz: 38.97MB, 440kbps;
-3 @ 44.1kHz: 38.49MB, 434kbps;
-3 @ 32.0kHz: 33.95MB, 383kbps.
  - a 13.6% at -2 or 11.8%  at -3 further saving by resampling to 32kHz! The results didn't sound bad at all.

[edit] I tried a couple of albums and the results were a bit of a surprise: FLAC: 773MB, 914kbps; -3 @ 44.1kHz: 321MB, 381kbps; -3 @ 32.0kHz: 313MB, 371kbps. The size difference is welcome, but the resampling has a time overhead and the 16kHz LPF. [/edit]

Target b) for -3: OK, so we should think about the details.
Just following you dialog here.. 
This seems the right basic choice, there has to be a benefit for offering a (little) bit of quality. IMO that means a significant lower bit rate for -3 (compared with -2).

(Would -skew of -12 -18 -24 (for -3 -2 -1) be too agressive?)
I am wondering about the clipping reduction method - at the moment, if it finds 1 or more sample which clips after rounding then it reduces bits_to_remove by one and tries again, until bits_to_remove=0 then it just stores the original values. Is 0 permissible clipping samples a bit too harsh? At the time thatthe iterative clipping was introduced, I put in an "allowable" variable, implying that a number of clipping (but rounded) samples may be permitted.
I suppose you mean consecutive samples of the maximum (or minimum) value?  To me in this case 0, 1 or 2 would make sense, only already badly clipping music would be affected by other values.

And yes, the dither function is obsolete as you no longer opt to lower the amplitude.
I also tried -3 [..] which yielded 420kbps [..] [edit]Maybe 400kbps for "real music" should be the target rather than approaching that for my problem set. [/edit]
The problem with this is that from the offset this method aims for constant quality (I like that BTW) so the bit rate will vary. I found for example that music that already compresses well (lossless) like in the 600's will not get half the bit rates with the help of lossyWav but rather still around 420.
I've settled on a set of settings which are in some ways similar to -2 but using different fft lengths, -nts and -spf, see below. 434kbps is a reasonable bitrate at a reasonable quality (I can't hear anything wrong, but my ears are 39 years old......). The -allowable parameter only counts individual clips, it doesn't look for multiples (although it could, at a slight speed penalty). The -window parameter hasn't made it into this revision as I have to check the bit reduction noise calculations for each new spreading function to ensure that I'm not adding the "wrong" amount of noise per bit removed.

Feedback, as always is requested and valued.

lossyWAV alpha v0.4.5 attached: Superseded.

-3 settings tweaked;

-allowable parameter implemented to allow a number of clips per codec block (total per block per channel).[!--sizeo:1--][span style=\"font-size:8pt;line-height:100%\"][!--/sizeo--]
Code: [Select]
lossyWAV alpha v0.4.5 : WAV file bit depth reduction method by 2Bdecided.
Delphi implementation by Nick.C from a Matlab script, www.hydrogenaudio.org

Usage   : lossyWAV <input wav file> <options>

Example : lossyWAV musicfile.wav

Quality Options:

-1            extreme quality [5xFFT] (-cbs 512 -nts -3.0 -skew 30 -snr 24
              -spf 11124-11125-11225-11225-11236 -fft 11111)
-2            default quality [4xFFT] (-cbs 512 -nts -1.5 -skew 24 -snr 18
              -spf 11235-11236-11336-12348-1234D -fft 11101)
-3            compact quality [3xFFT] (-cbs 512 -nts -0.5 -skew 24 -snr 18
              -spf 22236-22237-22347-22358-2234E -fft 01010)

-o <folder>   destination folder for the output file
-force        forcibly over-write output file if it exists; default=off

Advanced / System Options:

-nts <n>      set noise_threshold_shift to n dB (-18dB<=n<=0dB)
              (reduces overall bits to remove by 1 bit for every 6.0206dB)
-snr <n>      set minimum average signal to added noise ratio to n dB;
              (0dB<=n<=48dB)
-skew <n>     skew fft analysis results by n dB (0db<=n<=48db) in the
              frequency range 20Hz to 3.45kHz
-cbs <n>      set codec block size to n samples (512<=n<=4608, n mod 32=0)
-fft <5xbin>  select fft lengths to use in analysis, using binary switching,
              from 64, 128, 256, 512 & 1024 samples, e.g. 01001 = 128,1024
-overlap      enable conservative fft overlap method; default=off

-spf <5x5hex> manually input the 5 spreading functions as 5 x 5 characters;
              These correspond to FFTs of 64, 128, 256, 512 & 1024 samples;
              e.g. 44444-44444-44444-44444-44444 (Characters must be one of
              1 to 9 and A to F (zero excluded).
-clipping     disable clipping prevention by iteration; default=off
-allowable    select allowable number of clipping samples per codec block
              before iterative clipping reduction; (0<=n<=64, default=0).
-dither       dither output using triangular dither; default=off

-quiet        significantly reduce screen output
-nowarn       suppress lossyWAV warnings
-detail       enable detailled output mode

-below        set process priority to below normal.
-low          set process priority to low.

Special thanks:

Dr. Jean Debord for the use of TPMAT036 uFFT & uTypes units for FFT analysis.
Halb27 @ www.hydrogenaudio.org for donation and maintenance of the wavIO unit.
[/size]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #501
Code: [Select]
-allowable    select allowable number of clipping samples per codec block
              before iterative clipping reduction; (0<=n<=64, default=0).

I tried with/without -allowable 1 on a track that hits full scale.
Doesn't make a lot of difference here.
Code: [Select]
%lossyWAV Warning% : Codec_block_size forced to 512 bytes.
%lossyWAV Warning% : Allowable clipping samples set to 1 per codec block.
%lossyWAV Warning% : Process priority set to low.
temp-6605F1A4E1877D7AA8BA0D93BF92EA95.wav;5.2624;67932;12909;8.92x
%lossyWAV Warning% : 9 sample(s) clipped to maximum +ve amplitude.

%lossyWAV Warning% : Codec_block_size forced to 512 bytes.
%lossyWAV Warning% : Process priority set to low.
temp-6605F1A4E1877D7AA8BA0D93BF92EA95.wav;5.2606;67909;12909;9.01x
%lossyWAV Warning% : 23 bits not removed due to clipping.


BTW shouldn't the logging say 512 samples 
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV Development

Reply #502
[code]-allowable   select allowable number of clipping samples per codec block
           before iterative clipping reduction; (0<=n<=64, default=0).[/code

I tried with/without -allowable 1 on a track that hits full scale.
Doesn't make a lot of difference here.

BTW shouldn't the logging say 512 samples 
  erm, yes, you would be correct in that assertion!

-allowable 1 will only allow 1 sample per channel per codec_block to clip - try with -clipping instead to see what the maximum bits to remove for the track in question would be (this will also give you a count of samples which clip over or under) and then play about with -allowable. The parameter will take up to 64 permitted clips per channel per codec_block.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #503
I was thinking today that it would be nice to be able to just drop a wav (or multiple wavs) onto a small app and get lossyFLAC files in return.

So after about 8 hours of opcode hexing, and batch file scripting...  lFLCDrop is born.    See attached.


You will have to download and/or copy flac.exe and lossyWAV.exe into the folder you will use lFLCDrop in, because I'm not sure what the licenses are for redistributing that stuff yet.  I'll have to check that out, but if anyone knows off hand, you could let me know to save me the hassle. 

I should note that lFLC.bat for lFLCDrop v1.0.0.0 is forcing 576 sample blocks for lossyWAV and FLAC, due to Winamp's in_flac plugin not showing the spectrum in the classic visualization when 512 sample blocks are used.  (i don't have modern skins installed to test).  If this fix is not ok with you, feel free to chance it in the batch file.  For quick reference here's the command line used for both:
lossyWAV [input] [quality] -o [output] -nowarn -cbs 576
flac -8 -o [output] --delete-input-file -f -b 576 [input]
and you should note that FLAC is deleting a temp file, not your source file.  If you want to delete your source files, the option is available if you right-click on the lFLCDrop GUI. 


The next thing I plan to do is create an lFLC.bat for use with EAC, including passing in variables for use in tagging.  It might take a bit longer to test due to the possibility of it being impossible to get around certain characters being passed in.  Mainly double-quotes & percent signs, but it will need some testing for sure.


at any rate, enjoy

p.s. thanks to all of the people involved with lossyWAV and of course FLAC, and to Layer3Maniac for making the original FlacDrop.  Without all of you, this would not have been possible and I take no credit for anything I've done which belongs with you all.  This is mentioned in the readme file, but I thought it would be considerate to have here as well.

[edit] removed, newer version posted later in the thread [/edit]

lossyWAV Development

Reply #504
So after about 8 hours of opcode hexing, and batch file scripting...  lFLCDrop is born.    See attached.
Nice to hear that you think enough of the processor to create a method of using it! lossyWAV is LGPL (although exactly what that means, I still need to get my head round.....), by the way.

Possible bug report:

I am in the process of batch converting circa 1500 tracks in Foobar2000 v0.9.5 beta1 using FLAC v1.2.1 and lossyWAV v0.4.5. I got a bit concerned when after a while I noticed that the total time of the output files is less than that of the input files. Narrowing it down, I find that some tracks are exactly 8 codec blocks (4096 samples / 16kB) shorter than they should be. I am at a loss as to why this is occurring.

[edit] I've looked at the throughput as one album with 2 affected tracks processes: the input and processed WAV files are the same length..... [/edit]

[edit2] As an aside, I'm 623 tracks in and the processing (-3) has brought the bitrate down from 854kbps to 392kbps (8.27GB / 18.0GB). [/edit2]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #505
In reply to PM regarding conversion in Foobar2000: [edit] See wiki article [/edit]

I'm still working on the user selectable window function parameter, this should be ready tonight or tomorrow.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #506
lossyWAV alpha v0.4.6 attached: Superseded - bug report.

Added noise due to bit reduction calculations re-done. Calculated for the seven user selectable window functions. Slight change in bits_to_remove (more) than v0.4.5;

"-nts n" parameter now valid in the range -18dB to +6dB;

"-window n" parameter (0<=n<=6) selects window function to use in FFT analysis.

[!--sizeo:1--][span style=\"font-size:8pt;line-height:100%\"][!--/sizeo--]
Code: [Select]
lossyWAV alpha v0.4.6 : WAV file bit depth reduction method by 2Bdecided.
Delphi implementation by Nick.C from a Matlab script, www.hydrogenaudio.org

Usage   : lossyWAV <input wav file> <options>

Example : lossyWAV musicfile.wav

Quality Options:

-1            extreme quality [5xFFT] (-cbs 512 -nts -3.0 -skew 30 -snr 24
              -spf 11124-11125-11225-11225-11236 -fft 11111)
-2            default quality [4xFFT] (-cbs 512 -nts -1.5 -skew 24 -snr 18
              -spf 11235-11236-11336-12348-1234D -fft 11101)
-3            compact quality [2xFFT] (-cbs 512 -nts -0.5 -skew 24 -snr 18
              -spf 22236-22237-22347-22358-2234E -fft 01010)

-o <folder>   destination folder for the output file
-force        forcibly over-write output file if it exists; default=off

Advanced / System Options:

-nts <n>      set noise_threshold_shift to n dB (-18dB<=n<=+6.0dB)
              (-ve values reduces bits to remove, +ve value increase)
-snr <n>      set minimum average signal to added noise ratio to n dB;
              (0dB<=n<=48dB) Increasing value reduces bits to remove.
-skew <n>     skew fft analysis results by n dB (0db<=n<=48db) in the
              frequency range 20Hz to 3.45kHz
-cbs <n>      set codec block size to n samples (512<=n<=4608, n mod 32=0)
-fft <5xbin>  select fft lengths to use in analysis, using binary switching,
              from 64, 128, 256, 512 & 1024 samples, e.g. 01001 = 128,1024
-overlap      enable conservative fft overlap method; default=off

-spf <5x5hex> manually input the 5 spreading functions as 5 x 5 characters;
              These correspond to FFTs of 64, 128, 256, 512 & 1024 samples;
              e.g. 44444-44444-44444-44444-44444 (Characters must be one of
              1 to 9 and A to F (zero excluded).
-clipping     disable clipping prevention by iteration; default=off
-allowable    select allowable number of clipping samples per codec block
              before iterative clipping reduction; (0<=n<=64, default=0).
-window       select windowing function n (0<=n<=6, default=0); 0=Hanning
              1=Bartlett-Hann; 2=Blackman; 3=Nuttall; 4=Blackman-Harris;
              5=Blackman-Nuttall; 6=Flat-Top.
-dither       dither output using triangular dither; default=off

-quiet        significantly reduce screen output
-nowarn       suppress lossyWAV warnings
-detail       enable detailled output mode

-below        set process priority to below normal.
-low          set process priority to low.

Special thanks:

Dr. Jean Debord for the use of TPMAT036 uFFT & uTypes units for FFT analysis.
Halb27 @ www.hydrogenaudio.org for donation and maintenance of the wavIO unit.
[/size]
Implementation of the "-wmalsl" parameter to force codec_block_size to 2048 samples will be implemented for the next revision.

[edit] Possible candidate for -3: -3 -nts 6 -skew 36 -snr 21. Currently processing FLAC > lossyFLAC 1496 tracks, 859kbps > 337kbps. 40.8GB > 16.0GB

Really quite palatable to listen to. I think the interplay between the -nts +6 (take the minimum value found and add 6dB) and -snr 21 (take the average of all relevant bins and subtract 21dB), then take the lower of the modified minimum and the modified average, produces quite a robust check against added noise. I am listening to a lot of the output (4d17h27m27.333s) trying to find the artifacts I *really* expect to be there at that bitrate. None yet. Quite pleased.[/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #507
Thanks to everyone for bettering LossyWAV!! I don't know exactly what is happening here, but when I try to run version 0.4.6 it just outputs a wav header and no data. I attached a screenshot of the commandline, and it appears that LossyWAV doesn't even try to render any audio  for me. I'm running an Intel Celeron processor @ 2.4ghz (the P4 based style) and I'm wondering if something SSE-wise just isn't meshing with my processor. If anybody has any answers they will be greatly appreciated, but I'm gonna for now hope that newer versions will work again for me.

Thanks!
-808

lossyWAV Development

Reply #508
Thanks to everyone for bettering LossyWAV!! I don't know exactly what is happening here, but when I try to run version 0.4.6 it just outputs a wav header and no data. I attached a screenshot of the commandline, and it appears that LossyWAV doesn't even try to render any audio  for me. I'm running an Intel Celeron processor @ 2.4ghz (the P4 based style) and I'm wondering if something SSE-wise just isn't meshing with my processor. If anybody has any answers they will be greatly appreciated, but I'm gonna for now hope that newer versions will work again for me.

Thanks!
-808
Did v0.4.5 work properly with the same settings? I thought that I had got rid of all SSE instructions in v0.4.5 and don't think that I've added any into v0.4.6 (although I'll check anyway). I'll look for bugs and revert.

[edit]There's a bug in the -detail parameter which seems to prematurely end the process. I'll amend and include in the the next revision.[/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #509
[edit]There's a bug in the -detail parameter which seems to prematurely end the process. I'll amend and include in the the next revision.[/edit]
lossyWAV alpha v0.4.7 attached: Superseded.

-detail bug corrected.

Thanks BGonz808!
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #510
Using lossyWAV -3 -nts 6 -skew 36 -snr 21, my (small) test set achieved an average 344kbps with FLAC, compared to ~400 with -3 alone. Some files were smaller using FLAC, while others were smaller using WMALSL, and the difference between the two codecs over the whole set was negligible.
lossyFLAC (lossyWAV -q 0; FLAC -b 512 -e)

lossyWAV Development

Reply #511
I tried 0.4.7 on my regular/problem test set and got the following average bitrates:

-1: 512/585 kbps for my regular/problem sample set
-2: 430/539 kbps for my regular/problem sample set
-3: 388/481 kbps for my regular/problem sample set
-3 -nts 6 -skew 36 -snr 21: 338/468 kbps for my regular/problem sample set.

To me these are a very attractive bitrate variations for the various quality levels, and the average bitrate differences between regular and problems samples show at least in a statistical sense that lossyWav can differentiate well what to do according to the different situations.

Your new -3 candidate looks extremely attractive judging from the statistics, Nick.
Statistics however doesn't really tell about quality, so I tried -3 -nts 6 -skew 36 -snr 21 on my problem samples as well as on some tracks of regular music.
Surprise was the only issue I found was with badvilbel at ~sec. 19.0 where I could abx the added hiss 8/10. This added hiss is so negligible to me that it is well within the excellent quality I'd like to see with -3.
I have never thought before that lossyWav is that good at an average bitrate of ~340 kbps with regular music.
Great work, Nick.

So this is the way to go for -3 IMO as long as we don't get bad news. Maybe even for -2 in an adapted and more cautious way.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #512
I tried 0.4.7 on my regular/problem test set and got the following average bitrates:

-1: 512/585 kbps for my regular/problem sample set
-2: 430/539 kbps for my regular/problem sample set
-3: 388/481 kbps for my regular/problem sample set
-3 -nts 6 -skew 36 -snr 21: 338/468 kbps for my regular/problem sample set.

To me these are a very attractive bitrate variations for the various quality levels, and the average bitrate differences between regular and problems samples show at least in a statistical sense that lossyWav can differentiate well what to do according to the different situations.

Your new -3 candidate looks extremely attractive judging from the statistics, Nick.
Statistics however doesn't really tell about quality, so I tried -3 -nts 6 -skew 36 -snr 21 on my problem samples as well as on some tracks of regular music.
Surprise was the only issue I found was with badvilbel at ~sec. 19.0 where I could abx the added hiss 8/10. This added hiss is so negligible to me that it is well within the excellent quality I'd like to see with -3.
I have never thought before that lossyWav is that good at an average bitrate of ~340 kbps with regular music.
Great work, Nick.

So this is the way to go for -3 IMO as long as we don't get bad news. Maybe even for -2 in an adapted and more cautious way.
Well, I'm very glad to hear that you like the new -3 proposal. I will implement this in v0.4.8. I was pretty astonished when I got to the end of the 1496 track processing and the output was 16GB from 40.8GB input.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

 

lossyWAV Development

Reply #513
lossyWAV alpha v0.4.8 attached: Superseded.

-wmalsl parameter implemented : sets codec_block_size to 2048 samples, incompatible with -cbs parameter;

-3 quality level changed to -cbs 512 -fft 01010 -snr 21 -skew 36 -nts +6.0 -spf 22236-22237-22347-22358-2246E;

Code speeded up a bit further - I still don't understand the speed increases available by properly aligning variables......
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #514
Thinking further on what Halb27 was saying about hiss in badvilbel, I have been iterating with the -3 settings and have arrived at:

-3:  -fft 10001 -spf 22235-22236-22347-22358-2247F -snr 21 -skew 36 -nts 6

This gives a lossyFLAC output of 35.95MB / 405.5kbps with a fairly significant reduction in bits_to_remove for badvilbel, and also reverts to the original two fft lengths in David's script. This is in contrast to 34.62MB / 390.5kbps for alpha v0.4.8 -3 settings. Slightly more conservative, but if it reduces noticable hiss, then I;m all for it (however, I haven't heard any added hiss on my iPAQ at existing -3 settings - but the noise floor for audio output is not great on it).

I intend to implement these settings for the next revision, unless of course anyone feels strongly that I shouldn't (alternative settings welcomed).

Nick.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #515
Hmm,

If it's only about the (to me) negligible added hiss above hiss that is already there in the original badvilbel I personally wouldn't care about it. I've grown to love your current -3 setting. I've been listening to a lot of music with current -3 trying to abx problems on suspicious spots, and I'm very happy with it. To me it's a very good solution for people who want great quality on a FLAC enabled DAP.
Sure it's all within the usual restriction of experience so far. But remember it's about -3 here.
Everybody can increase -3 quality to his liking by lowering -nts.

Anyway I'll try your new -3 proposal tomorrow.

I've tried a lot of settings for -2 with your -3 idea in mind: using a rather high -skew and -snr value, a rather high -nts value, and being very restrictive with using spreading_length = 1, and I ended up with

-2 -fft 11011 -spf 33335-22236-22348-123FF-23FFF -nts 0.0 -skew 36 -snr 24

It yields an average of 405/549 kbps on my regular/problem sample set which compares favorably with the 430/539 kbps of the current -2 setting.
Moreover -nts 0 should be defaulted for security IMO but I guess using a positive user chosen -nts value is fine. Trading -nts 2.5 for -nts 0 for instance yields 388/540 kbps for my regular/problem sample set.

I will do a listening test with it (using -nts 2.5) tomorrow.

The idea behind the -spf setting is (apart from merging current setting with your -3 setting):
a) Make the 64 and 128 sample FFT the primary decision basis for deciding on the 2 highest frequence ranges. Give the 64 sample FFT a minor influence on decision making for the 3 lower frequency ranges.
b) Make the 512 and 1024 sample FFT the primary decision basis for deciding on the 2 lowest frequency ranges. Give the 1024 sample FFT a negligible influence on decision making for the 3 higher frequency ranges. Same for the 512 sample FFT with respect to the 2 highest frequency ranges.
c) Make the 128 and 512 sample FFT the primary decision basis for deciding on the 3rd of your 5 frequency ranges.
d) Details are chosen on a cost consideration. For instance the 2s in the 128 sample FFT setting cost next to nothing (at first I wanted to have them as a 3 as with the 64 sample FFT setting).

I will report on the listening test.

BTW I've found a little bug: -nts 1 doesn't do what it should do: -nts 0.99 is fine as is -nts 1.01, but with -nts 1.00 bits removed are far too low (less than with -nts 0.0).
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #516
I couldn't resist and try your new -3 proposal this morning, Nick.

The statistics says 343/473 kbps on average for my regular/problem set which is very close to the 338/468 kbps of the current -3 setting.
I also tried the 'hiss spot' of badvilbel, and I can't abx the difference.

Looking more closesly at the new setting it is a bit of what I have in mind with -2: let the short FFT do the main decision job for the high frequencies (the short FFT is good at that), and let the long FFT do the main decision job on the low frequencies (the short FFT isn't good at that).

Sorry for having been pretty negative about the new -3 setting. Guess I was a bit upset cause I've done a lot of listening effort with the current -3 setting. But I think this wasn't useless when switching to the new setting. The major principle is the same, and it is a little bit more defensive. Sure I'll try the new setting with my usual problem samples tonight. To me this is sufficient and I won't go through part of my regular collection again.

What's more relevant IMO: why is this -nts x setting, with x>0 to a rather high degree so good? Can we trust it so much to use a positive -nts also for the higher quality settings?

A high -skew value is a good thing for differentiating good and bad spots (with respect to 'number of bits to remove') in the music. But -skew is effective only at rather low frequencies. Together with a high -skew value -snr also does a good job differentiating. But because of this interconnection I'm afraid -snr is effective also only in the low to lower medium frequency range below ~3 kHz.
If this is correct using a positive -nts value leaves the high frequency range under reduced noise control.
However from what we experienced so far this doesn't seem to have a practical negative impact.
Maybe dropping the same amount of LSBs in an entire block usually gives a noise floor with frequencies below 3 kHz which is caught well by the skew/snr machinery even with a rather high positive -nts value?
Or maybe maybe the ATH curve is relevant here which gives reduced sensitivity to the 3+ kHz range for low level signals?

In either case it would be very welcome if younger members could contribute listening. If for instance everything's fine in the high frequency range to my old ears this doesn't say a lot.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #517
I couldn't resist and try your new -3 proposal this morning, Nick.

The statistics says 343/473 kbps on average for my regular/problem set which is very close to the 338/468 kbps of the current -3 setting.
I also tried the 'hiss spot' of badvilbel, and I can't abx the difference.

Looking more closesly at the new setting it is a bit of what I have in mind with -2: let the short FFT do the main decision job for the high frequencies (the short FFT is good at that), and let the long FFT do the main decision job on the low frequencies (the short FFT isn't good at that).

Sorry for having been pretty negative about the new -3 setting. Guess I was a bit upset cause I've done a lot of listening effort with the current -3 setting. But I think this wasn't useless when switching to the new setting. The major principle is the same, and it is a little bit more defensive. Sure I'll try the new setting with my usual problem samples tonight. To me this is sufficient and I won't go through part of my regular collection again.

What's more relevant IMO: why is this -nts x setting, with x>0 to a rather high degree so good? Can we trust it so much to use a positive -nts also for the higher quality settings?

A high -skew value is a good thing for differentiating good and bad spots (with respect to 'number of bits to remove') in the music. But -skew is effective only at rather low frequencies. Together with a high -skew value -snr also does a good job differentiating. But because of this interconnection I'm afraid -snr is effective also only in the low to lower medium frequency range below ~3 kHz.
If this is correct using a positive -nts value leaves the high frequency range under reduced noise control.
However from what we experienced so far this doesn't seem to have a practical negative impact.
Maybe dropping the same amount of LSBs in an entire block usually gives a noise floor with frequencies below 3 kHz which is caught well by the skew/snr machinery even with a rather high positive -nts value?
Or maybe maybe the ATH curve is relevant here which gives reduced sensitivity to the 3+ kHz range for low level signals?

In either case it would be very welcome if younger members could contribute listening. If for instance everything's fine in the high frequency range to my old ears this doesn't say a lot.
I'm glad that the badvilbel hiss has disappeared - I tried quite a few permutations before arriving at this latest proposal - I have also done quite a bit of listening at current -3 .

I think that due to the high skew value, and the fact that it weights in favour of the lower frequencies, will produce minimum values at low frequencies quite often. As there are artificially weighted, to add 6dB to them has no major detrimental effect on the output.

-snr is currently the average of the skewed & spread fft results. I had thought about making it the plain average of the relevant bins (pre-skewing) to see what effect that has, but put it off as I feel that this will effectively weight the higher frequencies. Another option would be to take the average of the skewed results, pre-spreading.

I'll take a look to see what's wrong with -nts 1.0.

Ditto your request for younger ears to test the output - it would be very much appreciated.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #518
lossyWAV alpha v0.4.9 attached: Superseded.

-3 quality settings modified to: -fft 10001 -nts 6.0 -snr 21 -skew 36 -spf 22235-22236-22347-22358-2246C;

-nts 1.0 bug rectified.

This results in 406.9kbps / 36.08MB for my 53 sample set. [edit] Currently re-processing my 1496 track set, 536 tracks in: 5.18GB / 345kbps output from 13.4GB / 895kbps input. [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #519
I see you made things a little more defensive for -3.

I've been thinking about listening tests. In order to make listening experience expendable throughout our quality levels, and with regard to the very good quality of these -3 settings I think we should make -2 a more defensive version than -3 in any detail (and -1 a more defensive version than -2 with every detail).
This way everybody can try -3 where problems can be heard most easily in case they exist. The resulting improvements on -3 can then be carried over analogously to -2 and -1.
It would be different if a certain say -2 detail wasn't necessarily more defensive than the corresponding -3 detail.
Moreover meanwhile I think we can use a slightly positive -nts value with -2 too when using a high -skew and -snr value.
I also feel that 3 analyses should be enough for -2, so speed can be improved compared to the current 4 analyses used.
So I have to change my -2 suggestion I wanted to listen to tonight.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #520
Well, while we are talking about defaults settings... in these days I've been working, just for the fun of it, on a very simple algorhythm which, using as a base official defaults, apply some morphing between them and slowly goes to pure lossless, so that you can input a floating point value in the range between [0.00 .. 4.00] instead of (-1;-2;-3) as a quality setting.

Here's some examples (please note that 1.0; 2.0; 3.0 are official defaults). Though numbers look fine, it is also possible that many of these combinations are worth nothing, as they are obtained as pure morphing. All this is just to show a possible feature.

SetF, LossyWAV 0.4.8, Tak 1.02 -p3m
Code: [Select]
-------------------------------------------------------------------------
Qual. String                                              Rem.Bits  kb/s
-------------------------------------------------------------------------
0,2 | 4096 -15,0 44,4 43 1111211112111121111211112 11111 | 0,3797 | 830 |
0,4 | 2048 -12,0 40,8 38 1111211113111131111311123 11111 | 1,1347 | 770 |
0,6 | 2048  -9,0 37,2 34 1112311123112231122311224 11111 | 2,2097 | 678 |
0,8 | 1024  -6,0 33,6 29 1112311124112241122411235 11111 | 3,3363 | 593 |
1,0 |  512  -3,0 30,0 24 1112411125112251122511236 11111 | 4,3924 | 523 |
1,4 |  512  -2,4 27,6 22 111351113611236112381124D 11111 | 4,7898 | 491 |
1,6 |  512  -2,1 26,4 20 112351123611336113481134D 11101 | 5,2201 | 458 |
2,0 |  512  -1,5 24,0 18 112351123611336123481234D 11101 | 5,4594 | 440 |
2,4 |  512   1,5 28,8 19 112361123711347123581234E 11010 | 5,9919 | 401 |
2,7 |  512   3,8 32,4 20 122361223712347123581234E 01010 | 6,4358 | 370 |
3,0 |  512   6,0 36,0 21 222362223722347223582234E 01010 | 6,9055 | 337 |
3,4 |  512   6,0 21,6 13 7778A7778A7788A7789B7788E 01010 | 7,6182 | 295 |
3,8 |  512   6,0  7,2  4 CCCDDCCCDDCCDDDCCDDECCDDF 00100 | 8,1760 | 269 |
-------------------------------------------------------------------------


lossyWAV Development

Reply #522
Well, while we are talking about defaults settings... in these days I've been working, just for the fun of it, on a very simple algorhythm....
If you could post / pm / em a copy of the algorithm to me (which language?), I will certainly have a look. This could have interesting possibilities.

[edit] Proposal for -2: -fft 10101 -nts 1.5 -snr 24 -skew 36 -spf 11224-12235-12346-22357-22459; 44.23MB / 498.9kbps for my 53 sample set. [edit]

[edit] Proposal for -1: -fft 11111 -nts -3.0 -snr 27 -skew 36 -spf 11124-11125-11225-11226-11236; 53.21MB / 600.2kbps for my 53 sample set. [edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #523
My statistics for the 0.4.9 -3 setting: 345/474 kbps on average for my regular/problem sample set which is very fine to me: pretty low bitrate for the regular samples and probably sufficiently high bitrate for the problems which is confirmed also by the listening experience so far.

As for your new proposals for -2 and -3: honestly speaking I don't like them very much.
Your -2 proposal yields 420/543 kbps on average for my regular/problem samples, and this is not a lot better in comparison to the 430/539 kbps on average of the current -2 setting which doesn't 'suffer' from the a bit questionable positive -nts setting. I do favor a positive -nts value for -2 as much as you do, but when doing so I would expect a lower bitrate for regular tracks and/or a higher bitrate for problematic tracks.
With -1 it's 523/601 kbps for regular/problematic tracks, and this too isn't a real progress from the 512/585 kbps for the current -1 setting.

I did a lot of variations for also finding a hopefully improved -2 and -1 setting.
As you do I also favor a small positive -nts value together with -skew 36 -snr 24 when it's up to -2. I decided for -nts 2 but I really don't care whether it's 1.5 or 2.0.
With the fft setting however my approach is different. I do want to let the longer FFTs decide on the low frequencies cause only they have a good resolution there. This also improves the differentiation between good and problematic spots which is enhanced by the high skew/snr setting. A spreading length of 1 with the short FFTs in contrary has a tendency to be rather contraproductive in this sense. So in principle a 64 and 1024 sample FFT should do the job, but I'm still a bit worried about the 1024 sample FFT stretching so far beyond the block borders. So I decided to use a 64, 512, and 1024 sample FFT.
I tuned the details and ended up with

-2 -skew 36 -snr 24 -fft 10011 -spf 22235-22236-22347-12359-1236C -nts 2

which yields 395/551 kbps for my regular/problematic tracks.

With -1 I also wanted to use a negative -nts value like you do (more exactly: 0 as the utmost limit).
I found differentiation between good and bad still improves a bit when going -skew 40, but there's no real improvement in good/bad spot differentiation when using a higher -snr value. Going from -snr 21 to -snr 24 to -snr 27 put up bitrate by the same amount for the regular as well as the problematic set. Going -snr 30 was contraproductive already. So I used -snr 21 and decided for -nts -1 (with a larger -snr value I preferred -nts 0).
I added a 128 sample FFT because even for the higher mid frequency range the resolution of the 64 sample FFT is a bit restricted.
So I ended up with

-1 -skew 40 -snr 21 -fft 11011 -spf 22224-22225-11235-11246-12358 -nts -1

which yields 452/576 kbps for my regular/problematic set.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #524
Your -2 proposal yields 420/543 kbps on average for my regular/problem samples, and this is not a lot better in comparison to the 430/539 kbps on average of the current -2 setting.....
With -1 it's 523/601 kbps for regular/problematic tracks, and this too isn't a real progress from the 512/585 kbps for the current -1 setting......
I wasn't trying for a revolutionary change in bitrate, rather a slight evolutionary reduction - I also tried to keep a logical progression in parameters between quality levels (i.e. -nts 6.0,1.5,-3.0, step -4.5; -snr 21,24,27, step 3.0, skew 36,36,36, step 0.0).
-2 -skew 36 -snr 24 -fft 10011 -spf 22235-22236-22347-12359-1236C -nts 2
which yields 395/551 kbps for my regular/problematic tracks.

-1 -skew 40 -snr 21 -fft 11011 -spf 22224-22225-11235-11246-12358 -nts -1
which yields 452/576 kbps for my regular/problematic set.
Personally, I would prefer to keep -skew constant. I think I see where you're coming from with respect to not using spread length=1 at short fft lengths. Time for more iterations....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)