Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV Development (Read 559378 times) previous topic - next topic
0 Members and 3 Guests are viewing this topic.

lossyWAV Development

Reply #600
Axon,

I share your unease at the way pseudo-psychoacoustics have been arrived at for lossyWAV. I wouldn't put it any stronger than that though. I don't have the time to get involved, and am very grateful to Nick and halb27 for pushing this forward with such enthusiasm.

It seems like 2BDecided's original code had some artifact problems... which makes no sense if it was purely by the book.
The basic algorithm is just "find the noise floor, and quantise at or below it".

The fundamental flaw in my implementation was that it couldn't "see" dips in the noise floor at low frequencies which are audible to human listeners - so it would happily fill them with noise. The "resolution" I used wasn't sufficient for low frequencies. The solution is either to skew the results, or modify the spreading, or both (I haven't taken the time to figure out which is the "right" approach) - the current version does both, to great effect. The reason my original script got away with it most of the time is because there are very few recordings where the noise floor is lowest at low frequencies - normally, the lower limit is at a high frequency, so inaccuracies in estimating it at low frequencies have no effect on the result for most recordings.

There was also a bug in later lossyFLAC MATLAB scripts which caused it to analyse the tail end of the "noise it had just added to the previous block" when assessing the noise floor of the current block. Nick spotted that, and corrected it in his code. I haven't generated a "fixed" MATLAB version.


The obvious "extras" for lossyWAV are a hybrid/lossless mode (quite possible), and a noise-shaped mode (already implemented, but not released for IP reasons). Finally, it might make sense to delineate between a proper psychoacoustic model (borrow one?) and a non-psychoacoustic implementation (close to now, but tamed a little).


btw Nick, I don't have any objections to you leaving switches in the final release for testing - just hide them well away in the depths of the manual! And please don't feel like you have to respect my wishes or anything - you've well and truly adopted my baby now!

Cheers,
David.

lossyWAV Development

Reply #601
Axon,

I share your unease at the way pseudo-psychoacoustics have been arrived at for lossyWAV. I wouldn't put it any stronger than that though. I don't have the time to get involved, and am very grateful to Nick and halb27 for pushing this forward with such enthusiasm.

It seems like 2BDecided's original code had some artifact problems... which makes no sense if it was purely by the book.
The basic algorithm is just "find the noise floor, and quantise at or below it".

The fundamental flaw in my implementation was that it couldn't "see" dips in the noise floor at low frequencies which are audible to human listeners - so it would happily fill them with noise. The "resolution" I used wasn't sufficient for low frequencies. The solution is either to skew the results, or modify the spreading, or both (I haven't taken the time to figure out which is the "right" approach) - the current version does both, to great effect. The reason my original script got away with it most of the time is because there are very few recordings where the noise floor is lowest at low frequencies - normally, the lower limit is at a high frequency, so inaccuracies in estimating it at low frequencies have no effect on the result for most recordings.

There was also a bug in later lossyFLAC MATLAB scripts which caused it to analyse the tail end of the "noise it had just added to the previous block" when assessing the noise floor of the current block. Nick spotted that, and corrected it in his code. I haven't generated a "fixed" MATLAB version.


The obvious "extras" for lossyWAV are a hybrid/lossless mode (quite possible), and a noise-shaped mode (already implemented, but not released for IP reasons). Finally, it might make sense to delineate between a proper psychoacoustic model (borrow one?) and a non-psychoacoustic implementation (close to now, but tamed a little).


btw Nick, I don't have any objections to you leaving switches in the final release for testing - just hide them well away in the depths of the manual! And please don't feel like you have to respect my wishes or anything - you've well and truly adopted my baby now!

Cheers,
David.
Thanks David, I'll look after her..... As to switches, I agree with the concensus that they should remain, although hidden from the attentions of casual users. I would also probably limit the input ranges so that truly awful results can be avoided.

A hybrid / lossless mode is totally possible - either at the same time as the processing, or as a stand alone program. If I venture down the piping route, it would have to be at the same time.

I corrected the Matlab script as well as my code and posted it as LossyFLAC6_x (I think).

All the best, and thanks again.

Nick.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #602
I corrected the Matlab script as well as my code and posted it as LossyFLAC6_x (I think).
Yes, you did thanks. I didn't get chance to merge the fix back into what I had.

It would be interesting to put all your tweaks into the noise shaping version, but the wait for (a) time and (b) the IP to expire means I'm looking at, er, sometime after I retire! (I'm currently 30-something!). I think I'll just release what I have when the IP expires and let someone else play with it. It would be so cool to have the option of going from true lossless to virtually lossless to high VBR mp3-like lossy (but with fewer problem samples) in the one codec.

Mind you, you're pretty much there already, without the noise shaping!

Cheers,
David.

lossyWAV Development

Reply #603
Attached File  Spread___Skew.zip ( 7.99k )

It is very hard to see the effect of a parameter change because of the random Log FFT output 

Having read 2Bdecided's comment it might be best to ditch the -0 settings as they emulate a flawed implementation.
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV Development

Reply #604

Attached File  Spread___Skew.zip ( 7.99k )

It is very hard to see the effect of a parameter change because of the random Log FFT output 

Having read 2Bdecided's comment it might be best to ditch the -0 settings as they emulate a flawed implementation.
You could copy the random number column and paste it in place as values to fix it. That would allow you to see the effects more clearly on a static example. Try looking again at the relativities between the two lines for minimum and the two lines for average....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #605
Having read 2Bdecided's comment it might be best to ditch the -0 settings as they emulate a flawed implementation.


I agree to some extent.  But perhaps a commandline string something like -allowbadsettings which will allow people to use -0 as well as remove the limits to the limited settings.  This would of course be another great option to hide deeeep in the manual.

lossyWAV Development

Reply #606
The -0 setting is no longer required as it can be re-created from the relevant parameters. -clipping, -dither and -allowable will also be removed at the next revision.

I have started the coding for correction files and can now create a WAV file (.lwcdf.WAV : lossyWAV correction data file) of the difference between the source and bit_removed data. It's basically just hiss and compresses less well than the lossy.WAV file.

There's still a lot to do on the correction file side of things, but it's shouldn't be too difficult - just time consuming.

I'm a bit concerned as to how, if I go down the route of two WAV files: one lossy; one lwcdf; that if a WAV file is processed more than once, then what happens if the wrong correction file is added to the lossy file? Probably something not too good......

@Halb27: I've narrowed down my variations to -3: -snr 18 -skew 36 -nts 6 -spf 22235-22236-22347-22358-22469 -fft 10001 -cbs 512.

This permutation yields 34.77MB / 392.2kbps on my 53 sample set.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #607
I'm a bit concerned as to how, if I go down the route of two WAV files: one lossy; one lwcdf; that if a WAV file is processed more than once, then what happens if the wrong correction file is added to the lossy file? Probably something not too good....

I'd store a checksum of the lossyWAV result in the correction file so you can figure out a wrong combination in the bring-it-back-to-lossless application.

Other than that I'm having a hard time with listening tests resulting from your -snr -215 approach.
I easily found that there's no magic with negative snr values: for my sample sets -snr -215/-100/-10/0 all gave the same average bitrate, and the result of -snr 10 was close by. So it's just the same machinery as with positive snr values: modifying the FFT min if the snr offset from the FFT average is lower. With -snr -215 or similar there's simply no modification of the FFT min, and -snr -215 simply works as if there was no snr machinery at all.

-3 -snr -215 yields 313/430 kbps with my regular/problem samples set. While this is welcome with regular tracks, it looks a bit low with the problem samples.
I listened to it (to get used to problems I started with -nts 16), and I added more problem samples. The result wan't good with badvilbel, bibilolo, bruhns, dithernoise_test, eig, furious, keys_1644ds, utb. There are clear artifacts/distortions audible. Sure that was with an insane setting of -nts 16 for a warm up.
Using -nts 9 and -nts 6 improves a lot, the distortion like noise is gone, I'd even call the results 'acceptable', but I can still abx furious, dithernoise_test, keys, utb, and badvilbel.

My usual approach for improving is to bring bitrate up for the problem set but to a minor degree with the regular set. From current -3 setting and previous experience I know a '1' instead of the '2' for the first frequency zone of the 1024 sample FFT should do the job. It does, but only for the statistics, my listening experience yielded pretty much the same not totally satisfying quality.

That's my current state. The interesting question is: if -2 -snr -215 is a bit poor for some problems, what is the most effective way to improve: may be a higher -skew value will do it, or may be just the basic thing of the entire machinery: a lower -nts value (would match the idea of going a bit back to the pure basics), or may be really the snr machinery has en essential participation in preserving quality (after all the current -3 quality is very good). Quite interesting questions, but the answers will take some time.

And of course I'll try your new suggestion for the -3 setting.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #608
I'd store a checksum of the lossyWAV result in the correction file so you can figure out a wrong combination in the bring-it-back-to-lossless application.
Not sure how I will achieve this inside a WAV file....
Other than that I'm having a hard time with listening tests resulting from your -snr -215 approach.
I easily found that there's no magic with negative snr values: for my sample sets -snr -215/-100/-10/0 all gave the same average bitrate, and the result of -snr 10 was close by. So it's just the same machinery as with positive snr values: modifying the FFT min if the snr offset from the FFT average is lower. With -snr -215 or similar there's simply no modification of the FFT min, and -snr -215 simply works as if there was no snr machinery at all.
That was exactly the point, to be able to switch off the -snr setting.
-3 -snr -215 yields 313/430 kbps with my regular/problem samples set. While this is welcome with regular tracks, it looks a bit low with the problem samples.
I listened to it (to get used to problems I started with -nts 16), and I added more problem samples. The result wan't good with badvilbel, bibilolo, bruhns, dithernoise_test, eig, furious, keys_1644ds, utb. There are clear artifacts/distortions audible. Sure that was with an insane setting of -nts 16 for a warm up.
Using -nts 9 and -nts 6 improves a lot, the distortion like noise is gone, I'd even call the results 'acceptable', but I can still abx furious, dithernoise_test, keys, utb, and badvilbel.

My usual approach for improving is to bring bitrate up for the problem set but to a minor degree with the regular set. From current -3 setting and previous experience I know a '1' instead of the '2' for the first frequency zone of the 1024 sample FFT should do the job. It does, but only for the statistics, my listening experience yielded pretty much the same not totally satisfying quality.

That's my current state. The interesting question is: if -2 -snr -215 is a bit poor for some problems, what is the most effective way to improve: may be a higher -skew value will do it, or may be just the basic thing of the entire machinery: a lower -nts value (would match the idea of going a bit back to the pure basics), or may be really the snr machinery has en essential participation in preserving quality (after all the current -3 quality is very good). Quite interesting questions, but the answers will take some time.

And of course I'll try your new suggestion for the -3 setting.
I've come to the realisation that the -snr setting is what (along with -skew and -nts) makes -3 so acceptable. Before -snr, we didn't have a way to stop minimum values which were close to the average introducing noise close to the average. Now we do - if we set -snr to 21 then we will never add noise above the average -21dB level.

I think that -3 is close to finished - I await your listening results with anticipation!

Nick.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #609

I'd store a checksum of the lossyWAV result in the correction file so you can figure out a wrong combination in the bring-it-back-to-lossless application.
Not sure how I will achieve this inside a WAV file....

Depends on the overall procedure. I guess you want to compress the correction file (though the compression ratio may be small - which just says that lossyWAV is working efficiently), so the final representation of the correction file won't have a WAV format. If you compress by your own method you can take care of the checksum easily, and if you use FLAC or similar, you can use tags to store the checksum of the lossyWAV result.

I've finished my investigations on -3.
First I wave edited all the old and new serious problem samples so that they consist only of the problematic spots. This way I hope to get a more meaningful statistics. With current -3 the average bitrate of this problem essence is 464 kbps.
When using -3 -snr -215 I got good, but not perfect results qualitywise, and I tried already without success to increase quality by using a spreading length of 1 for the lowest frequency zone of the 1024 sample FFT.
Next I tried to improve by using a higher -skew value. But this also doesn't bring the solution: using -3 -snr -215 -skew 44 yields an average bitrate of 422 kbps for my problem essence which is too low.
Next I lowered the -nts value, and -3 -snr -215 -nts 3 yields a bitrate of 444 kbps for my problem essence. I listened to it and was content with the result though to me it's a bit much on the cutting edge as my furious result was 7/10 and I also have the suspicion that utb isn't perfect though my ABX results don't back this up. With my regular sample set the average bitrate is 344 kbps which is nearly identical to the 345 kbps of current -3. Qualitywise the current -3 setting is more secure IMO, so I prefer it.
Then I used your new -3 proposal, but with the -spf value of current -3, that is I used -3 -snr 18, and the statistics is 331 kbps for my regular set, and 445 kbps for my problem essence. Listening to the problems showed that nearly everything is fine to me with the exception of dithernoise_test which was easy to abx 10/10 due to 1 spot where the noise like sound suddenly changes with the lossyWAV result in contrary to the original. With utb again I have the suspicion that it's not totally correct though I couldn't abx it and thus may be wrong.
Finally I tried your very new -3 proposal -3 -snr 18 -spf 22235-22236-22347-22358-22469. dithernoise_test is better now, it was harder for me to abx, and I arrived at 8/10. For utb my suspicion for being not perfect is gone.

So your new proposal is within the quality demand which to me is fine for -3 though it's on the cutting edge. But that's just my listening with my old ears to not very many samples (cosidered to be extraordinarily problematic though). The average bitrate for my regular sample set is 335 kbps which is only 10 kbps lower than that of current -3. Average bitrate however of the problem essence is 446 kbps, and that's 18 kbps less than that of current -3.
So we lose a lot more kbps in the problem area where a higher degree of kbps is wanted than we gain in the regular area. Once sensitive for especially dithernoise_test I tested it again with current -3, and everything is fine to me. As is utb.

So in the end IMO we should stick with current -3. An average bitrate of ~350 kbps for regular music is very good I think, and it seems we can't do essentially better with our weaponry without sacrificing safety margin to a considerable extent.
What the investigation has shown is that -snr has it's own specific part in preserving quality. It's not just an amplification of the merits of the -skew option.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #610
So in the end IMO we should stick with current -3. An average bitrate of ~350 kbps for regular music is very good I think, and it seems we can't do essentially better with our weaponry without sacrificing safety margin to a considerable extent.
What the investigation has shown is that -snr has it's own specific part in preserving quality. It's not just an amplification of the merits of the -skew option.
Thank you very much my friend for spending a lot of time on settings validation. I was nearly at the same conclusion when you posted. Therefore, -3 is fixed - permanently (unless we find a particularly awkward sample......).

I am tidying up the code and removing redundant parameters. Will post beta v0.5.5 tonight or tomorrow.

Thanks again.

Nick.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #611
So your new proposal is within the quality demand which to me is fine for -3 though it's on the cutting edge.

Isn't that exactly where -3 should be? And -2 being "transparent as far as could be determined"?

Quote
The average bitrate for my regular sample set is 335 kbps which is only 10 kbps lower than that of current -3. Average bitrate however of the problem essence is 446 kbps, and that's 18 kbps less than that of current -3.

3% to 4% extra compression is something lossless codecs would have to work very hard for, so nothing to give away easily, except for a reason of course.

Thanks, for your testing and observations.
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV Development

Reply #612
So your new proposal is within the quality demand which to me is fine for -3 though it's on the cutting edge.
Isn't that exactly where -3 should be? And -2 being "transparent as far as could be determined"?
Quote
The average bitrate for my regular sample set is 335 kbps which is only 10 kbps lower than that of current -3. Average bitrate however of the problem essence is 446 kbps, and that's 18 kbps less than that of current -3.
3% to 4% extra compression is something lossless codecs would have to work very hard for, so nothing to give away easily, except for a reason of course.

Thanks, for your testing and observations.
I take on board what you're saying, but I agree with Halb27 that we're aiming for transparency at -3 with increasing resilience at -2 and -1. The initial aim of the process was to "slightly" reduce bitrate - what we have currently with -3 is significant reduction using the interplay of -nts, -skew and -snr. Maybe -3 -snr 18 -nts 7.5 would produce adequate results, maybe not. However, while there's only really Halb27 doing the ABX'ing, I will unconditionally accept his opinion.

Anyway,

lossyWAV beta v0.5.5 attached: Superseded.

-allowable, -dither, -clipping and -overlap removed;

Reference_threshold values used to determine bits_to_remove from calculated minimum_value have been re-calculated. Very slight increase in bitrate (406.9 v0.5.4 vs 407.3 v0.5.5 for my 53 sample set).

Code tidied.

[!--sizeo:1--][span style=\"font-size:8pt;line-height:100%\"][!--/sizeo--]
Code: [Select]
lossyWAV beta v0.5.5 : WAV file bit depth reduction method by 2Bdecided.
Delphi implementation by Nick.C from a Matlab script, www.hydrogenaudio.org

Usage   : lossyWAV <input wav file> <options>

Example : lossyWAV musicfile.wav

Quality Options:

-1            extreme settings [4xFFT] (-cbs 512 -nts -2.0 -skew 36 -snr 21
              -spf 22224-22225-11235-11246-12358 -fft 11011)
-2            default settings [3xFFT] (-cbs 512 -nts +1.5 -skew 36 -snr 21
              -spf 22224-22235-22346-12347-12358 -fft 10101)
-3            compact settings [2xFFT] (-cbs 512 -nts +6.0 -skew 36 -snr 21
              -spf 22235-22236-22347-22358-2246C -fft 10001)

Standard Options:

-o <folder>   destination folder for the output file
-nts <n>      set noise_threshold_shift to n dB (-48.0dB<=n<=+48.0dB)
              (-ve values reduce bits to remove, +ve values increase)
-force        forcibly over-write output file if it exists; default=off

Codec Specific Options:

-wmalsl       optimise internal settings for WMA Lossless codec; default=off

Advanced / System Options:

-snr <n>      set minimum average signal to added noise ratio to n dB;
              (-215.0dB<=n<=48.0dB) Increasing value reduces bits to remove.
-skew <n>     skew fft analysis results by n dB (0.0db<=n<=48.0db) in the
              frequency range 20Hz to 3.45kHz
-spf <5x5hex> manually input the 5 spreading functions as 5 x 5 characters;
              These correspond to FFTs of 64, 128, 256, 512 & 1024 samples;
              e.g. 22235-22236-22347-22358-2246C (Characters must be one of
              1 to 9 and A to F (zero excluded).
-fft <5xbin>  select fft lengths to use in analysis, using binary switching,
              from 64, 128, 256, 512 & 1024 samples, e.g. 01001 = 128,1024
-cbs <n>      set codec block size to n samples (512<=n<=4608, n mod 32=0)

-quiet        significantly reduce screen output
-nowarn       suppress lossyWAV warnings
-detail       enable detailled output mode

-below        set process priority to below normal.
-low          set process priority to low.

Special thanks:

David Robinson for the method itself and motivation to implement it in Delphi.
Dr. Jean Debord for the use of TPMAT036 uFFT & uTypes units for FFT analysis.
Halb27 @ www.hydrogenaudio.org for donation and maintenance of the wavIO unit.
[/size]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #613
... 3% to 4% extra compression is something lossless codecs would have to work very hard for, so nothing to give away easily, except for a reason of course. ...

Please keep in mind that we did not have a big amount of testing so far, and I did abx dithernoise_test 8/10 with my 58 year old ears for this 10 kbps saving setting. We are not in the situation of lossless codecs where lossless is lossless after all, but also with -3 IMO we should be pretty safe qualitywise, cause otherwise there's no good distinction from mp3 etc. If after years of lossyWAV usage a sample should come up which isn't totally transparent but has a negligible issue this is an acceptable situation for -3 IMO, but we should take some care not to be in this situation at the very lossyWAV start. Not for the advantage of just having an average bitrate of 340 kbps instead of 350.

As -nts is an official option you can easily save some kbps by increasing the -nts value as Nick said if you prefer to be a little bit adventurous.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #614
newer version of lFLCDrop, check the last page(s)

lossyWAV Development

Reply #615
It's competition time for all the graphically creative users out there.... As the wiki is now up and running (many thanks to Mitch 1 2!), complete with Foobar2000 converter settings, I/we need an icon for lossyWAV.

Answers on the back of used large denomination currency of your choice () to: this thread.....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #616
It's not very important, but those who use lossyWAV together with FLAC may find this useful:

Synthetic Soul found already that FLAC -5 yields nearly the same file size as -8. I can confirm and extend this:

For FLAC used in our context in many respect it makes nearly no difference whether we use -8, -5, or -3.
What's important to many tracks is the -m parameter (defaulted with -8 and -5, but not with -3).
To a small degree also the -e parameter makes a difference (defaulted with -8, but not with -5 and -3).

So -8, -5 -e, or -3 -m -e all yield an identical file size in a practical sense (at least with my test set), and -3 -m -e is the fastest encoding procedure among these.
If you allow for another option -3 -m -e -r 2 speeds things up a bit more while not really sacrificing file size (with my test set).
Dropping -e speeds up things further. File size increases a bit more noticable than with the -m -e variants, but to most users it's probably still negligible. Use -3 -m for an amazing speed (together with -r 2 if you like to), or -5. File size for -3 -m and -5 usually is identical in a practical sense.

Keep in mind though that with these speed settings overall encoding time is dominated by lossyWAV. So it may not be wise to hunt for the ultimate FLAC speed.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #617
I can also confirm it. The increase in speed justifies the consistently negligible (<1kbps) increase in bitrate.
lossyFLAC (lossyWAV -q 0; FLAC -b 512 -e)

lossyWAV Development

Reply #618
I can also confirm it. The increase in speed justifies the consistently negligible (<1kbps) increase in bitrate.
Good find Halb27, thanks for the confirmation Mitch 1 2! So, for those on a time budget, flac -3 -e -m -r 2 -b 512 is the way to go.....

On my 53 sample set, this increases the bitrate from 407.5 (-8) to 413.4 (-3 -e -r 2 -m -b 512) - Fast though......
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #619
...On my 53 sample set, this increases the bitrate from 407.5 (-8) to 413.4 (-3 -e -r 2 -m -b 512) - Fast though......

That's not negligible to me, but I hope that's due to the nature of your more or less problematic snippets set (guess that's still your 53 sample set). With full sized regular music as Mitch_1_2 said I expect the difference to be <1 kbps on average.

If somebody finds that on a real life sample set of several full length tracks difference is > 1 kbps please let us know. For getting the precise difference we can look at the total size of the files under consideration. I expect difference to be ~0.1%.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #620
...On my 53 sample set, this increases the bitrate from 407.5 (-8) to 413.4 (-3 -e -r 2 -m -b 512) - Fast though......
That's not negligible to me, but I hope that's due to the nature of your more or less problematic snippets set (guess that's still your 53 sample set). With full sized regular music as Mitch_1_2 said I expect the difference to be <1 kbps on average.

If somebody finds that on a real life sample set of several full length tracks difference is > 1 kbps please let us know. For getting the precise difference we can look at the total size of the files under consideration. I expect difference to be ~0.1%.
I'll run a "real-world" conversion test - the same as my previous set, abut 10 albums and revert.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #621
...
Jean Michel Jarre - Oxygene         / 773kbps / 454kbps / 372kbps / 377kbps
...So, overall an average of 850kbps / 410kbps / 350kbps / 351kbps

Thanks for your test, Nick. So in an overall sense FLAC -3 -m -e -r 2 is fine IMO, though it's quite interesting that with an album like Oxygene things aren't totally satisfying.
Do you mind trying FLAC -3 -m -e -r 3 and FLAC -3 -m -e on Oxygene?
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #622

...
Jean Michel Jarre - Oxygene         / 773kbps / 454kbps / 372kbps / 377kbps
...So, overall an average of 850kbps / 410kbps / 350kbps / 351kbps

Thanks for your test, Nick. So in an overall sense FLAC -3 -m -e -r 2 is fine IMO, though it's quite interesting that with an album like Oxygene things aren't totally satisfying.
Do you mind trying FLAC -3 -m -e -r 3 and FLAC -3 -m -e on Oxygene?
Apologies, using revised calculated constants for Reference_Threshold for beta v0.5.5, Oxygene has increased to 372kbps, and 5kbps increase with -3 -e -m -r 2 -b 512. I forgot I did the last comparison using a previous version. I will do it again with vanilla -3 / -8.

Artist - Album / FLAC / lossyFLAC -2 / lossyFLAC-3; lossyFLAC -3/-3 -e -m -r 2 -b 512;

Code: [Select]
AC/DC - Dirty Deeds Done Dirt Cheap    / 781kbps / 398kbps / 331kbps / 332kbps
B52's - Good Stuff                     / 993kbps / 408kbps / 361kbps / 362kbps
David Byrne - Uh-Oh                    / 937kbps / 398kbps / 344kbps / 345kbps
Fish - Songs From The Mirror           / 854kbps / 384kbps / 336kbps / 336kbps
Gerry Rafferty - City To City          / 802kbps / 400kbps / 338kbps / 338kbps
Iron Maiden - Can I Play With Madness  / 784kbps / 422kbps / 371kbps / 372kbps
Jean Michel Jarre - Oxygene            / 773kbps / 454kbps / 372kbps / 377kbps
Marillion - The Thieving Magpie        / 790kbps / 404kbps / 344kbps / 344kbps
Mike Oldfield - Tr3s Lunas             / 848kbps / 421kbps / 365kbps / 366kbps
Scorpions - Best Of Rockers N' Ballads / 922kbps / 421kbps / 354kbps / 354kbps
[/font]

So, overall an average of 850kbps / 410kbps / 351kbps / 351kbps

I'm not worried about one spurious result - Oxygene, after all, is a fairly specific type of music.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #623
...
So, overall an average of 850kbps / 410kbps / 351kbps / 351kbps

This matches perfectly my experience with -3 -e -m -r 2 -b 512 as well as that of Mitch 1 2 as of his post.
You're right: we shouldn't care too much about specific music, especially as the result isn't extraordinarily bad.

Thanks again for your test.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #624
Having listened to the comments on noise shaping, I had a look on wikipedia and found the basic principles.

As I already have a mechanism to store the difference between the original sample and the bit_removed sample, I have some of a noise shaping algorithm already in place.

The coefficients have so far eluded me.

One simple possibility that springs to mind is to start with zero at the codec block / channel start and then add the first difference then divide by two. Then add the next difference and divide by two. And so on.

We'll see how it sounds.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)