Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV Development (Read 561712 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

lossyWAV Development

Reply #476
Thanks for the responses guys..... It seems to fail on some AMD and older Intel CPU's.

Sometimes no bits will be removed - that's the beauty of David's method - nothing is removed if it is not safe to do so.

No SSE / SSE2 instructions used, only 80x87 FPU instructions. I will try to revert to v0.4.1 with the functionality of v0.4.3 and attach.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #477
I will try to revert to v0.4.1 with the functionality of v0.4.3 and attach.
lossyWAV alpha v0.4.3e attached: Superseded.

Spread and Remove_Bits procedures have been rolled back to v0.4.1;

"-fft " parameter functionality remains.

Where's the smiley for "fingers-crossed" when you want it....?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #478
Yeah, it works.
Thank you.

BTW: From my personal experience on performance optimization the most imprtant thing is to have a good and adequate software architecture. Low level optimization is important often in only isolated spots.
Sure this needn't necessarily apply to lossyWav.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #479
Yeah, it works.
Thank you.

BTW: From my personal experience on performance optimization the most imprtant thing is to have a good and adequate software architecture. Low level optimization is important often in only isolated spots.
Sure this needn't necessarily apply to lossyWav.
I *was* only optimising the most frequently called procedures / functions. FFT, Spread and Remove_Bits are the functional core of the whole method. With all three converted to assembler I got an extra 10% speed compared to just FFT.

However, thankfully it is now working, and close to optimal speed. Have fun with your testing / transcoding!
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #480
lossyWAV 0.4.2 has no issues with my laptop's AMD Mobile Sempron 3000+ (32-bit) CPU, except it crashes when the specified output folder doesn't exist.


This problem still hasn't been fixed.
lossyFLAC (lossyWAV -q 0; FLAC -b 512 -e)

lossyWAV Development

Reply #481
... and close to optimal speed. ...

Yes, I'm very pleased by the speed.

ADDED:

As a result of the new possibilities of 5 fft lengths:

lossyWav -2 -cbs 512 -nts -1.0 -fft 11111 -spf 11235-11236-11336-12348-1234D    (my favorite for going productive)

followed by FLAC --best -e -f -b 512

yields 438 kbps for my regular set and 546 kbps for my problem set. I'm very pleased with this ratio.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #482
It works now.

lossyWAV Development

Reply #483
lossyWAV 0.4.2 has no issues with my laptop's AMD Mobile Sempron 3000+ (32-bit) CPU, except it crashes when the specified output folder doesn't exist.
This problem still hasn't been fixed.
Sorry Mitch, I will endeavour to fix it for the next revision. Thanks for the feedback people!

[edit]Thinking about the crashing - maybe it was an infinite loop....... Much investigation to come.[/edit]

[edit2] Some moron was using the FISTTP to store a truncated real to a mem32 integer....  .... which is apparently an SSE3 instruction. I will rework the routines to avoid using this instruction and re-attach as alpha v0.4.3f (hopefully with the output directory crashing bug rectified). [/edit2]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #484
lossyWAV alpha v0.4.4 attached: Superseded.

Use of FISTTP instruction (SSE3!) eradicated; Thanks for the pointer [JAZ] - I found it very quickly when I googled "80x87 instruction set" and FISTTP isn't on the list........

Spread and Remove_Bits procedures now assembler (again....);

Now checks for access to output directory if specified.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #485
I encoded part of my collection using v0.4.4 without any problem, and according to my listening experience so far everything is very fine.
I used a variant of -2 which made me think more deeply afterwards about what's really important.

I'd like to suggest a discussion on two points concerning default bahavior:

1)
I would welcome - as I said before - a general default cbs of 512 samples. This will make most lossless codecs behave more efficiently on one hand, and on the other hand I can't  see a logical reason why not to use it. If it's about holding average bitrate up for defensive reason we should use a more direct approach targeting directly at overcoming potential weaknesses.

2)
With -2 I suggest to use an additional 128 sample FFT, to be precise I'd like to see a default behavior according to -fft 11101 -spf 11235-11236-11336-FFFFF-1234D.
The 64 sample FFT yields only few bins in the low and lower mid frequency range, so it is welcome IMO to  have another rather short FFT which improves significantly upon the situation in the important lower mid frequency  range.
So I think it's a meaningful addition to use a 128 sample FFT.
Moreover it doesn't really hurt as lossyWav is very fast now, and the increase in average bitrate is very low.
With -1 btw (not much in my focus) I suggest to use the full 5 analyses.

What do you think?
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #486
I encoded part of my collection using v0.4.4 without any problem, and according to my listening experience so far everything is very fine.
I used a variant of -2 which made me think more deeply afterwards about what's really important.

I'd like to suggest a discussion on two points concerning default bahavior:

1)
I would welcome - as I said before - a general default cbs of 512 samples. This will make most lossless codecs behave more efficiently on one hand, and on the other hand I can't  see a logical reason why not to use it. If it's about holding average bitrate up for defensive reason we should use a more direct approach targeting directly at overcoming potential weaknesses.

2)
With -2 I suggest to use an additional 128 sample FFT, to be precise I'd like to see a default behavior according to -fft 11101 -spf 11235-11236-11336-FFFFF-1234D.
The 64 sample FFT yields only few bins in the low and lower mid frequency range, so it is welcome IMO to  have another rather short FFT which improves significantly upon the situation in the important lower mid frequency  range.
So I think it's a meaningful addition to use a 128 sample FFT.
Moreover it doesn't really hurt as lossyWav is very fast now, and the increase in average bitrate is very low.
With -1 btw (not much in my focus) I suggest to use the full 5 analyses.

What do you think?
Sounds entirely reasonable. I have no problem with a 512 sample codec_block_size. I will implement the changes to the -2 and -1 quality levels.

On another topic, do we *really* need a -dither option - I have no problems with the quality of the output? Similarly, the -clipping option to switch off the iterative clipping reduction method also seems redundant. This would increase throughput a bit which would in turn offset the increased processing time due to the extra analyses.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #487
I personally don't see a real reason for the -dither option.
But as it's not defaulted I don't care much about it. You created a good separation between standard options and advanced options, and -dither is well situated in the advanced options IMO.
Good reasons for eventually saying good bye to the -dither option are IMO
- if you should run into trouble with your software architecture keeping up the -dither option (guess you won't) when at the same time nobody seems to use -dither.
- if it comes to cleaning up all the advanced options - but as they're separated well into 'advanced options' there's no real need for such a cleaning procedure IMO. Sure the time may come where these things may be thought of being obsolete.

As we're talking about default bahavior: what about -3?
I see two targets for -3:

a) -3 as a minor variant of -2, expected to be excellent under all circumstances as we expect it from -2, but with a detail behavior which is not as defensive as is -2. Your choice of using the same -spf values as that of -2 points in this direction. If we want to have it like this I suggest we increase the -skew value a bit.

b) as a seriously less defensive alternative to -2 targeting at a larger average bitrate gap than with what we have at the moment. To be more precise: if -2 yields say ~440 kbps on average, -3 should yield ~400 kbps. I guess it's achievable while still getting excellent quality. May be an even larger gap makes sense when being aware that quality may be sacrificed on hopefully rare occasion.
For b) the default setting should change quite a lot IMO.
Having extremely good encoding speed (like with your doing just 1 FFT) as a target fits rather good into this framework.

I personally don't have a favorite for a) or b).
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #488
I totally forgot about the -clipping option.
If there wasn't David Bryant's remark about wavPack being able to make use of the MSBs being 1 I would easily say -clipping makes no sense. It looks like the 'iterative' anti-clipping strategy does not only preserve quality but also doesn't impact efficiency in a global sense.
David Bryant brought this wavPack feature back to mind recently so I think it's not so simple to drop the -clipping option (keeping in mind it was David Bryant who brought us the idea of taking care of the critical bands, and I think this idea was one of the major improvements in the progress of lossyWav).
My personal feeling however is as the 'iterative' anti-clipping strategy doesn't have a negative impact on efficiency in a global sense wavPack won't benefit significantly from letting clipping happen. Moreover even if it did it would do so because of allowing clipping to occur. But I'd like David Bryant see commenting on this. Maybe I understand this wavPack feature totally wrong.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #489
As we're talking about default bahavior: what about -3?
I see two targets for -3:

a) -3 as a minor variant of -2, expected to be excellent under all circumstances as we expect it from -2, but with a detail behavior which is not as defensive as is -2. Your choice of using the same -spf values as that of -2 points in this direction. If we want to have it like this I suggest we increase the -skew value a bit.

b) as a seriously less defensive alternative to -2 targeting at a larger average bitrate gap than with what we have at the moment. To be more precise: if -2 yields say ~440 kbps on average, -3 should yield ~400 kbps. I guess it's achievable while still getting excellent quality. May be an even larger gap makes sense when being aware that quality may be sacrificed on hopefully rare occasion.
For b) the default setting should change quite a lot IMO.
Having extremely good encoding speed (like with your doing just 1 FFT) as a target fits rather good into this framework.

I personally don't have a favorite for a) or b).
My preference would be for b). Thinking about it, if at the end of the day the only options were -1, -2, -3, -nts and -fft; with -skew, -snr & -spf fixed according to the quality settings, then the user could decide how aggressive the processing was by using -fft and -nts alongside the -1, -2 or -3 quality setting.

On the other hand, maybe all of the analyses should use the same -skew, -snr and -spf values?

However, taking David's preference for only 4 command line options (-1, -2, -3 & -nts) then *maybe* other parameters should only be available when using the -3 quality option. The thinking being: "I've already accepted that I want reduced quality by selecting quality level -3, so the program will now let me foul it up myself rather than using presets....."

On -dither and -clipping, from listening to undithered output and the process never reducing amplitude then -dither seems to be expendable. Similarly, the iterative approach used in the current clipping prevention method has little impact on bitrate so the -clipping parameter also seems to be expendable.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #490
Target b) for -3: OK, so we should think about the details.

Identical -spf and -skew values for all of the three quality levels? I don't like the idea.

From my test when finding useful values for -spf I know some values really hurt bitrate efficiency wise (most of all the bold 1 in '11124' for the 64 sample FFT of -1) but may be vital for being real defensive with respect to the critical band at the lower edge of the corresponding frequency range. So I think it's neceesary for -1 (and would be most welcome for -2 too, but it's expensive and the more economic way of treating this within -2 may be by doing the additional 128 sample FFT).

With -skew it's similar. -skew is important for diffentiating resulting bitrate between regular and problematic spots, but with a value >24 the improved defensiveness is getting more and more expensive. So I think a value of 24 is very appropriate for -2, but it should be significantly higher only for -1. For -3 it should be <24.

Using very high values for -snr helps differentiating between regular and problematic spots too but with these values there's a rather high price to pay bitrate wise. So again high values of -snr should be used with -1 only IMO.

So I strongly think -1, -2, and -3 should consist of different -fft, -spf, -skew, -snr, and -nts settings in such a way that the overkill defensiveness, standard defensiveness, reduced defensiveness are represented best.

If you want to keep -1 and -2 clean of user options I suggest you do it for -3 as well, and instead create an experimental quality option -x which enables all the advanced options. advanced options = any option except for -1, -2, -3, -nts x (and -flac etc. in case these are ever needed - guess they won't).
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #491
Target b) for -3: OK, so we should think about the details.

Identical -spf and -skew values for all of the three quality levels? I don't like the idea.

From my test when finding useful values for -spf I know some values really hurt bitrate efficiency wise (most of all the bold 1 in '11124' for the 64 sample FFT of -1) but may be vital for being real defensive with respect to the critical band at the lower edge of the corresponding frequency range. So I think it's neceesary for -1 (and would be most welcome for -2 too but it's expensive and the more economic way of treating this within -2 may be by doing the additional 128 sample FFT).

With -skew it's similar. -skew is important for diffentiating resulting bitrate between regular and problematic spots, but with a value >24 the improved defensiveness is getting more and more expensive. So I think a value of 24 is very appropriate for -2, but it should be significantly higher only for -1. For -3 it should be <24.

Using very high values for -snr helps differentiating between regular and problematic spots too but with these values there's a rather high price to pay bitrate wise. So again high values of -snr should be used with -1 only IMO.

So I strongly think -1, -2, and -3 should consist of different -fft, -spf, -skew, -snr, and -nts settings in such a way that the overkill defensiveness, standard defensiveness, reduced defensiveness are represented best.

If you want to keep -1 and -2 clean of user options I suggest you do it for -3 as well, and instead create an experimental quality option -x which enables all the advanced options. advanced options = any option except for -1, -2, -3, -nts x (and -flac etc. in case these are ever needed - guess they won't).
I like the idea of the -x quality parameter (-0?) enabling the advanced options and also keeping -1, -2 & -3 "clean". This would be a copy of -2 and only those settings that the user input would be over-written, the rest being taken as per -2 for the processing.

On the -skew, -spf and -snr settings I am inclined to agree with you. The only difficult bit being agreeing what those settings will be.....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #492
I like the idea of the -x quality parameter (-0?) enabling the advanced options and also keeping -1, -2 & -3 "clean".

On the -skew, -spf and -snr settings I am inclined to agree with you. The only difficult bit being agreeing what those settings will be.....

When first thinking of the experimental option I also thought of -0 cause it matches the current naming scheme. But with the current schematics it makes the experimental quality level look superior to the standard quality levels. Though hopefully somebody might find a great setting this way I think -x (or an explicit
-experimental) is more appropriate.

'The only difficult bit being agreeing what those settings will be.....'. May be, let's see, but with -2 I think we're pretty much done already (better ideas always welcome):

-2 = -fft 11101 -spf 11235-11236-11336-FFFFF-1234D -cbs 512 -nts -1.5 -skew 24 -snr 18

With -1 I suggest to use

-1 = -fft 11111 -spf 11124-11125-11225-11225-11236 -cbs 512 -nts -3.0 -skew 30 -snr 24.

Most disputable may be -3.
Due to the 'significantly reduced defensiveness' target I suggest we use those -spf values I found in my -spf value testing. I think it's necessary for a significantly reduced average bitrate, and it still provided excellent quality. So the mixture of this and the current setting is

-3 = -fft 1001 -spf 11236-FFFF-FFFF-FFFF-1246E -cbs  512 -nts -0.5 -skew 18 -snr 12.

All these settings are pretty much what they are right now, and IMO they're just working out a little bit more what the various accents of the different quality levels stand for.
I don't care much about such details like whether -skew value for -3 should be rather 20 and -snr value 0 (my very personal preference but worth nothing).
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #493
-1 = -fft 11111 -spf 11124-11125-11225-11225-11236 -cbs 512 -nts -3.0 -skew 30 -snr 24.
-2 = -fft 11101 -spf 11235-11236-11336-FFFFF-1234D -cbs 512 -nts -1.5 -skew 24 -snr 18
-3 = -fft 10001 -spf 11236-FFFFF-FFFFF-FFFFF-1246E -cbs  512 -nts -0.5 -skew 18 -snr 12.

I don't care much about such details like whether -skew value for -3 should be rather 20 and -snr value 0 (my very personal preference but worth nothing).
The quality settings in the next revision will reflect those above (unless anyone else indicates a strong preference for something different).

I've been playing with the -fft parameter again and -3 -fft 00100 -spf ....-23346-..... -skew 24 yields 403kbps on my problematic sample set with no immediately apparent artifacts. I say immediately apparent because I don't believe that ABX'ing -3 is useful - to me -3 is the equivalent of listening in a car or on a train or plane - there is background noise already, so some minor changes to the original may / will be obscured by the noise floor of the listening environment. My "acceptability" testing takes place in an open-plan office environment with earbuds & DAP.

I am wondering about the clipping reduction method - at the moment, if it finds 1 or more sample which clips after rounding then it reduces bits_to_remove by one and tries again, until bits_to_remove=0 then it just stores the original values. Is 0 permissible clipping samples a bit too harsh? At the time thatthe iterative clipping was introduced, I put in an "allowable" variable, implying that a number of clipping (but rounded) samples may be permitted. I think that I should implement a "-allowable" parameter (1<=n<=64 (maximum permissible codec_block_size)) to set the allowable value as a clipping detection "threshold".
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #494
The quality settings in the next revision will reflect those above (unless anyone else indicates a strong preference for something different).

I've been playing with the -fft parameter again and -3 -fft 00100 -spf ....-23346-..... -skew 24 yields 403kbps on my problematic sample set with no immediately apparent artifacts.  ....

Thanks a  lot.

As for your -3 approach (just 1 FFT, targeting a significantly lower bitrate than ~400 kbps for regular music ) I can try to help and do listening tests, especially with your setting. I wouldn't lower quality demand extremely however cause after all we will stay with pretty high bitrate, and with that I think we should have a distinction from what we can get with mp3 at moderate bitrate (though this is always a matter of taste).
Sorry I won't be able to do it within this week as I'm leaving for my father in law's 90th birthday (got some trouble at the moment producing a photo based dvd movie, and neither my old nor the new dvd player (present for my father in law) are playing it fine).
lame3995o -Q1.7 --lowpass 17

 

lossyWAV Development

Reply #495
As for your -3 approach (just 1 FFT, targeting a significantly lower bitrate than ~400 kbps for regular music ) I can try to help and do listening tests, especially with your setting. I wouldn't lower quality demand extremely however cause after all we will stay with pretty high bitrate, and with that I think we should have a distinction from what we can get with mp3 at moderate bitrate (though this is always a matter of taste).
Sorry I won't be able to do it within this week as I'm leaving for my father in law's 90th birthday (got some trouble at the moment producing a photo based dvd movie, and neither my old nor the new dvd player (present for my father in law) are playing it fine).
Don't worry about the timescale, I will keep on trying to optimise the code..... I hope you have a great time at the party! Have you checked whether the DVD is written as UDF or not? This may make a difference.

I also tried -3 -fft 01100 -spf ffff-22335-22346-fffff-fffff -skew 24 which yielded 420kbps - not too bad at all. Second opinion definitely required. [edit] I will test some "real" music tomorrow and see what the bitrate comes out at. Maybe 400kbps for "real music" should be the target rather than approaching that for my problem set. [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #496
... Have you checked whether the DVD is written as UDF or not? This may make a difference. ...

The DVD plays well on my PC so I think the DVD is fine. My own dvd player simply is broken and doesn't play any dvd any more. The new player plays the 'movie', but from time to time it skips the current spot a bit which especially sounds very ugly as the music skips. Guess it's a VBR problem and that's what I'm playing with all evening long but with limited success. Guess we'll exchange the player tomorrow.

As for your new -3 setting I like the new one better as it's more demanding. Let's hear how it sounds.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #497
As for your new -3 setting I like the new one better as it's more demanding. Let's hear how it sounds.
Another variation:

At the moment the method uses the Hanning window function on the input to the FFT analysis. Looking for "window function" in my favourite resource (Wikipedia) gives quite a long list. I have added a "-window" parameter to select which one to use. This allows the selection of 7 window functions (for evaulation / elimination at this stage): Hanning, Bartlett-Hann, Blackman, Nuttall, Blackman-Harris, Blackman-Nuttall and Flat-Top.

Will post revision tonight.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #498
What do you get for your test set resampled to 32kHz, processed with -2?

Does 32k resampling followed by ReplayGain (only negative values applied) help even more?

It makes sense to have a -3 along the lines you're proposing, but I suspect the above will be dramatically more efficient, and still artefact-free (though with a 16k LPF and, with RG, loud tracks becoming quieter).

Cheers,
David.

lossyWAV Development

Reply #499
Target b) for -3: OK, so we should think about the details.

Just following you dialog here.. 
This seems the right basic choice, there has to be a benefit for offering a (little) bit of quality. IMO that means a significant lower bit rate for -3 (compared with -2).

(Would -skew of -12 -18 -24 (for -3 -2 -1) be too agressive?)
I am wondering about the clipping reduction method - at the moment, if it finds 1 or more sample which clips after rounding then it reduces bits_to_remove by one and tries again, until bits_to_remove=0 then it just stores the original values. Is 0 permissible clipping samples a bit too harsh? At the time thatthe iterative clipping was introduced, I put in an "allowable" variable, implying that a number of clipping (but rounded) samples may be permitted.

I suppose you mean consecutive samples of the maximum (or minimum) value?  To me in this case 0, 1 or 2 would make sense, only already badly clipping music would be affected by other values.

And yes, the dither function is obsolete as you no longer opt to lower the amplitude.
I also tried -3 [..] which yielded 420kbps [..] [edit]Maybe 400kbps for "real music" should be the target rather than approaching that for my problem set. [/edit]

The problem with this is that from the offset this method aims for constant quality (I like that BTW) so the bit rate will vary. I found for example that music that already compresses well (lossless) like in the 600's will not get half the bit rates with the help of lossyWav but rather still around 420.
In theory, there is no difference between theory and practice. In practice there is.