how would a piped cmd look for win? (flac to lossywav and back to flac)p.s. and i can survive without metadata
cool, tnx, what makes you think it will be easier with foobar?
lossyWAV is very interesting! Is it like...Album in WAV - 230 mbAlbum in lw - 196 mbAlbum in flac - 99 mb ...?
...then I tried Q2.0 & I failed 100% ... I failed so badly that I decided to edit the wav to focus on the 3 easiest seconds to ABX ... I retried & put the sound level up & ... I failed 100% ...so well done Nick I cannot ABX Ginnungagap at Q2.0 anymore & the flaw is slightly reduced at lower settings.
I have three questions:1. Is noise shaping only used with the portable preset (and below) or also with the standard preset (and maybe higher presets)?Seems that it's always used with respect to the chosen quality value (q/10).2. A comparison of ReplayGain values indicates that peak sample values increase with lossyWAV, reaching 1.0 where the originals were below 1.0. May that indicate possibly clipped samples?3. Can lossyWAV safely be used with 96 kHz material without any disadvantages? Does it depend on whether noise shaping is used or not?
2) Peak samples will sometimes increase due to rounding off lsb's - this will cause some clipping at +32768 for 16-bit and will be changed to (32767 shr bits-to-remove) shl bits-to-remove;
If I disable noise shaping, I should enable dithering, right?
So you think that noise shaping should also be used with higher sample rates instead of normal dithering or no dithering at all?
What would be best practice in terms of settings (dithering, noise shaping) for these types of audio:- 16-bit, 44.1/48 kHz- 24-bit, 44.1/48 kHz- 16-bit, 88.2/96/176.4/192 kHz- 24-bit, 88.2/96/176.4/192 kHz
I created four test samples containing white noise: 16-bit/48 kHz, 16-bit/96 kHz, 24-bit/48 kHz and 24-bit/96 kHz. Then I did a frequency analysis of the difference between the original and lossy conversions (default, shaping 0, shaping 0 + dither 1). I conclude three things:Dithering seems to have no benefit. It just adds noise on top of the existing quantization noise, thus increasing the noise floor.The bit-depth doesn't seem to matter.Noise shaping benefits from higher sample rates because the noise is moved even further into inaudible frequency ranges.Click here for a graphical comparison between 48 kHz and 96 kHz (portable preset).
So (2 ^ bits-to-remove) = 32 is the clipping error, which is 1/1024th of the target signal amplitude in this case, and represents a smaller error than 32 from the original signal (presumably between 16 and 31), where the target bits-to-remove would have generated a rounding error of 15 if it had 17 bits available to round upwards instead of having to round downwards.In this 5-bit case, this clipping error is equivalent to clipping caused by increasing gain by 0.0085 dB above full scale, which is very low-level, and might reassure other users (e.g. try amplifying a full-scale signal by 0.0085 dB and ABX the clipping distortion, which will exceed the distortion in lossyWAV when 5 bits are removed, that is[/edit]). The sample error adds energy at about -60dB relative to a full-scale sample in this case, which is only mildly indicative of what scale of event may happen in the frequency domain to which the ear responds.Things get more complicated with noise shaping in use, though presumably there's a feed-forward of accumulated error (like with error-diffusion dither in imaging) which enables the always-negative clipping adjustment to be offset by greater likelihood of positive shifts in following samples, and presumably the instantaneous clipping is likely to be incorporated into the high frequency end of the shaped noise unless there happen to be numerous successive clipped samples, which naturally means lower frequencies.
Nick,1. The FFT size should vary with sample rate, though good luck making that happen easily with all the optimisation you've done! 2. IMO and IIRC, enabling dither shouldn't raise the noise floor much on average - the noise floor should stay the same, but more bits will have to be kept the achieve this.I don't enable dither either.Cheers,David.
Basically this means that at present I am not confident in the high (i.e. >48kHz) sample rate performance of lossyWAV 1.1.0 and would caution anyone using it at these sample rates against using it for anything other than testing purposes.
My intention is to understand and implement SebastianG's new noise shaping method, but for that I will also have to introduce / find a PSY model of some kind.I would hope that by using the new noise shaping method some additional bits can be removed for the same apparent quality level of output, thereby further reducing the bitrate.
Do you refer to lossyWAV in general or to how noise shaping is applied?What would happen if noise shaping is disabled via --shaping 0? Would that be taken into account by *not* removing those "additional bits" then? Otherwise the non-shaped results might be pretty bad in comparison if used with lower quality settings.I'm also wondering if trading further removal of bits for better noise shaping really yields useful results as both methods seem to cancel each other out:Removing more bits:- lower filesize- more noiseStronger noise shaping:- higher filesize- less (perceived) noise
reference threshold constants for rectangular dither and triangular dither have been calculated so added noise should be the same for dither off and any dither level between 0 and 1 - the number of bits-to-remove will however reduce with "increasing" dither.I expect to post lossyWAV 1.1.0b tonight.
I was referring to lossyWAV in general as the 64/1024 sample fft lengths are fixed at present.
The noise shaping implementation results in a trade-off between bits removed and filesize. That is why the option remains for the user to disable noise shaping.