Re: Downsample before encoding or just use lowpass?
Reply #9 – 2020-08-16 22:56:05
I then tried to listen for the artifacts you mention on transitions and really couldn't hear anything at all. I don't want to sound obtuse (and generally use something else for 'serious' batch conversion) but I do use fb2k fairly often for quick and dirty conversion and wanted to try to get to the bottom of this. Could the issue you mention have been fixed within SoX since the discussion in that old thread took place? I didn't read the source code to say for sure (and I couldn't even if I tried, it's not open), but from what I've noticed some time ago, and read on the Web, if you reset the DSP state between tracks, it can't completely avoid this problem because it sees each track in isolation. I don't know if it actually implements it, but in theory it can try to be smarter-than-usual and instead of assuming that the undefined samples are zero, extrapolate the data before the track and after, and then these errors on track boundaries can be hard or impossible to notice in most places. IIRC I tested this long time ago and could hear clicks but to be 100% certain I'd have to try again because foobar2000 and resampler DSP plugins could have had a lot of improvement since then. However, if you don't reset DSP state between tracks, then this issue simply cannot happen, it's safer in this regard. (But then it may cause different issues if you convert a lot of albums at once, because it concatenates them all; also may be a performance penalty, because everything is done sequentially so unless the encoder is multi-threaded, it'll use only 1 thread for encoding, because it sees just 1 huge track.)Again there was no difference in track times down to 1/1000th of a second Individual track time changes won't be significant, and even the full album length. But the issue is that each time you go to next track, the relative delay of all subsequent samples will change because 0th sample of the next track is going to correspond to 0th sample in the output; and if there was no change of track here, it's almost never going to be an integer number. So it's a sudden sharp jitter, limited in length to 1/2 samples.