Skip to main content
Topic: Track timing with Don't reset DSP between tracks (Read 493 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Track timing with Don't reset DSP between tracks

If I select the option "Don't reset DSP between tracks" in Converter, in order to process a gapless album, and have either SoX or SSRC resampler activated, the track split points will offset by a random amount. This doesn't happen with all effects: tracks times are unchanged with IIR Filter, and only occasionally changed with PPHS Resampler. Why aren't the splits precise as permitted by the output sampling rate?

I tried latest beta 17.


Re: Track timing with Don't reset DSP between tracks

Reply #1
and only occasionally changed with PPHS Resampler.
Even when it's in Ultra mode?

DSP plugins are allowed to change duration of audio (add silence, remove silence, crossfade etc etc), so foobar2000 cannot know precisely when one track ends and the next starts.

Re: Track timing with Don't reset DSP between tracks

Reply #2
In Ultra mode the duration changes by less than 1 CD sector, but it still sometimes changes. It is unexpected, because these resamplers keep duration precisely when used on one track at a time. If Foobar can't at all know if a plugin changes the duration, perhaps a note should be added on the converter page near the checkbox, and maybe the approximate splits should be rounded to 1/75 seconds.

Re: Track timing with Don't reset DSP between tracks

Reply #3
You have length issues when using the setting with DSPs that buffer samples.

A DSP gets N samples from input component at a time and sends it to a buffer. Once the buffer contains enough samples to be processed, processing is performed and as a result the DSP now has S samples to output.

The chunk sizes DSPs get depend on input formats and DSPs that are before them. Usually decoders output chunks at the format's native frame size. And at the end of a track chunk sizes can vary wildly.

I don't know the logic foobar2000 uses to decide when to change output track with this setup, but I'm pretty sure it doesn't split the chunks it receives from the DSP chain. As the sizes from inputs unlikely matches the chunk sizes from the buffer, the lengths will differ.

It could be hacked around for a special case where DSPs don't alter lengths. The converter could simply monitor duration going in and split things after getting the same duration out. But plenty of DSPs change lengths.

Best option for you is to not tick the checkbox and instead use SoX or SRC resamplers that feature signal extrapolation. That way track lengths don't get mutilated and the extrapolation should prevent glitches.

Re: Track timing with Don't reset DSP between tracks

Reply #4
I tested how SoX resampler deals with chunks when foobar2000 resamples 44100 Hz FLAC file to 48000 Hz.

Sizes of input chunks : 4096; 4096; 4096; 4096; 4096; 4096; 4096; 4096; 4096; 4096;...
Sizes of output chunks : 7238; 5786; 3857; 3857; 5787; 3857; 3857; 3858; 5786; 3857; ...

(Apparently the decoder reads a FLAC frame, decodes it and sends an audio chunk to foobar2000 core. So the length of all input chunks is 4096 samples.)


Re: Track timing with Don't reset DSP between tracks

Reply #5
Thank you Case for elaborate, good explanation. I guess we have to accept what the current plugin architecture supports. Maybe in the free space below "slow, needed for crossfading", a warning could appear once the box is checked, "Track durations may change slightly", or similar. Does "slow" refer to the process becoming single-threaded, or is there another reason why conversion would become slow without reset?

I was looking to combine IIR Filter or possibly a VST EQ with resampler into one pass. In my experiment I had uncompressed wav files in input. The maximum deviation of track length depends on the Normal/Best mode selected, which makes sense, as best would have higher latency. With FLAC block size bumped to 16384 (only for this test), highest observed difference on one album increases from 3 to 12 CD frames.


Re: Track timing with Don't reset DSP between tracks

Reply #7
In Ultra mode the duration changes by less than 1 CD sector, but it still sometimes changes. It is unexpected, because these resamplers keep duration precisely when used on one track at a time. If Foobar can't at all know if a plugin changes the duration, perhaps a note should be added on the converter page near the checkbox, and maybe the approximate splits should be rounded to 1/75 seconds.
foobar2000 needs an API for DSP plugins so that they can signal the delay; then foobar2000 can compensate for it later.

Re: Track timing with Don't reset DSP between tracks

Reply #8
There is. It's required to synchronize visualizations with audio.

Re: Track timing with Don't reset DSP between tracks

Reply #9
Then why not let Converter use that information as well? ;)

Re: Track timing with Don't reset DSP between tracks

Reply #10
Issue noted, thanks for reporting.

Re: Track timing with Don't reset DSP between tracks

Reply #11
While "Don't reset DSP between tracks" in 1.4.1 now doesn't change length of tracks, it leads to other strange result. See below.
I took 6  tracks 96 kHz/16 bit and converted them to 44.1/16 using SoX resampler 0.8.3 in DSP. No dithering. First, i converted them with "Don't reset DSP between tracks" enabled . Then i converted them with "Don't reset DSP between tracks" disabled.
Then i used foo_bitcompare to compare two sets of tracks.
Results (folder "dont" contains tracks converted with "Don't reset DSP between tracks" enabled; folder "ass" contains tracks converted with "Don't reset DSP between tracks" disabled):
Code: [Select]
Differences found in 5 out of 6 track pairs.
Zero offset detected in 5 out of 5 non-identical track pairs.

Comparing:
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\dont\01. Brothers in the Stars.flac"
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\ass\01. Brothers in the Stars.flac"
Compared 25087373 samples.
No differences in decoded data found.
Channel peaks: 1.0000000 1.0000000

Comparing:
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\dont\02. Black Hole in Human Form.flac"
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\ass\02. Black Hole in Human Form.flac"
Compared 22982803 samples.
Differences found: 45214179 values, starting at 0:00.000181, peak: 0.2111206 at 6:16.936780, 2ch
Channel difference peaks: 0.1607666 0.2111206
File #1 peaks: 1.0000000 1.0000000
File #2 peaks: 1.0000000 1.0000000
Detected offset as 0 samples.

Comparing:
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\dont\03. My Hands were Made to Hold the Wind.flac"
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\ass\03. My Hands were Made to Hold the Wind.flac"
Compared 20613869 samples.
Differences found: 183 values, starting at 0:00.000000, peak: 0.0053711 at 7:47.434649, 1ch
Channel difference peaks: 0.0053711 0.0005493
File #1 peaks: 1.0000000 1.0000000
File #2 peaks: 1.0000000 1.0000000
Detected offset as 0 samples.

Comparing:
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\dont\04. The Birth of Tragedy.flac"
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\ass\04. The Birth of Tragedy.flac"
Compared 23453830 samples.
Differences found: 46444670 values, starting at 0:00.000000, peak: 0.2778625 at 7:32.503583, 2ch
Channel difference peaks: 0.2015381 0.2778625
File #1 peaks: 1.0000000 1.0000000
File #2 peaks: 1.0000000 1.0000000
Detected offset as 0 samples.

Comparing:
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\dont\05. Individuation.flac"
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\ass\05. Individuation.flac"
Compared 22744310 samples.
Differences found: 44823568 values, starting at 0:00.000000, peak: 0.0425415 at 2:53.957438, 1ch
Channel difference peaks: 0.0425415 0.0370483
File #1 peaks: 1.0000000 1.0000000
File #2 peaks: 1.0000000 1.0000000
Detected offset as 0 samples.

Comparing:
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\dont\06. A Nest of Ravens in the Throat of Time.flac"
"D:\Downloaded\Obsidian Tongue - A Nest of Ravens in the Throat of Time\ass\06. A Nest of Ravens in the Throat of Time.flac"
Differences found: length mismatch - 8:46.570680 vs 8:46.570658, 23221767 vs 23221766 samples.
Compared 23221766 samples, discarded last 1 samples from the longer file.
Differences found within the compared range: 45662127 values, starting at 0:00.027778, peak: 0.6002197 at 7:22.850952, 2ch
Channel difference peaks: 0.5693970 0.6002197
File #1 peaks: 1.0000000 1.0000000
File #2 peaks: 1.0000000 1.0000000
Detected offset as 0 samples.



Total duration processed: 52:11.609
Time elapsed: 0:24.200
129.40x realtime
I thought that this may be just some offset. I used Audacity to examine variants of track #2 and found that there is NO offset. Or am i mistaken? See picture:

...yet, difference between variants of track #2 is pretty big (21% peak). Also, notice, peak of difference is not at the beginning or end of file. because track #2 is 8:41 long.
This is how spectrum of difference looks in Audacity:

If anyone is interested in source files, original FLAC is available on Bandcamp for free - https://hypnoticdirgerecords.bandcamp.com/album/a-nest-of-ravens-in-the-throat-of-time

Re: Track timing with Don't reset DSP between tracks

Reply #12
I see a subsample delay. The difference consisting of primarily high frequencies is a clue.

Track 2 begins on 25087372.8 samples (54611968/96*44.1). If I upsample both results 5x to 220 500 Hz with Izotope RX2, and shift the "dont-reset" clip forward by 1 sample, they null to peak 0.0307736 and RMS -89.2 dB, which I'd guess is reasonable difference for 2 resampling passes and some clipping in the first pass. I am quite certain that most of the peaks in the difference are due to clipping in 16-bit. Feel free to repeat the test in float.

'dont-reset'  http://i.imgur.com/zOUc5KD.png
'do-reset'  http://i.imgur.com/25XgoLU.png

Zoomed in difference: http://i.imgur.com/Xt5hjsH.png

Track 3 begins on 48070176 samples precisely, and therefore nulls except for a few samples at ends.

Thank you very much for addressing this issue, Developer. I will report if there are any problems. So far there don't seem to be.

 
SimplePortal 1.0.0 RC1 © 2008-2018