I found a problem sample for gapless encoding. Whatever I do, there is an audible click between the tracks. I've analyzed its cause, and found that it comes from edge effects. The termination of the waveform far from zero causes ringing, that manifests itself as an audible click, even when the files are played gapless.
I saw the problem with MP3 and MusePack.
MP3 : Lame 3.90.3 --alt-preset standard ; Foobar2000 disc writer
MPC : Mppenc 1.14 --xlevel ; Mppdec
You can get the file here : http://www.hydrogenaudio.org/forums/index....35&t=18207&st=0 (http://www.hydrogenaudio.org/forums/index.php?act=ST&f=35&t=18207&st=0)
Look at the attached picture to see the problem in the waveform.
Amazingly Ogg Vorbis at -q5 seems to handle this just fine.
How do I decode?
Edit: Nevermind, I found foobar2000 input plugin.
I confirm: vorbis is OK. DualStream is perfectly gapless too.
I've tried faac MP4 (gapless playback with foobar2000), and the gap 'click' is audible.
It's not the first time that a gapless problem was found with mpc. Could someone try with an old version (like 1.00) of mppenc.exe ? I can't do this for the moment.
I've tried with WMA9 :
-standard encoder isn't gapless (serious click between file)
-wmaPRO is perfectly gapless (VBR 2 pass tested).
The click is evident with mppenc 1.14/1.15r until --quality 6.99. It disappears at --quality 7.
With Nero AAC it can be heard with the Normal (VBR, of course) setting. Not at Extreme.
Good point. --insane is working flawlessly on my side too. Have someone Frank's phone number
Since the problem exists with FAAC all the way until the higest setting, I wouldn't suggest to start passing phone numbers around just yet The problem should be studied carefully and only then conclusions should be reached.
With Vorbis, there is a soft click up to around -q 3. At -q 3.20 it's gone. LAME sounds pretty bad with all presets, although the click is slightly weaker with --alt-preset--insane.
One last observation: mppenc 1.95z67 fails at up to --quality 7.49. --quality 7.50 is fine.
Isn't the problem similar to encoding a "step" signal ? I mean a positive constant DC signal (0 Hz) that would suddenly fall to zero.
This would sound as a click, like a single pulse. After a lossy encoding, it would still sound similar. The loss would be inaudible.
But now consider an infinite DC signal. It never makes any noise.
If this signal is divided into two tracks, won't the encoder encode the end of the first track as a "DC-to-zero" step, then the beginning of the next as a "zero-to-DC" step ?
I didn't try it yet, but while the tracks are played separately, there is a click at both ends of the files, and no difference with the original should be heard in the lossy versions, but when the two tracks are played gaplessly, the two steps cancel each other, reconstructing the constant DC signal, and what remains is the loss caused by the psychoacoustic process applied to the steps, that we can hear over the silent DC signal, like we can hear the encoding noise when we substract a lossy version of a file from its original.
I'm going to bed right now So if someone wants to do it before me tomorrow, create a silent wave file, apply a strong DC offset correction, then cut the result into two files.
If it behaves the same (audible click in gapless lossy playback), paste the two parts in the middle of a silent file, so as to create real steps, and see if the lossy waveform of the steps looks the same as the lossy waveform of the DC files.
Added the following files to the uploads
silent1.wav - 2 secs at 0dB
silent2.wav - 3 secs at 0dB
prepared by cutting a single 5sec .wav
and
steps.wav - 2 secs of digital silence, 2 secs at 0dB, 2 secs digital silence, 3 secs at 0dB, 1 sec of digital silence.
You should hear clicks at 2,4,6 and 9 secs.
I've found that this sort of click is associated with abrupt changes in level. It is, for example, the reason square waves sound harsh. It is not restricted to square edges by any means and the same effect can be heard at the end of a sine wave. If the last sample is at a peak the click is very sharp, but even if it terminates at a zero crossing there is still a definite 'wump', This is usually dealt with by adding a small amount of fade.
An mp3 adds digital silence and few bad samples at the begining and end of of a file, and if the threshold for silence detection is set too low then there will be a few bad samples that will cause a click at the join. I don't have foobar installed, but have been able to play gapless in Winamp with a little tweaking. Does foobar use the same technique ?
It seems to me there will always be an artefact (whether audible or not) using any technique that involves stripping silence and butting together mp3s. A look at the delta file shows that there is a cluster of inaccurate samples either side of the terminal points, so even if the silence is stripped from start and end there still exists a cluster of bad samples at the join.
(edit) Added waveform of stripped and butted silent1 and silent2 mp3 files to uploads. This is essentially equivalent to the delta file in this case (/edit)
UJ
Not to go off subject here Pio, but what proggy did you use in those screen captures? Cool Edit?
Thank you for the samples, Hujay, I'll check them tonight. I've got an idea to solve the problem for MP3, that I will test on your files.
SchockWave, I used SoundForge 4.5, I don't have CoolEdit.
I'll be very interested to see what you can come up with Pio. With the step file you've certainly devised what must be the severest test of mp3 encoding.
The screen shot of a perfect 'cut and shut' (glitch.jpg) that I posted clearly shows the artefacts, and is audible because of the unvarying background. In a normal music file this will still be present (as shown be looking at the delta file) but will be masked out by the music itself. This is distinct from the louder artefact that will be generated by imperfect silence detection and shown in the attatched 'imperfect.jpg'.
Note the difference in scale !
UJ
I didn't manage to process the samples like I wanted to, but here is the idea :
Before the beginning and after the end of the file, a little part of audio would be appended so as to make a smooth transition between the signal and the silence.
Since the problem comes from the step, that appears to have all frequencies present in the sonogram view, the ideal transition should be the one leading to a sonogram where all the frequencies present in the signal would fade to silence. I thought about the "replace" option for click removal in SoundForge noise reduction plugin to achieve this, but it generates a windowed transition waveform, faded at both ends, and I can't append it to the signal since it starts and ends with zero. It was designed to be mixed in the middle of a signal. I can't use it here.
Once the file is processed, it is longer than the original. Then the encoder encodes all the audio, but sets only the original lenght of the file in the header used by Foobar for gapless playback, so that when Foobar removes the gap, it also removes the transitions. This should get rid of the ringing.
As a result, the gaps present in the MP3 files have been increased, because of the addition of transition data, but the Foobar-like gapless playback has been improved.
Unfortunately, this method can't be applied to already gapless codecs without breaking compatibility, because it would turn them "gappy" with regular playback, because of the insertion of fades. With MP3, it doesn't matter since it is already gappy anyway.