HydrogenAudio

Hydrogenaudio Forum => Scientific Discussion => Topic started by: Rescator on 2013-01-10 03:36:01

Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-10 03:36:01
Pathological example of a intersample peak that was artificially created:

~0dB peak, ~20dBFS RMS (squarewave), +10.87dB intersample peak, 44.1KHz, 32bit float.

http://www.hydrogenaudio.org/forums/index....showtopic=98752 (http://www.hydrogenaudio.org/forums/index.php?showtopic=98752)

Please keep any discussion of the test sample in this tread, rather than where it's simply "stored".


The problem:
If oversampled the true peak is reveal to be almost +11dB.
A DAC would need 11dB headroom (or alternatively ~12dB which equals 2 bits) to handle this wav correctly.

The solution?:
A "quick fix" for a 24bit (or float) audio chain, would be to reduce the volume by 12dB somewhere.
Volume loss can be later compensated by simply increasing the analog volume (the user turning the knob a little higher).

*** The rest is somewhat opinionated. ***


Thoughts:
As such the "bottom 3 bits" of audio could be considered waste-able, 2 bits to handle pathological intersample peaks, + 1 bit due to quantization/noisefloor/dither.
A "24bit" DAC would have no issues, 21bits to use is a lot. Likewise a "20bit" DAC would still have 17bits to use.
Ideally the 11 (or 12) dB volume reduction would be done by the DAC just before the reconstruction stage.

Issues?:
For a 12dB headroom DAC one would need to crank up the playback volume, so such a DAC would sound more quiet than most other DACs.
Noisefloor of the amplifier and other parts of the equipment/audio chain is also an issue.
But even "cheap" gear has around -80dBFS to -100dBFS noisefloor.
Also considering that a normal living room can easily have a +50dB noisefloor, so loosing out on the 12dB or so of the quietest audio is not an issue.

So if taking CD audio as an example, a 12dB adjustment would cause the content in the -96dBFS to -84dBFS range to be lost.
The loss can be avoided by simply passing the 16bit audio as 24bit or 32bit float instead.
Under Windows Vista and Windows 7 and Windows 8 all audio is changed to 32bit so this is a non-issue.

How to avoid intersample peaks on gear without the needed headroom?:
On Windows you can simply make sure that you never raise the volume (in Windows) above -12dBFS (~45% volume),
and instead use the analog volume knob (if there is one on your system or gear) instead.

11dB really?
Yep! Then again this is a pathological example.
"Normally" the intersample peak is within 1dB of the digital peak, and in some rare cases up to 2 to 3dB higher.
If you make/master music, then the final mix/pressing master/encoding/exporting should have 2 or 3dB headroom.
So as long as no peaks go above 3dB you should be pretty darn safe from causing any clicks or distortion for the end user.

The example here is a pure spike, and humans tend not to like to listen to pops, clicks, static, test tones, or similar.
So encountering anything like this "in the wild" is very rare.

Is it really that bad?:
Please remember that intersample peaks do not damage equipment, at least I've never heard or read about such happening, and the CD was invented like ages ago.
So if this was a practical issue we'd have heard about it along time ago as equipment got fried etc. And we'd have had a solution years ago as well.

The only thing it does is damage the audio quality, that is if you actually can hear/notice it at all. You are more likely to hear crackling/distortion from overly compressed music.
And ironically it is that type of overly compressed music that has the most intersample peaks that go above 0dBFS.
Solution? Stop compressing the hell out of music. Use 20dB or more headroom and intersample peaks will most likely never be an issue.

Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Glenn Gundlach on 2013-01-10 05:24:17
Pathological example of a intersample peak that was artificially created:

~0dB peak, ~20dBFS RMS (squarewave), +10.87dB intersample peak, 44.1KHz, 32bit float.


Discussion thread for this sample: http://www.hydrogenaudio.org/forums/index....showtopic=98753 (http://www.hydrogenaudio.org/forums/index.php?showtopic=98753)

The red line is 0dBFS, the white line is -1dBFS, even without oversampling one can see the intersample peak is at ~+8.5dBFS, and settles on +10.87dBFS after 2x (or more) oversampling.


Does this ever happen in the real world or does the equipment have to be broken to do this?

Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-10 06:51:25
http://forums.digitalspy.co.uk/showthread....25#post56472825 (http://forums.digitalspy.co.uk/showthread.php?p=56472825#post56472825)
Quote from: Martin Watkins link=msg=0 date=
From 1999 to 2003 the BBC adopted the very sensible policy on DSat of leaving about 12 dB of headroom above PPM 6 so that if peaks did slip through they were well handled.


Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Kees de Visser on 2013-01-10 06:57:39
At a first glance it seems that this test signal isn't properly band-limited and therefore isn't a valid signal in the strict sense. Although signals like this can be created in the digital domain, they won't appear in the output of an ADC with proper anti-aliasing filtering.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: lvqcl on 2013-01-10 14:52:09
http://www.hydrogenaudio.org/forums/index....st&p=533436 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=59392&view=findpost&p=533436)

Theoretically, you can't guarantee such a limit unless the reconstruction filter's impulse response has a finite extent.

Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: 2Bdecided on 2013-01-10 15:06:49
While we're quoting past debates, I remember one where I said that if someone releases audio with such huge inter-sample overs, then those audio fans who values the intentions of the original engineer should leave them as they are - the original engineer, in all likelihood, monitored a sound with horrible inter-sample clipping from a standard DAC which clips inter-sample overs.

The EBU loudness group is recommending to stay below -1 dBTP - i.e. measure the maximum inter-sample over, and ensure it is below -1dBFS.
section 3.4. of https://tech.ebu.ch/docs/tech/tech3344.pdf (https://tech.ebu.ch/docs/tech/tech3344.pdf)

They're doing some other great work which I keep meaning to report on.

Cheers,
David.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: John_Siau on 2013-01-10 15:10:55
When designing Benchmark's new DAC2 HGC D/A converter, we chose to add 3.5 dB of digital headroom to accommodate inter-sample overs.  We are working with a 32-bit fixed-point conversion system, and a 32-bit fixed-point gain control.  The conversions subsystem has a 133 dB SNR, so we can afford to throw away 3.5 dB SNR to eliminate the clipping of inter-sample overs.

A survey of our in-house music library showed inter-sample overs reaching peak levels of +1.5 to +2 dBFS worst-case.  However, please note that our entire library is ripped in lossless formats.  I suspect that inter-sample overs could be higher in amplitude, and more frequent, when the audio is reconstructed with an MP3 decoder.  Does anyone have test results for MP3 audio sources?

I believe 3.5 dB of headroom above 0 dBFS is sufficient to handle all continuous waveforms, including square waves.  Can anyone provide examples or calculations to prove otherwise?

3.5 dB of headroom should also be more than sufficient to handle music (but my tests are limited to lossless rips at standard sample rates between 44.1 and 192 kHz).

The example cited in this thread is high-amplitude high-frequency transient - something we are unlikely to see in a typical recording.  It should not be necessary to provide the full 11 dB of headroom required for this pathological example.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: saratoga on 2013-01-10 16:07:54
I suspect that inter-sample overs could be higher in amplitude, and more frequent, when the audio is reconstructed with an MP3 decoder.  Does anyone have test results for MP3 audio sources?


This is true.  For rockbox we needed an extra dB or two IIRC for lossy audio to account for rounding errors (since we wrap around rather then clip!). 

That said, if your system can clamp rather then wrap around, I doubt lossy is a big concern.  I've never seen evidence that clipping the quantization error added by mp3 is audible. 
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: bennetng on 2013-01-10 17:19:39
I've seen +4.8dB intersample peak in a song, but it is in mp3 format.

Legal file by an indie artist, feel free to download the whole song

http://zonble.net/MIDI/orz.mp3 (http://zonble.net/MIDI/orz.mp3)

EDIT: I would say after decoding the mp3 into 32-bit float wav the file has a +4.8dB peak, so the case may not be directly relevant to this discussion.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: 2Bdecided on 2013-01-10 18:56:29
I don't see how on-sample values above 0dB FS in an mp3 file are relevant to the DAC - they can't get to the DAC (unless the mp3 decoder is built in to the DAC, or you re-define "FS").

I guess some (most?) clipressed/trashed music, encoded to mp3, decoded and re-clipped in/after the decoder at 0dB FS, and then fed to a DAC may have higher inter-sample overs than the same music in original lossless form - but I wonder if the worst case inter-sample overs all come from mp3 encoded tracks?

Beware of mp3s of unknown provenance - some people increase the gain of their mp3s after encoding, meaning you have something that could never have been encoded from LPCM at that level.

Cheers,
David.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: bennetng on 2013-01-10 19:26:55
I just generated another synthetic example waveform, but yes, we can hardly see this in real life, for discussion only.
http://www.sendspace.com/file/rg2ur5 (http://www.sendspace.com/file/rg2ur5)


Analog loopback of the above waveform with my soundcard:
http://www.sendspace.com/file/ha7tl2 (http://www.sendspace.com/file/ha7tl2)
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-11 10:27:24
A survey of our in-house music library showed inter-sample overs reaching peak levels of +1.5 to +2 dBFS worst-case.  However, please note that our entire library is ripped in lossless formats.  I suspect that inter-sample overs could be higher in amplitude, and more frequent, when the audio is reconstructed with an MP3 decoder.  Does anyone have test results for MP3 audio sources?

I believe 3.5 dB of headroom above 0 dBFS is sufficient to handle all continuous waveforms, including square waves.  Can anyone provide examples or calculations to prove otherwise?

3.5 dB of headroom should also be more than sufficient to handle music (but my tests are limited to lossless rips at standard sample rates between 44.1 and 192 kHz).


I'm seeing similar mentioned elsewhere too, normally you would never see above 3. Usually the same as or +1 to +2, very rarely +3, and above +3 probably almost ever, so 3.5 is a good margin. Anything above that are either test/synthetic like my test sample, or a single (or similar) pop that a ed user rarely hears. (usually data corruption during transmission) Although vinyl (being not just analog but mechanical) could cause high intersample peaks by mistake. (vinyl music is usually 40Hz-16KHz)

I'm curious of raw number tests as well, but I can't find a R128 scanner with True Peak with log generation for later processing.
At least I have not found such a tool, I'd be happy to scan and provide the results as I'm sure others would be.

I tried with Sox but upsampling (if it's the correct way, or is rate the better option?) even if it's correct, the stats option seem to clip the peak at full signal, and there seem to be no way to change that, using the vol option to reduce by 12dB and then do the upsample does make things better, but -5.20dB (and thyen calculating +12dB) it is still nowhere close to the actual 10.87dB.
Shame as the stats that Sox output is otherwise pretty nice.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: bandpass on 2013-01-11 12:13:58
$ sox InterSamplePeak.wav -n gain -11.9 stats

Pk lev dB      -12.00


$ sox InterSamplePeak.wav -n gain -11.9 rate 441k stats

Pk lev dB      -5.24


$ sox InterSamplePeak.wav -n gain -11.9 rate -vb 99.7 441k stats

Pk lev dB      -0.08
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Banned on 2013-01-12 11:24:52
it is still nowhere close to the actual 10.87dB.

What is "actual" here? It depends on what interpolation formula you use. I could say that your figure of 10.87dB is nowhere close to actual 15.59dB (obtained by using unwindowed sinc interpolation). This is as actual as it gets.

http://imgur.com/uQctz (http://imgur.com/uQctz)
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-12 12:14:32
it is still nowhere close to the actual 10.87dB.

What is "actual" here? It depends on what interpolation formula you use. I could say that your figure of 10.87dB is nowhere close to actual 15.59dB (obtained by using unwindowed sinc interpolation). This is as actual as it gets.


Nice! Do you have a test wav? After all that is the purpose of the my test wav, to see how/if the interpeak sample is detected by software, and is so what it's measured at. The +10.87dB is from Adobe Audition 1.5, with 999 quality setting and post processing off and tested with 2,4,8,16,32,64x resampling. (Note! Audition 1.5 crashed when trying to resample to 128x, but the other checks was very consistent (i.e. the same) +/- 0.01dB)
If you meant a peak scanner showed +15.59dB then I'm very curious indeed as to what tool that is, if it's a sample you are able to generate then that is awesome as the highest I could make was 10.87dB, though that was hand edited rather than mathematically generated. Color me curious.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-12 12:22:44
I just generated another synthetic example waveform, but yes, we can hardly see this in real life, for discussion only.

Analog loopback of the above waveform with my soundcard:


Wow! Although the intersample peaks do not look that bad, I'm more surprised by what is going on with the waveform itself, it seems to be increasing in volume, what is  actually going on there with your gear?
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Banned on 2013-01-12 14:54:24
Nice! Do you have a test wav? After all that is the purpose of the my test wav, to see how/if the interpeak sample is detected by software, and is so what it's measured at. The +10.87dB is from Adobe Audition 1.5, with 999 quality setting and post processing off and tested with 2,4,8,16,32,64x resampling. (Note! Audition 1.5 crashed when trying to resample to 128x, but the other checks was very consistent (i.e. the same) +/- 0.01dB)
If you meant a peak scanner showed +15.59dB then I'm very curious indeed as to what tool that is, if it's a sample you are able to generate then that is awesome as the highest I could make was 10.87dB, though that was hand edited rather than mathematically generated. Color me curious.

Example wav is here: http://filesmelt.com/dl/upsample1.wav (http://filesmelt.com/dl/upsample1.wav) (88200 hz, 32-bit float). It should be your wav upsampled using windowed sinc with a very large window - it almost doesn't affect the peak value. Check it yourself, I can't trust myself. I just had a funny bug when constructing filtering kernel for 2x upsampling - I stuffed zeros in every second value.  Though it doesn't change peak value here.
As I said already, peak value entirely depends on the interpolation filter used in the upsampler. Though no reasonable filter I can think of (not many) should give more than 15.59dB here. If you make your sample longer ( I don't mean stuffing it with zeros at beginning or end), you can get even bigger peak with a suitable upsampler.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: bennetng on 2013-01-12 15:36:53
I just generated another synthetic example waveform, but yes, we can hardly see this in real life, for discussion only.

Analog loopback of the above waveform with my soundcard:


Wow! Although the intersample peaks do not look that bad, I'm more surprised by what is going on with the waveform itself, it seems to be increasing in volume, what is  actually going on there with your gear?


My soundcard works fine. It gives excellent results in RMAA.
You will get similar result if you upsample the original file with some good resamplers like sox, adobe audition, foobar's PPHS and so on, try it yourself 
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-12 17:14:32
Example wav [...] It should be your wav upsampled using windowed sinc with a very large window - it almost doesn't affect the peak value.


That is not the same waveform, whatever upsampling method you used it significantly altered the waveform. Audition retained the original waveform shape regardless what samplerate it resampled to.

Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Banned on 2013-01-12 19:52:29
Example wav [...] It should be your wav upsampled using windowed sinc with a very large window - it almost doesn't affect the peak value.


That is not the same waveform, whatever upsampling method you used it significantly altered the waveform. Audition retained the original waveform shape regardless what samplerate it resampled to.

Before we can have a meaningful discussion, we need to agree about terms. Specifically:
- what is "true value" between samples. It can only be obtained by interpolation. Many different interpolation filters are possible. You seem to think that what's built into Audition is Final Truth™. I disagree - it's only a practical compromise. In my opinion, if we are to use the term "true value", we should define it as value given by Whittaker-Sannon interpolation formula, as it offers perfect reconstruction for signals satisfuing conitions f the sampling theorem.
- what is "waveform shape" and when it becomes sufficiently different. Your own post says that before upsampling the amplitude is 8.5db, and 10.87 after. I think that's sufficiently different.
Please also do a null test.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-12 21:53:30
Example wav [...] It should be your wav upsampled using windowed sinc with a very large window - it almost doesn't affect the peak value.
That is not the same waveform, whatever upsampling method you used it significantly altered the waveform. Audition retained the original waveform shape regardless what samplerate it resampled to.
Before we can have a meaningful discussion, we need to agree about terms. Specifically:
- what is "true value" between samples. It can only be obtained by interpolation. Many different interpolation filters are possible. You seem to think that what's built into Audition is Final Truth™.

Stop acting like a nincompoop, I never said final truth or anything like that, please do not try to imply I said something that I did not actually say. As for talking about Audition please see further below.

Quote
if we are to use the term "true value", we should define it as value given by Whittaker-Sannon interpolation formula, as it offers perfect reconstruction for signals satisfuing conitions f the sampling theorem.

Again, I never said true value. I said true peak, a term I did not make up but which the industry have been using for quite some years now, do not point the finger at me on that one.
I also assume you mean Whittaker-Shannon ? I am not familiar with that, nor do I have a way to practically test that. And be careful to claim "perfect", people get spanked for less on HA. 

Quote
- what is "waveform shape" and when it becomes sufficiently different. Your own post says that before upsampling the amplitude is 8.5db, and 10.87 after. I think that's sufficiently different.


Yes! Because peak detection of intersample peaks are only possible after upsampling. The 8.5 was from looking at the waveform rendering. After upsampling it matches visually (on the dB scale) with the peak analysis.

As to the waveform shape and me referring to Audition it's simple. I created the test wav by hand in Audition. I also tested upsampling. (999 quality, and no pre/post filter) If any pre/post filtering is done the waveform is altered.

To see what I mean look at this:
Original 44.1KHz waveform http://imageshack.us/photo/my-images/163/originaly.png/ (http://imageshack.us/photo/my-images/163/originaly.png/)
Upsamples/resampled to 88.2KHz with Audition 1.5, 999 quality, no pre/post filter. http://imageshack.us/photo/my-images/571/upsampled.png/ (http://imageshack.us/photo/my-images/571/upsampled.png/)
And this is yours http://imageshack.us/photo/my-images/96/upsampledb.png/ (http://imageshack.us/photo/my-images/96/upsampledb.png/)

To my eyes it is clear which upsample simply interpolated, and which actually altered the shape of the wave.

Quote
Please also do a null test.


I just did, but Audition is being an ass, seems that regardless what settings I'm using it insists on reducing the intersample peak to around 0dB. (Normally I'd welcome this, but it ruins a null test)
Regardless, the result is a waveform shape almost identical to the first two. (with the variation being how extreme the intersample peak actually is).
Note! There is no windowing (or if there is it's fixed) to be set here, not sure if that is significant.

I'm not sure what you are getting so upset for here. But we're getting slightly off topic now. This thread is not about Audition.
This thread is about extreme intersample peaks, an example of such for test cases, and detecting such, the headroom needed to handle it (if needed at all).

I'd also like to see a EBU R128 compliant test of the original wav as I'm curious as to how it detects (and by how much) with it's 4x upsampling true peak detection, but sadly Sox do no have this and I'm still trying to find a tool that does.

And don't get me wrong, I welcome other test wavs or examples of extremes, especially if they are even more extreme than my wav.
It's just that you claimed that your wav was a upsample mine, which may be true, but the interpolation is way off on yours.
Looking at the spectral view I see that my original (and it's upsample) has it's energy spread evenly through all frequency bands, with some minor clustering near 44.1 (both original and upsample) as does yours, but next is where things differ.
The length of the original wave is ~0:00.220, the length of my upsample wave is the same. The length of your upsample wave is ~0:03.823, a duration increase of over 3 and a half second.
You may have upsampled the original, but the result is not just a upsample, but a modification of the wave. If the wave is different then obviously the intersample peak(s) will be as well.

Now if you excuse me, I'm off to battle with Sox, (@bandpass, thanks for the Sox settings tip BTW!)
it seems to be morphing the wave a little during upsampling as well. (or perhaps it's resampling when I really should have it upsample instead?)
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: bandpass on 2013-01-12 22:17:26
Here's how to do the EBU upsampling/filtering with sox: http://www.hydrogenaudio.org/forums/index....st&p=816856 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=85978&view=findpost&p=816856)
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-12 22:22:28
Here's how to do the EBU upsampling/filtering with sox: http://www.hydrogenaudio.org/forums/index....st&p=816856 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=85978&view=findpost&p=816856)


Dude! Did anyone tell you, you rock? Thanks!
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-13 23:51:57
bandpass, 2bdecided, saratoga and John Siau will hopefully find this interesting.


4th post here http://www.hydrogenaudio.org/forums/index....st&p=820447 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=98752&view=findpost&p=820447)
has a .CSV with the raw data from a scan.

Here is the highlights:
Quote
5824 tracks.
Peak -1.18 dBFS (Min -28.03, Max 0.00)
ISP -0.45 dBFS (Min -28.02, Max 9.54)
ISP/Peak Delta -1.63 dB (Min -56.05, Max 9.54)
RMS -16.89 dBFS (Min -46.18, Max -6.34)

27.59% ISP <-1 dBFS
72.41% ISP >-1 dBFS

44.33% ISP -1 to 0 dBFS
19.95% ISP 0 to 1 dBFS
3.71% ISP 1 to 2 dBFS
0.79% ISP 2 to 3 dBFS
0.21% ISP 3 to 4 dBFS
0.34% ISP 4 to 5 dBFS
0.31% ISP 5 to 6 dBFS
1.56% ISP 6 to 7 dBFS
1.18% ISP 7 to 8 dBFS
0.00% ISP 8 to 9 dBFS
0.02% ISP 9> dBFS


The tracks are of all genres from "typical" to really weird. WAV, FLAC, Mp3, AAC/MP4, Ogg, 44.1KHz & 48KHz, 16 and 24/32bit. Soundtracks, pop, techno, anime, classic, game, computer music, my own composed music, standup, stereo and mono tracks, multi-channel tracks, spanning multiple decades. It took 4+ hours using 6 cores each at 100% (would have taken 24hrs if only one core had been used).

I am really surprised that only around 30% of the tracks is actually below the -1.0 dBFS for "true peaks" that EBU R128 wants.
I am even more surprised to see around 20% in the 0 to +1 dBFS range as that is (in my opinion) a rater high percentage of ISP "overs" to pass to a DAC or a audio chain.
The 3.71% in the +1 to +2 range and the 0.79% in the +2 to +3 range is very worrying, but a DAC like that mentioned by John Siau should handle this fine.

Where it really gets creepy is the >3dBFS ISPs, 3.62% total in the +3 to >+9 dBFS range. Wit the highest ISP at 9.54.
Those are probably really messed up tracks. Stupidly enough I did not also log the filepaths. (I just created the ids) So I can't easily find out what tracks are this bad so easily.
I may rescan everything in the future (maybe with better tools, R128 scan and log tool would be ideal for this stuff) and if the tracks where commercial tracks I'll post which ones (and if it turns out to be my music I'll provide samplers for testing).

Note! I used a intermediary tool that called sox, the tool itself was called/used by foobar2000 as if it was a encoder/converter, 32bit wav audio data was passed to sox, so it's unlikely the very high ISPs detected are corrupted file data, or if it is then foobar2000 treated it as part of the audio, in which case it matches real world situations (playing of corrupted data as audio).

Hopefully you guys find the .csv interesting, 5824 tracks is a rather large sample of data and thus hopefully useful.
Maybe if others could scan and make similar .csv and we can start gathering large amounts of data for numbers crunching (a foobar plugin or a standalone scan tool would make this easier for folks, *hint hint* !).
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: 2Bdecided on 2013-01-14 11:47:08
To see what I mean look at this:
Original 44.1KHz waveform http://imageshack.us/photo/my-images/163/originaly.png/ (http://imageshack.us/photo/my-images/163/originaly.png/)
Upsamples/resampled to 88.2KHz with Audition 1.5, 999 quality, no pre/post filter. http://imageshack.us/photo/my-images/571/upsampled.png/ (http://imageshack.us/photo/my-images/571/upsampled.png/)
And this is yours http://imageshack.us/photo/my-images/96/upsampledb.png/ (http://imageshack.us/photo/my-images/96/upsampledb.png/)

To my eyes it is clear which upsample simply interpolated, and which actually altered the shape of the wave.
No, you can't say that from this view. I assume your version of Audition works the same as my old Cool Edit Pro - when zoomed out, the solid envelope is formed from the maximum excursion of the actual sample values, no interpolation. The whole point of this topic is that actual sample values can be very different from nearby reconstructed values (and sample value peaks can be very different from intersample peaks). It's wrong to claim that a resampling algorithm is better because the between-original-sample reconstructed (resampled) sample values are closer to the original sample values.

"Banned" is right to say that sinc reconstruction is theoretically perfect in this sense, and with a short audio sample it's not that difficult to get as close to theoretical perfection as you wish - bounded only by the rounding error in the mathematical operations you use, and (if you're seeking the absolute true peak) the number of discrete inter-sample time points you wish to calculate. With any time-bound audio signal this "perfect" reconstruction is possible, though with longer signals a) it's a pain, and b) it's pointless (not least because the sinc function falls off to irrelevantly small values pretty quickly compared with, say, the length of a typical music track).

Whether "Banned" is performing the sinc reconstructions correctly or not, I don't know - I haven't tried it.

Cheers,
David.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: 2Bdecided on 2013-01-14 12:00:15
bandpass, 2bdecided, saratoga and John Siau will hopefully find this interesting.

4th post here http://www.hydrogenaudio.org/forums/index....st&p=820447 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=98752&view=findpost&p=820447)
has a .CSV with the raw data from a scan.

Thanks for this.

Are you seeing what I'm seeing? i.e. do an X-Y plot of column B against column D.

I see two series emerge - one effectively a Y=X line, but another a Y=X+8dB line. What's that? Am I doing something wrong, or not thinking, or is there an error in some of the data (maybe in the decoding of one format)?

From the X=Y line, and where the points start to depart from it (i.e. where inter-sample peaks start to become significantly higher than on-sample peak values), it seems that inter-sample peaks are (mostly) only a problem for tracks in your collection with on-sample peaks above -1dB. i.e. it's only tracks that were (nearly) clipped which generate intersample overs (with a small handful of exceptions).

Cheers,
David.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-14 15:49:26
Are you seeing what I'm seeing? i.e. do an X-Y plot of column B against column D.

No! I Do not have MatLAB or whatever. (can't afford to pay thousands).
A quick search for tools on the net turned up nothing for windows. (and I'm not keen to start compiling gnuplot, or doing CSV to JSON conversion to use Googles API etc.)
So if you could point me to a tool for windows, or show an image that would be helpful

As to the anomalies you seem to point out. Sox just processed whatever foobar gave it.
It is possible that either foobar passed bad audio/wav data to sox (my code did not alter any data at all, clean passtrough), or that Sox choked on some bitdepth/frequency/channel combo.
I probably won't rerun this test anytime soon (6 cores at 100% for 4 hours is a little on the heavy side) until a proper toolset is available. Maybe the R128 gain tool could be modified to append csv data lines to a file.

Scanning for "true peaks" the way that EBU R128 recommend is of particular interest obviously.
I could make a tool myself, but I have not found any example codes on upsampling and peak scanning, and I don't really read "math" equations.
The tiny tool I made and hooking it up between foobar and sox and the scanning took me a day. If I'm to spend any more time then it's better to do it right, read up and code a program program, and then it's suddenly about a week (or more) of work involved, (modifying an existing tool might be easier for existing maintainers, for me it would probably take me a week or so if unlucky, to learn/read the existing code enough to modify it).

I'd love to see hundreds of people on HA scan their collections, provide the csv and then someone can do some serious number crunching and present the results.
Depending on how many would do the scan and the size of peoples collection/test size the resulting data could number anywhere from a hundred thousand to a million tracks, which is defiantly statistically significant.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: bandpass on 2013-01-14 16:06:31
Using LibreOffice (free and available on Windows):

(http://i50.tinypic.com/jzz1w5.png)
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: 2Bdecided on 2013-01-14 16:20:01
Yep, that's exactly it.

(I just used Excel).

That upper "series" (Y=X+8dB) must be wrong, and is contaminating your results for the number of intersample overs above 0dB FS.


I'm coming at this from the other direction - I suspect that, in the context of EBU R128, consideration of intersample peak is irrelevant for content that reaches the consumer. Any consumer-targeted audio track that is loudness matched to -23LUFS is very unlikely to have any content near clipping, and as long as the actual samples sit below 0dB FS I bet the intersample peaks are safe too (except on a track intentionally created to disprove this statement!).

For pop CDs, which are often 10-15dB louder than EBU R128 requires, and often smashed/clipressed to be as loud as possible, then of course intersample overs are a real issue.

Cheers,
David.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: John_Siau on 2013-01-14 17:22:18
bandpass, 2bdecided, saratoga and John Siau will hopefully find this interesting.

The 3.71% in the +1 to +2 range and the 0.79% in the +2 to +3 range is very worrying, but a DAC like that mentioned by John Siau should handle this fine.

Where it really gets creepy is the >3dBFS ISPs, 3.62% total in the +3 to >+9 dBFS range.

Hopefully you guys find the .csv interesting, 5824 tracks is a rather large sample of data and thus hopefully useful.

Wow, nice work.  I am somewhat surprised to see anything over +3.1 dB.  It would be very nice to take a closer look at the tracks that exceed +3 dB.  It would also be interesting to compare raw tracks to mp3 versions of the same track.  I suspect that mp3 compression and reconstruction may increase the occurence of inter-sample overs (due to phase distortions in the mp3 compression process).

The Benchmark DAC2 HGC can handle a +3.5 dB inter-sample peak without clipping while the gain control is fully clockwise.  It can tolerate higher levels when the gain control is rotated to a lower gain setting.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: John_Siau on 2013-01-14 17:40:45
The "+11 dB" test signal (that started this thread) is proving very useful for testing the overload characteristics of DSP code.  If the DSP process is working properly, the inter-sample peak should pass at full amplitude (when there is sufficient headroom), or should be clipped when there is insufficient headroom.  The ES9018 D/A conversion IC seems to invert the inter-sample peak in some modes of operation.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-14 18:37:02
The "+11 dB" test signal (that started this thread) is proving very useful for testing the overload characteristics of DSP code.  If the DSP process is working properly, the inter-sample peak should pass at full amplitude (when there is sufficient headroom), or should be clipped when there is insufficient headroom.  The ES9018 D/A conversion IC seems to invert the inter-sample peak in some modes of operation.

Cool to hear, although it's rare to see such in normal music, I guess it's nice to be able to test how software/hardware handles the outliers (where usually odd things can occur).
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2013-01-14 19:29:26
(@bandpass Darn you, stop teaching me new tricks. PS! Libre crashed 3 times trying to plot this stuff. Heh! and thanks, had no idea Libre or OO could do that...)

I'm coming at this from the other direction - I suspect that, in the context of EBU R128, consideration of intersample peak is irrelevant for content that reaches the consumer. Any consumer-targeted audio track that is loudness matched to -23LUFS is very unlikely to have any content near clipping, and as long as the actual samples sit below 0dB FS I bet the intersample peaks are safe too (except on a track intentionally created to disprove this statement!).

For pop CDs, which are often 10-15dB louder than EBU R128 requires, and often smashed/clipressed to be as loud as possible, then of course intersample overs are a real issue.

*nod* My last 3 albums released has an RMS of around -23 dBFS, future projects of mine will "target" around RMS -26 dBFS as that seems to be close to EBU R128's -23 LUFS pretty close.

That upper "series" (Y=X+8dB) must be wrong, and is contaminating your results for the number of intersample overs above 0dB FS.


Yeah! *sigh* Looks like I have to revisit this later as this is really irking me now.
Check this out: http://imageshack.us/photo/my-images/688/scanuy.jpg/ (http://imageshack.us/photo/my-images/688/scanuy.jpg/)
Sorted by peak from lowest to highest.

Top left are peaks, bottom left is RMS, and the right/big one is the intersample peaks.
Both the Peak and RMS seem to correlate as expected, and even as the peaks max out (at 0.0 dBFS) the RMS shows the continued squashing going on.
And the ISP seems to match (if we ignore that "shadow" hanging over it there for a moment), and it's not until the very last ~0.70% of tracks that the ISP's go above 3.5 dBFS.

But back to that shadow (or cloud is perhaps more appropriate) hanging there, if one assumes they are 8dB "off" and adjust them "down" then they seem to match with the rest of the curve. Which is most likely correct.

Then again something else may be going on, I'll defiantly get back to this again later (with a updated/corrected csv for you guys) I just do not know when, I'f I'm going to waste a day on this again I might as well make sure it's correct, and that any sox errors/failures or foobar2000 issues can be handled, I'll also grab more data (like channels, codec (mp3 flac, wav, ogg, m4a, etc), bitdepth, frequency, and anything else I can think of/grab at the same time.
And if that anomaly rears it's head again, I'll make sure I track the filepaths so I can check the offenders if it's either damaged files/bad encodings or something else. (my guess is it was weird output from sox that my tool wasn't programmed to parse).

For those curious, it looked for "Pk lev dB" and "RMS lev dB" from the first stat and just "Pk lev dB" from the second stat. And only the first number was grabbed (for multichannel 2 or more numbers would be presented and intentionally ignored), any sox output that did not contain this info would get ignored.
Also if any Pk or RMS was NOT grabbed properly but still went into the csv then those will show up as either -999 or +999 values, and I see no such values, so it was either the wav passed to sox or sox itself that provided dodgy data.

But I will revisit this, I can not promise when though, I need to set aside a day, and if possible use a different tool than Sox (using foobar2000 as the "decoder" is very practical), peak, rms and some way to get ISP's or upsample and gather the peaks is all I need. Heck, even a upsampler with support for piping is all I need, I can code a peak scanner that do 32bit or even 64bit float peak scanner myself fairly quickly.

I could probably code something similar to sox's "upsample 4 sinc -a 40 -t 8k -24k" if I got some pointers/help though. (no idea how/where to start making a upsampler at all, any ANSI C code out there?).
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Wombat on 2013-01-14 21:31:12
After reading about that topic a bit i found that iZotope offers a feature build into their limiter that has intersample detection for "True Peaks"
Since Alexej Lukin is member here and to my understanding is part of the iZozope team he may give us some idea how they reched their conclusion of the peaks in music hitting above 3dB. Interesting is their limiter now seems to be able to prevent this directly while mixing it hot.
http://www.izotope.com/support/help/ozone/...s_maximizer.htm (http://www.izotope.com/support/help/ozone/pages/modules_loudness_maximizer.htm)
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: bandpass on 2013-01-15 08:57:48
if that anomaly rears it's head again, I'll make sure I track the filepaths so I can check the offenders if it's either damaged files/bad encodings or something else.

I'd do this as a matter of course as it's needed to further investigate this anomaly, any other anomaly that might occur in future, and any track with an otherwise interesting result.

I could probably code something similar to sox's "upsample 4 sinc -a 40 -t 8k -24k" if I got some pointers/help though.

With the above in place I don't think it will be difficult to find and fix the problem, but otherwise see http://www.dspguru.com/dsp/faqs/multirate/interpolation (http://www.dspguru.com/dsp/faqs/multirate/interpolation) (includes c-source) and http://www.itu.int/dms_pubrec/itu-r/rec/bs...;!PDF-E.pdf (http://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1770-3-201208-I!!PDF-E.pdf) for the filter coefs.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: knutinh on 2013-02-04 08:20:56
Since inter-sample overshoot is a problem for the analog stage of a DAC, what would happen at the corresponding stage in an ADC, assuming the same analog/digital waveform? Would it clip (producing different digital samples, implying that the samples can only be greated digitally), or would it just pick non-clipped samples? I guess that depends on if the ADC is essentially a text-book passive analog filter hooked up to a point-sampler, or if it is a multirate (digitally filtered using fixed-point arithmetics) design.

It seems that inter-sample over values are quoted with great accuracy and confidence, even though the exact reconstruction filter is not specified. Are you using an accurate approximation of the ideal sinc filter when discussing this? I guess that a different filter (e.g. lower bandwidth, non-linear phase) could produce fairly different results.

-h
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: bug80 on 2013-02-04 11:45:44
Are you seeing what I'm seeing? i.e. do an X-Y plot of column B against column D.

No! I Do not have MatLAB or whatever. (can't afford to pay thousands).

Off-topic, but take a look at GNU Octave (http://www.gnu.org/software/octave/). It's a free Matlab-clone, with the same syntax (so your old .m scripts still work).
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: 2Bdecided on 2013-02-04 12:08:23
Since inter-sample overshoot is a problem for the analog stage of a DAC...
It's also a problem for the digital section, i.e. the over sampling + reconstruction filter.
Quote
...what would happen at the corresponding stage in an ADC, assuming the same analog/digital waveform?
No sane person digitally samples at levels near clipping - they leave sufficient headroom. Insane people who push the levels like that will probably get clipping, either due to the analogue electronics, the digital processing (oversampling ADC and digital anti-alias filter), or the fact that the peak happens to occur on-sample rather than between samples.

The concern is almost completely with audio that has been processed after the ADC to increase the apparent loudness.

Quote
It seems that inter-sample over values are quoted with great accuracy and confidence, even though the exact reconstruction filter is not specified.
The EBU R128 definition is pretty strict, though it doesn't necessarily give the absolute highest possible true peak.

Cheers,
David.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: John_Siau on 2013-02-04 17:08:46
Since inter-sample overshoot is a problem for the analog stage of a DAC...
It's also a problem for the digital section, i.e. the over sampling + reconstruction filter.


Inter-sample overs are also a problem for any sample-rate conversion process.  ASRC devices will produce many spurious tones when inter-sample clipping occurs.  The solution is to reduce the signal level before executing the SRC process.  In our DAC2 HGC converter, we reduce the signal level by 3.5 dB before the upsampling.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Alexey Lukin on 2013-02-04 18:49:28
After reading about that topic a bit i found that iZotope offers a feature build into their limiter that has intersample detection for "True Peaks"
Since Alexey Lukin is member here and to my understanding is part of the iZotope team he may give us some idea how they reached their conclusion of the peaks in music hitting above 3dB. Interesting is their limiter now seems to be able to prevent this directly while mixing it hot.
http://www.izotope.com/support/help/ozone/...s_maximizer.htm (http://www.izotope.com/support/help/ozone/pages/modules_loudness_maximizer.htm)

This +3 dB figure is pretty arbitrary. I think that in practice maybe some 1% of mastered records will show this true peak level. The absolute maximally possible true peak overshoot cannot be specified precisely because it depends on the length and phase response of the DAC's reconstruction filter. If filters are long enough and the signal is specially crafted, there's no theoretic limit for the level of TP overshoot.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: knutinh on 2013-03-04 12:58:08

It seems that inter-sample over values are quoted with great accuracy and confidence, even though the exact reconstruction filter is not specified.
The EBU R128 definition is pretty strict, though it doesn't necessarily give the absolute highest possible true peak.

Cheers,
David.


I vaguely remember something about "phase scrabling" peaks in radio transmission - i.e. messing with the phase so as to minimize peaks while keeping the average levels (or, effectively maximizing the average levels with minimal audible distortion).

Could this be done in a DAC/SRC application? If complexity/delay was of no concern, one could choose between a set of prototype filters that sounded equally good, selecting the filter that minimized intersample overs? Is not this a neater (although certainly overkill) solution than throwing away a few dB of SNR for all material?

Or, one could have a two-path filtering, switching to a cruder interpolation in those few segments where intersample overs are an issue (linear interpolation?)

-k
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: John_Siau on 2013-03-04 16:51:44
I vaguely remember something about "phase scrabling" peaks in radio transmission - i.e. messing with the phase so as to minimize peaks while keeping the average levels (or, effectively maximizing the average levels with minimal audible distortion).

Could this be done in a DAC/SRC application? If complexity/delay was of no concern, one could choose between a set of prototype filters that sounded equally good, selecting the filter that minimized intersample overs? Is not this a neater (although certainly overkill) solution than throwing away a few dB of SNR for all material?


The simple solution is to reduce the signal amplitude prior to the SRC and DAC.  With a 24-bit data path, a 3 to 6 dB reduction in gain is of little consequence.  The 24-bit data path has a dynamic range of approximately 144 dB, and a loss of 3 to 6 dB should be insignificant.  Please note that this digital gain reduction must be made up after the DAC to achieve the same playback levels.  This means that there are higher demands on the performance of the DAC. 

Throwing away 3 to 6 dB of SNR is probably the best choice give the fact that DAC ICs are available with very good SNR specifications.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: 2Bdecided on 2013-03-04 17:21:37
I vaguely remember something about "phase scrabling" peaks in radio transmission
Yes. It's also sometimes called Phase Rotation. It's in Optimods and the like. It helps to make asymmetric waveform more symmetric.

One problem is that real-world clipressed audio has probably already been through one. Using this technique again might not generate lower peaks.

Quote
- i.e. messing with the phase so as to minimize peaks while keeping the average levels (or, effectively maximizing the average levels with minimal audible distortion).

Could this be done in a DAC/SRC application? If complexity/delay was of no concern, one could choose between a set of prototype filters that sounded equally good, selecting the filter that minimized intersample overs? Is not this a neater (although certainly overkill) solution than throwing away a few dB of SNR for all material?
I don't think any one would want it in-circuit all the time, and switching could introduce audible transients. There might be a way around it.

Quote
Or, one could have a two-path filtering, switching to a cruder interpolation in those few segments where intersample overs are an issue (linear interpolation?)
If you have no headroom, and it clips, the only choice is to clip it. There is no room even for linear interpolation.

You can use soft or hard limiting, or even a gentle AGC that only acts in the presence of inter-sample overs. I think this would sound worse than just clipping. Anything you do in a typical DAC will be very fast acting (not always desirable) because they introduce so little delay.

Like John, I think the better choice is simple to "throw away" a little headroom. I have never heard of the DAC noise floor being a practical problem (except in older systems without an analogue volume control - i.e. all digital volume control).

Cheers,
David.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2015-11-30 10:59:39
I'm performing some topic necromancy to let this thread come to a conclusion.

I recently re-ran the test here, I mostly did this out of my own curiosity.

But I also felt that John Siau, 2bdecided, bandpass, and others that contributed to this thread deserved some more answers.
The plot shown by bandpass showed several tracks with very high Intersample Peaks aka True Peaks.


Apologies that I do not have a tool others can easily use, but I'll describe what I did instead.
I created a tiny program that acted as a custom CLI encoder for Foobar2000, the only thing it did was pipe the audio from foobar2000 to sox and parse and log the output.
I also configured foobar2000 to put the artist/album/track/trackno/replaygain gain/peak info as part of the file name, so that my tiny tool could parse/add that to the log as well.

The latest foobar2000 version was used (v1.3.9), and all tracks was rescanned with the replaygain scanner in v1.3.9 with the scanner set to 2x oversampling for peaks (anything higher seemed to report the same peaks so 4x oversampling as suggested in EBU R128 specs is not possible). The peaks reported by SOX is thus used instead, I also confirmed visually using Adobe Audition 3 is the wave peaks do reach the oversample peak values reported by SOX.

The true magic in the test is due to SOX (big thanks to bandpass for giving me clues to how to use the the command line properly to do this).
The SOX command line used is the following: sox.exe - -n stats upsample 4 sinc -a 40 -t 8k -24k stats
The audio piped from foobar is piped as 32bit (with wav header).

I had to run 4 instances dividing the workload of 6528 tracks, otherwise it would have taken over 16 hours to run this scan; instead it only took around 6 hours (causing 80% load on a 6 core CPU).

Here are some stats from a little helper program I created to crunch some numbers from the CSV log:

Code: [Select]
6528 tracks.
Avg. Peak -1.24 dBFS (Min -24.55, Max 0.00)
Avg. ISP -0.56 dBFS (Min -25.71, Max 9.54)
ISP/Peak Delta -1.79 dB (Min -49.06, Max 9.54)
RMS -16.79 dBFS (Min -42.05, Max 0.00)

24.08% (1572) ISP <-1 dBFS
75.92% (4956) ISP >-1 dBFS

18.57% (1212) ISP -1 to 0 dBFS
43.49% (2839) ISP 0 to 1 dBFS
8.70% (568) ISP 1 to 2 dBFS
1.53% (100) ISP 2 to 3 dBFS
0.57% (37) ISP 3 to 4 dBFS
0.17% (11) ISP 4 to 5 dBFS
0.23% (15) ISP 5 to 6 dBFS
0.46% (30) ISP 6 to 7 dBFS
1.44% (94) ISP 7 to 8 dBFS
0.75% (49) ISP 8 to 9 dBFS
0.02% (1) ISP 9 to 10 dBFS
0.00% (0) ISP 10> dBFS


Immediate conclusion is that 10 dB headroom is needed to avoid any Intersample Peaks or True Peaks from causing distortion (if the distortion is audible or not is a different discussion).

A few notes on the tested tracks. They are my personal collection, collected over many years. They are  MP3, Ogg, AAC, FLAC formats/encodings. Some of it is mainstream. Some of it Iv'e composed myself (3 published albums plus some extra released and non-released stuff).
Some of the tracks are not main stream, rare or not purchaseable. Some have been ripped by me from games (as no official soundtrack existed).
In other words it's a weird even eclectic mix of tracks and sourced stuff. Thus it may or may not be representative of the common listener. Then again people or audiophile or engineers/techs that know or worry about intersample peaks are that common either, certainly not mainstream.

I'll list specific problem tracks if they can be found on the net (legally) somewhere or instructions on how/where they can be found (you may need to rip it from a game/source yourself for example). If a series of related tracks have high True Peaks then I'll list the artist/source/album so you can search for it yourself. Also note hat due to how releases are there may be differences on which year/region the release was made in/for.
A few years ago I copied/encoded all my CDs and recycled all my CD covers and CD inlays and threw the discs in the trash (can't be recycled, at least not at the time) so I no longer have proof of purchase for them (in retrospect kinda stupid, I could have put the CD inlays in a box somewhere) so I'm not giving a full list of the tested collection because of that (sorry). Also not all of them are FLAC (many years ago I ripped most of my music, I never got around to re-ripping it all as FLAC, drive space was not that cheap back then) so the encoding may be the source of the intersample peaks rather than the mastering of the track, I'll mention if this is the case.

Code: [Select]
The only track that had a True Peak above +9 dBFS:
Jayce and the Wheeled Warriors, opening theme (mp3) +9.54 dBFS, -11.88 RMS
I can't recall where it's from, I think I ripped this from from a Youtube video. I watched this show when I was young, so it's in my collection for nostalgia reasons. Can't share the track for legal reasons, sorry. But it should be searchable on Youtube so try there first (you'll also be treated to a cheesy 80s animated intro as well).


Code: [Select]
Quite a few tracks have surprisingly high intersample peaks that are above +8 dBFS.
Legendary standup comedian George Carlin's performances/recordings/albums "Back in Town", "Complaints And Grievances", "Playin With Your Head" all have true peaks at/above +8.0 dBFS, the RMS varies from -17 dBFS to -21 dBFS, so even if EBU R128 or ReplayGain was used the true peaks would still be above 0 dBFS afterwards.

Michael Land's Monkey Island III OST and RockStar Games Grand Theft Auto Liberty City Stories OST also have very high true peaks, their RMS is higher than George Carlin's stuff for various reasons (talking vs music being one, but also the years they where mastered, and the way they where mastered).
Liberty City Stories OST should be somewhat available (check WIMP, Spotify, iTunes, Google Play, Amazon etc) but I'm unsure if they match the game audio rip or not. (the in-game radio channels are sometimes mastered differently from the individual tracks).

Frank Klepacki's Blade Runner The Game soundtrack also show similar high true peaks.
His website is at http://www.frankklepacki.com/ you can find some tracks in the flash player on the page http://www.frankklepacki.com/portfolio/game-BR.html
You'll have to rip the tracks from there (or use a live True Peak meter), the rest of the tracks you'll have to rip/convert from Blade Runner The Game itself.


The majority of the other tracks with true peaks above +3 dBFS was also the from the same collections as those above.
(This explains the high anomaly on bandpass's plot/chart that 2bdecided was curious about.)


Code: [Select]
A few other collections are above +3 dBFS though.
A few tracks from the "Trilogy" album by "Carpenter Brut" (mp3) for example (check the various outlets or bandcamp or Youtube https://carpenterbrut.bandcamp.com/album/trilogy )
A few tracks from Yuki Kajiura's "Noir" soundtrack OST2 disc (mp3). Eminem's album "Relapse" (mp3). Savant's album "ISM" (FLAC)



Code: [Select]
Tracks with True Peaks in the +2 dBFS to +3 dBFS range.
Eminem's album "Relapse", Type O Negative's "World Coming Down" album, Pendulum's album "Immersion", Savant's "ISM" album, Vader's "XXV" album, Ramin Djawadi 's season 1 "Game of Thrones" soundtrack). Some of this stuff is more mainstream so tests should hopefully be more easily reproducible by others.


A note to John Siau:
Your DAC headroom of +3.5 dBFS seems to be a pretty good choice. Although in some few cases like Eminem's album "Relapse" it's just barely enough. And with artists that like Savant (ISM album) or Carpenter Brut (Trilogy album) they actually break that +3.5 dBFS headroom. If that is audible or not is another matter. Savant's ISM album actually has true peaks above 0dBFS as flac. The title track "Ism" from the album has a true peak of +4.20 dBFS, though the RMS is at +8 dBFS so EBU R128 or Replaygain would bring the true peak to below 0 dBFS in this case luckily, the track Mystery has a true peak of +3.59 dBFS.


Closing notes:
For the most part it seems that MP3 or AAC lossy encoding is a common trend among these.
The only exception is Savant's album ISM which has very high true peaks even as lossless FLAC, the current official place to get the album seems to be at http://savantofficial.bandcamp.com/album/ism (http://savantofficial.bandcamp.com/album/ism) the official website (and the shop there) seems to have issues currently, the preview there is lossless though no idea if the true peak is the same or worse with that though.

For those curious the particular track in question clips a lot and has lots of 8bit type of sounds in it. The rest of the album is similar.
The track Ism (from thew album of same name) if upsampled from 44.1kHz to 192kHz ends up with a max peak at +5.41 dBFS and over a million possibly clipped samples (or so states Adobe Audition at least.) That track be be worth using as a test for DACs or upsampling or True Peak testers as it satisfies the criteria of being in the wild and a real world example (rather than the artificial one as linked to in the first post of this thread).



I hope this rather long post is of some use to people out there.
It would be nice if someone could make a True Peak scanner tool (maybe base it around EBU R120 and libsoxr?) either as a foobar2000 plugin or a standalone tool (that you can easily pipe audio to) and log True Peak and RMS and other stats in a convenient file that people could upload/put online.


As I said earlier, if one can hear True Peaks above 0 dBFS or not is another discussion, but I can confidently state that True Peaks above 0 dBFS certainly do not improve the sound quality. And when music do exist in the wild that some of the current very high headroom DACs do not handle then that is at the very least of interest for further study. Should DACs have +6 dBFS headroom, or maybe +10 dBFS headroom? A +10 dB headroom would handle my entire collection without any clipping, then again I use Foobar and Replaygain so peaks never go that high anyway.
Except for those George Carlin recordings, if I apply replaygain to those then for one of them the gain is +8.19 pushing the true peak from +5.19 to +13.6 dBFS, now the digital peak is at -2.94 dBFS so at most the applied gain would be +2.94 which would make the true peak +8.13 which is still a lot. Would +10 dB headroom be enough for "all cases" then?


Note! As I've changed my digital "life" some time ago (new email etc. yay almost 0 spam now) I'm not using this email/account anymore. Hence the reason I wanted to follow up on this thread. If may be posting under a new nick in the future (no idea when, I've hardly been active at all on HA for a long time now, we'll see I guess).
So this will probably be the last post by this user/nick.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2015-11-30 11:28:31
I uploaded a 5 sec clip from Savant's Ism here https://www.hydrogenaud.io/forums/index.php?showtopic=110695 (https://www.hydrogenaud.io/forums/index.php?showtopic=110695)
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: 2Bdecided on 2015-12-01 08:22:50
Hi Rescator,

Good to see you back, even if it is a short stay. That's an impressive test you have run. At first I assumed the 32bit output was floating point, but now I wonder if it is fixed point because...

The log you posted says that the max peak is 0dBFS. I guess the true max peak of your mp3s (ignoring inter sample peaks) is higher than that, but the process you used clipped the decoded mp3s before checking for peaks. My guess is that this will increase the discrepancy between peak and intersample peak. I agree it is the way most people would do it, but it is making the situation worse in a completely avoidable way.

I worry that some of your extreme examples are simply mp3s that having been gained too high, rather than genuine examples of the kind of intersample overs that the lossless original would generate.

Cheers,
David.
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: bennetng on 2015-12-01 08:55:06
Since Rescator mentioned SoX a lot, I would like to point out that SoX is not fully compatible with floating point audio data.

Example:
https://www.hydrogenaud.io/forums/index.php...amp;mode=linear (https://www.hydrogenaud.io/forums/index.php?showtopic=101850&mode=linear)
Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2015-12-02 05:01:28
I did not know that about Sox and 32bit float, thanks bennetng for that.

However in this case it only means that at worst there is 24bit of precision, considering that all the audio is 24bit at most then that is not a issue really, the upsampling will reveal any ISPs regardless.

2bdecided your questions made me dig deeper in the tested collection and I found some interesting things...

First of all none of the mp3 have had any gain adjustments, I always preserve the original file and use tags (ReplayGain) instead.

Jayce and the Wheeled Warriors opening theme is a mp3, and taking a look at it in Adobe Audition 3 (it's mp3 decoder being used) it turns out it is Mono and 16KHz sampling frequency, the spectral display shows me frequencies are only up to about 5KHz (looks like some form of brickwall filter was applied around there), there are also vertical "lines" a few places in the 5kHz-8kHz range.
Loading/decoding to 16bit and upsampling to 176kHz showed no ISPs above +1.05 dBFS and same happen when decoding to 32bit float first.
This made me question the mp3 decoder in Audition 3, so I used Fooar2000 (same converter as in my mass scan) and the ISP ended up at +1.27 dBFS. Next I tried upsampling to 64kHz (4x 16kHz) in case 176kHz was squashing the ISPs somehow, the results was almost identical though.

Then I scratched my head and looked at the sox parameters I used and I realized a few things.

#1. I am/was ignorant of certain Sox parameters, just using parameteres without understanding them is bound to cause mistakes.
#2. I think I used the wrong parameters for the right reason.
#3. Mimicking EBU R128 is wrong, it's True Peaks are after filtering, my concern are ANY True Peaks/ISPs, the kind that a DAC need to handle rather than wat is audible.
#4. I did not check my tools/setup from 2013 and simply re-used it, a leftover compensation for a gain reduction of 12 was used but the gain reduction was never actually used with the sox parameters so this tainted the results. (imagine a major facepalm here).


So I'm gonna rerun the whole thing one more (a third) time.
And I'll be dropping the lowpass/highpass stuff or trying to match EBU R128 filtering. After all a DAC has to handle/process the whole frequency spectrum anyway.

I will test foobar2000 v1.3.9's new resampler which is slow but good against Sox and see which is better/fastest.
If the speed is very similar I may just drop Sox and let Foobar do the upsampling, in that case I'll recode my tool to not use Sox at all but instead scan the floating point data itself directly (all I need to do is check for highest (now upsampled) peak and gather RMS stats.

If it turns out Sox is still the better option then I'll have to go back to my old idea of applying a negative gain (of about -12 dB) to sox so it won't clip any peaks when upsampling and then compensate for that in my stats gathering tool by adding +12 to the results (and make sure that one of the two steps are not forgotten again).

Once done I'll post again here and I may start another thread with just the resulting anonymized CSV (I'll try to add a plot graph as well) so people can see the spread and the percentage of certain high peak values. Maybe others will add their own results to such a thread (would be nice to see how frequent high ISPs are in peoples collections).


One would think that with so many years of programming experienceI would not make "beginner" mistakes like this. I'm sure some professional statisticians out there are deservedly laughing their asses off right now though.

It just goes to show that unless your test is correct then the results means crap all. Check, check then triple check. Study anomalies individually and fiddle with the test to see if the anomaly was caused by the data tested or your way of testing. And question/never trust yourself, assume you did make mistake even if you are sure you didn't.


Anyway I'm sorry for "wasting" your folks time, I'll be back in about half a day with some correct and proper stats.

Savant's Ism though is still the track with the highest True Peak I've seen so far that is a "normal" in the wild music track/recording, so at least "something" came out from this mess.

Title: Pathological example of a intersample peak, 11dB, discussion.
Post by: Rescator on 2015-12-02 15:30:56
Now things look a bit more "normal".

Code: [Select]
6486 tracks.
Avg. ISP -0.78 dBFS (Min -24.55, Max 5.75)
RMS -16.87 dBFS (Min -42.05, Max -5.88)

27.03% (1753) ISP <-1 dBFS
72.97% (4733) ISP >-1 dBFS

20.35% (1320) ISP -1 to 0 dBFS
34.84% (2260) ISP 0 to 1 dBFS
13.61% (883) ISP 1 to 2 dBFS
3.25% (211) ISP 2 to 3 dBFS
0.62% (40) ISP 3 to 4 dBFS
0.25% (16) ISP 4 to 5 dBFS
0.05% (3) ISP 5 to 6 dBFS


After testing the new dBpoweramp/SSRC resampler in foobar2000 which is very slow (would take me two days to run the scans) and the PPHS resampler (with and without Ultra) I found that using PPHS without ultra enabled let me run each scan in about 2-3 hours (the processing was split into 5 parts using 5 cores so that's about 3 hours times five if it was on a single core).
There is some difference between the resamplers but they are minimal when looking for Inter-Sample Peaks.

PPHS (no ultra) set to resample to 192000 Hz was used as setting. Peak (ISP) and RMS was then scanned using my own tool.
Of particular interest is that foobar2000 v1.3.9's Replay Gain scanner set to " Peak scan oversample factor : 4 " is actually pretty close to the upsampling and then checking for highest peaks as I did. Unfortunately it and foobar does not show the peak relative to dBFS, nor is RMS shown (which I always find useful as RMS(Z) has no loudness curve).


I had to remove a few 5.1 tracks that produce invalid values (not checked where the cause was for that, I might later, I just dropped them from the test results instead).
Likewise a few tracks ended up with +infinity in the results and the filename was mangled, here I suspect that foobar's convert process chocked on the filename for some reason (I was using/missusing it to pass along album/artist/track/title and replaygain peak and gain details) so I had to remove those results as well.

And for the record I did do a full ReplayGain rescan of all the files using factor 4 (previously I had only used factor 2). A factor of 4 added a little more precision to the True Peak detection it seems for certain tracks. Also of note is that ReplayGain is "faster" than using a tool like mine combined with foobar's convert with PPHS as the overhead is way less.


Now back to the results...
Of the 6486 tracks in the result none have a True Peak/Inter-Sample Peak above +6 dBFS.
Carpenter Brut's Trilogy, Eminem's Relapse and Savant's Ism albums are all in the +3 or higher area.
As mentioned before Trilogy and Eminem are mp3's and Ism is FLAC so very high iSPs are not restricted to lossy only.

I think I'll just end these tests here.
I'll just end with the fact that in my tested collection of music none went above +6 dBFS, so a DAC with 6dB headroom would have no clipping with any of my music.
I'd also like to note that using ReplayGain the majority of the tracks with high ISPs do usually get a negative gain adjustment so the tracks usually end up below 0 dBFS,
and with the new scan factor in foobar2000 1.3.9 combined with clipping protection enabled that will probably never be a issue.

But for DAC designers/audiophiles it may be of interest that there exist actual music in the wild with ISPs as high as +5.75 dBFS which could be of some concern (how would a DAC or speaker handle that?).
Title: Re: Pathological example of a intersample peak, 11dB, discussion.
Post by: Porcus on 2018-05-21 22:49:57
More necromancy. I was curious enough to test a little bit, and found a +11 dBTP measurement in one of my CDs using one of the algorithms. Merzbow, not surprising - so you may wonder if all clipping is part of the music the way the artist intended it :-o

But I also tested a selection of albums with positive album gain, because here is where the peak value stored will limit upon playback. (The Merzbow album has an album gain of -21 dB, so the +11 won't bring it across zero.) General findings: a couple of dB suffices on everything I tested. Which excluded classical music (where there is too much that peaks too low, it would take much more work), test signal CDs, HDCDs and pre-emphasis CDs.  All were lossless CDs/Bandcamp downloads in the FLAC format.


The "positive gain" selection: Initially I set some criteria for entire albums, but ended up searching up tracks with high peaks from albums with positive RG album gain (calculated by fb2k, new algorithm), which is where clipping prevention according to peak would kick in.  I ended up scanning some 163 tracks, but that was because I was too lazy to remove stuff.
Then I picked a handful of tracks and scanned them with several "True peak" settings using foobar2000 1.4beta13. 
Music: Lots of prog.rock and the like. Diamanda Galás, Pink Floyd: "Mother" from the Shine On version of The Wall, The opening track from Demon's "British Standard Approved", Jonas Hellborg - and Bobby McFerrin (the "Voice" album).
Some general remarks:
* Turns out that in my setup, the SoX resampler does not find any intersample peaks at all.  Maybe it could have something to do with my setup, using SoX resampler to get rid of some odd sample frequencies.  Anyway, SoX excluded.
* Auto 2x/4x/8x return precisely the same figures. 

Results:
* Nothing in this selection went above +1.30 dBTP.  That track is White Willow: "John Dee's Lament" from their debut Ignis Fatuus, RG track peak was 0.98something.
* Other were close to or above +1 dBTP: McFerrin, Demon, Hellborg.
* Those who are worried about their Pink Floyd: +0.48 dBTP.
* Differences between algorithms are smallish, less than 0.1 dB.

Learnings: a couple of dB seems to take care of everything I tested, and algorithms make little difference on these tracks.


Then the LOUD albums: 12 albums (112 tracks) with album gain -16.00 dB (fb2k, new algorithm) and below.  Quite a lot of industrial/noise (three involving Merzbow),  a couple of black and death metal albums, and the infamous Stooges "Raw Power" reissue.  All tracks 16/44.1 but one track 24/44.1 in a Bandcamp purchase; that didn't turn out to matter.
* "No oversampling" track peaks overview: Six of twelve albums at full (.9999969 or 1 for all tracks).  Four more albums entirely within -0.12 dB.  The last two a bit particular: Deathstorm: We are Deathstorm, all at -0.2 dB and one -0.48 dB - and then a 41 minutes Merzbow concert in a single track, at -1 dB.
* Every track "above" the Stooges: Raw Power peak have Deathstorm or Merzbow involved.  Only here the choice of algorithm makes big differences in numbers:  The "worst" is Merzbow: Venereology, and here a track ranges +7.15 dBTP to +11.30 dBTP  depending on algorithm (the highest using dBpoweramp/SSRC - both are still way short of the album gain of -21.76 dB though).
* Stooges: Raw Power: variation among tracks in the interval [+1.95 dBTP, +2.98 dBTP] for PPHS default. Variations among algorithms: from PPHS default and and .3 to .4 dBTP upwards (PPHS ultra, dBpoweramp/SSRC), with "auto Nx" in between.
* None of the albums get album peaks below +1.77 dB (PPHS, both default and Ultra) to +1.84 (auto Nx). 
* Two to three of the 112 tracks stay below 0.  All Deathstorms. The "third" of these range -0.02 dBTP to +0.14 dBTP.

Learnings: All these albums bump up ~ 2dB or more, up to a whopping 11 dB. Only on one album did the algorithm really matter for the number - but remember again, the large negative RG album gain will more than compensate.