Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Pathological example of a intersample peak, 11dB, discussion. (Read 49979 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Pathological example of a intersample peak, 11dB, discussion.

Reply #25
bandpass, 2bdecided, saratoga and John Siau will hopefully find this interesting.

4th post here http://www.hydrogenaudio.org/forums/index....st&p=820447
has a .CSV with the raw data from a scan.

Thanks for this.

Are you seeing what I'm seeing? i.e. do an X-Y plot of column B against column D.

I see two series emerge - one effectively a Y=X line, but another a Y=X+8dB line. What's that? Am I doing something wrong, or not thinking, or is there an error in some of the data (maybe in the decoding of one format)?

From the X=Y line, and where the points start to depart from it (i.e. where inter-sample peaks start to become significantly higher than on-sample peak values), it seems that inter-sample peaks are (mostly) only a problem for tracks in your collection with on-sample peaks above -1dB. i.e. it's only tracks that were (nearly) clipped which generate intersample overs (with a small handful of exceptions).

Cheers,
David.

Pathological example of a intersample peak, 11dB, discussion.

Reply #26
Are you seeing what I'm seeing? i.e. do an X-Y plot of column B against column D.

No! I Do not have MatLAB or whatever. (can't afford to pay thousands).
A quick search for tools on the net turned up nothing for windows. (and I'm not keen to start compiling gnuplot, or doing CSV to JSON conversion to use Googles API etc.)
So if you could point me to a tool for windows, or show an image that would be helpful

As to the anomalies you seem to point out. Sox just processed whatever foobar gave it.
It is possible that either foobar passed bad audio/wav data to sox (my code did not alter any data at all, clean passtrough), or that Sox choked on some bitdepth/frequency/channel combo.
I probably won't rerun this test anytime soon (6 cores at 100% for 4 hours is a little on the heavy side) until a proper toolset is available. Maybe the R128 gain tool could be modified to append csv data lines to a file.

Scanning for "true peaks" the way that EBU R128 recommend is of particular interest obviously.
I could make a tool myself, but I have not found any example codes on upsampling and peak scanning, and I don't really read "math" equations.
The tiny tool I made and hooking it up between foobar and sox and the scanning took me a day. If I'm to spend any more time then it's better to do it right, read up and code a program program, and then it's suddenly about a week (or more) of work involved, (modifying an existing tool might be easier for existing maintainers, for me it would probably take me a week or so if unlucky, to learn/read the existing code enough to modify it).

I'd love to see hundreds of people on HA scan their collections, provide the csv and then someone can do some serious number crunching and present the results.
Depending on how many would do the scan and the size of peoples collection/test size the resulting data could number anywhere from a hundred thousand to a million tracks, which is defiantly statistically significant.

Pathological example of a intersample peak, 11dB, discussion.

Reply #27
Using LibreOffice (free and available on Windows):


Pathological example of a intersample peak, 11dB, discussion.

Reply #28
Yep, that's exactly it.

(I just used Excel).

That upper "series" (Y=X+8dB) must be wrong, and is contaminating your results for the number of intersample overs above 0dB FS.


I'm coming at this from the other direction - I suspect that, in the context of EBU R128, consideration of intersample peak is irrelevant for content that reaches the consumer. Any consumer-targeted audio track that is loudness matched to -23LUFS is very unlikely to have any content near clipping, and as long as the actual samples sit below 0dB FS I bet the intersample peaks are safe too (except on a track intentionally created to disprove this statement!).

For pop CDs, which are often 10-15dB louder than EBU R128 requires, and often smashed/clipressed to be as loud as possible, then of course intersample overs are a real issue.

Cheers,
David.

Pathological example of a intersample peak, 11dB, discussion.

Reply #29
bandpass, 2bdecided, saratoga and John Siau will hopefully find this interesting.

The 3.71% in the +1 to +2 range and the 0.79% in the +2 to +3 range is very worrying, but a DAC like that mentioned by John Siau should handle this fine.

Where it really gets creepy is the >3dBFS ISPs, 3.62% total in the +3 to >+9 dBFS range.

Hopefully you guys find the .csv interesting, 5824 tracks is a rather large sample of data and thus hopefully useful.

Wow, nice work.  I am somewhat surprised to see anything over +3.1 dB.  It would be very nice to take a closer look at the tracks that exceed +3 dB.  It would also be interesting to compare raw tracks to mp3 versions of the same track.  I suspect that mp3 compression and reconstruction may increase the occurence of inter-sample overs (due to phase distortions in the mp3 compression process).

The Benchmark DAC2 HGC can handle a +3.5 dB inter-sample peak without clipping while the gain control is fully clockwise.  It can tolerate higher levels when the gain control is rotated to a lower gain setting.
John Siau
Vice President
Benchmark Media Systems, Inc.

Pathological example of a intersample peak, 11dB, discussion.

Reply #30
The "+11 dB" test signal (that started this thread) is proving very useful for testing the overload characteristics of DSP code.  If the DSP process is working properly, the inter-sample peak should pass at full amplitude (when there is sufficient headroom), or should be clipped when there is insufficient headroom.  The ES9018 D/A conversion IC seems to invert the inter-sample peak in some modes of operation.
John Siau
Vice President
Benchmark Media Systems, Inc.

Pathological example of a intersample peak, 11dB, discussion.

Reply #31
The "+11 dB" test signal (that started this thread) is proving very useful for testing the overload characteristics of DSP code.  If the DSP process is working properly, the inter-sample peak should pass at full amplitude (when there is sufficient headroom), or should be clipped when there is insufficient headroom.  The ES9018 D/A conversion IC seems to invert the inter-sample peak in some modes of operation.

Cool to hear, although it's rare to see such in normal music, I guess it's nice to be able to test how software/hardware handles the outliers (where usually odd things can occur).

Pathological example of a intersample peak, 11dB, discussion.

Reply #32
(@bandpass Darn you, stop teaching me new tricks. PS! Libre crashed 3 times trying to plot this stuff. Heh! and thanks, had no idea Libre or OO could do that...)

I'm coming at this from the other direction - I suspect that, in the context of EBU R128, consideration of intersample peak is irrelevant for content that reaches the consumer. Any consumer-targeted audio track that is loudness matched to -23LUFS is very unlikely to have any content near clipping, and as long as the actual samples sit below 0dB FS I bet the intersample peaks are safe too (except on a track intentionally created to disprove this statement!).

For pop CDs, which are often 10-15dB louder than EBU R128 requires, and often smashed/clipressed to be as loud as possible, then of course intersample overs are a real issue.

*nod* My last 3 albums released has an RMS of around -23 dBFS, future projects of mine will "target" around RMS -26 dBFS as that seems to be close to EBU R128's -23 LUFS pretty close.

That upper "series" (Y=X+8dB) must be wrong, and is contaminating your results for the number of intersample overs above 0dB FS.


Yeah! *sigh* Looks like I have to revisit this later as this is really irking me now.
Check this out: http://imageshack.us/photo/my-images/688/scanuy.jpg/
Sorted by peak from lowest to highest.

Top left are peaks, bottom left is RMS, and the right/big one is the intersample peaks.
Both the Peak and RMS seem to correlate as expected, and even as the peaks max out (at 0.0 dBFS) the RMS shows the continued squashing going on.
And the ISP seems to match (if we ignore that "shadow" hanging over it there for a moment), and it's not until the very last ~0.70% of tracks that the ISP's go above 3.5 dBFS.

But back to that shadow (or cloud is perhaps more appropriate) hanging there, if one assumes they are 8dB "off" and adjust them "down" then they seem to match with the rest of the curve. Which is most likely correct.

Then again something else may be going on, I'll defiantly get back to this again later (with a updated/corrected csv for you guys) I just do not know when, I'f I'm going to waste a day on this again I might as well make sure it's correct, and that any sox errors/failures or foobar2000 issues can be handled, I'll also grab more data (like channels, codec (mp3 flac, wav, ogg, m4a, etc), bitdepth, frequency, and anything else I can think of/grab at the same time.
And if that anomaly rears it's head again, I'll make sure I track the filepaths so I can check the offenders if it's either damaged files/bad encodings or something else. (my guess is it was weird output from sox that my tool wasn't programmed to parse).

For those curious, it looked for "Pk lev dB" and "RMS lev dB" from the first stat and just "Pk lev dB" from the second stat. And only the first number was grabbed (for multichannel 2 or more numbers would be presented and intentionally ignored), any sox output that did not contain this info would get ignored.
Also if any Pk or RMS was NOT grabbed properly but still went into the csv then those will show up as either -999 or +999 values, and I see no such values, so it was either the wav passed to sox or sox itself that provided dodgy data.

But I will revisit this, I can not promise when though, I need to set aside a day, and if possible use a different tool than Sox (using foobar2000 as the "decoder" is very practical), peak, rms and some way to get ISP's or upsample and gather the peaks is all I need. Heck, even a upsampler with support for piping is all I need, I can code a peak scanner that do 32bit or even 64bit float peak scanner myself fairly quickly.

I could probably code something similar to sox's "upsample 4 sinc -a 40 -t 8k -24k" if I got some pointers/help though. (no idea how/where to start making a upsampler at all, any ANSI C code out there?).

Pathological example of a intersample peak, 11dB, discussion.

Reply #33
After reading about that topic a bit i found that iZotope offers a feature build into their limiter that has intersample detection for "True Peaks"
Since Alexej Lukin is member here and to my understanding is part of the iZozope team he may give us some idea how they reched their conclusion of the peaks in music hitting above 3dB. Interesting is their limiter now seems to be able to prevent this directly while mixing it hot.
http://www.izotope.com/support/help/ozone/...s_maximizer.htm
Is troll-adiposity coming from feederism?
With 24bit music you can listen to silence much louder!

Pathological example of a intersample peak, 11dB, discussion.

Reply #34
if that anomaly rears it's head again, I'll make sure I track the filepaths so I can check the offenders if it's either damaged files/bad encodings or something else.

I'd do this as a matter of course as it's needed to further investigate this anomaly, any other anomaly that might occur in future, and any track with an otherwise interesting result.

I could probably code something similar to sox's "upsample 4 sinc -a 40 -t 8k -24k" if I got some pointers/help though.

With the above in place I don't think it will be difficult to find and fix the problem, but otherwise see http://www.dspguru.com/dsp/faqs/multirate/interpolation (includes c-source) and http://www.itu.int/dms_pubrec/itu-r/rec/bs...;!PDF-E.pdf for the filter coefs.

Pathological example of a intersample peak, 11dB, discussion.

Reply #35
Since inter-sample overshoot is a problem for the analog stage of a DAC, what would happen at the corresponding stage in an ADC, assuming the same analog/digital waveform? Would it clip (producing different digital samples, implying that the samples can only be greated digitally), or would it just pick non-clipped samples? I guess that depends on if the ADC is essentially a text-book passive analog filter hooked up to a point-sampler, or if it is a multirate (digitally filtered using fixed-point arithmetics) design.

It seems that inter-sample over values are quoted with great accuracy and confidence, even though the exact reconstruction filter is not specified. Are you using an accurate approximation of the ideal sinc filter when discussing this? I guess that a different filter (e.g. lower bandwidth, non-linear phase) could produce fairly different results.

-h

Pathological example of a intersample peak, 11dB, discussion.

Reply #36
Are you seeing what I'm seeing? i.e. do an X-Y plot of column B against column D.

No! I Do not have MatLAB or whatever. (can't afford to pay thousands).

Off-topic, but take a look at GNU Octave. It's a free Matlab-clone, with the same syntax (so your old .m scripts still work).

Pathological example of a intersample peak, 11dB, discussion.

Reply #37
Since inter-sample overshoot is a problem for the analog stage of a DAC...
It's also a problem for the digital section, i.e. the over sampling + reconstruction filter.
Quote
...what would happen at the corresponding stage in an ADC, assuming the same analog/digital waveform?
No sane person digitally samples at levels near clipping - they leave sufficient headroom. Insane people who push the levels like that will probably get clipping, either due to the analogue electronics, the digital processing (oversampling ADC and digital anti-alias filter), or the fact that the peak happens to occur on-sample rather than between samples.

The concern is almost completely with audio that has been processed after the ADC to increase the apparent loudness.

Quote
It seems that inter-sample over values are quoted with great accuracy and confidence, even though the exact reconstruction filter is not specified.
The EBU R128 definition is pretty strict, though it doesn't necessarily give the absolute highest possible true peak.

Cheers,
David.

Pathological example of a intersample peak, 11dB, discussion.

Reply #38
Since inter-sample overshoot is a problem for the analog stage of a DAC...
It's also a problem for the digital section, i.e. the over sampling + reconstruction filter.


Inter-sample overs are also a problem for any sample-rate conversion process.  ASRC devices will produce many spurious tones when inter-sample clipping occurs.  The solution is to reduce the signal level before executing the SRC process.  In our DAC2 HGC converter, we reduce the signal level by 3.5 dB before the upsampling.
John Siau
Vice President
Benchmark Media Systems, Inc.

Pathological example of a intersample peak, 11dB, discussion.

Reply #39
After reading about that topic a bit i found that iZotope offers a feature build into their limiter that has intersample detection for "True Peaks"
Since Alexey Lukin is member here and to my understanding is part of the iZotope team he may give us some idea how they reached their conclusion of the peaks in music hitting above 3dB. Interesting is their limiter now seems to be able to prevent this directly while mixing it hot.
http://www.izotope.com/support/help/ozone/...s_maximizer.htm

This +3 dB figure is pretty arbitrary. I think that in practice maybe some 1% of mastered records will show this true peak level. The absolute maximally possible true peak overshoot cannot be specified precisely because it depends on the length and phase response of the DAC's reconstruction filter. If filters are long enough and the signal is specially crafted, there's no theoretic limit for the level of TP overshoot.

Pathological example of a intersample peak, 11dB, discussion.

Reply #40

It seems that inter-sample over values are quoted with great accuracy and confidence, even though the exact reconstruction filter is not specified.
The EBU R128 definition is pretty strict, though it doesn't necessarily give the absolute highest possible true peak.

Cheers,
David.


I vaguely remember something about "phase scrabling" peaks in radio transmission - i.e. messing with the phase so as to minimize peaks while keeping the average levels (or, effectively maximizing the average levels with minimal audible distortion).

Could this be done in a DAC/SRC application? If complexity/delay was of no concern, one could choose between a set of prototype filters that sounded equally good, selecting the filter that minimized intersample overs? Is not this a neater (although certainly overkill) solution than throwing away a few dB of SNR for all material?

Or, one could have a two-path filtering, switching to a cruder interpolation in those few segments where intersample overs are an issue (linear interpolation?)

-k

Pathological example of a intersample peak, 11dB, discussion.

Reply #41
I vaguely remember something about "phase scrabling" peaks in radio transmission - i.e. messing with the phase so as to minimize peaks while keeping the average levels (or, effectively maximizing the average levels with minimal audible distortion).

Could this be done in a DAC/SRC application? If complexity/delay was of no concern, one could choose between a set of prototype filters that sounded equally good, selecting the filter that minimized intersample overs? Is not this a neater (although certainly overkill) solution than throwing away a few dB of SNR for all material?


The simple solution is to reduce the signal amplitude prior to the SRC and DAC.  With a 24-bit data path, a 3 to 6 dB reduction in gain is of little consequence.  The 24-bit data path has a dynamic range of approximately 144 dB, and a loss of 3 to 6 dB should be insignificant.  Please note that this digital gain reduction must be made up after the DAC to achieve the same playback levels.  This means that there are higher demands on the performance of the DAC. 

Throwing away 3 to 6 dB of SNR is probably the best choice give the fact that DAC ICs are available with very good SNR specifications.
John Siau
Vice President
Benchmark Media Systems, Inc.

Pathological example of a intersample peak, 11dB, discussion.

Reply #42
I vaguely remember something about "phase scrabling" peaks in radio transmission
Yes. It's also sometimes called Phase Rotation. It's in Optimods and the like. It helps to make asymmetric waveform more symmetric.

One problem is that real-world clipressed audio has probably already been through one. Using this technique again might not generate lower peaks.

Quote
- i.e. messing with the phase so as to minimize peaks while keeping the average levels (or, effectively maximizing the average levels with minimal audible distortion).

Could this be done in a DAC/SRC application? If complexity/delay was of no concern, one could choose between a set of prototype filters that sounded equally good, selecting the filter that minimized intersample overs? Is not this a neater (although certainly overkill) solution than throwing away a few dB of SNR for all material?
I don't think any one would want it in-circuit all the time, and switching could introduce audible transients. There might be a way around it.

Quote
Or, one could have a two-path filtering, switching to a cruder interpolation in those few segments where intersample overs are an issue (linear interpolation?)
If you have no headroom, and it clips, the only choice is to clip it. There is no room even for linear interpolation.

You can use soft or hard limiting, or even a gentle AGC that only acts in the presence of inter-sample overs. I think this would sound worse than just clipping. Anything you do in a typical DAC will be very fast acting (not always desirable) because they introduce so little delay.

Like John, I think the better choice is simple to "throw away" a little headroom. I have never heard of the DAC noise floor being a practical problem (except in older systems without an analogue volume control - i.e. all digital volume control).

Cheers,
David.

Pathological example of a intersample peak, 11dB, discussion.

Reply #43
I'm performing some topic necromancy to let this thread come to a conclusion.

I recently re-ran the test here, I mostly did this out of my own curiosity.

But I also felt that John Siau, 2bdecided, bandpass, and others that contributed to this thread deserved some more answers.
The plot shown by bandpass showed several tracks with very high Intersample Peaks aka True Peaks.


Apologies that I do not have a tool others can easily use, but I'll describe what I did instead.
I created a tiny program that acted as a custom CLI encoder for Foobar2000, the only thing it did was pipe the audio from foobar2000 to sox and parse and log the output.
I also configured foobar2000 to put the artist/album/track/trackno/replaygain gain/peak info as part of the file name, so that my tiny tool could parse/add that to the log as well.

The latest foobar2000 version was used (v1.3.9), and all tracks was rescanned with the replaygain scanner in v1.3.9 with the scanner set to 2x oversampling for peaks (anything higher seemed to report the same peaks so 4x oversampling as suggested in EBU R128 specs is not possible). The peaks reported by SOX is thus used instead, I also confirmed visually using Adobe Audition 3 is the wave peaks do reach the oversample peak values reported by SOX.

The true magic in the test is due to SOX (big thanks to bandpass for giving me clues to how to use the the command line properly to do this).
The SOX command line used is the following: sox.exe - -n stats upsample 4 sinc -a 40 -t 8k -24k stats
The audio piped from foobar is piped as 32bit (with wav header).

I had to run 4 instances dividing the workload of 6528 tracks, otherwise it would have taken over 16 hours to run this scan; instead it only took around 6 hours (causing 80% load on a 6 core CPU).

Here are some stats from a little helper program I created to crunch some numbers from the CSV log:

Code: [Select]
6528 tracks.
Avg. Peak -1.24 dBFS (Min -24.55, Max 0.00)
Avg. ISP -0.56 dBFS (Min -25.71, Max 9.54)
ISP/Peak Delta -1.79 dB (Min -49.06, Max 9.54)
RMS -16.79 dBFS (Min -42.05, Max 0.00)

24.08% (1572) ISP <-1 dBFS
75.92% (4956) ISP >-1 dBFS

18.57% (1212) ISP -1 to 0 dBFS
43.49% (2839) ISP 0 to 1 dBFS
8.70% (568) ISP 1 to 2 dBFS
1.53% (100) ISP 2 to 3 dBFS
0.57% (37) ISP 3 to 4 dBFS
0.17% (11) ISP 4 to 5 dBFS
0.23% (15) ISP 5 to 6 dBFS
0.46% (30) ISP 6 to 7 dBFS
1.44% (94) ISP 7 to 8 dBFS
0.75% (49) ISP 8 to 9 dBFS
0.02% (1) ISP 9 to 10 dBFS
0.00% (0) ISP 10> dBFS


Immediate conclusion is that 10 dB headroom is needed to avoid any Intersample Peaks or True Peaks from causing distortion (if the distortion is audible or not is a different discussion).

A few notes on the tested tracks. They are my personal collection, collected over many years. They are  MP3, Ogg, AAC, FLAC formats/encodings. Some of it is mainstream. Some of it Iv'e composed myself (3 published albums plus some extra released and non-released stuff).
Some of the tracks are not main stream, rare or not purchaseable. Some have been ripped by me from games (as no official soundtrack existed).
In other words it's a weird even eclectic mix of tracks and sourced stuff. Thus it may or may not be representative of the common listener. Then again people or audiophile or engineers/techs that know or worry about intersample peaks are that common either, certainly not mainstream.

I'll list specific problem tracks if they can be found on the net (legally) somewhere or instructions on how/where they can be found (you may need to rip it from a game/source yourself for example). If a series of related tracks have high True Peaks then I'll list the artist/source/album so you can search for it yourself. Also note hat due to how releases are there may be differences on which year/region the release was made in/for.
A few years ago I copied/encoded all my CDs and recycled all my CD covers and CD inlays and threw the discs in the trash (can't be recycled, at least not at the time) so I no longer have proof of purchase for them (in retrospect kinda stupid, I could have put the CD inlays in a box somewhere) so I'm not giving a full list of the tested collection because of that (sorry). Also not all of them are FLAC (many years ago I ripped most of my music, I never got around to re-ripping it all as FLAC, drive space was not that cheap back then) so the encoding may be the source of the intersample peaks rather than the mastering of the track, I'll mention if this is the case.

Code: [Select]
The only track that had a True Peak above +9 dBFS:
Jayce and the Wheeled Warriors, opening theme (mp3) +9.54 dBFS, -11.88 RMS
I can't recall where it's from, I think I ripped this from from a Youtube video. I watched this show when I was young, so it's in my collection for nostalgia reasons. Can't share the track for legal reasons, sorry. But it should be searchable on Youtube so try there first (you'll also be treated to a cheesy 80s animated intro as well).


Code: [Select]
Quite a few tracks have surprisingly high intersample peaks that are above +8 dBFS.
Legendary standup comedian George Carlin's performances/recordings/albums "Back in Town", "Complaints And Grievances", "Playin With Your Head" all have true peaks at/above +8.0 dBFS, the RMS varies from -17 dBFS to -21 dBFS, so even if EBU R128 or ReplayGain was used the true peaks would still be above 0 dBFS afterwards.

Michael Land's Monkey Island III OST and RockStar Games Grand Theft Auto Liberty City Stories OST also have very high true peaks, their RMS is higher than George Carlin's stuff for various reasons (talking vs music being one, but also the years they where mastered, and the way they where mastered).
Liberty City Stories OST should be somewhat available (check WIMP, Spotify, iTunes, Google Play, Amazon etc) but I'm unsure if they match the game audio rip or not. (the in-game radio channels are sometimes mastered differently from the individual tracks).

Frank Klepacki's Blade Runner The Game soundtrack also show similar high true peaks.
His website is at http://www.frankklepacki.com/ you can find some tracks in the flash player on the page http://www.frankklepacki.com/portfolio/game-BR.html
You'll have to rip the tracks from there (or use a live True Peak meter), the rest of the tracks you'll have to rip/convert from Blade Runner The Game itself.


The majority of the other tracks with true peaks above +3 dBFS was also the from the same collections as those above.
(This explains the high anomaly on bandpass's plot/chart that 2bdecided was curious about.)


Code: [Select]
A few other collections are above +3 dBFS though.
A few tracks from the "Trilogy" album by "Carpenter Brut" (mp3) for example (check the various outlets or bandcamp or Youtube https://carpenterbrut.bandcamp.com/album/trilogy )
A few tracks from Yuki Kajiura's "Noir" soundtrack OST2 disc (mp3). Eminem's album "Relapse" (mp3). Savant's album "ISM" (FLAC)



Code: [Select]
Tracks with True Peaks in the +2 dBFS to +3 dBFS range.
Eminem's album "Relapse", Type O Negative's "World Coming Down" album, Pendulum's album "Immersion", Savant's "ISM" album, Vader's "XXV" album, Ramin Djawadi 's season 1 "Game of Thrones" soundtrack). Some of this stuff is more mainstream so tests should hopefully be more easily reproducible by others.


A note to John Siau:
Your DAC headroom of +3.5 dBFS seems to be a pretty good choice. Although in some few cases like Eminem's album "Relapse" it's just barely enough. And with artists that like Savant (ISM album) or Carpenter Brut (Trilogy album) they actually break that +3.5 dBFS headroom. If that is audible or not is another matter. Savant's ISM album actually has true peaks above 0dBFS as flac. The title track "Ism" from the album has a true peak of +4.20 dBFS, though the RMS is at +8 dBFS so EBU R128 or Replaygain would bring the true peak to below 0 dBFS in this case luckily, the track Mystery has a true peak of +3.59 dBFS.


Closing notes:
For the most part it seems that MP3 or AAC lossy encoding is a common trend among these.
The only exception is Savant's album ISM which has very high true peaks even as lossless FLAC, the current official place to get the album seems to be at http://savantofficial.bandcamp.com/album/ism the official website (and the shop there) seems to have issues currently, the preview there is lossless though no idea if the true peak is the same or worse with that though.

For those curious the particular track in question clips a lot and has lots of 8bit type of sounds in it. The rest of the album is similar.
The track Ism (from thew album of same name) if upsampled from 44.1kHz to 192kHz ends up with a max peak at +5.41 dBFS and over a million possibly clipped samples (or so states Adobe Audition at least.) That track be be worth using as a test for DACs or upsampling or True Peak testers as it satisfies the criteria of being in the wild and a real world example (rather than the artificial one as linked to in the first post of this thread).



I hope this rather long post is of some use to people out there.
It would be nice if someone could make a True Peak scanner tool (maybe base it around EBU R120 and libsoxr?) either as a foobar2000 plugin or a standalone tool (that you can easily pipe audio to) and log True Peak and RMS and other stats in a convenient file that people could upload/put online.


As I said earlier, if one can hear True Peaks above 0 dBFS or not is another discussion, but I can confidently state that True Peaks above 0 dBFS certainly do not improve the sound quality. And when music do exist in the wild that some of the current very high headroom DACs do not handle then that is at the very least of interest for further study. Should DACs have +6 dBFS headroom, or maybe +10 dBFS headroom? A +10 dB headroom would handle my entire collection without any clipping, then again I use Foobar and Replaygain so peaks never go that high anyway.
Except for those George Carlin recordings, if I apply replaygain to those then for one of them the gain is +8.19 pushing the true peak from +5.19 to +13.6 dBFS, now the digital peak is at -2.94 dBFS so at most the applied gain would be +2.94 which would make the true peak +8.13 which is still a lot. Would +10 dB headroom be enough for "all cases" then?


Note! As I've changed my digital "life" some time ago (new email etc. yay almost 0 spam now) I'm not using this email/account anymore. Hence the reason I wanted to follow up on this thread. If may be posting under a new nick in the future (no idea when, I've hardly been active at all on HA for a long time now, we'll see I guess).
So this will probably be the last post by this user/nick.


Pathological example of a intersample peak, 11dB, discussion.

Reply #45
Hi Rescator,

Good to see you back, even if it is a short stay. That's an impressive test you have run. At first I assumed the 32bit output was floating point, but now I wonder if it is fixed point because...

The log you posted says that the max peak is 0dBFS. I guess the true max peak of your mp3s (ignoring inter sample peaks) is higher than that, but the process you used clipped the decoded mp3s before checking for peaks. My guess is that this will increase the discrepancy between peak and intersample peak. I agree it is the way most people would do it, but it is making the situation worse in a completely avoidable way.

I worry that some of your extreme examples are simply mp3s that having been gained too high, rather than genuine examples of the kind of intersample overs that the lossless original would generate.

Cheers,
David.


Pathological example of a intersample peak, 11dB, discussion.

Reply #47
I did not know that about Sox and 32bit float, thanks bennetng for that.

However in this case it only means that at worst there is 24bit of precision, considering that all the audio is 24bit at most then that is not a issue really, the upsampling will reveal any ISPs regardless.

2bdecided your questions made me dig deeper in the tested collection and I found some interesting things...

First of all none of the mp3 have had any gain adjustments, I always preserve the original file and use tags (ReplayGain) instead.

Jayce and the Wheeled Warriors opening theme is a mp3, and taking a look at it in Adobe Audition 3 (it's mp3 decoder being used) it turns out it is Mono and 16KHz sampling frequency, the spectral display shows me frequencies are only up to about 5KHz (looks like some form of brickwall filter was applied around there), there are also vertical "lines" a few places in the 5kHz-8kHz range.
Loading/decoding to 16bit and upsampling to 176kHz showed no ISPs above +1.05 dBFS and same happen when decoding to 32bit float first.
This made me question the mp3 decoder in Audition 3, so I used Fooar2000 (same converter as in my mass scan) and the ISP ended up at +1.27 dBFS. Next I tried upsampling to 64kHz (4x 16kHz) in case 176kHz was squashing the ISPs somehow, the results was almost identical though.

Then I scratched my head and looked at the sox parameters I used and I realized a few things.

#1. I am/was ignorant of certain Sox parameters, just using parameteres without understanding them is bound to cause mistakes.
#2. I think I used the wrong parameters for the right reason.
#3. Mimicking EBU R128 is wrong, it's True Peaks are after filtering, my concern are ANY True Peaks/ISPs, the kind that a DAC need to handle rather than wat is audible.
#4. I did not check my tools/setup from 2013 and simply re-used it, a leftover compensation for a gain reduction of 12 was used but the gain reduction was never actually used with the sox parameters so this tainted the results. (imagine a major facepalm here).


So I'm gonna rerun the whole thing one more (a third) time.
And I'll be dropping the lowpass/highpass stuff or trying to match EBU R128 filtering. After all a DAC has to handle/process the whole frequency spectrum anyway.

I will test foobar2000 v1.3.9's new resampler which is slow but good against Sox and see which is better/fastest.
If the speed is very similar I may just drop Sox and let Foobar do the upsampling, in that case I'll recode my tool to not use Sox at all but instead scan the floating point data itself directly (all I need to do is check for highest (now upsampled) peak and gather RMS stats.

If it turns out Sox is still the better option then I'll have to go back to my old idea of applying a negative gain (of about -12 dB) to sox so it won't clip any peaks when upsampling and then compensate for that in my stats gathering tool by adding +12 to the results (and make sure that one of the two steps are not forgotten again).

Once done I'll post again here and I may start another thread with just the resulting anonymized CSV (I'll try to add a plot graph as well) so people can see the spread and the percentage of certain high peak values. Maybe others will add their own results to such a thread (would be nice to see how frequent high ISPs are in peoples collections).


One would think that with so many years of programming experienceI would not make "beginner" mistakes like this. I'm sure some professional statisticians out there are deservedly laughing their asses off right now though.

It just goes to show that unless your test is correct then the results means crap all. Check, check then triple check. Study anomalies individually and fiddle with the test to see if the anomaly was caused by the data tested or your way of testing. And question/never trust yourself, assume you did make mistake even if you are sure you didn't.


Anyway I'm sorry for "wasting" your folks time, I'll be back in about half a day with some correct and proper stats.

Savant's Ism though is still the track with the highest True Peak I've seen so far that is a "normal" in the wild music track/recording, so at least "something" came out from this mess.


Pathological example of a intersample peak, 11dB, discussion.

Reply #48
Now things look a bit more "normal".

Code: [Select]
6486 tracks.
Avg. ISP -0.78 dBFS (Min -24.55, Max 5.75)
RMS -16.87 dBFS (Min -42.05, Max -5.88)

27.03% (1753) ISP <-1 dBFS
72.97% (4733) ISP >-1 dBFS

20.35% (1320) ISP -1 to 0 dBFS
34.84% (2260) ISP 0 to 1 dBFS
13.61% (883) ISP 1 to 2 dBFS
3.25% (211) ISP 2 to 3 dBFS
0.62% (40) ISP 3 to 4 dBFS
0.25% (16) ISP 4 to 5 dBFS
0.05% (3) ISP 5 to 6 dBFS


After testing the new dBpoweramp/SSRC resampler in foobar2000 which is very slow (would take me two days to run the scans) and the PPHS resampler (with and without Ultra) I found that using PPHS without ultra enabled let me run each scan in about 2-3 hours (the processing was split into 5 parts using 5 cores so that's about 3 hours times five if it was on a single core).
There is some difference between the resamplers but they are minimal when looking for Inter-Sample Peaks.

PPHS (no ultra) set to resample to 192000 Hz was used as setting. Peak (ISP) and RMS was then scanned using my own tool.
Of particular interest is that foobar2000 v1.3.9's Replay Gain scanner set to " Peak scan oversample factor : 4 " is actually pretty close to the upsampling and then checking for highest peaks as I did. Unfortunately it and foobar does not show the peak relative to dBFS, nor is RMS shown (which I always find useful as RMS(Z) has no loudness curve).


I had to remove a few 5.1 tracks that produce invalid values (not checked where the cause was for that, I might later, I just dropped them from the test results instead).
Likewise a few tracks ended up with +infinity in the results and the filename was mangled, here I suspect that foobar's convert process chocked on the filename for some reason (I was using/missusing it to pass along album/artist/track/title and replaygain peak and gain details) so I had to remove those results as well.

And for the record I did do a full ReplayGain rescan of all the files using factor 4 (previously I had only used factor 2). A factor of 4 added a little more precision to the True Peak detection it seems for certain tracks. Also of note is that ReplayGain is "faster" than using a tool like mine combined with foobar's convert with PPHS as the overhead is way less.


Now back to the results...
Of the 6486 tracks in the result none have a True Peak/Inter-Sample Peak above +6 dBFS.
Carpenter Brut's Trilogy, Eminem's Relapse and Savant's Ism albums are all in the +3 or higher area.
As mentioned before Trilogy and Eminem are mp3's and Ism is FLAC so very high iSPs are not restricted to lossy only.

I think I'll just end these tests here.
I'll just end with the fact that in my tested collection of music none went above +6 dBFS, so a DAC with 6dB headroom would have no clipping with any of my music.
I'd also like to note that using ReplayGain the majority of the tracks with high ISPs do usually get a negative gain adjustment so the tracks usually end up below 0 dBFS,
and with the new scan factor in foobar2000 1.3.9 combined with clipping protection enabled that will probably never be a issue.

But for DAC designers/audiophiles it may be of interest that there exist actual music in the wild with ISPs as high as +5.75 dBFS which could be of some concern (how would a DAC or speaker handle that?).

 

Re: Pathological example of a intersample peak, 11dB, discussion.

Reply #49
More necromancy. I was curious enough to test a little bit, and found a +11 dBTP measurement in one of my CDs using one of the algorithms. Merzbow, not surprising - so you may wonder if all clipping is part of the music the way the artist intended it :-o

But I also tested a selection of albums with positive album gain, because here is where the peak value stored will limit upon playback. (The Merzbow album has an album gain of -21 dB, so the +11 won't bring it across zero.) General findings: a couple of dB suffices on everything I tested. Which excluded classical music (where there is too much that peaks too low, it would take much more work), test signal CDs, HDCDs and pre-emphasis CDs.  All were lossless CDs/Bandcamp downloads in the FLAC format.


The "positive gain" selection: Initially I set some criteria for entire albums, but ended up searching up tracks with high peaks from albums with positive RG album gain (calculated by fb2k, new algorithm), which is where clipping prevention according to peak would kick in.  I ended up scanning some 163 tracks, but that was because I was too lazy to remove stuff.
Then I picked a handful of tracks and scanned them with several "True peak" settings using foobar2000 1.4beta13. 
Music: Lots of prog.rock and the like. Diamanda Galás, Pink Floyd: "Mother" from the Shine On version of The Wall, The opening track from Demon's "British Standard Approved", Jonas Hellborg - and Bobby McFerrin (the "Voice" album).
Some general remarks:
* Turns out that in my setup, the SoX resampler does not find any intersample peaks at all.  Maybe it could have something to do with my setup, using SoX resampler to get rid of some odd sample frequencies.  Anyway, SoX excluded.
* Auto 2x/4x/8x return precisely the same figures. 

Results:
* Nothing in this selection went above +1.30 dBTP.  That track is White Willow: "John Dee's Lament" from their debut Ignis Fatuus, RG track peak was 0.98something.
* Other were close to or above +1 dBTP: McFerrin, Demon, Hellborg.
* Those who are worried about their Pink Floyd: +0.48 dBTP.
* Differences between algorithms are smallish, less than 0.1 dB.

Learnings: a couple of dB seems to take care of everything I tested, and algorithms make little difference on these tracks.


Then the LOUD albums: 12 albums (112 tracks) with album gain -16.00 dB (fb2k, new algorithm) and below.  Quite a lot of industrial/noise (three involving Merzbow),  a couple of black and death metal albums, and the infamous Stooges "Raw Power" reissue.  All tracks 16/44.1 but one track 24/44.1 in a Bandcamp purchase; that didn't turn out to matter.
* "No oversampling" track peaks overview: Six of twelve albums at full (.9999969 or 1 for all tracks).  Four more albums entirely within -0.12 dB.  The last two a bit particular: Deathstorm: We are Deathstorm, all at -0.2 dB and one -0.48 dB - and then a 41 minutes Merzbow concert in a single track, at -1 dB.
* Every track "above" the Stooges: Raw Power peak have Deathstorm or Merzbow involved.  Only here the choice of algorithm makes big differences in numbers:  The "worst" is Merzbow: Venereology, and here a track ranges +7.15 dBTP to +11.30 dBTP  depending on algorithm (the highest using dBpoweramp/SSRC - both are still way short of the album gain of -21.76 dB though).
* Stooges: Raw Power: variation among tracks in the interval [+1.95 dBTP, +2.98 dBTP] for PPHS default. Variations among algorithms: from PPHS default and and .3 to .4 dBTP upwards (PPHS ultra, dBpoweramp/SSRC), with "auto Nx" in between.
* None of the albums get album peaks below +1.77 dB (PPHS, both default and Ultra) to +1.84 (auto Nx). 
* Two to three of the 112 tracks stay below 0.  All Deathstorms. The "third" of these range -0.02 dBTP to +0.14 dBTP.

Learnings: All these albums bump up ~ 2dB or more, up to a whopping 11 dB. Only on one album did the algorithm really matter for the number - but remember again, the large negative RG album gain will more than compensate.