Topic: pfpf v0.1 (Read 22742 times)previous topic - next topic

0 Members and 1 Guest are viewing this topic.
• Axon
• Members (Donating)
pfpf v0.1
13 January, 2008, 10:35:38 PM
http://audiamorous.blogspot.com/2008/01/pf...dynamic_13.html
Quote
The dynamic range of a selection of music is dependent on both  estimating the time-varying loudness of the music and the timescale  used for loudness evaluation. I propose a numerical method of  estimating dynamic range that satisfies those dependencies using a  modified ITU-R 1770 loudness filter and three moving windo[/i]ws to  estimate loudness across three different timescales. The goal is to  more accurately measure and compare dynamic range between different  music genres and different masterings and processing techniques for the  same music.

Summary of algorithm:
•     Apply ITU-R 1770 filters to convert amplitude to instantaneous loudness.
•   Estimate loudness across three different timescales by computing 10ms  ("short term"), 200ms ("medium term") and 3000ms ("long term") windowed  RMS power.
•     Decouple timescales by scaling 10ms loudness by 200ms loudness, and 200ms loudness by 3000ms loudness.
•     Threshold loudness at each timescale to remove silence (optional)
•     Compute histogram for each loudness estimate
•     Dynamic range = range between 50th and 97.7th percentile, for each timescale
[/li][/list]I've been kicking this around for almost a year, but I finally broke down and wrote the thing for real in an afternoon last November (it's been extensively tuned since then). The recent discussions about dynamic range have forced my hand, because so many important things were touched upon, and really, you can think of pfpf as an extremely elaborate reply to that topic.

This is a better way to measure dynamic range, for the following reasons:
• It measures dynamic range as a ratio of loudnesses. Peak-to-average cannot claim this (it is fundamentally a comparison of two different units). ReplayGain comparisons cannot claim this.
• It uses a real loudness model (flawed though it is) for the basis of loudness estimation. Waveform comparisons (especially for loudness-war-related discussions) are fundamentally flawed for this reason - what you get out of Audacity has a relatively tenuous connection to real perceived loudness.
• Dynamic range is estimated across three different timescales - 3000ms, 200ms, and 10 ms - and each scale is fully decorrelated from each other. So pfpf can tell between when a quiet passage has a loud transient, or when a loud passage has a sudden pause. The timescales are configurable.
• It uses a percentile approach on a histogram for estimating dynamic range, instead of min/max/avg. This makes the technique much more resilient to differences in mastering and medium; pops and ticks should not affect results, nor should small bits of digital silence, like in greynol's Tool example. (Yes, greynol, you can distinguish ppp from fff now.) The percentiles are configurable.
• Background noise (when no music is playing) can be masked with a fixed threshold, so that silence won't pile up on one side of the histogram distorting the numbers, and the results should be invariant of any extra silence padding before/after music (this should make CD/vinyl comparisons a lot easier). The threshold is configurable.

• greynol
• Global Moderator
pfpf v0.1
Reply #1 – 14 January, 2008, 01:47:24 AM
This makes the technique much more resilient to differences in mastering and medium; pops and ticks should not affect results, nor should small bits of digital silence, like in greynol's Tool example. (Yes, greynol, you can distinguish ppp from fff now.)
Easy now, killer.  There were no small bits of digital silence in the track I presented.

Anyway, I look forward to checking this out.

Great post!
13 February 2016: The world was blessed with the passing of a truly vile and wretched person.

• Axon
• Members (Donating)
pfpf v0.1
Reply #2 – 14 January, 2008, 02:51:02 AM
OK, replace "digital silence" in that sentence with "milliseconds of extremely quiet sound in the middle of a loud passage".

• greynol
• Global Moderator
pfpf v0.1
Reply #3 – 14 January, 2008, 11:42:44 AM
OK, replace "digital silence" in that sentence with "milliseconds of extremely quiet sound in the middle of a loud passage".

A 4-second window revealed an average RMS power of -57.4dB!
13 February 2016: The world was blessed with the passing of a truly vile and wretched person.

• Axon
• Members (Donating)
pfpf v0.1
Reply #4 – 14 January, 2008, 12:55:55 PM
Oh. My bad. Well, run it through pfpf and lemme know what you see

• Vitecs
pfpf v0.1
Reply #5 – 15 January, 2008, 08:50:53 AM
Tried to play a little with it.

1. Values too small  For pop music, even with with a good dynamic.

2.  I made two files, 1k Sine tone with following distribution:
File1: -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec;
File2: -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec;

So, File2 is just two concatenated File1s.

I expect to have equal reports but:

File1:
ITU-R1770 loudness: 1.355045 db
Long term dynamics: 4.244141 db
Medium term dynamics: 4.674053 db
Short term dynamics: 0.024499 db

File2:
ITU-R1770 loudness: 1.355051 db
Long term dynamics: 3.323366 db
Medium term dynamics: 4.674758 db
Short term dynamics: 0.024645 db

Difference in "Long term dynamic". Is it predictable and OK?

• Axon
• Members (Donating)
pfpf v0.1
Reply #6 – 15 January, 2008, 05:35:34 PM
Tried to play a little with it.

1. Values too small  For pop music, even with with a good dynamic.
The numbers are not directly comparable to other metrics. You can't compare them to RG numbers or peak-to-average numbers. You need to evaluate them on their own.

That said, a lot of the constrained range is because of the percentiles I'm choosing. For the long term time scale, I could make a strong case for ignoring the 50th percentile entirely, and defining the range as between, say, the 5th and 95th percentiles. I suspect the same case could be made for the shorter timescales.

Changing from 0.5-0.977 to 0.05-0.95 would essentially double the results if the histograms are normal (and the medium/short time scales are).

Quote
2.  I made two files, 1k Sine tone with following distribution:
File1: -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec;
File2: -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec, -12dB 5sec, 0 dB 5sec;

So, File2 is just two concatenated File1s.

I expect to have equal reports but:

File1:
ITU-R1770 loudness: 1.355045 db
Long term dynamics: 4.244141 db
Medium term dynamics: 4.674053 db
Short term dynamics: 0.024499 db

File2:
ITU-R1770 loudness: 1.355051 db
Long term dynamics: 3.323366 db
Medium term dynamics: 4.674758 db
Short term dynamics: 0.024645 db

Difference in "Long term dynamic". Is it predictable and OK?

The change in long term dynamics is expected. The basic problem is that the loudness computations maintain state over several seconds of music, and at the start of the file, that state must be initialized to something. There are three options for initialization:
• Set it to zero
• Initialize it with the music at the very start of the file
• Initialize it with the music at the very end of the file
Choosing #3 would effectively stop the problem you are seeing with differing dynamics measurements, because you're essentially treating the .wav as a giant loop containing a periodic signal, and repeating the signal will not change the results any. But I would argue that such a situation simply does not exist with real-world music, and it is not as important to tune for it as you think.

#1 means that every analyzed file starts from a long-term volume of zero - and I believe that's wrong for most situations where music is played, when loudness is fairly equalized with what was played beforehand. The same problem exists for #3 - what happens if the music ends at maximum loudness, but starts very quietly? The loudness will incorrectly be initialized to a very high level. #2 avoids this issue, but results in the issue you see, where repeating the signal yields a different result.

---

In theory, a gated sine wave should have a dynamic range of zero, because the silence is masked in any listening environment. That is, the dynamic range of a recording is connected to the dynamic range of the listening environment. In reality, the thresholds should probably be raised from -80db because they are grossly generous to the listening environment.

Also, I think I see a bug in the histogram calculations that generate the long term and medium term dynamics calculations. If you look at the histograms and the percentile lines, they are way out in the middle of nowhere; they're interpolating between the high points on the histogram, when they probably ought to be clamped somewhere. I'll look into a nice way of fixing this.

Thank you for testing this!  Anybody else interested?

• 2Bdecided
• Developer
pfpf v0.1
Reply #7 – 16 January, 2008, 04:47:11 AM
This is very interesting. Most of it makes perfect sense, but can you explain this part in a little more detail please...

•     Decouple timescales by scaling 10ms loudness by 200ms loudness, and 200ms loudness by 3000ms loudness.

...I think I know what you mean, but I'm not 100% sure.

Cheers,
David.

• Axon
• Members (Donating)
pfpf v0.1
Reply #8 – 16 January, 2008, 12:44:57 PM
This is very interesting. Most of it makes perfect sense, but can you explain this part in a little more detail please...

•     Decouple timescales by scaling 10ms loudness by 200ms loudness, and 200ms loudness by 3000ms loudness.

...I think I know what you mean, but I'm not 100% sure.

Cheers,
David.

Loudness, as a perceptual quality, is scale-dependent. It can vary across very large timescales (seconds to minutes), and it can vary across very short timescales (milliseconds), and the variation can be unrelated between timescales. This is important information that should be captured numerically, but capturing short term loudness also captures the long term loudness - one needs to isolate that out in order to estimate the short term dynamic range accurately.

Example: Say you have two recordings of two guys in a quiet field. One guy  is speaking into the microphone at a varying volume from 1 meter away for a bit. The other guy yells at the microphone from 100 meters away after that, saying the same things that the first guy said, at the same volumes. Clearly, the overall, or long-term, loudness changes dramatically between the different speakers, and the two loudnesses are fairly constant. But at a smaller timescale, they're both guys who are yelling the same thing. If you remove the large-scale loudness difference, the short-term loudness varies dramatically (alternating between yelled words and silence), and the variation is going to be the same between the two speakers. In other words, the long term loudness differs greatly between the two speakers, but the long term dynamic range is very low; but the short term loudness, when equalized for long term loudness, is the same between the two speakers, and the short term dynamic range is higher.

In comparison, a simple program-wide loudness estimation at a small timescale, like 50ms, with a percentile measurement (50th for ITU-R1770, 95th for ReplayGain) would lock onto either the loudness of the closer guy, or average out at some ill-defined region of loudness that doesn't correspond to any actual loudness in the recording. This is correct for a program loudness equalization system, which those systems are designed for, but for estimating dynamic range, estimations of this kind lose meaning.

However, the same kind of problem exists with peak-to-average measurements, because it also uses a program-wide loudness estimation. And those are used to estimate dynamic range.

pfpf solves this by scaling shorter-term loudness by longer-term loudness. RMS power is first calculated in the size of the smallest blocks (10ms). This represents the loudness at the short term timescale. Then it holds two moving window of the last several 10ms blocks - one window is for 200ms, the other window is for 3000ms. Computing RMS power for these windows yields the medium term and long term loudnesses. Then, I divide the 10ms loudness by the 200ms loudness, and the 200ms loudness by the 3000ms loudness. This is how I claim to decouple the timescales. It's hokey, but it seems to work ok.

---

On a different note: Is Blogger a crappy way to publish this? Should I put this up on a different site, or just throw up my own HTML file, or make a PDF?

• Axon
• Members (Donating)
pfpf v0.1

• carpman
• Developer
pfpf v0.1
Reply #10 – 19 January, 2008, 01:54:59 AM
Looks like a pretty interesting project. Was going to give it a go but put off due to 90MB download of LabVIEW 8.2.1 Run-Time Engine -- then thought - no big deal -- but then was put off by the grand registration process just to download the runtime environment.

It could be just me being lazy, but I guess I've got used to apps being less of a deal to run.

I wonder if this in part explains the low response to what I would have thought (due to the whole loudness war issue) is a pretty hot topic on HA.

Just a thought.

I'm not familiar with LabView -- do a lot of applications use it?

C.
PC = TAK + LossyWAV  ::  Portable = Lame MP3

• edwardar
pfpf v0.1
Reply #11 – 19 January, 2008, 03:53:39 PM
Hi, Just wanted to say I'm really interested in this!
I've downloaded all the stuff (just enter junk details for the labview runtime), but haven't had time to check stuff out yet. Will post back soon.

Ed

• Axon
• Members (Donating)
pfpf v0.1
Reply #12 – 20 January, 2008, 04:42:44 PM
Looks like a pretty interesting project. Was going to give it a go but put off due to 90MB download of LabVIEW 8.2.1 Run-Time Engine -- then thought - no big deal -- but then was put off by the grand registration process just to download the runtime environment.

It could be just me being lazy, but I guess I've got used to apps being less of a deal to run.

I wonder if this in part explains the low response to what I would have thought (due to the whole loudness war issue) is a pretty hot topic on HA.

Just a thought.
Oh, yeah - I guess that could be a downer.

Here's a direct link to the small runtime installer - it's designed for web browser integration but I think it has enough to run pfpf. It's 23MB and doesn't require registration.

http://ftp.ni.com/support/softlib/labview/...vruntimeeng.exe

Otherwise, I could build an installer .exe that has pfpf and the runtime included, but then the download size jumps from 2MB to 64MB (!).

Quote
I'm not familiar with LabView -- do a lot of applications use it?

C.

It's used in a wide variety of scientific and engineering applications, but it's generally used more for institutional use than end-user use. (One notable exception is Lego Mindstorms NXT, albeit in a radically altered form.) I use it because it's the best tool I have available for the job.

(Full disclosure: that's largely because I work for NI.)

• carpman
• Developer
pfpf v0.1
Reply #13 – 20 January, 2008, 07:45:10 PM
Oh, yeah - I guess that could be a downer.

Here's a direct link to the small runtime installer - it's designed for web browser integration but I think it has enough to run pfpf. It's 23MB and doesn't require registration.

http://ftp.ni.com/support/softlib/labview/...vruntimeeng.exe

Otherwise, I could build an installer .exe that has pfpf and the runtime included, but then the download size jumps from 2MB to 64MB (!).

64MB is better than the full 90MB + registration.

Also, if you have the space (1) a 64MB installer.exe could be one option, along with (2) the standalone program (2MB) as well as (3) the other alternative 23MB (browser integration) runtime which doesn't require registration

It's used in a wide variety of scientific and engineering applications, but it's generally used more for institutional use than end-user use. (One notable exception is Lego Mindstorms NXT, albeit in a radically altered form.) I use it because it's the best tool I have available for the job.

(Full disclosure: that's largely because I work for NI.)

Thanks for the info.

As for me -- if it was a 64MB all in one job (runtime + program) I'd download it and give it the test run it surely deserves.

Do you think this program would be helpful in working out audio levels for a release? i.e. if I was attempting to get db levels right across tracks of varying compression (not in the lossless/lossy sense) -- currently I use wavgain and then my ears for fine tuning -- can you see your app having a role in this kind of process?

C.
PC = TAK + LossyWAV  ::  Portable = Lame MP3

• Raiden
• Developer
pfpf v0.1
Reply #14 – 03 February, 2008, 09:23:26 AM
Thank you very much for this great tool.
One minor issue with the UI: I could not adjust it to smaller resolutions like 1024x786.
Also, one feature request:
http://img211.imageshack.us/img211/8289/declipperjd8.png
http://img87.imageshack.us/img87/3199/declipper2cz2.png
It's the declipper from Izotope RX which features a so called "histogram of waveform levels" where you can see the sample distribution over the bitrange. However it is very limited as it just shows values from 0 until -8 dB and does not have a horizontal scale.
Looking at an improved version would help estimating the amount of clipping.

I took some albums and calculated their values (formatted as csv):
Code: [Select]
`;ITU-R1770;Long term;Medium term;Short termAFX - Hangable Auto Bulb;-7.788082;4.36652;3.337994;6.887517Aphex Twin - Come to Daddy;-10.227442;5.942758;3.847455;7.825321Aphex Twin - Drukqs;-9.917538;7.071786;4.388686;6.813219Aphex Twin - I Care Because You Do;-7.774041;4.54145;3.247972;6.422187Aphex Twin - Richard D James;-6.508191;6.523617;3.569301;6.64566Aphex Twin - Windowlicker;-5.959195;4.973649;3.996707;7.377451Autechre - Peel Session 2;-10.18724;5.11148;4.102982;6.688941Boards of Canada - Music Has the Right to Children;-11.859337;3.819474;3.979695;8.030439Daft Punk - Discovery;-9.502911;3.125146;3.895449;8.225183Daft Punk - Human After All;-4.745425;2.467026;2.417565;6.070163Depeche Mode - Playing The Angel (CD);-4.63066;5.671811;2.821454;5.167589Depeche Mode - Playing The Angel (vinyl);-13.377421;5.556798;2.950576;5.357894Kraftwerk - Aerodynamik;-9.119114;2.115024;2.772039;9.152209Led Zeppelin - Led Zeppelin IV;-14.694527;3.665771;2.850409;4.360807Miles Davis - Live around the World;-12.152856;6.635845;4.818088;7.318425Palais Schaumburg - Palais Schaumburg;-14.740098;3.944125;4.198906;8.591792Pink Floyd - Dark Side of the Moon;-11.909041;8.187197;3.667342;5.059493Pink Floyd - Wish You Were Here;-13.876637;5.944166;3.749432;5.118362Rage Against the Machine - Rage Against the Machine;-8.409433;3.806114;3.175354;6.820433The Orb - Orbus Terrarum;-11.880174;6.653551;3.57129;5.094658Underworld - 1992-2002 [JPN promo] (disc 1&2);-7.115458;2.978977;2.716756;6.985416Underworld - A Hundred Days Off;-8.154587;2.849434;2.973633;6.86929Underworld - Born Slippy Nuxx 2003;-5.42971;2.25276;2.278929;5.826855Underworld - Dark & Long;-7.047832;2.039863;2.558234;7.347276Underworld - Dark & Long [DNK];-12.963083;2.773522;2.618432;7.58829Underworld - Dirty Epic / Cowgirl;-13.302865;4.780328;2.772149;6.770922Underworld - Dirty Epic [DEU];-12.523822;4.366223;3.216058;6.722689Underworld - Dubnobasswithmyheadman;-17.540799;3.483473;3.178938;7.227787Underworld - Everything, Everything;-8.519916;3.4288;2.325238;5.567491Underworld - I'm A Big Sister, And I'm A Girl, And I'm A Princess, And This Is My Horse;-14.42569;4.771379;3.08573;5.026963Underworld - Live in Tokyo 25th November 2005 (disc 1&2&3);-11.736943;3.626;2.779501;5.980281Underworld - Lovely Broken Thing;-9.08797;3.564376;3.698982;8.525503Underworld - Mmm... Skyscraper I Love You;-16.179067;3.672679;3.428056;7.967833Underworld - Oblivion with Bells;-9.540373;3.765357;3.382677;6.600661Underworld - Pearl's Girl [USA];-9.761997;3.271767;2.887699;7.549634Underworld - Pizza for Eggs;-10.008908;4.363326;3.692727;5.965214Underworld - Second Toughest in the Infants [DEU];-15.170278;4.424356;3.175443;7.09311Underworld - Spikee/Dogman Go Woof;-15.873933;2.642126;3.138517;7.768897Venetian Snares - 2370894;-6.556472;6.74378;4.157356;6.369249Venetian Snares - A Giant Alien Force More Violent & Sick Than Anything You Can Imagine;-2.344575;3.570357;3.421717;5.569379Venetian Snares - Cavalcade of Glee and Dadaist Happy Hardcore Pom Poms;-2.515518;5.610292;3.641683;6.789516Venetian Snares - Doll Doll Doll;-2.246614;8.641083;4.100379;5.388812Venetian Snares - Find Candace;-6.297612;8.911154;3.858491;5.863097Venetian Snares - Higgins Ultra Low Track Glue Funk Hits 1972-2006;-3.985018;6.850304;3.618574;5.453436Venetian Snares - Huge Chrome Cylinder Box Unfolding;-5.650552;6.310211;4.741762;7.284746Venetian Snares - My Downfall;-4.995728;14.48013;4.618486;4.298872"Venetian Snares - printf(''shiver in eternal darkness-n'');";-4.088835;5.297741;3.762588;5.77607Venetian Snares - Rossz Csillag Allat Született;-5.816221;10.262339;4.676324;5.850983Venetian Snares - Songs about My Cats;-1.748199;6.035857;4.253491;6.424096Venetian Snares - Winter in the Belly of a Snake;-9.070008;10.172085;5.213827;6.626628Venetian Snares + Speedranch - Making Orange Things;3.147053;3.302486;2.562315;2.985406`

• soultrain
pfpf v0.1
Reply #15 – 03 February, 2008, 05:15:15 PM
Looks like a pretty interesting project. Was going to give it a go but put off due to 90MB download of LabVIEW 8.2.1 Run-Time Engine -- then thought - no big deal -- but then was put off by the grand registration process just to download the runtime environment.

C.

Yep that was the killer for me too.... Now dloading the smaller runtime, i think 1 complete pakcage would be better.

• soultrain
pfpf v0.1
Reply #16 – 03 February, 2008, 05:25:27 PM
Dloaded the small library and pfpf, installd both and rebooted. igot got all sort resource missing errors. Can not load frontpanel etc. So maybe look for a all in one package.

ps would it be usefull on good quality mp3 files?

• Axon
• Members (Donating)
pfpf v0.1
Reply #17 – 27 June, 2008, 12:01:38 PM
Yeah I really should have replied to yall sooner. Chromatix's work has convinced me to get off my butt. I just fixed all the links, so everybody can download pfpf again from the usual location.

Quote
Dloaded the small library and pfpf, installd  both and rebooted. igot got all sort resource missing errors. Can not  load frontpanel etc. So maybe look for a all in one package.
That's bizarre. Are you running a non-English version of Windows? You might need to download a bigger (or different) runtime in that event. Did you unzip everything before you ran pfpf.exe?

Quote
ps would it be usefull on good quality mp3 files?
In theory, the lossiness of a sample should not impact the measurements, because lossy files (with very few exceptions) should not affect the loudness or dynamic range of music.

Quote
Thank you very much for this great tool.
And thank you for taking all the trouble to run all those numbers  They may come in handy for spotting problem samples, where too little or too much dynamic range is estimated.

Quote
One minor issue with the UI: I could not adjust it to smaller resolutions like 1024x786.
I'll see what I can do to reduce the resolution requirements, but I can't guarantee much. I may just punt and say that a 1680x1050 screen is required. I've already split the UI up into several different tabs and I think it's really important to keep all the histogram and loudness plots large and on the same page.

Quote
Also, one feature request:
http://img211.imageshack.us/img211/8289/declipperjd8.png
http://img87.imageshack.us/img87/3199/declipper2cz2.png
It's  the declipper from Izotope RX which features a so called "histogram of  waveform levels" where you can see the sample distribution over the  bitrange. However it is very limited as it just shows values from 0  until -8 dB and does not have a horizontal scale.
Looking at an improved version would help estimating the amount of clipping.
Clipping analysis really isn't what this is all about. There's a lot more meaning in trying to estimate how the ear is actually responding to dynamic range manipulations than simply pointing out the level characteristics of the signal.

That said... it wouldn't be hard to add.

Quote
Do you think this program would be helpful  in working out audio levels for a release? i.e. if I was attempting to  get db levels right across tracks of varying compression (not in the  lossless/lossy sense) -- currently I use wavgain and then my ears for  fine tuning -- can you see your app having a role in this kind of  process
It could play a role for that, yes, although I would imagine that for pop masterings wavgain would give you great results. I'd love to hear from you as to which of the two tools matches your perceptions the best about the dynamic range. Certainly pfpf is more (over)engineered for that purpose, but it's entirely untested as to if it performs better

• Chromatix
pfpf v0.1
Reply #18 – 27 June, 2008, 04:34:47 PM
I think I'm getting the same "missing resources" errors as others mentioned, so it looks like the full-size runtime is needed.  I'm running English Windows, but it's 2K not XP.

• Axon
• Members (Donating)
pfpf v0.1
Reply #19 – 27 June, 2008, 04:50:27 PM
Boo! Well, the first thing I would suggest (unfortunately) is downloading the full installer from the NI website. Warning, registration required, etc etc.

• Raiden
• Developer
pfpf v0.1
Reply #20 – 27 June, 2008, 05:28:28 PM

• Axon
• Members (Donating)
pfpf v0.1
Reply #21 – 27 June, 2008, 06:29:09 PM
Hell, you can just download it from the FTP site. I'm mostly just deferring to the Web interface for deciding which installer to use, since my first pick was wrong.

I broke down and uploaded a pfpf zip with an installer, including a runtime. I haven't tested it, I just hit "build". It's 64MB. Don't overdownload it

• Chromatix
pfpf v0.1
Reply #22 – 28 June, 2008, 07:37:01 AM
Hell, you can just download it from the FTP site. I'm mostly just deferring to the Web interface for deciding which installer to use, since my first pick was wrong.

I broke down and uploaded a pfpf zip with an installer, including a runtime. I haven't tested it, I just hit "build". It's 64MB. Don't overdownload it

Ah yes, that works much better.  (Note to others: if you installed the "miniature" runtime, uninstall it first, otherwise it won't get replaced.)

Your default timescales for long and medium are somewhat longer than mine, I think.  So my averaging-meter is coming in somewhere between your long and medium term measurements, and my peak-meter somewhere between your medium and short term measurements.  With that said, we're getting respectably similar-shaped graphs, I think.

Because I'm measuring things in a different way, I get a kind of DC-offset on my medium-term graph, which I also factor into my measurements.  This has the neat side-effect of eliminating the enormous negative spikes I see on your graphs, though I get (smaller?) positive spikes instead.  I don't think the human ear is as sensitive to sudden decreases in amplitude as it is to sudden increases, which is why I am comfortable with using a 300ms/99% decay rate on both meters.

Unfortunately, it's very difficult to read anything from your medium-term graph, for several reasons.  Probably the biggest difference to usability would be if the X-axes for all three graphs were linked, so that it was easier to zoom in on the detail.  It would also be neat to listen to the track while watching meter needles, as an engineer would - perhaps I can write a tool to do that in Linux.

I've been trying to find out what ITU-R1770 actually is, in detail, but all of the useful free links I can easily find seem to have gone dead.  Any pointers here?

• Axon
• Members (Donating)
pfpf v0.1
Reply #23 – 28 June, 2008, 05:05:24 PM
Your default timescales for long and medium are somewhat longer than mine, I think.  So my averaging-meter is coming in somewhere between your long and medium term measurements, and my peak-meter somewhere between your medium and short term measurements.  With that said, we're getting respectably similar-shaped graphs, I think.
Yeah, I think the main differences are going to reside in how transients are handled, so the overall graphs are going to be really similar.

Quote
Because I'm measuring things in a different way, I get a kind of DC-offset on my medium-term graph, which I also factor into my measurements.  This has the neat side-effect of eliminating the enormous negative spikes I see on your graphs, though I get (smaller?) positive spikes instead.  I don't think the human ear is as sensitive to sudden decreases in amplitude as it is to sudden increases, which is why I am comfortable with using a 300ms/99% decay rate on both meters.
That is a very good point - you could make a convincing case for this sort of asymmetry based solely on temporal masking. I suppose I could implement that in a windowed fashion by shifting the window forwards in time a bit, but exponential decay is certainly easier (and potentially more accurate).

Quote
Unfortunately, it's very difficult to read anything from your medium-term graph, for several reasons.  Probably the biggest difference to usability would be if the X-axes for all three graphs were linked, so that it was easier to zoom in on the detail.  It would also be neat to listen to the track while watching meter needles, as an engineer would - perhaps I can write a tool to do that in Linux.
Hmm, I thought I added code to link the X-axes together - I'll need to revisit that. Doing live playback is reasonable enough.