HydrogenAudio

Hydrogenaudio Forum => Scientific Discussion => Topic started by: Woodinville on 2011-05-29 00:24:12

Title: Interesting Histograms.
Post by: Woodinville on 2011-05-29 00:24:12
I've been doing histograms of audio stuff, as well as calculating spectral flatness measure, so I can reproduce some of the information I had back when I was starting on the PAC/NBC/AAC work.

I've found an album that a fair number of missong codes.

I've also found an album that has a max under +-16384 and has over 75% missing codes inside of that.

I didn't originally set up to calculate ratio of missing codes to all codes. But I think I shall have to.

I'll put up some photos of the various histograms when I get a bit more done.

Wow, batman, doesn't anyone look at this stuff before pressing!?
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-29 03:32:03
http://s238.photobucket.com/albums/ff228/j...l%20Histograms/ (http://s238.photobucket.com/albums/ff228/jj_0001/Level%20Histograms/)

Some histograms.  My ghu is octave SLOW.
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-29 04:33:31
ETA:

octave wavread returns numbers 1 level out of 32768 below -1 and and exactly +1 on the positive side.

Does anybody have a WORKING wavread fer bleep's sake?  Anyone? Help?  wavread should never go below -1, or above 1 - 2^-(nbits-1)

Also, try this in the latest octave 

round(-.0001)
ans=-0

Yep, minus zero.

I guess I'll spring for my own personal copy of ratlab and signal processing package, that's intolerable.

So, all of these histograms are "almost right", it's pretty clear that they are right except that they might be missing more missing codes, since one can't count on the actual wavread scaling.

Bleepity blip bleep blip.

Your in annoyance

jj
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-29 07:12:46
Ok, if you're using wavread from octave, be sure to change the constants for 16 and 24 bit files from 2^(nbits-1)-1 to 2^(nbits-1)

Otherwise you waste a lot of time getting incorrect results.

Plots above will be changing as they get done. Expletive.
Title: Interesting Histograms.
Post by: C.R.Helmrich on 2011-05-29 09:30:38
Thanks for debugging Octave's wavread, JJ!

Hmm, looking at the Robert Lee (piano) histogram, it looks like someone, as the last step before pressing, applied a positive-dB gain on the 16-bit data? 

Chris
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-29 09:47:03
Thanks for debugging Octave's wavread, JJ!

Hmm, looking at the Robert Lee (piano) histogram, it looks like someone, as the last step before pressing, applied a positive-dB gain on the 16-bit data? 

Chris


It's a nearly-homemade recording of a really good stride pianist from the middle of Indiana.

If you want to hear youtube-quality, listen to "Allen Dale" on Youtube. But try to ignore it when he talks politics. Listen to 12th St Rag, and just watch the hands fly.

Oh, and you know, I'm not under any IP restrictions right now, I could post the histogram script for Octave if anyone gave two whits. It's not like it's even remotely close to rocket science, I think anyone here could write it in a half-hour as long as you also fix the wavread constant. &*(&( that was irritating. Oh, and the Spectral Flatness Measure is for 2048 Hann Window.  Green Day and Avril Lavigne now up, Chumbawama coming up soon.
Title: Interesting Histograms.
Post by: xnor on 2011-05-29 11:47:34
Interesting plots, thanks.

I've also found an album that has a max under +-16384 and has over 75% missing codes inside of that.


So it's basically a 15-bit album? 

I'm interested in the script.
Title: Interesting Histograms.
Post by: romor on 2011-05-29 16:25:14
Ok, if you're using wavread from octave, be sure to change the constants for 16 and 24 bit files from 2^(nbits-1)-1 to 2^(nbits-1)

Otherwise you waste a lot of time getting incorrect results.

Plots above will be changing as they get done. Expletive.

Excuse my ignorance and if I understand you right, but why should Octave (or anything else) scale normalized data on 2^(n-1)-1 instead 2^(n-1)?
Title: Interesting Histograms.
Post by: Paulhoff on 2011-05-29 17:44:06
I've been doing histograms of audio stuff, as well as calculating spectral flatness measure, so I can reproduce some of the information I had back when I was starting on the PAC/NBC/AAC work.

I've found an album that a fair number of missong codes.

I've also found an album that has a max under +-16384 and has over 75% missing codes inside of that.

I didn't originally set up to calculate ratio of missing codes to all codes. But I think I shall have to.

I'll put up some photos of the various histograms when I get a bit more done.

Wow, batman, doesn't anyone look at this stuff before pressing!?


I don't have anything, and I'm not at all sure what you want to begin with.


My Bad???

Paul

   
Title: Interesting Histograms.
Post by: xnor on 2011-05-29 18:17:36
Excuse my ignorance and if I understand you right, but why should Octave (or anything else) scale normalized data on 2^(n-1)-1 instead 2^(n-1)?


because 2^15-1 = 32767 decimal = 7FFF hex = highest possible value for a signed, 16-bit integer

and -32768 decimal = 8000 hex

see two's complement (http://en.wikipedia.org/wiki/Two%27s_complement)
Title: Interesting Histograms.
Post by: romor on 2011-05-29 18:39:09
If the problem is about histogram function, and that's why the need for +1 bin (number of samples + 1), maybe that can make some sense to me

Data presented by wavread from Octave is same as text output by Cool Edit Pro, or output from libsndfile library on my PC
Title: Interesting Histograms.
Post by: Axon on 2011-05-29 19:36:25
OT:

OH MY FUCKING GOD JJ you read my mind about octave.

I tried my *best* to like it a month ago, when I attempted to use it for "real" audio work. I really did. Right before I discovered


At which point I concluded that: it is plainly obvious that nobody actually does useful audio processing work in Octave, and it would likely be a waste of my time to attempt to do so, rather than look at other free numerical processing solutions. (I'm thinking mostly about F# nowadays.)

Then I posted "GNU Octave fucking blows" on Facebook. Now see jj, if I friended you, you would have seen that comment, and you wouldn't have had to go through all that trouble, eh
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-29 20:15:07
Ok, if you're using wavread from octave, be sure to change the constants for 16 and 24 bit files from 2^(nbits-1)-1 to 2^(nbits-1)

Otherwise you waste a lot of time getting incorrect results.

Plots above will be changing as they get done. Expletive.

Excuse my ignorance and if I understand you right, but why should Octave (or anything else) scale normalized data on 2^(n-1)-1 instead 2^(n-1)?


I don' t know, but that's what it did. !?

You can go into the library and change it to the proper 2^(n-1) without much trouble.
Title: Interesting Histograms.
Post by: Axon on 2011-05-29 20:18:29
I don' t know, but that's what it did. !?

You can go into the library and change it to the proper 2^(n-1) without much trouble.


I vaguely remember doing this too.

Yet another reason to give Octave the Viking funeral it richly deserves, by dousing it with lighter fluid, and setting it on fire.
Title: Interesting Histograms.
Post by: Axon on 2011-05-29 20:52:08
I've also found an album that has a max under +-16384 and has over 75% missing codes inside of that.

I didn't originally set up to calculate ratio of missing codes to all codes. But I think I shall have to.

In some meaningful sense, couldn't this be considered a "13-bit" recording, rather than a "15-bit" recording?

And I suppose, more generally: when we are evaluating coding efficiency, are we "literally" evaluating Komolgorov complexity, and using that as an estimator for sound quality?
Title: Interesting Histograms.
Post by: xnor on 2011-05-29 20:56:18
Forgive my previous reply, I didn't read properly.

in the Octave 3.4.0 sources, wavread.m, line 201 onward:

Code: [Select]
    ## Normalize samples.
    switch (bits_per_sample)
      case 8
        yi = (yi - 128)/127;
      case 16
        yi /= 32767;
      case 24
        yi /= 8388607;
      case 32
        yi /= 2147483647;


This seems to be the problem you're talking of and should be corrected to 32768 for 16-bit for example.

Or patch it to don't normalize at all.
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-29 21:00:27
Forgive my previous reply, I didn't read properly.

in the Octave 3.4.0 sources, wavread.m, line 201 onward:

Code: [Select]
    ## Normalize samples.
    switch (bits_per_sample)
      case 8
        yi = (yi - 128)/127;
      case 16
        yi /= 32767;
      case 24
        yi /= 8388607;
      case 32
        yi /= 2147483647;


This seems to be the problem you're talking of and should be corrected to 32768 for 16-bit for example.

Or patch it to don't normalize at all.


I patched it to 32768 and 2147483648

Then I could get the results I expected on a file that was half -max and half +max without using wrong multipliers

I think 8-bit is mulaw, and if I'm right, that's just totally wrong, but I'm not entirely sure. Been a while since I used 8-bit.
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-29 21:02:01
Interesting plots, thanks.

I've also found an album that has a max under +-16384 and has over 75% missing codes inside of that.


So it's basically a 15-bit album? 

I'm interested in the script.


Actually, as Axon points out, it's 13 bits, because there are so many missing levels, as well as it running under half.
Title: Interesting Histograms.
Post by: xnor on 2011-05-29 21:10:17
I think 8-bit is mulaw, and if I'm right, that's just totally wrong, but I'm not entirely sure. Been a while since I used 8-bit.


Nah it's just that 8-bit PCM is unsigned so the range of values is 0 to 255 with a centerpoint of 128.
So if you substract 128 you get the same result you'd get with signed 8-bit integers (-128 to 127). Of course the division by 127 should be corrected to 128 like in the other cases.

edit: oh you already filed a bug, great!
Title: Interesting Histograms.
Post by: Axon on 2011-05-29 21:23:10
If anybody's interested, I still have my mostly-complete-and-working rewrite of wavread to implement buffered WAV file reading (along with other mucho bug fixes of course).

EDIT: Ah, I saw jj's bug too. I guess I should stop being so antisocial and submit a few of my own too.
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-29 21:28:55
If anybody's interested, I still have my mostly-complete-and-working rewrite of wavread to implement buffered WAV file reading (along with other mucho bug fixes of course).

Well, I'd be interested if it was working.



I suppose I should just attach the histogram/sfm calculation here.

Nothing fancy, I assure you.

--- this is freeware, nothing is guaranteed at all---
Code: [Select]
clear all
close all
clc

fname='02.wav';
x=wavread(fname);
x=x';

len=length(x)

his(1:65536)=0;

low=32768;
high=32768;
windd=hann(2048)';

sfmmean=0;
nmeas=0;

for ii=1:len
for jj=1:2

t=x(jj,ii);
if ( t < -1)
t=-1
fflush(stdout);
end
if (t >65535/65536)
t=65535/65536
fflush(stdout);
end

t=round(t*32768+32769);


his(t)=his(t)+1;
if (t < low)
low=t;
end
if (t > high)
high=t;
end
end
if (mod(ii,1024)==0)
if (ii>1024)
for kk=1:2
w=x(1,ii-2047:ii);
w=w .* windd;
wt=fft(w);
wt= wt .* conj(wt);
wt=max(wt,.0000000001);
am=sum(wt)/2048;
gm= exp (sum(log(wt))/2048);
sfm=gm/am;
if ( am > .5)
nmeas=nmeas+1;
sfm=10*log10(sfm);
sfmmean=sfmmean+sfm;
end


end
end
if (mod(ii,1024*16)==0 )
ii/len
fflush(stdout);
end
end

end

tot=sum(his);
his=his/tot;
his=max(his, .000000000001);
xax=-32768:32767;

semilogy(xax,his);
axis([-40000 40000 1e-10 1e-1]);
nmeas=nmeas
if (nmeas >0)
sfmmean=sfmmean/nmeas
end
low=low-32769
high=high-32769
fname
---end of freeware---

Change fname as you will

Its slow. The time is spent not in the fft part, but in the very simple histogram part, too. No idea why. It's slower than a snail glued to a rock.
Title: Interesting Histograms.
Post by: Axon on 2011-05-29 21:52:58
Mods, could one of you please add *.m as an allowed upload extension?
Title: Interesting Histograms.
Post by: ExUser on 2011-05-29 22:34:04
I'll bug the admins. I can't do this. In the interim, zip the .m and post that?
Title: Interesting Histograms.
Post by: googlebot on 2011-05-29 22:34:10
Then I posted "GNU Octave fucking blows" on Facebook. Now see jj, if I friended you, you would have seen that comment, and you wouldn't have had to go through all that trouble, eh


I somewhat associate 'Axon' with wooing Woodinville for quite some time now, but I think that's a new level.
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-29 23:41:53
I'll bug the admins. I can't do this. In the interim, zip the .m and post that?


Well, screen-copying the text above into octave works like a champ

It's not like there's any special character stuff in it.
Title: Interesting Histograms.
Post by: Axon on 2011-05-30 02:08:45
Then I posted "GNU Octave fucking blows" on Facebook. Now see jj, if I friended you, you would have seen that comment, and you wouldn't have had to go through all that trouble, eh


I somewhat associate 'Axon' with wooing Woodinville for quite some time now, but I think that's a new level.

You have no idea.


I'll bug the admins. I can't do this. In the interim, zip the .m and post that?


Well, screen-copying the text above into octave works like a champ

It's not like there's any special character stuff in it.


Right, but, my hacked audio package is 380 lines of code, my test for said hack was another 50... it starts to add up.

In any case. I have uploaded everything I've got HERE (http://www.hydrogenaudio.org/forums/index.php?showtopic=88852). Have at it.
Title: Interesting Histograms.
Post by: ExUser on 2011-05-30 06:14:29
Right, but, my hacked audio package is 380 lines of code, my test for said hack was another 50... it starts to add up.
[ codebox ] will work.
Title: Interesting Histograms.
Post by: bandpass on 2011-05-30 06:53:14
You can go into the library and change it to the proper 2^(n-1) without much trouble.

2^(n-1) is still problematic (though perhaps not for histograms). E.g. a wav containing a minimum value cannot be inverted.
Title: Interesting Histograms.
Post by: ExUser on 2011-05-30 07:15:44
More generally,

Code: [Select]
float scale(int value){
  float offset=(MAX_INT+MIN_INT)/2;
  float range=(MAX_INT-MIN_INT)/2;
  return (value+offset)/range;
}


MAX_INT and MIN_INT are the minimum and maximum values for the int type, which is a fixed-point number.
Title: Interesting Histograms.
Post by: bandpass on 2011-05-30 07:31:47
But not for wav (or au, or aiff) files, where 0 is defined to be the midpoint ("offset" in the code).

Edit: not sure if the code has representational issues: mathematically, (MAX_INT+MIN_INT)/2 = -0.5
Title: Interesting Histograms.
Post by: ExUser on 2011-05-30 08:41:22
That code should map int values to 0..1 inclusive, ideally, ignoring external definition of midpoint. Defining the midpoint to be int 0 implies that the encoding is slightly biased towards negative values. Seems to be a really weird engineering decision. Do you have a citation for defining 0 to be the midpoint?
Title: Interesting Histograms.
Post by: romor on 2011-05-30 08:59:26
 to me it seems that Woodinville found that formula as workaround, and as shown later by xnor Octave's wavread does have issue, only of different kind

libsndfile data I was referring was with e-notation and it seemed like everything is OK with 16 digit Octave floats (not mentioning 4-6 digits cool edit  but it wasn't - values were way off, sometimes to two decimals
Changing to xnor notices then formating everything to 16-digit floats, and double checking, seem fine again

Apologetic IPy version for those histograms
[a href=\"http://dl.dropbox.com/u/30782742/ipython.html\" target=\"_blank\"]

I used "whos" and slicing both in Octave and IPy, to check if variables match and everything is OK

Not sure what's the purpose of "kk - for loop": kk is never used if intended to process both channel data, and even if it should, "jj - for loop" seems to me it could take care. Or perhaps I use Octave rarely
Title: Interesting Histograms.
Post by: bandpass on 2011-05-30 09:10:34
That code should map int values to 0..1 inclusive, ideally, ignoring external definition of midpoint. Defining the midpoint to be int 0 implies that the encoding is slightly biased towards negative values. Seems to be a really weird engineering decision. Do you have a citation for defining 0 to be the midpoint?

http://msdn.microsoft.com/en-us/library/ms...audiodataformat (http://msdn.microsoft.com/en-us/library/ms707957.aspx#pcmwaveformaudiodataformat)

Weird, certainly: lots of special-case code needed and/or DC-offsets creeping in.
Title: Interesting Histograms.
Post by: googlebot on 2011-05-30 09:44:45
Defining the midpoint to be int 0 implies that the encoding is slightly biased towards negative values.


It doesn't have to imply that. The negative range can also be interpreted to just have more headroom. When you convert from a balanced to an unbalanced encoding, the max negative symbol stays unused. When you convert from an unbalanced to a balanced encoding, there indeed needs to be special case handling. Best general guidance would be not using the max negative symbol at all and asking for user feedback when you encounter it on the input pipeline.

It is a good thing, that 0 is defined as the midpoint. Else detecting silence would be a PITA. When the first PCM formats were developed, these differences were probably too far below the analog noise floor of the best converters to really cause any concern.
Title: Interesting Histograms.
Post by: bandpass on 2011-05-30 10:20:18
Best general guidance would be not using the max negative symbol at all and asking for user feedback when you encounter it on the input pipeline.

Should an ADC request user intervention whenever its input is < 1/2^n of its range?

Quote
It is a good thing, that 0 is defined as the midpoint. Else detecting silence would be a PITA.

I'm not sure that's useful though; in practice, silence is defined in (finite) dB.
Title: Interesting Histograms.
Post by: xnor on 2011-05-30 11:04:30
I don't see the problem. PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).

Anything above that range is simply clipped to the highest possible value, so inverting -32768 results in +32767.
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-30 11:10:43
You can go into the library and change it to the proper 2^(n-1) without much trouble.

2^(n-1) is still problematic (though perhaps not for histograms). E.g. a wav containing a minimum value cannot be inverted.


That is correct, and proper.

That is the nature of 2's compliment, there is only one '0' entry, and thus one extra negative entry.

And, yes, if the quantizer is done as a standard PCM quantizer, there must be a zero reconstruction level.

That is, after all, the definition. Yes. Really.
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-30 11:17:51
Not sure what's the purpose of "kk - for loop": kk is never used if intended to process both channel data, and even if it should, "jj - for loop" seems to me it could take care. Or perhaps I use Octave rarely


It's for doing the spectral flatness measure, but I may have indeed messed up, let me look.

Yep, messed up. the '1' in the array reference should be 'kk'.

Sigh.  I'll see if the sfm is much different. It's unlikely.

Oh, and yes, octave is astonishingly slow, even when you only do the histogram stuff. In fact, the SFM stuff adds surprisingly little to the run time, which is bizzare.

Even without bounds checking it's slow, slow, slow. No idea why.

Matlab is quite a bit faster, but I don't have it here.
Title: Interesting Histograms.
Post by: googlebot on 2011-05-30 13:00:23
Should an ADC request user intervention whenever its input is < 1/2^n of its range?


No, just fall back to the usual behavior defined for all out-of-range values.

I'm not sure that's useful though; in practice, silence is defined in (finite) dB.


It's not a bug, it's a feature! From an analog perspective, a good place for the midpoint would have been between the two smallest symbols and both ranges would have been in perfect symmetry. Silence would then be encoded as some form of noise alternating between the smallest symbols, which is fine, since PCM encoding doesn't make any promise better than that.

Giving '0' the privileged meaning of 'digital silence', at the cost of one usual symbol in the positive range, enables scenarios, where you signal something like: "don't try to replicate my primitive approximation of silence but replace it with the best silence you have available". The cost (1 symbol) isn't significant in contrast to the gained possibility.

In practice you dither, so a privileged '0' symbol is unnecessary, since silence is encoded as noise anyway. Some dithering tools have something like an "auto-black" feature, though.

I don't see the problem. PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).


[-1.0, 1.0] is a perfectly fine range for PCM encoding in float representation. Why should artificial constraints from a legacy storage format be carried over to a better format, which doesn't benefit from that constraint in any way? Conversion to and from [-1, 1] isn't black magic, after all.
Title: Interesting Histograms.
Post by: xnor on 2011-05-30 18:15:03
[-1.0, 1.0] is a perfectly fine range for PCM encoding in float representation.

I (we?) were talking about the PCM format (format tag 1) in RIFF WAVE files, which is integer only, and the normalization issue.
With normalized floats (format tag 3) the range is of course, like you posted, -1.0 <= y <= +1.0 and normalization is not needed.

I don't think those non-floating point formats are legacy at all. I know a couple of recording engineers that do not use floats as storage format.
And I think it's common practice to keep the level at least a fraction of a dB below full scale. Even if you're only 0.01 dB below full scale you're down to something like 32730 with 16-bit integers.
Title: Interesting Histograms.
Post by: ExUser on 2011-05-30 18:40:51
Once again I put in my amateur two-bits worth and receive sound instruction in return. I <3 you guys. </off-topic>
Title: Interesting Histograms.
Post by: Axon on 2011-05-30 20:37:34
Yeah I also simplified jj's Octave code (into something like 10 lines IIRC), and always attempted to use native Octave functions whereever I could, and it still ran like a dog.

mmmm... Python for numeric work. Crunchy.
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-30 20:49:09
Yeah I also simplified jj's Octave code (into something like 10 lines IIRC), and always attempted to use native Octave functions whereever I could, and it still ran like a dog.

mmmm... Python for numeric work. Crunchy.


Nah, dogs run fast. It's not that fast
Title: Interesting Histograms.
Post by: bandpass on 2011-05-31 07:49:55
PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).
Anything above that range is simply clipped to the highest possible value, so inverting -32768 results in +32767.

This approach introduces a new mathematics, where inversion is non-linear, which is madness, or at least highly undesirable.

Either the analogue signal is biased with ½ LSB, in which case the digital signal range is -32767 to +32767 (-32768 can never occur and would clip if sent to a corresponding DAC), or the digital signal is biased with ½ LSB, in which case the inverse of -32768 is 32767.

Note that even though microsoft claims that the midpoint is zero, a WAV file cannot know how your ADC is biased-up.

That is correct, and proper.
That is, after all, the definition. Yes. Really.

Can you provide a source for the definition?
Title: Interesting Histograms.
Post by: Nick.C on 2011-05-31 08:03:31
.... but can you hear a 0.5 lsb offset?
Title: Interesting Histograms.
Post by: Woodinville on 2011-05-31 08:24:21
Can you provide a source for the definition?


It goes all the way back to about 1960. I don't recall it presently, but in fact zero is zero. It all boils down to that.

There have been a variety of scalings for fix to float conversion, but the most common is that of -1 is the largest negative.  Given the reality of integer 2's compliment math, that's really how it all works out.
For sign-magnitude integers, you wind up with one zero with two codes for it in the integer.

It is possible to do midriser quantizers instead of midtreat quantizers, but then the following bites you:

When you start to do integer math and floating point math and expect something to work out the same way, you have to  have zero is in fact zero, and nothing else but. Otherwise you have very different domains for your signals.
Title: Interesting Histograms.
Post by: Axon on 2011-05-31 08:26:41
0.5lsb is an issue for spectrum analysis, and in principle, might also exacerbate potential stability issues in lowpass filters. But more importantly, ITS JUST WRONG.

PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).
Anything above that range is simply clipped to the highest possible value, so inverting -32768 results in +32767.

Not quite -- inverting (multiplying by -1) -32768 results in -32768. Invert all the bits, and add 1.

Quote
That is correct, and proper.
That is, after all, the definition. Yes. Really.

Can you provide a source for the definition?

The wikipedia entry for two's complement arithmetic (https://secure.wikimedia.org/wikipedia/en/wiki/Two%27s_complement)?
Title: Interesting Histograms.
Post by: Axon on 2011-05-31 08:30:50
And just to flesh this discussion out some, yes, having a negative value which cannot be inverted to a positive value *is* the cleanest and most efficient solution. Unless anybody here would instead prefer negative zero. Hands?
Title: Interesting Histograms.
Post by: C.R.Helmrich on 2011-05-31 09:19:36
PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).
Anything above that range is simply clipped to the highest possible value, so inverting -32768 results in +32767.

Not quite -- inverting (multiplying by -1) -32768 results in -32768. Invert all the bits, and add 1.

Meaning all -1s are inverted to 2, and there will be no 1s? Btw, the way xnor described it is how e.g. Audition inverts.

Chris
Title: Interesting Histograms.
Post by: bandpass on 2011-05-31 10:13:14
0.5lsb is an issue for spectrum analysis, and in principle, might also exacerbate potential stability issues in lowpass filters. But more importantly, ITS JUST WRONG.

Indeed, hence the discussion—it's a small but annoying issue if it's not handled consistently.

Quote
PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).
Anything above that range is simply clipped to the highest possible value, so inverting -32768 results in +32767.

Not quite -- inverting (multiplying by -1) -32768 results in -32768. Invert all the bits, and add 1.

But that's also highly undesirable for DSP.

At the ADC, if there is no bias, inverting an analogue signal that converts to -32768 would produce an analogue signal that converts to 32767; digital inversion should give the same result (at the DAC output that is).

Quote
Quote
That is correct, and proper.
That is, after all, the definition. Yes. Really.

Can you provide a source for the definition?

The wikipedia entry for two's complement arithmetic (https://secure.wikimedia.org/wikipedia/en/wiki/Two%27s_complement)?

It doesn't mention ADC/DAC biasing.  A better place to look might be IEC 60908 or somesuch.

If there is no ADC bias (and 16-bit ADC values are stored unmodified or with just the top bit flipped), then a valid DSP solution is:

Code: [Select]
float dsp_sample = (adc_sample + 0.5) / 32767.5;

If there is ½ LSB ADC bias then a valid DSP solution is:

Code: [Select]
float dsp_sample = adc_sample / 32767.0;

and -32768 is an unused value.  The code:

Code: [Select]
float dsp_sample = adc_sample / 32768.0;

doesn't seem to map to any real world ADC scenario.

In practice, as has been mentioned, recordings are made with headroom and probably have any DC-offset (w.r.t. digital 0) removed with post-processing; this however has the same result as biasing the ADC, which again means that -32768 should be an unused value.
Title: Interesting Histograms.
Post by: romor on 2011-05-31 17:33:37
Oh, and yes, octave is astonishingly slow, even when you only do the histogram stuff. In fact, the SFM stuff adds surprisingly little to the run time, which is bizzare.

Even without bounds checking it's slow, slow, slow. No idea why.

Matlab is quite a bit faster, but I don't have it here.

Seems that array creation is main break here. I tried to speed up histogram and this code:

Code: [Select]
def histo(fn):
    import scikits.audiolab as au
    (sp, sf, b) = au.wavread(fn)

    sp = sp.transpose()
    his=zeros((65536,), dtype=numpy.int)

    for i in range(len(sp[0])):
        his[int(sp[0, i] * 32768) + 32768] += 1
        his[int(sp[1, i] * 32768) + 32768] += 1

    his = maximum(his / float(sum(his)), 10e-12)
    x = arange(-32768, 32768)
    semilogy(x, his)


is ~15 times faster then Octave or ~3 times faster then previously posted.

This is bare bone NumPy - simple numpy array processing, so other then Python and NumPy nothing more is required (except SciPy or audiolab for reading WAV data; edit: and matplotlib for graphs).
MKL build isn't noticeable here, as there are no linalg functions. However this can of course further be optimized by various approaches, i.e.: http://www.scipy.org/PerformancePython (http://www.scipy.org/PerformancePython) Example here is totally different, (scheme on grid and testing it convergence) but provides examples how NumPy/SciPy can be dissected further. I just thought to link to that, as I'm not aware how many users here are familiar with this Python approach for numerical problems

I'm wondering also about Matlab performance. My main PC PSU went down and I can't test Matlab 2009b these days. Can someone compare performance between Octave and Matlab, just for histogram code:

Code: [Select]
fname='some.wav';
x=wavread(fname);
x=x';
len=length(x)
his(1:65536)=0;
for ii=1:len
for jj=1:2
t=x(jj,ii);
t=round(t*32768+32769);
his(t)=his(t)+1;
end
end
Title: Interesting Histograms.
Post by: xnor on 2011-05-31 19:30:50
Not quite -- inverting (multiplying by -1) -32768 results in -32768. Invert all the bits, and add 1.

I was just describing that inverting negative full-scale will result in a 16-bit PCM value of 32767.
You're talking about the implementation. If you blindly add 1 you run into the same overflow problem you just described.
(Nitpicking..  )


In practice, as has been mentioned, recordings are made with headroom and probably have any DC-offset (w.r.t. digital 0) removed with post-processing; this however has the same result as biasing the ADC, which again means that -32768 should be an unused value.


Well it's not unused in practice, just take a look at some recordings, DAWs etc.

And here's an excerpt from the CS5550 ADC datasheet:

Quote
These signed registers contain the last value of the measured results of AIN1 and AIN2. The results will be with-
in the range of -1.0 ? AIN1,AIN2  < 1.0.  The value is represented in two's complement notation, with the binary
point place to the right of the MSB (MSB has a negative weighting). These values are 22 bits in length. The two
least significant bits, (located at the far right-side) have no meaning, and will always have a value of “0”.


ADS1232:
Quote
A positive full-scale input produces
an  output  code  of  7FFFFFh  and  the  negative  full-scale
input  produces  an  output  code  of  800000h.
[...]
Ideal Output Code vs Input Signal: 0 Vin = 000000h out
Title: Interesting Histograms.
Post by: Woodinville on 2011-06-01 04:23:05
Code: [Select]
float dsp_sample = adc_sample / 32768.0;

doesn't seem to map to any real world ADC scenario.

In practice, as has been mentioned, recordings are made with headroom and probably have any DC-offset (w.r.t. digital 0) removed with post-processing; this however has the same result as biasing the ADC, which again means that -32768 should be an unused value.


I must differ, zero must be zero. Otherwise how do you justify converting coefficents in filters between fix and float, etc? Consider the differences in rounding and how they may work out in terms of zeros when you can NOT represent ZERO in the "fix" case.

It's ugly, ugly, ugly. You must have zero, and don't forget, 2s compliment has a single zero. Ergo, you need to map that zero to zero.

Introducing ADC's is simply a complete distraction here. Zero is zero. Not "almost zero".
Title: Interesting Histograms.
Post by: romor on 2011-06-01 05:45:06
I hope I'm not bothering you all with my Python snippets, but this morning I tried to replace slow loop with FORTRAN code (it's first time I tried this), and result was that Octave was outperformed by ~3000 times faster code.

Back to back: (http://i.imgur.com/YOGP8.png) (http://i.imgur.com/9dupd.png)

Here is Fortran code, which was used as fhist module within Python: http://pastebin.com/suMgN5QB (http://pastebin.com/suMgN5QB)

I guess inline C/C++ with scipy.weave could give similar result
Title: Interesting Histograms.
Post by: Woodinville on 2011-06-01 07:53:17
I hope I'm not bothering you all with my Python snippets, but this morning I tried to replace slow loop with FORTRAN code (it's first time I tried this), and result was that Octave was outperformed by ~3000 times faster code.

Back to back: [a href="http://i.imgur.com/9dupd.png" target="_blank"]
Title: Interesting Histograms.
Post by: romor on 2011-06-01 08:31:30
Not sure if I get your writing, but as mentioned I guess that array creation within that loop (iterating over an array) is what is slowing down
I also have no idea what Octave is doing, or NumPy at that level, but as suggested in posted link about performance, it's not that hard to inline C/C++ or slightly harder to do it with Fortran, and then gain some unbelievable speed up, while still in friendly environment
Even if it is about couple of plots, simple extension while writing the script seems worth the effort

Of course, I'm not suggesting you to install Python/NumPy and all that's needed if you don't already have or are familiar with it, just showing potential possibility

Matlab should perform better then Octave, but I'm guessing it still would be very slow
Title: Interesting Histograms.
Post by: Woodinville on 2011-06-01 11:17:13
Not sure if I get your writing, but as mentioned I guess that array creation within that loop (iterating over an array) is what is slowing down


Except that I create the arrays outside the loop and then iterate over them.  It's well known that repeatedly growing an array is a disaster, but I'm not doing that.

Maybe octave is, for some reason, but one would hope that it would take the original his(bounds)=0 and allocate the whole array then and there, since I am setting the whole array to a value.

Array creation (i.e. malloc-ing it) is not the same as iterating over an already created array, which is what I set up.
Title: Interesting Histograms.
Post by: bandpass on 2011-06-01 12:20:04
Quote
A positive full-scale input produces an  output  code  of  7FFFFFh


Yes, but a lower voltage might also produce the same value.  And in my experience, IC data sheets are frequently inaccurate; here's one from Analog Devices (AD5399) which immediately contradicts itselfs:

(http://i52.tinypic.com/w1tbhc.gif)

Introducing ADC's is simply a complete distraction here.

How so? The numbers in a wav file typically have their origins in an ADC and will be presented to a DAC.

If your ADC works like this:

(http://i54.tinypic.com/2q2n2he.png)

then, you don't have a central value to label as zero (in 2's complement).

If you bias by ½LSB:

(http://i51.tinypic.com/b4erf6.png)

you cause (effective) clipping to occur at the top end.  And there are other possible ways that it can be done.
Title: Interesting Histograms.
Post by: Woodinville on 2011-06-01 17:59:59
Introducing ADC's is simply a complete distraction here.

How so? The numbers in a wav file typically have their origins in an ADC and will be presented to a DAC.



The problem comes about when doing signal processing or transmission with format changes, or when the actual meaning of the PCM signal is the issue.

The 1/2lsb offset just does not matter at all in the real world. The difference in clipping levels is beyond minescule.  It's a clipping level change of 2e-4 dB.

That is utterly, completely, and without a doubt irrelevant.

And, once more, with feeling, the DEFINITION of the integer PCM signal is a midtread quantizer, NOT a midriser quantizer.  Can we please just stick to the definitions that were set up somewhere around 1960? Please?
Title: Interesting Histograms.
Post by: romor on 2011-06-02 05:18:56
Not sure if I get your writing, but as mentioned I guess that array creation within that loop (iterating over an array) is what is slowing down


Except that I create the arrays outside the loop and then iterate over them.  It's well known that repeatedly growing an array is a disaster, but I'm not doing that.

Maybe octave is, for some reason, but one would hope that it would take the original his(bounds)=0 and allocate the whole array then and there, since I am setting the whole array to a value.

Array creation (i.e. malloc-ing it) is not the same as iterating over an already created array, which is what I set up.

Absolutely wrong wording. I thought to edit it after posting it yesterday, but left it. I meant that on every loop whole 'his' array is re-created, not just indexed, but that is wild guess

So, every new day, new things learned. I got a tip that there is already function in Octave that does the hard work - histc(). So for the sake of completeness about discussion why Octave is so slow with this interesting histograms, this would speed it on couple of levels (x150):

Code: [Select]
x=wavread(filename);
x=x';

t = round(x(:) * 32768 + 32769);
his = histc(t, 1:65536)';

his=max(his/sum(his), 1e-12);

xax=-32768:32767;

semilogy(xax,his);
axis([-40000 40000 1e-10 1e-1])
Title: Interesting Histograms.
Post by: Woodinville on 2011-06-02 06:53:36
So, every new day, new things learned. I got a tip that there is already function in Octave that does the hard work - histc(). So for the sake of completeness about discussion why Octave is so slow with this interesting histograms, this would speed it on couple of levels (x150):

Code: [Select]
x=wavread(filename);
x=x';

t = round(x(:) * 32768 + 32769);
his = histc(t, 1:65536)';

his=max(his/sum(his), 1e-12);

xax=-32768:32767;

semilogy(xax,his);
axis([-40000 40000 1e-10 1e-1])


Heck, do you even have to do that, then? (goes to read that manual page. I know there was some reason I did not use the matlab function when I did this before...)

Hmm.  It's a bit funky but ought to be workable. Working.

Ok the following gets apparently identical results, and takes 10 seconds for a 7 minute song.

I declare histc a winner.

Code: [Select]
clear all
close all
clc

fname='13.wav'
x=wavread(fname);
x=round(x*32768);

len=length(x)

his=histc(x(:,1),-32768:32767); % channel 1
his=his+histc(x(:,2),-32768:32767); %channel 2

tot=sum(his);
his=his/tot;
his=max(his, .000000000001);
xax=-32768:32767;

semilogy(xax,his);
axis([-40000 40000 1e-10 1e-1]);
big=max(max(x))
small=min(min(x))

fname
Title: Interesting Histograms.
Post by: romor on 2011-06-02 07:21:36
 yeah, double transposing was useless
at the beginning I thought that various hist() functions probably aren't good idea when you didn't used them in the first place, and my experience with statistics is bare basic - I almost failed on the exam last year
Title: Interesting Histograms.
Post by: Woodinville on 2011-06-02 10:33:44
yeah, double transposing was useless
at the beginning I thought that various hist() functions probably aren't good idea when you didn't used them in the first place, and my experience with statistics is bare basic - I almost failed on the exam last year


I used what worked in matlab best, at least for the versions I've timed out. So much for that idea
Title: Interesting Histograms.
Post by: jlohl on 2014-03-24 11:02:50
Quote
I've found an album that a fair number of missing codes.
I've also found an album that has a max under +-16384 and has over 75% missing codes inside of that.
I didn't originally set up to calculate ratio of missing codes to all codes. But I think I shall have to.

Hi all,
back to this old but interesting topic, I did same kind of histogram as JJ and I also found lots of missing codes in 16 bits analysis of many tracks of various musics. I'm wondering where this comes from.
Because even if the music is produced in 14 or 15 bits (?), some dither is generally used and should more or less fill missing codes, no ?

Thanks
JLO
Title: Interesting Histograms.
Post by: Woodinville on 2014-03-27 10:36:22
Quote
I've found an album that a fair number of missing codes.
I've also found an album that has a max under +-16384 and has over 75% missing codes inside of that.
I didn't originally set up to calculate ratio of missing codes to all codes. But I think I shall have to.

Hi all,
back to this old but interesting topic, I did same kind of histogram as JJ and I also found lots of missing codes in 16 bits analysis of many tracks of various musics. I'm wondering where this comes from.
Because even if the music is produced in 14 or 15 bits (?), some dither is generally used and should more or less fill missing codes, no ?

Thanks
JLO


You would hope so.  But as you've noticed, there does seem to be a lot of "interesting" processing going on.
Title: Interesting Histograms.
Post by: Arnold B. Krueger on 2014-03-28 11:28:55
Quote
I've found an album that a fair number of missing codes.
I've also found an album that has a max under +-16384 and has over 75% missing codes inside of that.
I didn't originally set up to calculate ratio of missing codes to all codes. But I think I shall have to.

Hi all,
back to this old but interesting topic, I did same kind of histogram as JJ and I also found lots of missing codes in 16 bits analysis of many tracks of various musics. I'm wondering where this comes from.
Because even if the music is produced in 14 or 15 bits (?), some dither is generally used and should more or less fill missing codes, no ?

Thanks
JLO


You would hope so.  But as you've noticed, there does seem to be a lot of "interesting" processing going on.


What do you think is the cause of this, JJ?

Is this a consequence of perceptual coding?
Title: Interesting Histograms.
Post by: jlohl on 2014-03-30 18:23:03
Quote
Is this a consequence of perceptual coding?
I've seen missing codes on tracks  that stayed in the pcm world and that have never been perceptually encoded.
Title: Interesting Histograms.
Post by: Woodinville on 2014-04-01 07:07:19
Quote
I've found an album that a fair number of missing codes.
I've also found an album that has a max under +-16384 and has over 75% missing codes inside of that.
I didn't originally set up to calculate ratio of missing codes to all codes. But I think I shall have to.

Hi all,
back to this old but interesting topic, I did same kind of histogram as JJ and I also found lots of missing codes in 16 bits analysis of many tracks of various musics. I'm wondering where this comes from.
Because even if the music is produced in 14 or 15 bits (?), some dither is generally used and should more or less fill missing codes, no ?

Thanks
JLO


You would hope so.  But as you've noticed, there does seem to be a lot of "interesting" processing going on.


What do you think is the cause of this, JJ?

Is this a consequence of perceptual coding?


That's the least likely source. The way any kind of lossless coding (well beyond truncation or reduction in bit length of the signal directly) works would not cause missing codes.

Missing codes are more likely from lack of dithering, bad ADC, or bad arithmetic in a processor. Alternatively, not enough bits in the processor.
Title: Interesting Histograms.
Post by: jlohl on 2014-04-02 20:51:55
To be sure of my histograms, I did some tests generating and analysing noise files :

- white noise generated by wavelab in 44.1kHz, 16 bits
- reduction of number of bits by distorder, recorded back in 44/16
- dithered by wavelab, with internal dither, noise type 1, noise shaping 3, output in 44/16
- histogram analysis in 16 bits and missing codes counted in percentage of actual used range

wav 16 bits : 0% missing codes
wav 15 bits : 50% missing codes
wav 14 bits : 75% missing codes
wav 13 bits : 87% missing codes
wav 13 bits dithered : 0% missing codes
wav 12 bits : 94% missing codes
wav 12 bits dithered : 0% missing codes
wav 11 bits : 97% missing codes
wav 11 bits dithered : 0% missing codes
wav 10 bits : 99% missing codes
wav 10 bits dithered : 33% missing codes

All values seem in conformity with theorie.
Here are the histogram for 10bits also with detailed picture : blue is undithered, red is dithered.
We see that dither is really helpfull in filling missing codes !
Would now be interesting to analyse all steps of a real production, from recording to final pressed CD.

(http://i56.servimg.com/u/f56/11/21/77/49/10bits10.png)
Title: Interesting Histograms.
Post by: Kees de Visser on 2014-04-03 08:45:03
wav 16 bits : 0% missing codes
wav 13 bits dithered : 0% missing codes
wav 12 bits dithered : 0% missing codes
wav 11 bits dithered : 0% missing codes
wav 10 bits dithered : 33% missing codes
Isn't it strange that 10 bits dithered has 33% missing codes ? Were the dither settings identical to the other versions ?
Title: Interesting Histograms.
Post by: jlohl on 2014-04-03 21:02:08
Isn't it strange that 10 bits dithered has 33% missing codes ? Were the dither settings identical to the other versions ?

The dither was the same for all files. You can see in the right lower picture that the dither only fills about 2/3 of the all codes. Because the dither amplitude is not high enough to fill missing codes from a 10bits files but is enough to fill a 11bit file.
Title: Interesting Histograms.
Post by: Woodinville on 2014-04-03 23:44:51
Isn't it strange that 10 bits dithered has 33% missing codes ? Were the dither settings identical to the other versions ?

The dither was the same for all files. You can see in the right lower picture that the dither only fills about 2/3 of the all codes. Because the dither amplitude is not high enough to fill missing codes from a 10bits files but is enough to fill a 11bit file.


Dither should be set relative to the quantization level.
Title: Interesting Histograms.
Post by: pdq on 2014-04-04 14:00:53
Dither should be set relative to the quantization level.

If you were decreasing the number of bits then the dither should be determined by the destination bits, but if you are increasing the resolution then dither should not be applied at all, should it? In this case, why not set the dither based on the source?
Title: Interesting Histograms.
Post by: jlohl on 2014-04-04 15:44:58
Quote
Dither should be set relative to the quantization level.
Above tests are just for demo purposes. But it also shows that dither amplitude and type should be set depending on final quantization but also depending on input signal quantization.
In the above tests, it shows that this particular Wavelab dither is fine for 11bits>16bits dithering but would have unnecessarily high amplitude for a 15 bits>16bits requantization.
Title: Interesting Histograms.
Post by: Woodinville on 2014-04-05 06:50:39
Quote
Dither should be set relative to the quantization level.
Above tests are just for demo purposes. But it also shows that dither amplitude and type should be set depending on final quantization but also depending on input signal quantization.
In the above tests, it shows that this particular Wavelab dither is fine for 11bits>16bits dithering but would have unnecessarily high amplitude for a 15 bits>16bits requantization.


Hmm, that would suggest that the input noise floor is not always being accurately reproduced. It is arguable if that is audible, of course.