Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Interesting Histograms. (Read 44660 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Interesting Histograms.

Reply #25
Then I posted "GNU Octave fucking blows" on Facebook. Now see jj, if I friended you, you would have seen that comment, and you wouldn't have had to go through all that trouble, eh


I somewhat associate 'Axon' with wooing Woodinville for quite some time now, but I think that's a new level.

You have no idea.


I'll bug the admins. I can't do this. In the interim, zip the .m and post that?


Well, screen-copying the text above into octave works like a champ

It's not like there's any special character stuff in it.


Right, but, my hacked audio package is 380 lines of code, my test for said hack was another 50... it starts to add up.

In any case. I have uploaded everything I've got HERE. Have at it.

Interesting Histograms.

Reply #26
Right, but, my hacked audio package is 380 lines of code, my test for said hack was another 50... it starts to add up.
[ codebox ] will work.

Interesting Histograms.

Reply #27
You can go into the library and change it to the proper 2^(n-1) without much trouble.

2^(n-1) is still problematic (though perhaps not for histograms). E.g. a wav containing a minimum value cannot be inverted.

Interesting Histograms.

Reply #28
More generally,

Code: [Select]
float scale(int value){
  float offset=(MAX_INT+MIN_INT)/2;
  float range=(MAX_INT-MIN_INT)/2;
  return (value+offset)/range;
}


MAX_INT and MIN_INT are the minimum and maximum values for the int type, which is a fixed-point number.

Interesting Histograms.

Reply #29
But not for wav (or au, or aiff) files, where 0 is defined to be the midpoint ("offset" in the code).

Edit: not sure if the code has representational issues: mathematically, (MAX_INT+MIN_INT)/2 = -0.5

Interesting Histograms.

Reply #30
That code should map int values to 0..1 inclusive, ideally, ignoring external definition of midpoint. Defining the midpoint to be int 0 implies that the encoding is slightly biased towards negative values. Seems to be a really weird engineering decision. Do you have a citation for defining 0 to be the midpoint?

Interesting Histograms.

Reply #31
 to me it seems that Woodinville found that formula as workaround, and as shown later by xnor Octave's wavread does have issue, only of different kind

libsndfile data I was referring was with e-notation and it seemed like everything is OK with 16 digit Octave floats (not mentioning 4-6 digits cool edit  but it wasn't - values were way off, sometimes to two decimals
Changing to xnor notices then formating everything to 16-digit floats, and double checking, seem fine again

Apologetic IPy version for those histograms
[a href=\"http://dl.dropbox.com/u/30782742/ipython.html\" target=\"_blank\"]

I used "whos" and slicing both in Octave and IPy, to check if variables match and everything is OK

Not sure what's the purpose of "kk - for loop": kk is never used if intended to process both channel data, and even if it should, "jj - for loop" seems to me it could take care. Or perhaps I use Octave rarely

Interesting Histograms.

Reply #32
That code should map int values to 0..1 inclusive, ideally, ignoring external definition of midpoint. Defining the midpoint to be int 0 implies that the encoding is slightly biased towards negative values. Seems to be a really weird engineering decision. Do you have a citation for defining 0 to be the midpoint?

http://msdn.microsoft.com/en-us/library/ms...audiodataformat

Weird, certainly: lots of special-case code needed and/or DC-offsets creeping in.

Interesting Histograms.

Reply #33
Defining the midpoint to be int 0 implies that the encoding is slightly biased towards negative values.


It doesn't have to imply that. The negative range can also be interpreted to just have more headroom. When you convert from a balanced to an unbalanced encoding, the max negative symbol stays unused. When you convert from an unbalanced to a balanced encoding, there indeed needs to be special case handling. Best general guidance would be not using the max negative symbol at all and asking for user feedback when you encounter it on the input pipeline.

It is a good thing, that 0 is defined as the midpoint. Else detecting silence would be a PITA. When the first PCM formats were developed, these differences were probably too far below the analog noise floor of the best converters to really cause any concern.

 

Interesting Histograms.

Reply #34
Best general guidance would be not using the max negative symbol at all and asking for user feedback when you encounter it on the input pipeline.

Should an ADC request user intervention whenever its input is < 1/2^n of its range?

Quote
It is a good thing, that 0 is defined as the midpoint. Else detecting silence would be a PITA.

I'm not sure that's useful though; in practice, silence is defined in (finite) dB.

Interesting Histograms.

Reply #35
I don't see the problem. PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).

Anything above that range is simply clipped to the highest possible value, so inverting -32768 results in +32767.
"I hear it when I see it."

Interesting Histograms.

Reply #36
You can go into the library and change it to the proper 2^(n-1) without much trouble.

2^(n-1) is still problematic (though perhaps not for histograms). E.g. a wav containing a minimum value cannot be inverted.


That is correct, and proper.

That is the nature of 2's compliment, there is only one '0' entry, and thus one extra negative entry.

And, yes, if the quantizer is done as a standard PCM quantizer, there must be a zero reconstruction level.

That is, after all, the definition. Yes. Really.
-----
J. D. (jj) Johnston

Interesting Histograms.

Reply #37
Not sure what's the purpose of "kk - for loop": kk is never used if intended to process both channel data, and even if it should, "jj - for loop" seems to me it could take care. Or perhaps I use Octave rarely


It's for doing the spectral flatness measure, but I may have indeed messed up, let me look.

Yep, messed up. the '1' in the array reference should be 'kk'.

Sigh.  I'll see if the sfm is much different. It's unlikely.

Oh, and yes, octave is astonishingly slow, even when you only do the histogram stuff. In fact, the SFM stuff adds surprisingly little to the run time, which is bizzare.

Even without bounds checking it's slow, slow, slow. No idea why.

Matlab is quite a bit faster, but I don't have it here.
-----
J. D. (jj) Johnston

Interesting Histograms.

Reply #38
Should an ADC request user intervention whenever its input is < 1/2^n of its range?


No, just fall back to the usual behavior defined for all out-of-range values.

I'm not sure that's useful though; in practice, silence is defined in (finite) dB.


It's not a bug, it's a feature! From an analog perspective, a good place for the midpoint would have been between the two smallest symbols and both ranges would have been in perfect symmetry. Silence would then be encoded as some form of noise alternating between the smallest symbols, which is fine, since PCM encoding doesn't make any promise better than that.

Giving '0' the privileged meaning of 'digital silence', at the cost of one usual symbol in the positive range, enables scenarios, where you signal something like: "don't try to replicate my primitive approximation of silence but replace it with the best silence you have available". The cost (1 symbol) isn't significant in contrast to the gained possibility.

In practice you dither, so a privileged '0' symbol is unnecessary, since silence is encoded as noise anyway. Some dithering tools have something like an "auto-black" feature, though.

I don't see the problem. PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).


[-1.0, 1.0] is a perfectly fine range for PCM encoding in float representation. Why should artificial constraints from a legacy storage format be carried over to a better format, which doesn't benefit from that constraint in any way? Conversion to and from [-1, 1] isn't black magic, after all.

Interesting Histograms.

Reply #39
[-1.0, 1.0] is a perfectly fine range for PCM encoding in float representation.

I (we?) were talking about the PCM format (format tag 1) in RIFF WAVE files, which is integer only, and the normalization issue.
With normalized floats (format tag 3) the range is of course, like you posted, -1.0 <= y <= +1.0 and normalization is not needed.

I don't think those non-floating point formats are legacy at all. I know a couple of recording engineers that do not use floats as storage format.
And I think it's common practice to keep the level at least a fraction of a dB below full scale. Even if you're only 0.01 dB below full scale you're down to something like 32730 with 16-bit integers.
"I hear it when I see it."

Interesting Histograms.

Reply #40
Once again I put in my amateur two-bits worth and receive sound instruction in return. I <3 you guys. </off-topic>

Interesting Histograms.

Reply #41
Yeah I also simplified jj's Octave code (into something like 10 lines IIRC), and always attempted to use native Octave functions whereever I could, and it still ran like a dog.

mmmm... Python for numeric work. Crunchy.

Interesting Histograms.

Reply #42
Yeah I also simplified jj's Octave code (into something like 10 lines IIRC), and always attempted to use native Octave functions whereever I could, and it still ran like a dog.

mmmm... Python for numeric work. Crunchy.


Nah, dogs run fast. It's not that fast
-----
J. D. (jj) Johnston

Interesting Histograms.

Reply #43
PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).
Anything above that range is simply clipped to the highest possible value, so inverting -32768 results in +32767.

This approach introduces a new mathematics, where inversion is non-linear, which is madness, or at least highly undesirable.

Either the analogue signal is biased with ½ LSB, in which case the digital signal range is -32767 to +32767 (-32768 can never occur and would clip if sent to a corresponding DAC), or the digital signal is biased with ½ LSB, in which case the inverse of -32768 is 32767.

Note that even though microsoft claims that the midpoint is zero, a WAV file cannot know how your ADC is biased-up.

That is correct, and proper.
That is, after all, the definition. Yes. Really.

Can you provide a source for the definition?

Interesting Histograms.

Reply #44
.... but can you hear a 0.5 lsb offset?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

Interesting Histograms.

Reply #45
Can you provide a source for the definition?


It goes all the way back to about 1960. I don't recall it presently, but in fact zero is zero. It all boils down to that.

There have been a variety of scalings for fix to float conversion, but the most common is that of -1 is the largest negative.  Given the reality of integer 2's compliment math, that's really how it all works out.
For sign-magnitude integers, you wind up with one zero with two codes for it in the integer.

It is possible to do midriser quantizers instead of midtreat quantizers, but then the following bites you:

When you start to do integer math and floating point math and expect something to work out the same way, you have to  have zero is in fact zero, and nothing else but. Otherwise you have very different domains for your signals.
-----
J. D. (jj) Johnston

Interesting Histograms.

Reply #46
0.5lsb is an issue for spectrum analysis, and in principle, might also exacerbate potential stability issues in lowpass filters. But more importantly, ITS JUST WRONG.

PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).
Anything above that range is simply clipped to the highest possible value, so inverting -32768 results in +32767.

Not quite -- inverting (multiplying by -1) -32768 results in -32768. Invert all the bits, and add 1.

Quote
That is correct, and proper.
That is, after all, the definition. Yes. Really.

Can you provide a source for the definition?

The wikipedia entry for two's complement arithmetic?

Interesting Histograms.

Reply #47
And just to flesh this discussion out some, yes, having a negative value which cannot be inverted to a positive value *is* the cleanest and most efficient solution. Unless anybody here would instead prefer negative zero. Hands?

Interesting Histograms.

Reply #48
PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).
Anything above that range is simply clipped to the highest possible value, so inverting -32768 results in +32767.

Not quite -- inverting (multiplying by -1) -32768 results in -32768. Invert all the bits, and add 1.

Meaning all -1s are inverted to 2, and there will be no 1s? Btw, the way xnor described it is how e.g. Audition inverts.

Chris
If I don't reply to your reply, it means I agree with you.

Interesting Histograms.

Reply #49
0.5lsb is an issue for spectrum analysis, and in principle, might also exacerbate potential stability issues in lowpass filters. But more importantly, ITS JUST WRONG.

Indeed, hence the discussion—it's a small but annoying issue if it's not handled consistently.

Quote
PCM cannot represent exactly +1.0, it's as simple as that. The valid range is -1.0 <= y < +1.0 where 1.0 maps to 2^(nbits-1).
Anything above that range is simply clipped to the highest possible value, so inverting -32768 results in +32767.

Not quite -- inverting (multiplying by -1) -32768 results in -32768. Invert all the bits, and add 1.

But that's also highly undesirable for DSP.

At the ADC, if there is no bias, inverting an analogue signal that converts to -32768 would produce an analogue signal that converts to 32767; digital inversion should give the same result (at the DAC output that is).

Quote
Quote
That is correct, and proper.
That is, after all, the definition. Yes. Really.

Can you provide a source for the definition?

The wikipedia entry for two's complement arithmetic?

It doesn't mention ADC/DAC biasing.  A better place to look might be IEC 60908 or somesuch.

If there is no ADC bias (and 16-bit ADC values are stored unmodified or with just the top bit flipped), then a valid DSP solution is:

Code: [Select]
float dsp_sample = (adc_sample + 0.5) / 32767.5;

If there is ½ LSB ADC bias then a valid DSP solution is:

Code: [Select]
float dsp_sample = adc_sample / 32767.0;

and -32768 is an unused value.  The code:

Code: [Select]
float dsp_sample = adc_sample / 32768.0;

doesn't seem to map to any real world ADC scenario.

In practice, as has been mentioned, recordings are made with headroom and probably have any DC-offset (w.r.t. digital 0) removed with post-processing; this however has the same result as biasing the ADC, which again means that -32768 should be an unused value.