Skip to main content
Topic: General signal path questions (Read 1890 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

General signal path questions

Hi all
I pose here some thoughts on foobar, digital audio in general, trivial questions as well as some more rather, wondrous speculation nearer the end. I'm sure to some it's basic stuff but It's only long because I wanted be concise. 

Background,
I'm currently experimenting with plug-ins to familiarize reconstruction process and its signal path; albeit through sound card which I'm aware is delta sigma internally as well as some portable speakers (only for time being).

I realise for the following to affect the sound as desired the converter a dac (chip) must accept and run 'natively' at input rate, which can be done using multi-bit/ladder dac via digital interface capable of 192khz+.

I’m currently trying out 'foo_dsp_multiresampler plugin made available by kode (hydrogenaudio), in conjunction with filters made using freeware app 'rephase' (available at diyaudio) to suppress folds/imaging created from ZOH function.

My first question surrounds ‘zero order hold’ up-sampling.
I read this exerts droop characteristic, not to be confused with 'slow roll' off filter also found in digital reconstruction.
I ask if -3dB~ at 20k is really that critical for home listening and whether a universal filter preset/plugin exists to compensate for if it is?

In any case material with different sample rates still need to be streamlined through single FIR convolved at a common rate. Unaware of a convolver with asymmetric rate detection I’m using the BLEP function of multiresampler as temporary solution.

The BLEP synthesis function of multiresampler produces images, much like zoh… however unlike zoh, BLEP doesn’t alias when up sampling by odd factor like 44.1 > 192.
Example:



Therefore BLEP is useful to streamline multiple rates to a common rate and FIR as temporary solution. The FIR itself was made using rephase.

Designing FIR with rephase.
There are the usual parameters in rephase I’ve yet to read up on what these do, particularly Taps, FFT length, centering and windowing though I wish to focus this question more on centering. INT\FLOAT and MIDDLE\ENERGY are all interchangeable options - I was unsure what combination to use for foobar for starters but intelligible guess says FLOAT/ENERGY. There more in regards to ENERGY specifically further down…

Transversal filter stage.
Take for example 48 up sampled zoh to 384k streamed direct to a multi-bit dac running at 384k...
I ask if the residual/suppressed images reaching the dac and counterpart analog filter (intact) produce desirable effect in the first instance and/or if the process of convolving lower rate material with a filter at higher rate can improve things alone?

To reword, can a signal still benefit from the digital process of up sampling > convolution alone if it’s subsequently down sampled to say 96k (from 384) afterwards and sent to the dac at 96k?
I understand the purpose of the ‘classic’ slow roll off is to keep audible band intact as possible but I wonder if convolution can do anything for signal in the digital domain separately from the analog effect of residual/suppressed images reaching post filter.

I've always wondered if up sampled signal merged with imaging (energy) also present via say, an IIR filter, affects the sound, standalone from what may be heard from post filter.

...Finally, quantization and bit depth.
While keeping in mind most ‘post-production’ VSTs can be used with foobar, they are designed to work with 24bit material/recordings.
Then realise the following focuses more on when a signal of lower resolution, ie 16, is to be processed in 32float container (IIRC is equivalent to 24bit).

Because its probable foobar is designed to be transparent/bit perfect to allow receiving sound device to magic-up the signal, I feel it's missing input/decoding options.

I say this, firstly because I’ve seen instances of the word "compander" to describe very separate processes; one quantization related and the other related to expanding ‘compressed’ music (compansion, upward compression). I believe “xfi crystalizer, does just this but in isn’t useful being internal.

Secondly, call me old school but earlier recordings particularly electronic, trance and the like composed using samplers/digi-synths containing dacs from pre 1992 era (PCM56, TDA1541 which I call 1st generation) tend to sound fuller with more ‘bottom'.
I won’t limit probable reasons to perhaps resonating high pass filter and higher noise floor in these systems.

By researching specifications of older converters, I approximated a timeline spanning 3 generations of converters;  1st generation being multi bit converters, 2nd generation ‘bitstream’ from 1992 onwards and 3rd being ‘advanced delta sigma segment’ or 24/96+ chips introduced circa 1998 or thereabout.

When you see measurements of Sony PS1 online for example, they show low level non linearity - I class this 2nd gen IMO, for as 3rd generation feature what appears to be only slightly improved SNR (-100dB) than 2nd until you realise that’s with sinusoidal at full scale and not lesser scale as with older chips.

My point being that 1st gen, found in early samplers and digital synthesizers could only do -60 at best.
Interestingly I read somewhere a floor beyond this is questionable due to Johnson or thermal noise which should be physically present at all times. Extensive over sampling has removed it entirely and it’s dubious whether this is a good thing.

What further asserts my curiosity is also the +48dB control, seen on these samplers.
An intelligible guess says its to attenuate (relatively) high levels of DC present when sampling LPs, though I hope someone can enlighten me on this.

Really, all of this leads me back to the ENERGY centering option and -24db control in rephase. I'm unsure whether these can be used in the same fashion as 48dB control, in sequence with perhaps higher (relative) SNR and perhaps associated low level non linearity emulation in signal path.

Of course I could buy a tape deck, a sampler and re-digitize my music collection but I'm confident theres could be a more interesting approach.

Thoughts and feedback welcome.

Thanks for reading

General signal path questions

Reply #1
Zoh is a toy problem used in textbooks to explain the math behind sampling. Its not a real thing used for audio.


General signal path questions

Reply #3
I don't understand most of what you are talking about....

There is rarely any advantage to upsampling.  You can't add information that's not there...  Well, you can add information but you can't restore any lost information.  ... And, 16-bit/44.1kHz is better than human hearing anyway.

Quote
I ask if -3dB~ at 20k is really that critical for home listening and whether a universal filter preset/plugin exists to compensate for if it is?
Most audio DACs & ADCs are better than that...  Usually, you'll have a higher cutoff frequency and a sharp filter (assuming you're at a sample rate of 44.1kHz or higher).

But, -3dB at 20kHz may not be audible with music.  Generally only young people can hear to 20kHz.  And, any 20kHz musical harmonics are low-level and tend to be masked by lower-frequency sounds.    In a 20kHz hearing test, you are listening to a "loud" 20kHz signal against a silent background.  Even if you can hear that, play a lower-level 20khz sound while music is playing and you probably won't hear it.

Quote
Then realise the following focuses more on when a signal of lower resolution, ie 16, is to be processed in 32float container (IIRC is equivalent to 24bit).
Floating point makes DSP "easier".  There's a lot of summing in DSP and if you are doing mathematical operations on 16-bit values, the results or interim operations can require more than 16-bits. 

Floating point also allows you to go over 0dBFS without clipping.  You can do things in an audio editor like mixing or  boosting the bass that push the peaks over 0dB and as long as you reduce the level before rendering to an integer format, or before sending the signal to your DAC, you can "recover" from going over 0dB.

Quote
in 32float container (IIRC is equivalent to 24bit)
32-bit float handles a much-much wider range of values than 24-bit integer.  With floating-point, you can have huge numbers or tiny fractional numbers.  With 24-bit integers you can "only" count about to +/- 4 million and you can't have fractional values less than 1.  In practical audio terms, floating point essentially has infinite dynamic range.


Quote
I believe “xfi crystalizer, does just this but in isn’t useful being internal.
I believe that's a harmonic exciter effect that adds high frequency harmonics.

General signal path questions

Reply #4
Quote
32-bit float handles a much-much wider range of values than 24-bit integer. With floating-point, you can have huge numbers or tiny fractional numbers. With 24-bit integers you can "only" count about to +/- 4 million and you can't have fractional values less than 1. In practical audio terms, floating point essentially has infinite dynamic range.

Although floating point numbers can express a near-infinite (in this context) range of values, they only use 23 bits of precision (mantissa).  So in that respect they suffer from quantisation problems at a similar level to 24 bit integers.  However, the variable exponent means that the full precision should always be maintained, whereas a fixed (!) integer might have 24 bits of precision but not all in the range you are trying to work with.  The trivial example is that there is nothing between 0 and 1 expressed as an integer, whereas a float can have a full 23 bits-worth of values between 0 and 1.  So if you are changing levels or summing audio, integer values may become drastically distorted (even 32 bit int) while floats should stay more accurate.

The downside of floats, particularly from a computer point of view is rounding.  They are subject to small inaccuracies even with quite trivial operations such as addition.  This is compounded by the methods that computers use to perform floating point arithmetic, so that the same calculation done on different processing units, or even with code generated by different compilers, may produce a different result.  Given the potentially much larger inaccuracies introduced with DSP using integers, or just from the approximations necessary with most DSP, this isn't a big problem.

P.S.  24 bits allows 16.7 million values.  Give or take a sign, that's still more than 4 million.  32 bits allows around 4 *billion* values, 16 bits only 65,536.

General signal path questions

Reply #5
I'm currently experimenting with plug-ins to familiarize reconstruction process and its signal path;

I’m currently trying out 'foo_dsp_multiresampler plugin made available by kode (hydrogenaudio), in conjunction with filters made using freeware app 'rephase' (available at diyaudio) to suppress folds/imaging created from ZOH function.


Why add complexity with plug-ins if you are still trying to understand simple reconstruction?

Quote
My first question surrounds ‘zero order hold’ up-sampling.


Why add complexity with upsampling if you are still trying to understand simple reconstruction?

Quote
I ask if -3dB~ at 20k is really that critical for home listening and whether a universal filter preset/plugin exists to compensate for if it is?


3 dB at 20 KHz is not a complete definition frequency response curve.  There are approximately an infinite number of frequency response curves that are 3 dB down at 20 KHz.

Quote
To reword, can a signal still benefit from the digital process of up sampling > convolution alone if it’s subsequently down sampled to say 96k (from 384) afterwards and sent to the dac at 96k?
I understand the purpose of the ‘classic’ slow roll off is to keep audible band intact as possible but I wonder if convolution can do anything for signal in the digital domain separately from the analog effect of residual/suppressed images reaching post filter.


Signals generally don't benefit from upsampling. In general upsampling just spreads the same information over more bits.  If the upsampling is down without mangling the data too badly, it has no audible effect. Upsampling is never perfect so there is at least a minor degradation involved with its use, but this can be minimized and not audible.

One good reason to upsample  if you are going to do nonlinear processing in the digital domain. Doing the nonlinear processing at a far higher sample rate can minimize the imaging that the nonlinear processing would otherwise cause.


Quote
Secondly, call me old school but earlier recordings particularly electronic, trance and the like composed using samplers/digi-synths containing dacs from pre 1992 era (PCM56, TDA1541 which I call 1st generation) tend to sound fuller with more ‘bottom'.


Any such belief is unfounded in technology.


Quote
By researching specifications of older converters, I approximated a timeline spanning 3 generations of converters;  1st generation being multi bit converters, 2nd generation ‘bitstream’ from 1992 onwards and 3rd being ‘advanced delta sigma segment’ or 24/96+ chips introduced circa 1998 or thereabout.


If you ABX good converters from any of your 3 generations be prepared for a lot of random guessing.  Been there, done that.


Quote
When you see measurements of Sony PS1 online for example, they show low level non linearity


Evidence?

Here's some techical tests of a PS/1

PS/1 test at stereophile

There's a comment in the text that the converters are 14 bit, and the evidence shows that the converters are properly dithered so there is no actual nonlinearity.  Pretty common for that time and place.


 

General signal path questions

Reply #6
There were a few different models of the Playstation. There is the original square model, and the later smaller & rounded model that was sold as "PSone". If I'm not mistaken, there were also some more hardware revisions that Sony didn't update the case for. You need to look at the model number, which is in the format of SCPH-xxxx where xxxx is a number. The article mentions the model they tested.

 
SimplePortal 1.0.0 RC1 © 2008-2020