Skip to main content
Topic: Have a working 'expander' based on DolbyA (not same design) -- works well. (Read 7103 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: Have a working 'expander' based on DolbyA (not same design) -- works well.

Reply #75
I have been on a distraction -- been playing with reorganizing some old stereo recordings (mostly the same that I had been working on before.)   I re-oriented the Sergio Mendes examples a bit, and have produced much more interesting stereo images (stuff is still real -- no weird inversions that I can hear -- clearer locality/etc.)  The reoriented examples are on the repository:
01.Mais Que Nada,
03.With A little Help From My Friends
05.Look Of Love
09.Going Out Of My Head
13.Pretty World
15.Pais Tropical

I'll publish the technique when I totally get it tuned up.   Pais Tropical is special -- it has some kind of 'on-site' miking -- so it has a harsh character.   The others are really fantastic music, and I wouldn't have wasted my time if it wasn't such great stuff.
I have equivalent results with ABBA (actually even more of an improvement), * but not ready to disclose.

* CORRECTION:  I was in a hurry last night and uploaded my ABBA experiments onto the repository without thinking.  Take a listen - they are interesting.  Not 100% tuned yet -- but very promising.



New expander release, example scripts, example results

Reply #76
Semi-great news!!!
Here there is a new post of the latest 4band psuedo-DolbyA expander.   I have added appropriate low pass filters and retuned some of them because they were overly attenuating the LF around 20-30Hz.   Now, there shouldn't be any more than about 2dB of total static attenuation at 20Hz (of course, the dynamic gain loss will be more due to the operation of the expander.)  It is technically necessary to at least do some DC blocking, or input offsets can cause trouble and there will definitely be more disturbances in the output.

I have cleaned up the source code so that a lot of cruft has been removed.  It will be easier to see how simple the code is from a high level viewpoint.  (There is a lot of low level stuff going on, however due to the use of C++ classes and putting a lot of the work into the supporting code.)

The source distribution was previously often not complete.  I forgot about the vector math library, and re-added that back into the source.  Hopefully, it will build now.

The Windows source distribution and the singular Linux distribution  now has some example scripts that I have actually used to decode some anthology album releases -- which were almost definitely DolbyA encoded.   It is important to follow what is going on in the scripts, or you will not get ideal results.  I carefully set the thresholds in the scripts - even though they are not perfect.  I had to set the thresholds by listening for decoding artifacts (I am well trained to hear them, but still there are some interactions that I sometimes miss.)  Specifically, the thesholds on the carpenters might not be 100% correct.

I have added a simple example of actual decoding before/after results with the psuedo-DolbyA only.  The results show that the psuedo-DolbyA really works (note the significant decrease in hiss, even some inter-syllable NR.)  Note that the result has a little less treble, which is to be expected (similar to the action of DolbyB.)  Even if you retune the treble back up (adding 3dB here or there at the HF range), there is less hiss than the original.

The resulting NR isn't quite as substantial as the 'restoration processor', but still worthwhile, and the results produced by this decoder are technically more accurate as to the original source material.


Re: Have a working 'expander' based on DolbyA (not same design) -- works well.

Reply #77
Regarding the threshold numbers for the Carpenters on the recent release of the expander...
These numbers are variables used in the shell scripts that I sent along with the Windows source release and the singular Linux distribution.

For Carpenters before 1981 (maybe not even accurate for the later part of 1970s, and definitely not for any Christmas album), the
level parameters appear to best be set to:  dt0=0.50 dt1=-3.5 dt2=-3.5 .   The dt2 isn't quite as critical because it applies to only the top two bands for HF enhancement removal.
For Carpenters on/after 1981, the parameters are quite a bit lower:  dt0=-6.0 dt1=-9.0 dt2=0.0.   The last value of dt2 is questionable, but does imply a lot of HF emphasis had been applied.  The much lower parameters imply recording at significantly higher level above DolbyA reference level.  The 1981 album appears to have either significant tape saturation or they are using a compression/limiter with poorly designed HF attack/decay behavior (which might be another attempt at dessing, after too much enhancement with 2HF channels of DolbyA.)   I forget what extreme tape saturation at HF @ 15ips sounds like -- but it just might be overload at HF because of recording hot to begin with, then applying an approx additional 6dB of dynamic gain between 3K and 20K with DolbyA.  (The sses in Karen's voice are really corrupted on especially the 1981 album.)

The source material are the so-called studio anthology album -- which apparently is just a runoff of tapes sent to disk/CD mastering without careful handling of decoding needs.   They were obviously not just using a single DolbyA, but using them in a configuration that enhance the capabilities (or using similar FET sidechain compressor for compression.)  My shell scripts imply the configuration.


Early, early release of the 'restoration processor'

Reply #78
Here, I am distributing (giving you'all) a copy of the so-called 'restoration processor.'  It is a rather nice (IMO) expander that is good at reasonably good NR and transient recovery.  It is NOT a tick/pop remover, and in fact will likely make them much worse.

I can only distribute an Intel/AMD 64 Linux version right now.  Hope to distribute a Windows64 version before the end of this week, and likely will have it working/distributed tomorrow or the day after.   RENAME THE FILE TO ex-avx from ex-avx.out (or use that as the command name.) I promise that I am really gonna work on completing the Win64 version -- I did have earlier version working -- but it needs updating.   I wanted to give away something NOW/ASAP because I have been promising/talking about it for a long time.  The biggest last issue was the best default parameters, and to resolve the commands into something simple for a command line (it used to take a fairly big script to set-up.)   The usage is similar to the psuedo-dolbyA, except it only works at 44.1,48 and 96 (no 192.)  It is complex, and quite multi-threaded.  It really likes 4 core machines, and needs an i7-3000 series or newer.   I'll be releasing versions that don't require nearly the latest CPUs soon (about the time that the windows version is done -- which will be an ATOM/silvermont optimized ver-sion that will work on almost any i3/i5/i7 or recent ATOM.)

The readme talks more about the program, but generally it is used in a command line (pipeline or stdin/stdout.)  The command switches are simple:
--info                   provides a nice 'pacifier' that shows the amount of expansion on each band, in/out levels, and limiter status.    expansion ratio, between 0.00 and 0.99.   Best/most useful values are 0.10 to 0.50.
--exthresh=-x      dB for threshold.  Usually between -20 and -3, with -6 being a good first guess.
--exmingain=-x    dB for minimum gain.  Values between -30 and -3 are good choices, I often use -12
--exmaxgain=x     dB for maximum gain.  Values between 1 and 6 are good choices.  I often use 4
* PLEASE NOTE THAT THESE COMMAND LINE PARAMS MIGHT NOT BE RANGE CHECKED -- insane commands might give insane results (and sore ears.)
* Acutually, there is a rather nice built-in limiter that keeps things from going too nuts.

Example usage:
ex-avx <infile.wav >outfile.wav
Uses default exratio of about 0.21, default exthresh of -6.0dB, exmingain of -15dB, exmaxgain of 6.0dB

Another usage:
ex-avx --exratio=0.50 --exmingain=-20 --exmaxgain=3 --info
Do more aggressive expansion with a lower minimum gain (more noise reduction) and limit the maximum gain
avoiding overload.  Also, provide a running log of what kind of gain control is being done.

I am also including some examples (directly using the exact program being distributed, with default parameters and for the 'lots' version with an expansion parameter of 0.60 (which is A LOT of expansion.)   Note the relative lack of artifacts even with huge amounts of expansion!!!

**** WITH zero dB expansion (default), there is a several dB loss through the processor.  I added in that loss to avoid clipping or exciting the limiter.   I normally use 32bit floating point, so the loss of precision isn't really all that important.  Using 16bits should really be reserved for quick checks, and 32bit floating point is MUCH BETTER considering the fact that this is an expander and I HAVEN"T ADDED DITHERING!!!

Read the README,txt for a little info about this nice toy.  The Linux binary is ex-avx.out (rename it to ex-avx to be compatible with the examples.)  The example is a 30sec clip from an old song from the 1960's called 'Downtown', source was youtube.  The -noproc version is the raw copy.   The -defaultproc version is using the 'restoration processor' without any args.  The -lotsproc is using the 'restoration processor' with much more expansion.

Again -- I promise a windows version soon (hours or days -- not weeks.)   Source code will probably be 2wks away because of maintenance/source management issues.  The source code is very big -- 120MB (yes, really), some of it is automatically generated, and it is JUST NOT READY for other people to look at it (it is A LOT better than it used to be.)


Re: Have a working 'expander' based on DolbyA (not same design) -- works well.

Reply #79
Just got the first build of this version of the FULL SCALE processor working on Windows64.   Bad news is that it runs way too slow on the Silvermont architecture (the ATOM), so I have some optimization work to do.   Will probably take a few more days -- but it is definitely in progress.   Currently, it does run nicely on the Haswell platform on Linux.

The FULL SCALE/capability 'restoration processor' works really nice with the psuedo-Dolby.  So, when doing a DolbyA decode, then doing some additional expansion/transient recovery really does help some old music (e.g. Brasil'66 type stuff.)   The restoration processor is sometimes used like this:

sox infile --> pseudo-DolbyA -> restoration processor -> outfile

John Dyson

Re: Have a working 'expander' based on DolbyA (not same design) -- works well.

Reply #80
This is great news! From my perspective, the restoration processor is icing on the cake. The pseudo-DolbyA is the key. We have some other options for restoration processing (or mastering), such as iZotope Ozone (or iZotope RX Audio Editor), but the Dolby decode is a challenge. I personally like the U-He Satin on some of my tapes, it gives them more life than the hardware decode without being overly bright and do pumping.

Re: Have a working 'expander' based on DolbyA (not same design) -- works well.

Reply #81
Thanks for the feedback!!!  I am very focused working on the Windows version -- which is now actually functioning.

When the 'restoration processor' is fully available, I REALLY HOPE that will help people with their older (and some new) recordings.   When I use the 'restoration processor' (STILL NEED A NAME -- I suck at naming things -- it might just as well be 'processor 1B' or somesuch :-)), I often use it with the pseudo-DolbyA first in a pipeline.  I am currently optimizing the code to work a little more quickly on non-AVX CPUS -- my Windows box is a ATOM 2core silvermont, so is quite slow when compared with my 4core Haswell.   The non-AVX and smaller SSE registers on the ATOM make the code longer and less efficient.  I am focused working on streamlining the code to work much better on slower computers.   I will also probably blindly build an AVX version for Windows (that will need at least a i3/i5/i7 of 3000 series or better, or recent AMD), and offer both the blindly built AVX version and a tested SSE version on Windows.  For Linux, I'll offer both a verified SSE & AVX version...

A hint to doing DolbyA decoding is that sometimes it SEEMS like they did a DolbyA encode on the normal stereo signal, but additionally on the M+S variant.  I have sometimes been successful doing a L+R decode first, and then an M+S with an equal threshold, 1 dB down threshold or 3dB down thresholds.   For the M+S (the second decode), I use the third arg of the parenthesis cmd argument to specify a gain of 0.5dB -- the default loss of the first decoder (again the first is usually L+R.)   The noise reduction and transient behavior is sometimes better when two phases of DolbyA are used (esp when not using the 'restoration processor').  So, essentially, the full pseudo-DolbyA command pipeline (for both L+R & M+S decode) looks something like this:

sox infile.flac --type=wav - rate=96k | da-avx --cmd="(l,-1.0,0.00)" | da-avx --cmd="(m,-4.0,0.50)"| sox - outfile.flac

(which means, L+R decode, threshold -1dB, no input gain and M+S decode, threshold -4.0dB, 0.50dB input gain).

Also, it seems like some vocals might have been processed by some kind of sidechain HF enhancement, where the pseudo-DolbyA can help undo that also, but instead of using the 'l' command, but instead use the 'hl' command which only enables the top two bands of the decoder in the standard L+R mode.

For example (using the pseudo-DolbyA & restoration processor both), the Brasil'66 stuff really needs expanding (really badly -- it is INCREDIBLY beautiful when properly decoded), and the DolbyA decode helps a lot -- but the results are still missing the intensity of the percussion.   The 'restoration processor' rebuilds a significant amount of percussion.  IN SOME WAYS, even though the 'restoration processor' is good at NR, it is the 'transient recovery' that one really benefit s from much of the time.

The 'restoration' software internals are a bit of a tricky beast, because of trying to find a good balance between the release times and the 'modulation noise/surging' issues.   I think that there will eventually be at least three (3) levels of dynamics management along with the adjustable expansion ratio.  (IN the large scale, the expansion ratio is dB linear, but in the short term it is simple linear -- like a FET compressor approximately is -- or DolbyA is approximately simple linear expansion - not really dBlinear.)*
  * When I meantion dBlinear or simple linear, I am speaking more of the domain of the release/attack time characteristics, and
  less the actual compression ratio type dBlinear.   A 'linear' release time has a kind of exponential characteristic, where it slows
  quickly as it approaches the target, but is much faster if it is far away from the target.   dB linear moves at a pure dB rate until
  it hits the goal exactly.  So, the dB linear scheme REALLY hits the target, but the 'linear' scheme keeps approaching the target
  closer and closer.   That shape of the dB oriented attack/release appears to match what the ears like to hear instead of the
  more common simple linear scheme as used on most direct FET compressor/expanders.
  * An example of the 'linear' sound for a compressor is that as the compression amount increases (the depth of compression
     increases), then the effective release speed also increases -- because the decrease in gain has further to go, so the gain change
     is quicker in dB/sec if the amount of gain release is large.   With dB linear release, no matter how many dB of compression are
     being done, the release speed is the same in dB/sec, and the ear hears it (the density) as pretty  much the same as lighter
    compression (other than the  larger eventual gain change.)   A linear compressor is better to use when you want to increase
    the 'compressed sound'  or 'density' by increasing the depth of compression.   A dB linear compressor is better when
    you want the sound density to stay similar whether or not you are using 6dB of compression or 12dB of compression.

So, usually, when being used in the default expansion ratio of approx 0.20 (it is approx 1:1.20), there is mostly very little need to change the dynamics characteristics on most music material.   However, at much higher expansion ratios (0.30 -> 0.70), then the dynamics become super important.  Right now, the dynamics are kind of middle of the road -- but I am going to try to put together a slower and faster mode.   The dynamics tuning is not just a dB/sec release rate type thing, but is the idea that the first part of the attack/release are fast-linear up to several dB.   After that, then it defaults to a dB linear scheme.   It is the mix of 'how much' straight linear and 'how much' dB linear that is a big part of the release scheme.   There appears to be a 'fixed' set of dB/second rates that ears like to hear -- so that rate is current fixed to a set range of values (it is also dynamically calculated.)

Also, the attack scheme is important but not seeming to need as much adjustment -- but I think that I have found a good tradeoff where transient recovery is very effective, but it is not overly sensitive to transients (causing a gain-hang effect or surging.)   Also, I am using an interesting attack scheme which really minimizes the intermodulation on the attack edge.  I haven't exhaustively tuned it yet, but hope that I can actually  measure the intermodulation effects at higher frequencies.  (I do have a scheme/method to measure the effects of attack time on LF ranges -- and it can be very obvious once the intermodulation effects are recovered from the total signal (and one can actually hear the grinding/almost static-like noise that is buried in the signal -- otherwise hidden because of masking.))


New version of pseudo-DolbyA -- fixed some thresholds that were a little off

Reply #82
I think that the 3k-9k and the 9k-20k thresholds were a little off.   I changed them -- and made them a little more precise in the code also...  The HF sound in intense situations (e.g. ABBA type stuff, my 'acid test' for severe HF) is significantly cleaner and the HF sound had some irritating problems until now.  Not usually noticeable, but the error might have been measurable.

New versions of the Linux, Windows and source.

Sorry for the error.

Strong suggestion to try to use the latest psuedo-DolbyA decoder - 23Feb or aft

Reply #83
I just did a comparison between the previous 19Feb version of the psuedo-Dolby and the 23Feb version (that is, the most recent.)  When checking on my most stringent tests USING REAL DOLBYA SOURCE, it becomes very clear that the various attack/decays of the two highest frequency ranges match MUCH better on the 23Feb version.  So, whether or not the newer version is perfect (I make no such guarantee -- cannot test against a REAL, guaranteed DolbyA -- but there is someone who might be volunteering to help), the new version has a MUCH MUCH smoother sound quality to it.   That is, it has a smoother sound even though there is less HF attenuation.

Because of the way that DolbyA did the two high frequency ranges, their attack/decays can interact, and it appears to be important to get the match between the two fairly close.  (That match is more the design of the DolbyA attack/release, not so much the external device.)  The best way to describe the improvement between the two recent pseudo-DolbyA versions is that it SEEMS like the 3-9k and the 9-20k attack/decays match more 'hand in glove', so that there aren't as many undesired undulations between the two gain ranges.

So, I am trying to make clear -- the new version of the psuedo-DolbyA does sound much better.  If you are using/playing with the pseudo-DolbyA at all -- please try to use the newest version.

This does prove one thing -- the design requirements of a DolbyA decoder has lots of little pitfalls.   It isn't super-difficult, but it is quite tedious to try to match the behavior of the old HW design.

John Dyson

Re: Have a working 'expander' based on DolbyA (not same design) -- works well.

Reply #84
With some help from another individual (don't have permission yet to divulge the name), I/we found a bug in my specification of the threshold.   I had previously been suggesting a threshold of 0dB (the middle arg to --cmd).   The better value would be 10.0 .   I am in the midst of a re-review of the software based upon these corrected threshold values (which do appear to be producing better NR, and less need to use repeated decoding!!!)

It is so good to get help when needed.  I do learn, and happy to fix mistakes when they happen.   A new version with a few minor corrections will be released within a day.

The good news is that ALL previously useful results are still valid.  The new results will be better yet!!!


SimplePortal 1.0.0 RC1 © 2008-2018