The 'restoration processor' (PLEASE help me figure out a name) first release is now available. This is missing a few relatively unimportant things -- it is already very complex, and I have vastly simplified the usage when compared with my original (personal) versions.
This program is an expander that works especially well -- it is good for helping undo excessive compression, do some noise reduction, and transient recovery. This program WILL NOT help with record ticks/pops, but can help with turntable rumble -- it has specific low frequency noise reduction also (not just hiss reduction.) Frankly, depending upon the way that the recording was made, the NR can be amazingly good at removing hiss and room vibrations on old recordings. The transient recovery is perhaps even more important in some cases because the low, hiss-prone levels are not very often, but the bright parts of the music occur more often.
Often, I have used this program directly in conjunction with the pseudo-DolbyA (the programs super-add to become much more than each individually), but pseudo-DolbyA is not needed to gain benefit from this program.
Behaviors that might surprise you -- it is not tedious to use -- it doesn't have strong expander artifacts -- it does not skew frequency response very much.
Usage that might surprise you -- the attack/release type stuff is not a normally exposed tweak. The only adjustments are "expansion ratio", "threshold", "maximum gain", "minimum gain". All parameters are set in terms of dB, except the expansion ratio which is a number between 0.00 and 0.99 (please don't expect anything above 0.70 to do anything useful, even 0.50 is a lot of expansion.) The attack/release times are about as automatic as they can be.
Simple usage: This is a command line program (sorry), that runs on Linux and Windows (64 bit versions only, even though I do have a 32 bit version for Linux -- very ugly and slow.) The program likes to run on recent Intel/AMD desktop type machines (i5/i7 are the best, and i9 if you have one.) It can benefit from up to 6 cores, but 4 cores are quite usable (which is what I have.)
Here is the simplest command: ex-avx --info <infile.wav >outfile.wav
the .wav file must be 44.1k, 48k or 96k (96k very much preferred.)
must also be 16bit signed (CD format) or 32 bit floating point (very much preferred.)
(the --info switch is optional, but very useful -- you can see the dBs change in each band.)
Example command to convert to preferred format, process it, then back to CD format:
sox infile.wav --type=wav --encoding=floating-point --bits=32 - rate -v 96k | ex-avx | sox - --encoding=signed-integer --bits=16 CDoutfile.wav rate -v 44.1k
(the long pipeline can be simplified to single commands if you uncomfortable with things like this -- I know, GUI is the way to go nowadays, right? :-).)
Amazingly simple, right? Well... The most interesting additional switch is the '--exratio=0.xx' switch. So, the default is about 0.20 for the 'exratio'. I suggest that exratio up to 0.40 is usually pretty easy to deal with, like this:
ex-avx --exratio=0.40 --info <infile.wav >outfile.wav
READ THE README.txt FOR MORE INFO...
The bad news -- I am not able to test/verify the AVX version on Windows, but it should work. The SSE3 version on Windows does work (I verified it), but it is very slow except on the faster machines. That is, I tested it on an ATOM silvermont, and it runs approx 6X slower than realtime. The same SSE3 code runs approx 1X realtime on my 4 core Haswell desktop and the AVX version runs approx 2X realtime on my 4 core Haswell desktop. If you have an 8 core recent i9, the AVX version should easily run 3-5X realtime.
I know that the program runs slowly, and I had to disable some of the quality for SSE3 versions on Windows -- but not much quality is really lost (mostly some transient recovery is weakened.) All AVX versions have full quality.
POST QUESTIONS (or email me through the messaging or firstname.lastname@example.org.) if you have troubles or want help or
The attachments include the Linux programs, the windows programs, and the README. After you unpack them, choose which program that is best for you -- you can try each one to see which one is best. Nothing really bad should happen if you choose the wrong one :-), other than a fault message of some kind. Sorry for not packaging the programs any better -- it is actually possible to merge some of the machine specific code into one program, it is just too much work for right now :-).
GOOD LUCK, and tell me if you want help.
Additional comment -- there are some potentially useful and detailed postings about this full scale expander program in the pseudo-DolbyA thread in this forum. It talks in more detail about some of the command line arguments and some of the details/hints of the expander behavior.
If this expander adhered to any one 'theory' of operation, it wouldn't sound very good -- it would sound very much like a typical expander (pumping, noise modulation, etc.) I have tried to mix the idea of a dB and a linear attack/decay expander, and let the program adapt somewhat between the two modes.
I actually found that it appears that our 'hearing' generally likes a certain attack/decay time range in the long term, and a certain attack/decay time range in the short term. The way (shape) of the attack/decay curve is also different between the two modes (there aren't really two modes, but a kind of adaptive behavior between the linear attack/decay and the dB-linear attack/decay.)
I am trying to figure out if the way that this expander works is unique. I haven't found any resources that talk about the kind of 'tricks' being done in this piece of software. I am thinking that this (kind of one and a half) special techniques might be a proprietary feature in other expanders/noise reducers... I truly DO NOT KNOW.
I question even bringing up the matter about some of the subtle internal operating techniques, but I also know that I'll never make money on this software, so I will eventually be providing all information that makes sense for others to use. Just trying to determine if some of these things in the software are 'special', 'competent' or simply 'stupid' or 'obvious' to those who do this kind of software for a living!!!
Please give me feedback as to the performance of the software, and if you have any troubles or find severely ineffective behavior in the software, please tell me. I have plenty of time to make this thing work well. As quickly as the software stabilizes, I'll be able to consider releasing the source code. Right now, LOTS of things can still change, and just blasting out source code with 100's of lines of changes every several days is not really helpful for anyone. I do intend to release the source code (and all of the potentially interesting tricks in the code.)
I'd be glad to help test and evaluate it, but all my computers are Macs, sooo.... Good luck with it anyway; this kind of thing could be immensely useful for music and speech processing.
Hopefully, if this (expander) works as well as I am trying to make it work -- there will be enough information available for other people who are 'Mac' programming experts to adapt the design to that platform. Maybe someone who knows GUI programming might even do a GUI port and/or plugin port of the code also??? If there is enough interest, I intend to make the code clear enough (and the concepts clear enough) for others to continue with the effort -- if they want.
I have something (I believe) pretty impressive -- a short part of 'casino-royale' from Herb Alpert back in the 1960's. The un processed version is louder, and quite compressed sounding. Then.... listen to the fully processed version... The processed version is MUCH MUCH more natural. If you play it at regular listening levels, the coronet comes through incredibly intensely and doesn't sound nearly as compressed. I thought that I detected some slight mutual gain change (between parts of the music), but after review found that the change derived from the original material -- so the expansion artifacts are not noticeable to me. I could have used higher expansion amounts, but this was just a normal amount of useful expansion. This version resulted from both the psuedo-DolbyA and the restoration processor running at 1:1.30 expansion ratio (the expansion setting was --exratio=0.30. The output of the pseudo-DolbyA chain was fed directly into the restoration processor with the following args:
I placed the full version on my repository: https://spaces.hightail.com/space/pG4t4ZFnyB
(look for the name 'casino-fullnoproc' and 'casino-fullproc'.)
PS: (From what I have heard -- part of the OS that I had written a significant part of -- the FreeBSD OS was used to create the new Mac OS... however, they didn't use the kernel code that I wrote, but mostly took parts of the userland stuff. I wrote the VM code and various parts of the I/O subsystems and a few other pieces -- AFAIR, did I also write the tty I/O subsystem, or was it my collaborator David??? Been over 20yrs ago, hard to remember it all!!!)
As a matter of 'full disclosure', and trying to avoid confusion -- the 'Casino Royale' example was done using this very long command line. I'll show the full command line, then separate it into its components to make it clearer for non-Unix/Linux users *easier understand. (I know that command line things can be perplexing to pure GUI users -- no biggie, just trying to help here):
Here is the original command line (copied directly from the command line editing feature of bash):
sox "/music/Herb/H*/07*" --type=wav --encoding=floating-point - gain 0 rate -v 96k| /home/jdyson/ap/DolbyA/da-avx --cmd="(l,-1.5,0.0)" --info | /home/jdyson/ap/DolbyA/da-avx --cmd="(m,-4.5,0.50)" --info | /home/jdyson/ap/newagcwork/ex-avx --exratio=0.30 --exthresh=-18 --info | sox - casino-proc.flac gain -bn
Here is each phase of the command -- note that the '|' character means take the output of the previous command and use it as input to the next... Here we go!!!
Read the 'Casino Royale' file from my archives, convert it to a 96k, floating point .wav file with a gain of 0dB:
sox "/music/Herb/H*/07*" --type=wav --encoding=floating-point - gain 0 rate -v 96k
Do a normal L+R DolbyA decode, with a threshold of -1.5dB.
/home/jdyson/ap/DolbyA/da-avx --cmd="(l,-1.5,0.0)" --info
Do a normal M+S DolbyA decode, with a threshold of -4.5dB. Add a 0.50dB makeup gain due to the 0.50dB loss natural
to the DolbyA expander design.
/home/jdyson/ap/DolbyA/da-avx --cmd="(m,-4.5,0.50)" --info
Run the 'restoration processor', with an expansion ratio of 0.30 (actually 1.30), and a threshold of -18dB.
/home/jdyson/ap/newagcwork/ex-avx --exratio=0.30 --exthresh=-18 --info
Convert the output of the 'restoration processor' into a flac file, and normalize the output level to 0dB. (.wav files are too big!!!)
sox - casino-proc.flac gain -bn
NOTE: the benefit of doing M+S decode was a surprise to me, but quite beneficial. It is my guess that NR was done in that direction to help with QUAD encoding or something like that (or gain better NR.) The NR benefit of the M+S NR is almost as good as the L+R, making the background hiss much less obvious. It took me probably a month of experimentation as to why I kept noticing aspects of DolbyA compression even after doing a normal L+R expansion. I tried a second L+R, and it wasn't always correct -- so I tried M+S on a lark, and it sounded better. This is why I added the M+S capability directly into the psuedo-DolbyA decoder.
Expander usage hint -- one of the 'hidden' features. You might notice the depth of expansion is somewhat conservative vs. time, but is still effective. If you want MUCH MORE depth of expansion, there is a parameter that you can set (and it will change in the future, but will be effectively implemented by a new command.) This command is one of the temporary commands because I haven't put together a coherent command structure yet. So, if you want more noise reduction (I mean -- significantly more) between syllables -- I mean, really fast... There is an option that you can use (and it will effectively be supported forever, but in a different form):
When these bits are set, the speed of the expansion is much faster. This is NOT just a tuning of speed, but changes the filtering structure. The 0x100000 bit adds in a relatively fast gain riding filter in the dB domain. The 0x8000 bit disconnects a part of the bigger scale dB release time filter. So, basically the gain filter is changed from a dynamic standard release time, and is changed into a fast gain-riding filter. Eventually, this will be a single feature -- but I leave these switches available for experimentation.
When running the 'restoration processor' on one example song (Take a Chance On Me from ABBA), the HF1 gain depth goes from approx -3dB to -1dB for the gain range to approx -8 to -1dB. The processor maintains high quality at extremely fast attack/decay times (esp fast decay times) due to careful management of aliasing and intermodulation distortion.
So, in effect, this modification effectively decreases the 'modulation noise' by about 5dB. This modification might not always be applicable, so I avoided making it permanent. After running some tests, with a few minor mods, this MIGHT become the default. There are even more aggressive capabilities in the expander, but with this mod, it seems asymptotically about as good as one can do...