Here is one more -- probably more typical song -- Mais que nada? You can hear the NR there, but of course the transient processing is pretty good also.
Wanted to make sure that there isn't too much loss of interest because of the waterall style of development (and the ongoing learning exercise on my part. I have done audio signal processors for many years, but this is the most out-of-the-box that I have done. This current one that did this processing example is doing a vast NONLINEAR calculation to rebuild transients (and as a side effect -- helps witih NR.) This is not the typical detector->filter->gaincontrol type design, even though it does have those aspects. I intend to release this when it settles down, but fortunately/unfortunately, every week or so I find a new change/minor redesign that makes a major improvement. I am working for us all on this. (Sorry for the lack of fade on the example -- I screwed up when making it... There is another example on the main topic that takes a song distorted with a lot of live-style 1960s recording technique and made a -- perhaps 30-50% improvement... This is more substantial, perhaps a little too aggressive, but shows the ability to do extreme processing when needed.
The previous example: Mais-new.mp3, was processed using almost full bore processing options. As such - a few minor artifacts were left in (a little freq resp skew, etc.) yep -- this processor even mostly maintains a relatively flat freq response even with extreme expansion -- lots of subtle stuff going on. However, this version has about 1/2-2/3 of the expansion going on, but still the full dynamics restoration -- listen to the percussion -- very sharp without a hard edge. If it wasn't mathematically fairly accurate, it would be harsh sounding. Also, this demo was run with a less deep series expansion -- did it with less complete series expansion because of the amount of CPU time. The processor can run from realtime alll the way down to approx 1/3 realtime, depending on various parameters including the depth of the transient recovery. If this was a simple edge enhance, it would be very fast and easy, but it isn't. So, this one has about 1/2-2/3 of the expansion.
Have another NON-IDEAL-SITUATION result from the restoration processor. This is the "lonely bull" from Herb Alpert. If you listen carefully at the beginning of the ORIGINAL material ON THE LEFT CHANNEL, you'll notice a veil of primarily mid-high frequency hiss. It isn't severe, but is noticeable and noticeable throughout the piece. The Bull-nr version has the processor running at a fairly low processing level -- mostly just enough to help with the noise and to add just a bit of dynamic range. Currently, the processor is NOT optimized for NR because I have biased it towards being better for dynamics. I am going to add on some more noise handling this week, but even currently it does a noticeable job. Because of the way that the ears work, SOME of the high level and low level processing have to be a little different. There currently is NO special handling for hiss, but I have some ideas, and the transient processing does help because it does hide the hiss more inter-syllable (the transient handling works mostly for upwards level transients, but does have equal basic design mechanisms that can instantaneously decrease gain also.) Currently, the transient handling on the low level side is muted a bit, and I need to design some code that will improve that without adding distortion/etc. The examples are 'Bull-orig.mp3' which has a snippit of the original code, and 'Bull-nr.mp3' is the noise reduced and slghtly expanded copy. I NORMALLY KEEP PRISTINE FLAC COPIES FOR A WEEK OR SO, so if someone wants to carefully analyze the sound, let me know. Also, I encoded the mp3s at fairly high rate. AS ALWAYS -- it AINT PERFECT, and constructive criticism is happily accepted.
As I suggested before, I made a few changes to the downward transient handling so that HOPEFULLY there is little increase in distortion, but a significant (okay, barely audible :-)) increase in noise reduction. Well, here it is. Most of the veil of hiss that was left over from the previous version of the processor is removed. I guess that the common term might be 'modulation noise' type effect -- where there was previously more of a lag in handling the downward decrease is level. Even the original version was pretty fast, significantly faster than a typical RC decay without creating intermod (in this case, intermod would be the effect of gain control on the shape of the waveform, thereby creating sidebands or modulation products.) Normally, just depending on the RC product might either allow hiss to get through during the decrease or be so very fast as to modulate the signal. This version does neither -- for constant signals (or relatively constant signals), there is no excess distortion. When the signal level increases or decreases, the change is very rapid (much faster than normal RC delays), but not simply accelerated by some kind of peaking or high pass filter. Before, I had blunted off some of the handling of the downside of the signal level, but now I let it go hog-wild. So, even though this has passed my preliminary tests/verification -- the full test suite takes almost a day to run, it might not be 100% correct. I simply sometimes become overly proud of my baby. I have heard nothing that implies that the signal will have more noticeable distortion, BUT IT DOES SOUND QUIETER, especially in the inter-syllable level decreases. I have called this demo version 'Bull-nr3', and the original is exactly the same as previous.
This demo came from my test run... Geesh, it takes a long time. I got so frustrated that I didn't run the full depth traversal of the transient recovery... If I run the whole thing, it takes about 20mins for a 5min song on a 4core Haswell CPU. If I run with fewer terms expanded, it runs faster than realtime, so that is what I chose. This demo shows some progress with Pais Tropical from Sergio Mendes Brasil'66. I tend to use the worst, fuzziest (but fixable) test songs for my benchmark. It is easy to do easy stuff like the Carpenters (yes -- it really improves their recordings -- helps to lessen the purposeful hard edges on Karens voice -- shows Richards 's' problem clearly also -- kind of gets rid of certain kinds of processing.) So -- this is not like the processed version of Pais Tropical is going to be smooth and non-crunchy, but the processor has mitigated a LOT of the crunch along with some EQ (I don't normally play with EQ -- but in this case, they obviously did something to the audio.) If I just tried to EQ the original, it is still very ugly sounding (even though a wonderful song.) By doing the processing (at a medium/high level) and a little bit (really, not much) EQ, the sound is more up-to-date. It is still crunchy, but not quite as bad, and the male voices are much better (not perfect.) I am NOT demoing perfection, but rather the small amounts of incremental progress. I have some really, really good sounding stuff too, but it is so very easy for the processor to remove some 1960's/1970's compressor artifacts (and remove voice enhancement to an extent.) This Pais-Tropical and the ABBA SuperTrouper are VERY difficult.
Here is a very short Carpenters example where the voice seems a bit clearer. There is absolutely no EQ or anything like that. This is all just the restoration processor (which has no EQ when in restoration mode.) The program does have the ability to modify the freq response when compressing, but that feature has been deprecated because even the multi-band compressors have the ability to maintain fairly flat input/output frequency response. The compressors are currently disabled, however and the limiters only kick in when the signal starts getting too hot (and the limiters are special beasts also -- not your grandma's limiters) -- but I am not focusing or caring about the 'finalizing' mode right now. Here is a very short clip of the Carpenters. During this clip, there is absolutely no limiting going on, and anytime I produce a demo, I seldom let limiting hit the signal (but I have to watch carefully -- because I doubt anyone can detect when the processor goes into up to 3dB of limiting.) It is just not accurate to let the limiting kick in, so I avoid it.
* One note -- the mix of the restoration/finalizing features is meant to support undoing the old traditional compression and sometimes associated nasty sound, and replace it with either very careful compression and/or just a very careful limiting phase... The expander part of the restoration processor (ESP the transient recovery) can sometimes push the dynamic range -- yet sound pretty good.
I am thinking about the 'carpenters' example right before... I played the entire piece, and it sounds pretty good except for two things -- I think that I added a bit too strong dynamic range to it -- if I play it at very low level at start, near the end it is pretty loud (that is, of the entire rainy days and mondays song.) That matter is easily tunable -- but the other problem is that there is some ducking in the sibilance of Karens voice. I heard a rumor that they were using part of a DolbyA to boost the highs in her voice - so, I pulled out the trusty psuedo-DolbyA processor and it did a pretty good job of reducing some of the ducking. The bad news is that the pseudo-DolbyA acts on the entire signal and not just her voice. I could play matrix games to bring up her voice and the to rematrix it, but then that is a singular solution for a single song. Not worth the trouble or tuning effort.
Back to the long-term-dynamics being too strong... I did a little experiment where I basically took the gain control signal, and split it into two bands -- LF for slow changes in gain and HF for all other changes in gain. I did an effective sqrt on the LF gain signal (well, not quite, but essentially I did), and recombined the LF & HF signals (handling time problems, of course) and used the modified gain control on the audio channel... So, I got the dynamics of essentially 1:1.5, but the long term of 1:sqrt(1.5). It kind of worked, and didn't have artifacts except a small amount of excess surging which I know can be corrected by changing the gain control LF 3dB freq. I would like to have separate control of the long term and short term dynamics processing -- but need to think about it, but the benefit seems smaller than what the effort might be. The transient processing is pretty much fixed (to a constant) because the math works out correctly on if the proportions are correct, and must essentially match the expansion factors, attack/decay times, etc. The transient processing used to be variable, but nothing other than the 'middle' or 'normal' value was optimal.
Oh well, I am working on finishing the simplified command structure for the restoration processor. The program will probably have everything in it like the finalizer/compressors, etc -- but won't be accessible by the simplified command line mechanism. The restoration code is complex enough, but when trying to specify all of the parameters of multiple multi-band compressors, a three band compressor, one pseudo-limiter, one-nice-hard-limiter, and one really hard limiter (plus clipping if that ever happens), it can be a VERY long command line. So, I am keeping things simple right now. (The limiters will be engaged, but only to limit the output to 0dBfs. And normally they won't be very audible unless someone hits them fairly hard -- some of the limiting is done in a phase coherent/anti-intermod multi-band way -- it is really difficult to detect unless the real limiter kicks in.)
The display for the expansion/input/output and pseudo-DolbyA decode is very, very primitive, but incredibly useful. It is basically a per second text update of information about the previous second (max/min gains, etc)... It can be enabled/disabled. There are internal things that allow producing graphs for gnuplot/etc, but I haven't really needed them -- so slowly started removing them. I have been planning to produce the programmatic interface for a GUI -- but I won't do the GUI code.
Wrt 'rainy days and mondays'. At the beginning, there are some 'crackles' that are obviously artificially produced (due to electronic gizmo or overload or something.) One is during the sustained 'olllld', and the other is the word 'rainy' has some crackles on almost all versions that I have heard except for an old quad version that listened to. I totally reprocessed Carpenters's rainy days and mondays, and produced this snippet to show what the processors can do. The entire thing is all-out, that is I ran the pseudo-DolbyA with the first two channels gain control disabled (to overtly fix the voice crackles), the restoration processor to get rid of the hiss and other distractions like that, but also ended up with a wider dynamic range than normal listeners would want. So, to squeeze it back down so that the music generally stays within -20dB to 0dB -- mostly sitting between -6 and -3dB, I ran the finalizing processor on it.
The finalizing processor has a 3band compressor RMS(1.7:1), an 8 band RMS (low compression ratio: 1.2:1), 8band linear/RMS (for faster tracking, low compression ratio: 1.2:1), and the limiter bank which only saw one or two hits at -1 or -2dB near the end of the song. The attack/release times are totally calculated at runtime, but anywhere between 10msec and 1second. The result has a slightly different frequency response balance due to the 1/2 psuedo-DolbyA running on it -- but actually sounds more natural and less punchy. The highs are still there -- so not much is really lost. The full song is at https://spaces.hightail.com/space/bOPBXTkeeT with the name 01.rainy-proc.mp3. The short beginning is attached...
Something REALLY WEIRD in producing the demo for 'rainy days and mondays' for the full-out processor cleanup example. I am going to chase this down JUST IN CASE someone listens and finds that the distortion in the word 'rainy' is still there. I listened to the mp3 copy of the processed version and found a bit of a buzz again in the word 'rainy'. When I played the direct 96k 24bit floating-point original copy -- the buzz wasn't there?!?!? It confounds me right now that the conversion process would cause the error. I will chase that down -- I didn't think that mp3 encoding/decoding would cause that kind of artifact (I always misspell it that way). Later on today, I am going to try various conversions and encode/decode of different file types to figure out what is going on.
Sorry if this is any trouble... But the buzz appears to be gone from my raw/uncompressed copy. hmmm... I have heard the 'swirl' type HF effect from MP3, but this is the first time that I have heard a distrotion like sound...
FOUND RAGGED SOUND SOURCE -- I THINK:
Sorry for so many posts -- I think that I found the 'ragged sound' problem that bothers me badly (but could be from many sources, including my local HW.) Anyway -- If I play the example that I had previously posted here, it sounds okay. If I download and play the version that I sent to 'spaces', then that is okay. However, If I play the version on "spaces" through the built-in player that pops up on my WWW browser I do get the slight rattle. So, if you want to hear what my examples REALLY sound like on the repository, then download them first, and use a good player. I do not know the architecture of the pop-up player, and don't really like to play with WWW toys - so all I can say is that the only way to hear my full examples ACCURATELY is to download the copy entirely first and then play on an accurate/high quality player that doesn't have math glitches in it!!!
The reason why i am so very picky about how something sounds -- it is because I am writing software that manipulates audio -- and I want to have control of any change that I make. My hearing is getting 'old', so I will miss things, but I do listen with full attention so that I hopefully don't miss anything.
Sometimes, I get fixated on a single song or group for my testing. Here is an example that shows before/after and the silky silence (very slight amount of noise) with the 'restoration processor' at fairly low level of expansion but shows very very significant noise reduction. One thing about the processor is that it can do expansion without skewing the frequency response (except instantaneously.) The difference between previous demos and this demo is that this is with the processor running in the practical maximum mode.... (Well, if you consider practical as one hour per song.) I truly wish I could figure out the math to do the transient processing without taking so much time. When doing the series expansion, EACH variable in the series (simiar to a Taylor series) is a different value based upon the original. The calculation of each value is based upon the original value, which is yet another series and infinitum (well, limited to the depth that i specify.) This tends to reconstruct the shape that was hit by a compressor. It is NOT hooked into the actual signal but rather the gain. Oh well, this is the maximum that makes any sense, and is only slightly better than the lesser processed versions. I looked at the spectrum and everything, and there is a slight decrease in noise -- mostly the noise is dependent on the gain ratios. Also, I need to figure out how to couple the transient handling into the signal on a downward level trajectory.