Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Great killer sample, easy to ABX on most codecs (Read 40769 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: Great killer sample, easy to ABX on most codecs

Reply #125
but another approach might simply be to detect the absurdity of the sample (the vast majority of the energy is above 14 kHz) and reject it or fall back to lossless.
"Reject it" meaning reject the noise, i.e. low-pass filter the file, or reject the file entirely?

I don't know what are the intended goals for wavpack lossy, but my opinion on lossy in general is that low-pass should be standard procedure. We are already trading off some quality, so removing what is inaudible (or if you are young enough, what is not that important) seems like the reasonable thing to do. For that matter, for example, I don't understand why vorbis tries to preserve ultrasonics when q >= 6.

can we assume that dither is used not only to mask quantization, but also for an artistic effect? Hissing and crackling are effects that I often hear in music.
You don't need >= 14k content for that. At least for me, I can low-pass your example at 12k and I still hear the same hiss. Of course then it can't stay at 8-bits anymore.

And there are more than a dozen other profiles that generate noise at HF range. I'm sad to think that hybrid mode will only handle basic dither (attached below) and choke on everything else.
This needs clarification "... at 8 bits". I don't think there will be any problems with those noise shaping filters at 16-bits.

Re: Great killer sample, easy to ABX on most codecs

Reply #126
@bryant, can we assume that dither is used not only to mask quantization, but also for an artistic effect? Hissing and crackling are effects that I often hear in music. As comrade @danadam noted, in this case it seems 8-bit dither with noise-shaping of Shibata High profile was added. And there are more than a dozen other profiles that generate noise at HF range. I'm sad to think that hybrid mode will only handle basic dither (attached below) and choke on everything else.
WavPack lossy does not handle dither of any kind; it simply attempts to encode the entire waveform as closely as possible and makes no further distinctions. That fact that we have identified that noise as Shibata dither is irrelevant. This sample has a ton of ultrasonic noise and, unlike a psycho-acoustic codec, WavPack is not going to remove it just because you can’t hear it. Instead, it will encode it and attempt to hide its own quantization noise underneath the audio, which in this case is it is unable to do because currently its noise-shaping is not steep enough.

As for your other suggestion, I would certainly not assume that dither would (or should) be used for “artistic effect”. In this sample the audio was truncated to 8-bit at some point, and that was done using dither and noise-shaping to make the conversion artifacts as unobtrusive as possible (that’s what they’re for), and then it was converted back to 16-bit. Obviously there was some reason for this, but if they just wanted to add some hiss there would have been ways to do it without creating havoc with lossy encoders. When this was converted back to 16-bit most of that noise should have been filtered out.

Re: Great killer sample, easy to ABX on most codecs

Reply #127
but another approach might simply be to detect the absurdity of the sample (the vast majority of the energy is above 14 kHz) and reject it or fall back to lossless.
"Reject it" meaning reject the noise, i.e. low-pass filter the file, or reject the file entirely?

I don't know what are the intended goals for wavpack lossy, but my opinion on lossy in general is that low-pass should be standard procedure. We are already trading off some quality, so removing what is inaudible (or if you are young enough, what is not that important) seems like the reasonable thing to do. For that matter, for example, I don't understand why vorbis tries to preserve ultrasonics when q >= 6.
I may have been a little rash when I said “reject it”, but I definitely was not referring to filtering out the noise despite it being inaudible (some, most, or all of it, depending on your age). WavPack lossy makes few psychoacoustic assumptions, and frequency is certainly not one of them. It attempts to preserve the waveform faithfully, so it can be useful for non-audio applications like electrophysiology data.

But yes, lowpass filtering of audio would certainly be an appropriate thing to do to get better performance from WavPack lossy, or even reduce the sampling rate. For example, I use 32 kHz sampling rate when I record FM broadcast to WavPack as I do not consider the 19 kHz pilot tone to be part of the "artistic effect".   :)

Re: Great killer sample, easy to ABX on most codecs

Reply #128
In this case ,32 khz makes it worse as the noise has nowhere to go but down the spectrum.  In contrast, 48khz works better.

Re: Great killer sample, easy to ABX on most codecs

Reply #129
> In this case ,32 khz makes it worse as the noise has nowhere to go but down the spectrum.  In contrast, 48khz works better.

Do you mean, it subjectively sounds better at the same bitrate? (I don't think it *has* to always be worse in similar cases. All of the noise has to go to the audible range indeed, but the noise floor can be lower because there are more bits per sample available. It's probably very difficult to truly predict how it'll go in general.)

> When this was converted back to 16-bit most of that noise should have been filtered out.

Why/how? I don't think this is how it works. Converting an integer from less bits to more bits is lossless (perfectly reversible). Do you mean there is some extra step that you'd expect to always happen in tandem with that?
a fan of AutoEq + Meier Crossfeed

Re: Great killer sample, easy to ABX on most codecs

Reply #130
> In this case ,32 khz makes it worse as the noise has nowhere to go but down the spectrum.  In contrast, 48khz works better.

Do you mean, it subjectively sounds better at the same bitrate? (I don't think it *has* to always be worse in similar cases. All of the noise has to go to the audible range indeed, but the noise floor can be lower because there are more bits per sample available. It's probably very difficult to truly predict how it'll go in general.)

> When this was converted back to 16-bit most of that noise should have been filtered out.

Why/how? I don't think this is how it works. Converting an integer from less bits to more bits is lossless (perfectly reversible). Do you mean there is some extra step that you'd expect to always happen in tandem with that?
There are two different scenarios to consider. The first is that we want to preserve this pathological audio, and the other is we want to salvage it.

Since the vast majority of the energy in the sample is at frequencies that some people can hear and some can’t, this is going to sound very different to different people. But if the goal is to preserve that loud hiss, say for some nostalgia, then it’s sort of hit or miss as to what a lossy codec is going to do (as we’ve seen) because there’s all kind of ways that could fail, either in analysis or processing.

The recommendation is generally to not transcode from one lossy format to another (and converting to 8-bit PCM is a very lossy operation), so my advice would be to play it safe and accept the great lossless compression that you can get (because of the 8 bits). Keep in mind that the pathological amounts of hiss may very well create other distortions in your playback system, from resampling engines to DACs to transducers, but I assume that’s all acceptable and that any ABX testing you do is suspect.

On the other hand, we know what the artist intended this to sound like (from YouTube and Spotify) and it does not have pathological hiss. That was an artifact of converting to 8-bit PCM (for a game audio perhaps?) and so it seems reasonable to me to try to remove that to make everything more palatable for lossy encoders, and that’s very easy to do. The useful audio here goes up to about 14 kHz, so I used my ART program with a gentle 48-tap sinc lowpass at 15 kHz and got the attached file. You can see the lowpass profile here:



That still has some audible hiss (like maybe from a cassette or FM recording), but should be an easy encode because it looks like real audio.

There also seems to be a misconception here. Resampling audio from 44.1 kHz to 32 kHz does not require the noise “to go somewhere else”. Proper downsampling involves an anti-aliasing lowpass filter that removes frequencies that are no longer valid at the new sampling rate (i.e., >= Fs/2), so when I use my tool again to convert this (using default settings) to 32 kHz it’s fine and the crazy hiss is gone and it should be easy to encode. If some other program is doing something else and making this worse to encode, it’s not being used as intended or buggy.

And I’m not suggesting that increasing the bitdepth of integer samples requires any filtering because it is, as you say, lossless. But in this particular case there is a lot of dither noise that is there only because of the reduced bitdepth, so it makes sense to remove it if you can do so without removing too much useful audio information.

And, by the way, if all this wasn't true then DSD wouldn't work because it's completely based on downsampling 1-bit audio and removing the dither noise.

Re: Great killer sample, easy to ABX on most codecs

Reply #131
Lame 3995o vbr -q1 cannot be better than any 320cbr. Helix is old, good yes but still mp3.
Just in case for less informed users among us: it's not -q1, but -Q1, a tweaked approach implemented by @halb27.

Actually, I agree with @maikmerten, who said: “Your LAME 3.995o encode did overall a better job, certainly better than LAME 3.100 for this sample. However, as one can see in the spectrogram, it somewhat avoided most of the sbf21 trouble by effectively lowpassing the high frequency band [using transition band 17960 Hz - 18494 Hz] and thus getting rid of the excessive energy there, which might be the best tradeoff in this case”. Helix without -HF2 used an even lower 16536 Hz limit and also sounds fine (with enough hiss, but without metallic fluctuations and ultrasonic bloat).

Input.


Output. lame3995o -Q1


Output. hmp3 -V150
• Join our efforts to make Helix MP3 encoder great again
• Opus complexity & qAAC dependence on Apple is an aberration from Vorbis & Musepack breakthroughs
• Let's pray that D. Bryant improve WavPack hybrid, C. Helmrich update FSLAC, M. van Beurden teach FLAC to handle non-audio data

Re: Great killer sample, easy to ABX on most codecs

Reply #132
@Kraeved with all due respect I have been on this forum for 2 decades, I know and used lame3995o.

"Lame 3995o vbr -Q1 cannot be better than any 320cbr"

Re: Great killer sample, easy to ABX on most codecs

Reply #133
@Kraeved with all due respect I have been on this forum for 2 decades, I know and used lame3995o.

Dear @shadowking, it seems my words were open to misconstruction. Speaking of less informed users, I didn't mean you personally, but those who read such in-depth forum threads without having our knowledge of codecs, after which they randomly apply the flags they encounter and then ask why they don't get the desired results. Since -q1 is different from -Q1, I felt it necessary to emphasize that. Also, it would be useful to know what upset you with the result of 3995o.
• Join our efforts to make Helix MP3 encoder great again
• Opus complexity & qAAC dependence on Apple is an aberration from Vorbis & Musepack breakthroughs
• Let's pray that D. Bryant improve WavPack hybrid, C. Helmrich update FSLAC, M. van Beurden teach FLAC to handle non-audio data

Re: Great killer sample, easy to ABX on most codecs

Reply #134
OK No hard feelings at at.

I have been running abx tests for the last few days for several hours total. I think I may have a solution or a partial one.
I even encoded some tracks to 8bit with dither and the effect was a bit similar to lullaby. Any way I don't know how that fits into
anything but anyhow - I discovered that at around 5 bit per sample or more is needed. Below its like FM radio  - too me not annoying at all but still..   So if one is into 8bit that is the way .  It also translates into 16bit robustness - around 6bps maybe less or more is needed for a lossywav like quality.  Back to codectest16,  I decided on the normal mode since that is the workhorse of WV.  Interestingly, to my ears @576k the modes seem to converge more or less like it don't matter if -x or -h -s etc..  I remember many years ago I read a post by dibrom on regards to MP3 -APS behavior .  He said its a quality 1st approach - reach bitrate first then try to bring it down if safe. So,  Going by that to my ears and test sample set if I take 576 and add some defensive mechanism like -x , smart mid-side stereo and -s , I got down to 500k .  Specifically;  -b500x4s1.  A workhorse setting  'normal' but with big safety net.  Then I guess for completion ; a 'high' setting of -b550x4s1 .  In bps its 5.67 to 6.24 I guess 6 on average.  Anyway now to 320-350k or -b4 ;  use -b4x4s.5 for an alternative. To my ears the difference is not annoying if audible at all in normal volume day to day listening. For middle bitrates I found -s.5 safe so far although you lose some of the advantage of not getting a full -s1 or even negative shaping when appropriate though how this actually translates in real life listening is a different matter. I think its a good balance anyway.  For high bitrate above 400,  abx results tell me a -s1 is the way or somewhere from 70-100% upward tilt ( -s.7 to -s1).  I am leaning on -s1 . 
Interestingly, On WV manual -s1 if the default for high sample rates > 48 and hi-res as the noise is 'pushed up into in audible range '.  So it seems to me the -j1 -s1 simple method is also good for very high bitrate 44-48khz- 16 bit.

Anyhow to make short of the long,  Two settings for different goals. Its really simple its almost ridiculous when I write it like:

350k
-b4x4 (-s optional)

500k
-b5.67x4s1  or round to -b6x4s1

In popular metrics
-b320x4    ~ compact portable
-b500x4s1 ~ archiving normal
-b550x4s1 ~ archiving xtreme

Now the space saved on 500k is still considerable IMO at least for non classical.  Lacuna coil down to 280mb from 541mb. 
For classical a very dynamic title ' Fauré  Requiem 235 vs 277 mb  or  434k vs 508k flac. Even here 18% would be considered very significant in lossless benchmarks between different codecs.