Non-expert thoughts on an hypotetical "near-lossless" encoding.

Topic: Non-expert thoughts on an hypotetical "near-lossless" encoding. (Read 5738 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Non-expert thoughts on an hypotetical "near-lossless" encoding.

2017-05-26 07:52:36

For many lossless codecs, there's LossyWAV with it's variable reduction of bitdepth as an approach to near-losslessness.
For a time now I've been wondering, from a completely inexpert standpoint, about a different approach based on similar concepts as used in lossy codecs.
Instead of aiming to replicate a waveform down to the bits, the idea would be to encode the audible band (or a more limited range, if desired) in the same or similar fashion as lossy encoders do, with the exception that there wouldn't be any psychoacoustic evaluation.
The aim of this would be to reduce bitrate further than lossless and lossywav, but not raising the noisefloor, and also avoiding completely(?) any psychoacoustics-induced artifacts, avobe or under the threshold of audibility, even surviving further processing/DSP and other non-linearities in an audio system that might reveal otherwise inaudible artifacts as present in common lossy compression. For example, on my phone, through headphones I can tolerate very low bitrates without being annoyed by artifacts (or even noticing them), but the same files can become almost unbearably artifacted through the built-in speaker.
I imagine that this could also prove a good target for transcoding of existing lossy material for which there's no access to a better source than old/rare/unsupported formats and neither pure lossy or lossless targets seem appropriate, one introducing further distorsion and the other being a waste of bandwidth (in my limited

Of course, all of this is just hypotetical. All that remains is to answer these questions:
Does something like this exist?
Is it feasible to make an encoder that works like this for an existing lossy codec? Preferably something fully open, such as Opus
Could such an encoding method deliver on the goals of bitrate reduction (compared to lossless), and no artifacting even under harsh conditions (strong non-linearities, heavy use of DSP...)?
Does the concept have any merit or it's just me daydreaming?

Re: Non-expert thoughts on an hypotetical "near-lossless" encoding.

Reply #1 – 2017-05-26 08:05:15

As far as I understand that this would just mean transforming the input signal from the time domain to the frequency domain. I wouldn't have the hope that this can lead to a bitrate reduction comparable to that of lossyFLAC or wavPack hybrid.

Re: Non-expert thoughts on an hypotetical "near-lossless" encoding.

Reply #2 – 2017-05-26 16:19:29

The "wouldn't be any psychoacoustic evaluation" and further coding gain from frequency domain coding are at odds.

Re: Non-expert thoughts on an hypotetical "near-lossless" encoding.

Reply #3 – 2017-05-26 16:45:16

Quote

in the same or similar fashion as lossy encoders do, with the exception that there wouldn't be any psychoacoustic evaluation.

That sounds like you'd simply be making the compression less intelligent.

Quote

Does the concept have any merit or it's just me daydreaming?

What are you trying to accomplish? It seems to me that there are already CODECs that address every conceivable compromise/trade-off...

Re: Non-expert thoughts on an hypotetical "near-lossless" encoding.

Reply #4 – 2017-05-26 17:11:09

It is just you daydreaming.

I get the impression that you think that "psychoacoustic" is a dirty word; something to to be avoided. And that you have no idea just how clever lossy compression such as MP3 is. It is actually quite genius... and whether you want to use lossy-compressed audio or not, it is worth taker a deeper look into the whole world of digital music and lossy compression.

Being a maths dunce, I fall at the first formulae. But there are people like JJ Johnston who have a knack of explaining some of the concepts of digital music and compression. Seek this stuff out: you will benefit greatly. or... well, I can only really say that I did!

You seriously need to do some groundwork before letting your imagination loose on this.

Re: Non-expert thoughts on an hypotetical "near-lossless" encoding.

Reply #5 – 2017-05-26 20:33:31

Maybe I can make it a bit clearer by pointing to the representation of the signal in the frequency domain.
It means calculating the frequency distribution in a short window of time. One of the decisions the psychoacoustic model has to do is decide upon the length of this window. For a transient signal this window has to be small for a good temporal resolution while it can be large for tonal parts of the signal. Without psychoacoustics it has always to be small to provide good quality for transient signals. But the frequency distribution must be calculated with high precision in order to give a good quality in every situation. So you need a lot of windows each of which has to be encoded with a lot of bits. Not an appealing approach.

Re: Non-expert thoughts on an hypotetical "near-lossless" encoding.

Reply #6 – 2017-05-26 21:10:26

@Thad E Ginathom
If you read my post carefuly, perhaps you'll realize that you don't need to be so aggressive against me.
You quickly and baselessly make assumptions that I'm biased against words/concepts, that I have no idea about this and that...
I started this thread explicitly disclaiming any assumption over the validity of the concept.
I didn't say anything suggesting any asumptions about my actual knowledge, presumed knowledge or lack thereof. Now, seeing how you go about it, I'm pretty sure I'm not going to engage that discussion with you in particular.
I just wanted to present my idea and see what people more knowledgeable than me on this subject cared to say about it.
It would have been nice if all these 6 lines you took the time to fill for me contained some actual discussion on the subject that I proposed, explaining, in any depth, what's right or wrong with my ideas.

Something like halb27 just avobe ^

@halb27
I see. Thank you.
I had a rough understanding of time vs frequency domain, and I've certainly seen the effects of the window size in spectrograms, but I wasn't aware where they fell inside the mechanism of a lossy codec, much less that the psy model had to make that kind of decission.
I guess that what I meant when I said "psychoacoustics" would be the part that evaluates things like, for example, overlap (if that's the correc term). You know, when certain sounds prevent the listener from clearly hearing other sounds, thus, enabling the encoder to dedicate less bits to what you can't hear well or at all. It's my impression that information-discarding processes like this one are what cause artifacts in lossily encoded audio (?). Thus, my rationale was that to achieve this "nearlosslessness", it could be a good idea to skip, or greatly limit such discarding compared to normal lossy encoding.

Re: Non-expert thoughts on an hypotetical "near-lossless" encoding.

Reply #7 – 2017-05-26 21:20:14

TANSTAAFL

You can't compress audio losslessly beyond the amount of information in the source (very roughly 2:1 for most audio.) It's hard to do better than professionals who design such systems for a living - and if you do aspire to doing better than they do you'll need to do a lot of reading before you speculate.

If you want more compression you need to throw something away - you can use psychoacoustics to pick the things we can't perceive or you can do something else and throw away something we can perceive.

Re: Non-expert thoughts on an hypotetical "near-lossless" encoding.

Reply #8 – 2017-05-26 22:07:43

@radorn:
You can use your approach by simply using very high bitrate with a lossy codec.
The higher the bitrate the less does quality rely on psychoacoustical details towards which you obviously have some doubts.
Transform codecs used with very high bitrate are a brute force approach pretty much the way you want it to be.
Use for instance lame3995o -Q0 (just to do some advertising for my own Lame variant).
Sure when going extreme you can use lossyFLAC as well. At 400 kbps and above I'd prefer lossyFLAC over a transform codec.

Re: Non-expert thoughts on an hypotetical "near-lossless" encoding.

Reply #9 – 2017-05-26 23:09:11

Resample to 32kHz and use lossyWAV. Audible band. No psychoacoustics.

Windowing and FFT-ing (or similar) without psychoacoustics may or may not be as efficient (efficiency being a vague concept in this context anyway), and probably risks audible artefacts (easily introduced by dumb application of transform functions).

Cheers,
David.

Re: Non-expert thoughts on an hypotetical "near-lossless" encoding.

Reply #10 – 2017-05-27 13:21:21

Quote from: radorn on 2017-05-26 21:10:26

@Thad E Ginathom
If you read my post carefuly, perhaps you'll realize that you don't need to be so aggressive against me.
You quickly and baselessly make assumptions that I'm biased against words/concepts, that I have no idea about this and that...
I started this thread explicitly disclaiming any assumption over the validity of the concept.
I didn't say anything suggesting any asumptions about my actual knowledge, presumed knowledge or lack thereof. Now, seeing how you go about it, I'm pretty sure I'm not going to engage that discussion with you in particular.
I just wanted to present my idea and see what people more knowledgeable than me on this subject cared to say about it.
It would have been nice if all these 6 lines you took the time to fill for me contained some actual discussion on the subject that I proposed, explaining, in any depth, what's right or wrong with my ideas.

Just on the personal note: apologies for perceived aggression. The post was not intended to be rude. This is my interpretation of our conversation:

"I don't know much but I have this idea."
"I don't know much either but your idea sounds wrong to me."

Admittedly, that was not very useful, but I did not intend to offend. Sorry.

Notice