Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV Development (Read 573676 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

lossyWAV Development

Reply #1000
Re: Transcodability - lossyWAV does not allow re-processing of an already processed file. I would prefer to keep it that way
That's a good thing (except for testing) - I meant feeding the output of lossyWAV to an mp3 encoder. I think that's what Dynamic was talking about.

Cheers,
David.

lossyWAV Development

Reply #1001
Re: Transcodability - lossyWAV does not allow re-processing of an already processed file. I would prefer to keep it that way
That's a good thing (except for testing) - I meant feeding the output of lossyWAV to an mp3 encoder. I think that's what Dynamic was talking about.

Cheers,
David.
Yes, I see what he meant now. However, it reminds me of a quote on anythingbutipod where someone transcoded from lossyWAV >> OGG and the filesize increased by about 1MB compared to lossless >> OGG (as a %age I have no idea....)

As an aside, -7 -autoshape -snr 9 -nts 36 is nearly palatable on my iPAQ and comes in at 321.1kbps for my 53 problem sample set.

[edit] Iterating, I found that -7 -autoshape -snr 14.35 -nts 19.95 is very close in bitrate to vanilla -7. [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #1002
Once there was a good rule,
Thou should not transcode

lossy-wav is made already for portable usage, compatibility with portable devices flac supporting, but to shrink the size of true Lossless music. No reason to go 2nd time lossy, ie. transcode from lossy->lossy. if somebody wants mp3/mpc/ogg/aac as small sized thingie for portable usage, then go directly from Lossless source to small-lossy (mp3/mpc/ogg/aac).
There are already enough programs and scripts to encode in 1 single step to various formats/sizes/bitrates, like mareo.exe.

lossyWAV Development

Reply #1003
I fundamentally disagree with you user.

Of course you should not aim to transcode, but sometimes it is inevitable, and sometimes it is not worth worrying about.

Of course we should all keep our lossless files and use them everywhere, but sometimes we can't, and sometimes it is not worth worrying about (for some people).


For example...

Modern loud CDs regularly hit 1000kbps+ with lossless codecs.

What's happening is that the mathematical 96dB range is being perfectly preserved, even though the actual dynamic range is about 6dB.

I believe it is pointless keeping the lossless version. It's a "perfect" copy of a mediocre original.

Thankfully, lossyWAV, used less aggressively, allows you to make a near-lossless version.

If I can have something which is half the bitrate (or less), sounds identical, and transcodes identically, then I have no need to keep the lossless version.

To me, this is an argument for dumping the lossless original. As it says on the Monkey's Audio website: lossless is for anal retentives. I'm not one, so if all rational reasons for lossless are removed, I won't use it. YMMV.


Why not create mp3s or whatever from the lossless original?

1. "Why not?" Well, Why? Really, if there's no difference, why? It's OCD-like behaviour.

2. I might not know the lossy format I will need in the future. Shall I create mp3, ogg, AAC, HE-AAC etc?

3. If I'm a radio station, it's the broadcast (FM, mp2, mp3, WMA, whatever) that's the "transcode" - I can hardly avoid that or make it at the same time as I rip the CD.


So, for me, the "transcodability" of the less aggressive lossyWAV modes is very important.

I could give you more examples... "sensible" preservation of 24/96 files; "sensible" preservation of GBs of "working files" from audio sessions which will probably never be used again, but won't be any use at all if converted to mp3; etc etc etc.

If lossyWAV, in its more gentle modes, is "safe".

Cheers,
David.

lossyWAV Development

Reply #1004
I just finished my abx test.

First I followed your suggestion and used -7 -autoshape -nts 20 -snr 14.
This setting yields 309 kbps for my regular set (quite a bit higher than plain -7) and 355 kbps for my problem set (a bit low for problem samples).
Hiss is pretty audible with bruhns (for instance sec. 9.3-10.2), but it's audible also with bibilolo (sec. 4.3-5.5) and badvilbel (sec. 5.9-7.2). There's also a slight inaccuracy with Atemlied (sec. 9.3-10.1) which is best audible at moderately high listening volume.
I didn't test a lot more samples then those mentioned because to me this is not adequate quality for an average of 309 kbps. The hiss (and the inaccuracy) isn't really annoying though, and I listened to some regular music (carefully but without abxing), and was content with it. Anyway looking at codecs like vorbis (I just tested the new Aoyumi version, and quality is great even at -q4 [~130 kbps] I personally don't like my abx result at a bitrate of ~310 kbps.

I redid the test using plain -7 autoshape which yields 325/384 kbps for my regular/problem test set.
bruhns 9.3-10.2 is better now to me, though still quite audible. Same goes for bibilolo.
I didn't test more samples as I personally am not content with this as well.

-6 -autoshape yields 337/404 kbps.
bibilolo is ok now, but when abxing bruhns I found added hiss already at sec. 2.3-4.4.

I skipped -5 and went directly to -4 -autoshape as from my last test I know plain -4 is transparent for me with the samples tested. -4 -autoshape yields 369/450 kbps. With this average bitrate for the problem set chances are good that everything is alright now.
bruhns at sec. 9.3-10.2 however still isn't perfect though quite acceptable.

Summing it up to me this isn't a good result for using -autoshape though it's a matter of taste whether or not one is willing to accept added hiss which seems to be the major issue when using -autoshape with low bitrate settings.

Maybe a variant of autoshape is more successful: make the frequency up shift of noise also depend on the degree to which there's energy in the input signal's 2 highest frequency zones (that is the range from ~8.2 kHz up). As bruhns is a pretty low volume sample maybe being more conservative at low volume is helpful too. What do you think, Nick?
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #1005
I just finished my abx test.

First I followed your suggestion and used -7 -autoshape -nts 20 -snr 14.
This setting yields 309 kbps for my regular set (quite a bit higher than plain -7) and 355 kbps for my problem set (a bit low for problem samples).
Hiss is pretty audible with bruhns (for instance sec. 9.3-10.2), but it's audible also with bibilolo (sec. 4.3-5.5) and badvilbel (sec. 5.9-7.2). There's also a slight inaccuracy with Atemlied (sec. 9.3-10.1) which is best audible at moderately high listening volume.
I didn't test a lot more samples then those mentioned because to me this is not adequate quality for an average of 309 kbps. The hiss (and the inaccuracy) isn't really annoying though, and I listened to some regular music (carefully but without abxing), and was content with it. Anyway looking at codecs like vorbis (I just tested the new Aoyumi version, and quality is great even at -q4 [~130 kbps] I personally don't like my abx result at a bitrate of ~310 kbps.

I redid the test using plain -7 autoshape which yields 325/384 kbps for my regular/problem test set.
bruhns 9.3-10.2 is better now to me, though still quite audible. Same goes for bibilolo.
I didn't test more samples as I personally am not content with this as well.

-6 -autoshape yields 337/404 kbps.
bibilolo is ok now, but when abxing bruhns I found added hiss already at sec. 2.3-4.4.

I skipped -5 and went directly to -4 -autoshape as from my last test I know plain -4 is transparent for me with the samples tested. -4 -autoshape yields 369/450 kbps. With this average bitrate for the problem set chances are good that everything is alright now.
bruhns at sec. 9.3-10.2 however still isn't perfect though quite acceptable.

Summing it up to me this isn't a good result for using -autoshape though it's a matter of taste whether or not one is willing to accept added hiss which seems to be the major issue when using -autoshape with low bitrate settings.

Maybe a variant of autoshape is more succesfull: make the frequency up shift of noise also depend on the degree to which there's energy in the input signal's 2 highest frequency zones (that is the range from ~8.2 kHz up). Do you like to try that, Nick?
Ok, I will try again to implement the RMS variability approach that I was trying (but didn't release).

Another approach would be to make the variability of the shaping non-linear with respect to bits-to-remove. At present it increases at 1/13 per bit to remove, i.e. 0=0; 1=1/13; 2=2/13; etc; 12=12/13; 13=13/13. If I was to change this from linear to some power, say for example shaping_factor = 1-((13-bits-to-remove)/13)^n then things may change.

Again, the noise shaping function itself is totally fixed, all the autoshape function is vary how much to apply. It doesn't change the frequency to which the noise is shifted. Think of it as -shaping 0 = pure white noise; -shaping 1 = fully shaped noise; -shaping <n> = something in between.

I'll get back to the "drawing board" with the autoshape function and post v0.9.1 soon.

As an aside, I have found that TCPMP for my iPAQ plays FLAC *much* better (better = less cpu usage and more accurate output) than GSPlayer / gspflac.dll. In particular dithernoisetest would exhibit some harmonics using GSPlayer which don't exist in TCPMP. TCPMP v0.72 RC1 is still available, google is your friend....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #1006
... Again, the noise shaping function itself is totally fixed, all the autoshape function is vary how much to apply. It doesn't change the frequency to which the noise is shifted. Think of it as -shaping 0 = pure white noise; -shaping 1 = fully shaped noise; -shaping <n> = something in between. ...

That's clear. I was trying to bring another thing into focus: masking effects. If there's a lot of HF energy in the input signal, your shaping factor can be close to 1, and if there's no or little HF energy there the shaping factor is better close to 0. To be considered as well as the amplitude considerations.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #1007
... Another approach would be to make the variability of the shaping non-linear with respect to bits-to-remove. At present it increases at 1/13 per bit to remove, i.e. 0=0; 1=1/13; 2=2/13; etc; 12=12/13; 13=13/13. If I was to change this from linear to some power, say for example shaping_factor = 1-((13-bits-to-remove)/13)^n then things may change. ....

With bits-to-remove=1 and n=2 this yields a shaping factor of ~0.148 > 1/13~0.077. So this would make the noise shaping more agressive (in case I understand this correctly).
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #1008
... Another approach would be to make the variability of the shaping non-linear with respect to bits-to-remove. At present it increases at 1/13 per bit to remove, i.e. 0=0; 1=1/13; 2=2/13; etc; 12=12/13; 13=13/13. If I was to change this from linear to some power, say for example shaping_factor = 1-((13-bits-to-remove)/13)^n then things may change. ....
With bits-to-remove=1 and n=2 this yields a shaping factor of ~0.148 > 1/13~0.77. So this would make the noise shaping more agressive (in case I understand this correctly).
Yes, it will make it more aggressive. Using the revised -autoshape -7, my 53 problem sample set yields 378.77kbps. With the v0.9.0 -autoshape -7, 366.21kbps.

lossyWAV beta v0.9.1 attached to post #1 in this thread.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #1009
... Another approach would be to make the variability of the shaping non-linear with respect to bits-to-remove. At present it increases at 1/13 per bit to remove, i.e. 0=0; 1=1/13; 2=2/13; etc; 12=12/13; 13=13/13. If I was to change this from linear to some power, say for example shaping_factor = 1-((13-bits-to-remove)/13)^n then things may change. ....

With bits-to-remove=1 and n=2 this yields a shaping factor of ~0.148 > 1/13~0.77. So this would make the noise shaping more agressive (in case I understand this correctly).
Yes, it will make it more aggressive. Using the revised autoshape, my 53 problem sample set yields 378.77kbps. With the v0.9.0 autoshape, 366.21kbps.

I think in the opposite direction: using for instance something like

bits-to-remove      shaping factor
        0                          0
        1                          0
        2                          0.1
        3                          0.15
        4                          0.25
        5                          0.4
        6                          0.55
        7                          0.7
        8                          0.8
        9                          0.85
       10                           0.9
       11                          0.95
    >=12                        1.0

so that as a tendency with low-volume spots shaping is small.
Should be positive for the added hiss of low-volume spots. Reduces the bitrate bloat as well.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #1010
I thought that the problem of hiss you were encountering would be white noise, i.e. shaping too low?

Is full -shaping 1.0 better than -autoshape for the problem samples you identified (bruhns, bibilolo, badvilbel)?

And finally (as if you had nothing else better to do  ) is the v0.9.1 -autoshape any better (if full shaping is better than autoshape v0.9.0)?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #1011
David, I totally agree with your post. Well said, as usual. I'm excited for lossyWAV precisely because of it's potential as a transcodable source. I'll do some listening tests in this area when I can find the time. This sums it up for me:

If I can have something which is half the bitrate (or less), sounds identical, and transcodes identically, then I have no need to keep the lossless version.


Not to mention that there isn't a problem sample found yet (at more defensive settings).

lossyWAV Development

Reply #1012
1. "Why not?" Well, Why? Really, if there's no difference, why? It's OCD-like behaviour.

2. I might not know the lossy format I will need in the future. Shall I create mp3, ogg, AAC, HE-AAC etc?

3. If I'm a radio station, it's the broadcast (FM, mp2, mp3, WMA, whatever) that's the "transcode" - I can hardly avoid that or make it at the same time as I rip the CD.


So, for me, the "transcodability" of the less aggressive lossyWAV modes is very important.


I'm in agreement. I'd like to use a safe & robust setting in lossyWAV (-1 or -2 perhaps) just as I'd happily pre-process my rips with Album Gain and simple dither (using wavgain or foobar2000) before losslessly compressing them. I'd treat either as an excellent quality source to keep on my hard drive, which I could tag properly and then robustly encode to conventional lossy formats as I need it. Such an archive occupies far less space than straight lossless in the case of those many modern dynamically ultra-compressed albums.

I acquire new playback devices from time to time, and may desire different formats to suit the storage capacity / battery life / format compatibility / gapless support available with each. (Pragmatism frequently leads me to stick to LAME VBR MP3s, however). Also, on occassions, I actually need to have a degree of dynamic compression (foo_vlevel) for soft background music from highly dynamic sources that would get lost entirely in places if I didn't use some volume levelling. This requires processing before encoding (unless we get frame-by-frame volume levelling in mp3gain style).

I wouldn't normally want to transcode lossyWAV -2 to lossyWAV -7, for example, but perhaps if I wanted low battery-drain and fairly good quality on the right device, I'd be willing to do so, pragmatically, (and I'd be tempted to name the file as .transcoded.lossy.flac or with a .lossy7t.flac extension or some such, just in case it should ever find its way back onto my PC).

It seems that I'm part of a minority in being willing to use 'safe' lossyFLAC or lossyWV in place of true lossless as my main PC storage and for generating lossy files pretty-much on the fly, for whatever external device I wish.
Dynamic – the artist formerly known as DickD

lossyWAV Development

Reply #1013
Is there currently any plan/work on dynamic noise shaping?  That is, despite some obscure and ancient "patent" (which may/may not apply depending on license/age/obscurity/location/algo differences--and on that topic, imho, any software patent shouldn't last more than 5 years, much less 15)... 

Currently, (yes I use tcpmp 0.72rc1 or the 0.8x builds floating around  ), 320kbps is my max, so I'm looking forward to just the right combination of settings (i.e. -7 with shaping, snr, nts) to produce such a file at that bitrate.  320kbps with noise shaping, whew.  This is really heating up! 
Copy Restriction, Annulment, & Protection = C.R.A.P. -Supacon

lossyWAV Development

Reply #1014
... It seems that I'm part of a minority in being willing to use 'safe' lossyFLAC or lossyWV in place of true lossless as my main PC storage and for generating lossy files pretty-much on the fly, for whatever external device I wish.

I also think like that. I'd love to have just 1 collection (not a lossless and a lossy one), and -1 or -2 and a good additional noise shaping is a very promising way to go.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #1015
I thought that the problem of hiss you were encountering would be white noise, i.e. shaping too low?

Is full -shaping 1.0 better than -autoshape for the problem samples you identified (bruhns, bibilolo, badvilbel)?

And finally (as if you had nothing else better to do  ) is the v0.9.1 -autoshape any better (if full shaping is better than autoshape v0.9.0)?

I'll try your proposals this weekend.
My considerations arise from my WavPack lossy experience. Before David Bryant introduced dynamic noise shaping I preferred to shift noise upwards. This eliminated ugly distortions with samples like keys, but it introduced the risk of audible hiss. This risk was very real when using settings in the 300 to 350 kbps range, especially with high values for the shift.
I think our situation is similar, and I think a strong shifting up should only be done when there's a high chance that the added hiss is masked. I think this is especially so as with the very aggressive settings quality control of our machinery has a weak basis in general and a very weak basis above ~3 kHz (though it works fine to an astonishing extent).
As for controlling the masking of HF hiss I can imagine a crude approach is sufficient.
The very first approach can be: use a shaping factor of 1 for very loud music, and a shaping factor of 0 for quiet music, but do it defensively meaning: with music of mediocre loundness use a moderate shaping factor closer to 0 than to 1.
Bits to remove is a rough measure for the loudness of the music. So the noise shaping factor can be computed by something like:

noise shaping factor   = 0     for bits-to-remove <= 5
                                   = 1     for bits-to-remove >=12
                                   = (bits-to-remove - 5)^2/49     for bits-to-remove between 5 and 12

With this very crude approach of controlling the masking I think it's best to be very conservative.

A better hiss masking control (which needn't be that defensive) could be not to take into account the loudness of the music (or the number of bits to remove), but the HF energy of the input signal, something like the sum of all the bins in the 2 highest frequency zones of the FFT analyses (~8.2+ kHz) for all the 64 sample FFTs which make up for an entire 512 sample block.
lame3995o -Q1.7 --lowpass 17

 

lossyWAV Development

Reply #1016
I'll try your proposals this weekend.
My considerations arise from my WavPack lossy experience. Before David Bryant introduced dynamic noise shaping I preferred to shift noise upwards. This eliminated ugly distortions with samples like keys, but it introduced the risk of audible hiss. This risk was very real when using settings in the 300 to 350 kbps range, especially with high values for the shift.
I think our situation is similar, and I think a strong shifting up should only be done when there's a high chance that the added hiss is masked. I think this is especially so as with the very aggressive settings quality control of our machinery has a weak basis in general and a very weak basis above ~3 kHz (though it works fine to an astonishing extent).
As for controlling the masking of HF hiss I can imagine a crude approach is sufficient.
The very first approach can be: use a shaping factor of 1 for very loud music, and a shaping factor of 0 for quiet music, but do it defensively meaning: with music of mediocre loundness use a moderate shaping factor closer to 0 than to 1.
Bits to remove is a rough measure for the loudness of the music. So the noise shaping factor can be computed by something like:

noise shaping factor   = 0     for bits-to-remove <= 5
                                   = 1     for bits-to-remove >=12
                                   = (bits-to-remove - 5)^2/49     for bits-to-remove between 5 and 12

With this very crude approach of controlling the masking I think it's best to be very conservative.

A better hiss masking control (which needn't be that defensive) could be not to take into account the loudness of the music (or the number of bits to remove), but the HF energy of the input signal, something like the sum of all the bins in the 2 highest frequency zones of the FFT analyses (~8.2+ kHz) for all the 64 sample FFTs which make up for an entire 512 sample block.
So, instead of just calculating the minimum / average of each FFT output for the whole range, 20Hz > 16kHz, I could calculate a minimum / average for each different portion of the spreading frequency list. In this way, the relative outputs in each sub-range could be compared and if the high frequency range was low then apply less shaping as you have already suggested.

[edit] As an aside, I thought that we were getting close to the "end" with respect to v1.0.0, so the release numbers have been climbing rapidly. As we are in (yet another!) potentially fairly fast transitionary period, I will be appending b > z to the beta releases to give me more "time" before v1.0.0.... [/edit]

[edit2] If I was going to really push the processing-time-per-codec-block requirement, I could also carry out a 512 sample FFT on the correction data, i.e. quantization noise, and see where the quantization noise has actually gone.... [/edit2]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #1017
So, instead of just calculating the minimum / average of each FFT output for the whole range, 20Hz > 16kHz, I could calculate a minimum / average for each different portion of the spreading frequency list. In this way, the relative outputs in each sub-range could be compared and if the high frequency range was low then apply less shaping as you have already suggested. ...

I do not understand the minimum / average approach.

I think what I have in mind is something else: compute the input signal's HF energy of a block as the sum of all the bins in the 2 highest frequency zones (~8.2+ kHz) of all the 64 sample FFTs which cover the block.
Compare this HF energy to predefined energy levels which tell about the noise shaping factor.
For the predefined energy levels:
Look at the HF energy (computed the same way) of the bibilolo start (at the seconds I mentioned in my last test report) and use a noise shaping factor of 0 for this energy level.
On the other end take loud music with a high amount of HF (for instance 'Living in the future'), and use the computed energy level as a measure for using a noise shaping of 1.
In production when energy level is between these two extreme forms, use a quadratic function of the form (HF-a)^2/b for interpolation to get the noise shaping value.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #1018
[edit2] If I was going to really push the processing-time-per-codec-block requirement, I could also carry out a 512 sample FFT on the correction data, i.e. quantization noise, and see where the quantization noise has actually gone.... [/edit2]

Why?
Assuming the unfiltered quantization noise to act like a memoryless source of random numbers with rectangular probability density the noise power you'll get after shaping can directly be computed with the help of the noise transfer function N. For a frequency f in radians f=Hz*(2pi/fs) where fs=sampling_frequency_in_Hz set z=cos(f)+i*sin(f) and compute |N(z)*2^{bits2remove}| which is proportional to the the amplitude spectral density of the filtered noise.

Usually a psychoacoustic codec determines the amount of tolerable noise in specific time/frequency regions. Seeing "spreading function" popping up here I assume you're actually doing that computation. For a "codec block" the result of this computation would be a curve describing the spectral power density of the tolerable noise. Then you could try to find the parameters 's' and 'b' for the curve |N(z*s)*2^{b}| so it's still under the tolerable noise curve but maximizes b -- the number of bits to remove. 's' here is the shaping strengh parameter.

my 2 cents on optimizing the number of bits to remove and the shaping strenth,
SG

lossyWAV Development

Reply #1019
Also, on occassions, I actually need to have a degree of dynamic compression (foo_vlevel) for soft background music from highly dynamic sources that would get lost entirely in places if I didn't use some volume levelling. This requires processing before encoding (unless we get frame-by-frame volume levelling in mp3gain style).
OT: That's possible, but no one has implemented it. It wouldn't be as good or flexible as a separate DRC, but it would often be better than transcoding an mp3 to another mp3.


[edit] As an aside, I thought that we were getting close to the "end" with respect to v1.0.0, so the release numbers have been climbing rapidly. As we are in (yet another!) potentially fairly fast transitionary period, I will be appending b > z to the beta releases to give me more "time" before v1.0.0.... [/edit]
IMO (though others may disagree strongly) your first "stable" release should be without noise shaping.

Also IMO (and again, others may disagree) the more you base your noise shaping on the input signal, the closer you get to that Sony patent.

If fixed noise shaping doesn't buy you much, you should definitely get a stable and (as far as I know) patent-free release out there before playing with noise shaping any more.

If nothing else, having a "stable" release is going to get you a lot more testers! (I would hope!).

Cheers,
David.

lossyWAV Development

Reply #1020
IMO (though others may disagree strongly) your first "stable" release should be without noise shaping.
...
If nothing else, having a "stable" release is going to get you a lot more testers! (I would hope!). ...

As for the current state I also don't see a big advantage of noise shaping.
But we're already talking about things which are very promising cause with loud and HF rich music (aka most of pop/rock music) noise shifting can really lead to very high quality at a rather moderate bitrate. Guess this is an attractive feature for many users. Sure we're moving here in the world of psychoacoustics but to a lot slighter degree than transform codecs do it.

And look at the initial purpose we're struggling at with -2 and -1 where we don't rely on any kind of psy model to assure quality. With a good noise shaping we get a near noiseless frequency range of the fundamentals and a controlled quality in the HF region. To me this is very attractive and we should rather wait a bit yet until final release.

Of course there shouldn't be any patent problems. But is there really an issue with the proposals done so far?
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #1021
As we are in (yet another!) potentially fairly fast transitionary period, I will be appending b > z to the beta releases to give me more "time" before v1.0.0....

How about 0.10.1    0.11.1 etc. 
Although lossyWav can be considered beta, the noiseshaping functions might be considered alpha (debatable).
May I asked what will be the result when the "optimum" noise shaping is found,  better quality at the same bit rate or lower bit rate at the same quality?    anything else seems not useful.
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV Development

Reply #1022
As we are in (yet another!) potentially fairly fast transitionary period, I will be appending b > z to the beta releases to give me more "time" before v1.0.0....
How about 0.10.1    0.11.1 etc. 
Although lossyWav can be considered beta, the noiseshaping functions might be considered alpha (debatable).
May I asked what will be the result when the "optimum" noise shaping is found,  better quality at the same bit rate or lower bit rate at the same quality?    anything else seems not useful.
Yes, I can use 0.10.0, etc. - so I will.

Noise shaping makes the processed data less predictable for the lossless codec, thus increasing bitrate. However, its use can allow more aggressive settings to be used before the results are noise shaped.

David: I'm inclined to agree with you - v1.0.0 should be issued with noise shaping code removed.

Horst: from beta v1.0.1, I would expect to improve noise shaping and its application in lossyWAV.

Sebastian: Your understanding of applied mathematics far exceeds mine - I'm not sure what you're getting at.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #1023
...
David: I'm inclined to agree with you - v1.0.0 should be issued with noise shaping code removed.

Horst: from beta v1.0.1, I would expect to improve noise shaping and its application in lossyWAV.
...

Sounds like a promising road map.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #1024
Sebastian: Your understanding of applied mathematics far exceeds mine - I'm not sure what you're getting at.
Simplistically, that the quantisation noise has gone exactly where you've put it in a predictable way - you don't need to check - unless something is broken.

Of course, it's easy to break something, and useful to have that checking code in there for debugging. It also saves having to think the theory through!

Cheers,
David.