Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV 1.3.0 Development Thread (Read 198394 times) previous topic - next topic
0 Members and 2 Guests are viewing this topic.

lossyWAV 1.3.0 Development Thread

Reply #150
I just tried 1.2.2p with furious.

Using -Z --adaptive, 1.2.2p is clearly better than 1.2.2m to me. It's ABXable against the original as was expected, but I wouldn't call the result bad.

Using -Z --adaptive --altpreset I ABXed 1.2.2p 10/10 against the original in the 1.8...2.7 second range.
As usual for a direct comparison I also tried -Z --altpreset, and today I could also ABX it against the original 10/10 in the 1.8...2.7 second range.
Obviously my sensitivity varies (I don't think the non-noise shaping machinery has changed).
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #151
The existing bit-removal and fixed noise-shaped bit-removal processes have not changed at all.

I'm glad that 1.2.2p has given an improvement over 1.2.2m in relation to the hiss encountered in Furious.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #152
lossyWAV beta 1.2.2q attached to post #1 in this thread.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #153
I just tried 1.2.2q with furious.
-Z -A isn't hard to ABX (no surprise). It sounds different from the previous version but I wouldn't call it better (or worse).
-Z -A -t wasn't transparent either. As usual I focused on second 1.8...2.7 and ABXed 9/10 against the original.

Sorry: no improvement IMO.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #154
Thanks very much for the testing - I'll take it apart and see if I can think of any improvements.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #155
To simplify the code, I am thinking of removing the compatibility with libfftw3f-3.dll (single-precision) and reverting to double precision for all calculations. Does anyone have any violent objections to this proposal?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #156
To simplify the code, I am thinking of removing the compatibility with libfftw3f-3.dll (single-precision) and reverting to double precision for all calculations. Does anyone have any violent objections to this proposal?

I personally love things becoming simpler. Can't see a good reason for using single precision as lossyWav is pretty fast with double precision.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #157
lossyWAV beta 1.2.2r attached to post #1 in this thread.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #158
I will try it when I'll be back from holiday.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #159
There is a known problem within foobar2000 (although more likely to do with cmd.exe itself) when running an executable within the cmd.exe command line from a path which includes spaces. The suggested fix for this is to enclose the element of the path which contains spaces within double quotation marks ("), e.g. c:\"program files"\directory_where_executable_is\executable_name

Isn't it possible to just put the whole path into ""? Like "c:\program files\directory_where_executable_is\executable_name"? IMHO this should be possible and the easiest way as you just can say "<path>".

lossyWAV 1.3.0 Development Thread

Reply #160
@halb27 - although you may want to wait until beta 1.2.2s is posted - I have had an idea which I am trying to implement that may improve things.

@Big_Berny - if the executable is on the path then you don't need to specify any path to is, just the executable name.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #161
@Big_Berny - if the executable is on the path then you don't need to specify any path to is, just the executable name.

I know. I just wanted to say that you can put the whole path into "" if there are spaces in it. It's easier to make a batch/script then.

lossyWAV 1.3.0 Development Thread

Reply #162
@halb27 - although you may want to wait until beta 1.2.2s is posted - I have had an idea which I am trying to implement that may improve things.

OK, I wait until 1.2.2.s is out.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #163
Thanks for your patience.

lossyWAV beta 1.2.2s attached to post #1 in this thread.

[edit] Suggest -A 96 or greater (default=32) [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #164
Thank you, Nick, for the new version.

I tried furious with 1.2.2.s -Z --adaptive 96, and - as was expected - it is rather easy to ABX.
I tried -q 1.0 --adaptive 96, and ABXing was already difficult for me.

But we shouldn't focus so much on furious IMO as I have been doing. While on holiday (walking the wonderful Cotswold Way BTW) I was thinking about what can be expected from adaptive noise shaping. The big picture with adaptive noise shaping was adressed in this thread, but without a final result. IMO it is important to discuss this.

Before adaptive noise shaping came up we have always targeted at universal transparancy as this is what is best in line with the lossyWAV principle. We even did so with the portable quality level though we allowed it to be on the very cutting edge.

With adaptive noise shaping I now think we shouldn't do that because I think there will be always artificial samples where adaptive noise shaping can't help which brings us to average bitrates we're used to without noise-shaping. For those who use lossyWAV preprocessing as an efficient alternative to pure lossless encoding adaptive noise shaping isn't interesting a priori. Adaptive noise shaping is most interesting for those who want to use lossyWAV preprocessed lossless as an alternative to very high bitrate mp3 or aac (or similar). In order to have this usage of lossyWAV attractive it should compare favorably in some respect. I suggest the following targets:

- no problems with temporal resolution (as opposed to pre-echo problems) in a universal way
- no other kind of artefacts in a universal way
or in other words:
- only a tiny amount of audible added hiss is allowed on problem samples, and added hiss should be inaudible with natural music.

This should be achieved at an attractive average bitrate, say roughly 300 kbps or below.

furious is an artificial sample, so we shouldn't care much about a small amount of added hiss.
The quality level -Z --adaptive however also shows up certain amounts of what we've called added clicks which can be classified as artefacts. Maybe it's possible to improve upon that. With -q 1.0 these artefacts are next to nothing.

What is most important is listening tests with non-artificial music on a broad scale. I will try that. It would be nice however if some more members could contribute to get a broader experience basis.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #165
I finally did a little test very casual without headphones.

Serioustrouble - very big improvement - cannot abx Q0 -A but -Q is very easy.
Dear Sir -  Maybe a little bit better but still easy to abx.
Mandylion - 5/10 Q0 -A vs 8/10 Q0
Sarah Mac - ice - Both are the same.


In general -A can be a good thing, but i need a much more through test with headphones.




lossyWAV 1.3.0 Development Thread

Reply #166
First of all: thanks to everyone doing listening test.
For those who use lossyWAV preprocessing as an efficient alternative to pure lossless encoding, adaptive noise shaping isn't interesting a priori. Adaptive noise shaping is most interesting for those who want to use lossyWAV preprocessed lossless as an alternative to very high bitrate [lossy].

This will depend on what ANS will bring us, if it would happen that the same level of transparency (or headroom thereof) can be achieved with maybe 50kps less when using ANS, that would be appealing to the "efficient alternative to lossless" users too.

Adaptive Noise Shaping only makes sense if it brings an improvement in either lower bitrates at the same quality or higher quality at the same bitrates. These issues are most relevant around the "just transparent" border that could be --portable.

Nick is bringing an interesting experiment to see if Adaptive Noise Shaping can be turned into our advantage.
In theory, there is no difference between theory and practice. In practice there is.

lossyWAV 1.3.0 Development Thread

Reply #167
@shadowking:
Does 'Dear Sir' originate from natural instruments or is it artificial music? Is the error added hiss or kind of an artefact?
Is 'Sarah Mac - ice' transparent or not, and if not: is it non-artificial music, and is the error hiss or artefact?
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #168
Bitrate results for my 55 problem sample set:
Code: [Select]
+-------+----------+----------+----------+----------+----------+----------+
|Version| Settings | FLAC -5  |--insane  |--extreme |--standard|--portable|
+-------+----------+----------+----------+----------+----------+----------+
|       | default  | 654kbit/s| 584kbit/s| 510kbit/s| 427kbit/s| 325kbit/s|
| beta  +----------+----------+----------+----------+----------+----------+
|       |--shaping | 658kbit/s| 589kbit/s| 515kbit/s| 431kbit/s| 325kbit/s|
|1.2.2s +----------+----------+----------+----------+----------+----------+
|       |--adaptive| 654kbit/s| 584kbit/s| 509kbit/s| 426kbit/s| 323kbit/s|
+-------+----------+----------+----------+----------+----------+----------+
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #169
Sorry it took me so long, but finally I've managed to finish my listening test of 120 tracks of regular music from my collection of various genres (pop, classic, singer/songwriter music, instrumental music), and of 13 problem samples.
With adaptive noise shaping lossyWAV is so good with even low bitrate that it wasn't much fun and I allowed for a lot of breaks.

First I decided about the setting to use. For quality below standard it makes most sense to me to use the altpreset quality scheme (more efficient due to the slightly lower analysis limit, but more defensive otherwise). In case we assume adaptive noise shaping is useful we should also use it for quality below standard IMO. These considerations may or may not apply for standard and above (and are a matter of taste anyway) - I ignore this for the moment.

With our quality steps of 2.5 this means quality levels below standard as of:

--portable (-q 2.5 --altpreset --adaptive 96):                                             373 kbps (on average for my test set of regular music)
--superportable [my suggestion] (-q 0 --altpreset --adaptive 96):    321 kbps
--maxportable [my suggestion] (-q -2.5 --altpreset --adaptive 96):  280 kbps

For my test I used -q -2.5 --altpreset --adaptive 96.

In short I was very pleased with the quality of this setting. I often thought I heard a problem, but when ABXing I nearly always found that the 'problem' was with the original as well.

In the end I found just the following cases of intransparency among the real world tracks:

Sweet Old World: The sibilants are rougher here, but it's not obvious at all (8/10).
Ta douleur: The french 'r' is rougher. I'm sure it's not transparent though I got only at 7/10 (repeatedly).
Köln Concert: Temporal smearing when listening with my electrostatic headphone (8/10), but I couldn't hear the problem with my Alessandro MS2 (modified Grado 325). I should mention I'm not sensitive to temporal smearing, so this may be more of a problem to those who are.
Tout le monde: I'm not quite sure for this sample. It sounded rougher to me on my electrostatics (but only 7/10), but I couldn't here the issue with my MS2.

As for the problem samples I first listened to furious again. Now I think intransparency is only due to hiss, not kind of an artefact as I thought before. The hiss just sounds like a bit of 'wobbling' because there is kind of a modulation in the signal.
I also found the following problem samples not transparent:
herding calls: inprecision (quite similar to mp3 behavior), not very obvious though in spite of my 9/10.
eig: slight temporal smearing when listening with my electrostatics (only 7/10), not confirmed when using my MS2. Keep in mind I'm not good at ABXing temporal smearing.

So all in all extremely rare spots of minor intransparency to me.

As the next step I'll try different --adaptive parameters for the identified problems to see if this will improve things. I'm also interested to see how q 0 --altpreset --adaptive 96 will perform on the problematic spots.
lame3995o -Q1.7 --lowpass 17

lossyWAV 1.3.0 Development Thread

Reply #170
Wow! That's quite a test program to have undertaken. I am amazed that --maxportable (-q -2.5 --altpreset --adaptive 96) has proven to be so robust in practice.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #171
Hi Nick, all,

I gave the latest 1.2.2 a try today and noticed that on many files, for -I down to -P, more noise tends to be added (i.e. LSBs removed) on the right channel than on the left channel, even if the input file has the same energy/spectral shape in both channels. Try it with SQAM track 2, for example. There I get about 2 dB more noise (SNR reduction) on the right channel with -P, no noise shaping.

Chris
If I don't reply to your reply, it means I agree with you.

lossyWAV 1.3.0 Development Thread

Reply #172
Chris,

Many thanks for taking the time to try lossyWAV - and also for highlighting what appears to be a (significant) bug.

Nick.

[edit] Were you testing main quality scale or altpreset and did you use any noise-shaping? [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV 1.3.0 Development Thread

Reply #173
[edit] Were you testing main quality scale or altpreset and did you use any noise-shaping? [/edit]

I used a minimal command line: lossyWAV.exe input.wav -P. No noise shaping.

If I may ask: How does LossyWAV in non-noise-shaping mode determine when to chop off LSBs? Simply based on the amplitude or energy of the current frame, or something more complicated?

Chris
If I don't reply to your reply, it means I agree with you.

lossyWAV 1.3.0 Development Thread

Reply #174
If I may ask: How does LossyWAV in non-noise-shaping mode determine when to chop off LSBs? Simply based on the amplitude or energy of the current frame, or something more complicated?
Each codec block has a number of overlapping FFT analyses carried out on it at two or more lengths (approx 1.5 msec equivalent and 10 msec equivalent and others in between).

The results from each are "skewed" using a curve that reduces the FFT bin results by 36dB at the lowest bin and 0dB at approx 3.4kHz.

In the frequency range of interest (default 20Hz to 16kHz), the lowest result is then found, along with the average (using a pair of simple spreading systems). These are modified by parameters dictated by the selected quality setting.

For each FFT length there is a lookup table to translate the chosen lowest result each FFT analysis of that length to a number of bits-to-remove.

The lowest allowable bits-to-remove value is chosen between all analyses at all FFT lengths for each channel.

If channels are not linked then the number of bits removed from the audio data is independent per channel.

There is a clip detection mechanism in place to stop more than <n> clips occuring per codec-block-channel. When the limit is exceeded then the number of bits-to-remove is reduced by one and the bit-removal process is repeated. Some codec-block-channels will not have any bits removed due to clipping.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)