lossyWAV 1.2.0 Development Thread

Topic: lossyWAV 1.2.0 Development Thread (Read 319618 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

lossyWAV 1.2.0 Development Thread

2008-08-25 12:16:05

Following the release of lossyWAV 1.1.0b, I feel it is (again) time to kick off development of the next minor release.

Items currently on the list for inclusion in 1.x.0:

[blockquote]1.32.0: Implementation of SG's new noise shaping method;
1.2.0: ~~Checking of S (=L-R) channel for matrix surround content;~~
1.2.0: ~~Revisit the spreading function;~~[/blockquote]If you have any ideas, suggestions, code optimisations, etc, please post them here.

Link to the hydrogenaudio wiki article

lossyFLAC resultant bitrates:

Code: [Select]

10 Album Test Set
+-------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|Version| Settings | FLAC -5  |--insane  |--extreme |--standard|--portable|  --zero  | --nasty  | --awful  |
+-------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|v1.0.0b| default  | 854kbit/s| 626kbit/s| 539kbit/s| 452kbit/s| 365kbit/s| 295kbit/s| -------- | -------- |
+-------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|v1.1.0c| default  | 854kbit/s| 632kbit/s| 548kbit/s| 463kbit/s| 376kbit/s| 285kbit/s| -------- | -------- |
+-------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|v1.2.0 | default  | 854kbit/s| 627kbit/s| 544kbit/s| 460kbit/s| 376kbit/s| 288kbit/s| -------- | -------- |
|v1.2.0 |    -t    | 854kbit/s| 582kbit/s| 514kbit/s| 450kbit/s| 385kbit/s| 341kbit/s| 310kbit/s| 283kbit/s|
+-------+----------+----------+----------+----------+----------+----------+----------+----------+----------+

55 Problem Sample Set
+-------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|Version| Settings | FLAC -5  |--insane  |--extreme |--standard|--portable|  --zero  | --nasty  | --awful  |
+-------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|v1.0.0b| default  | 780kbit/s| 655kbit/s| 582kbit/s| 503kbit/s| 417kbit/s| 330kbit/s| -------- | -------- |
+-------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|v1.1.0c| default  | 780kbit/s| 654kbit/s| 583kbit/s| 508kbit/s| 425kbit/s| 321kbit/s| -------- | -------- |
+-------+----------+----------+----------+----------+----------+----------+----------+----------+----------+
|v1.2.0 | default  | 780kbit/s| 654kbit/s| 585kbit/s| 510kbit/s| 427kbit/s| 325kbit/s| -------- | -------- |
|v1.2.0 |    -t    | 780kbit/s| 623kbit/s| 565kbit/s| 506kbit/s| 441kbit/s| 391kbit/s| 354kbit/s| 322kbit/s|
+-------+----------+----------+----------+----------+----------+----------+----------+----------+----------+

Suggested foobar2000 converter setup:

lossyFLAC:

Code: [Select]

Encoder: c:\windows\system32\cmd.exe
Extension: lossy.flac
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\flac - -b 512 -5 -f -o%d
Format is: lossless or hybrid
Highest BPS mode supported: 24

lossyTAK:

Code: [Select]

Encoder: c:\windows\system32\cmd.exe
Extension: lossy.tak
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\takc -e -p2m -fsl512 -ihs - %d
Format is: lossless or hybrid
Highest BPS mode supported: 24

lossyWV:

Code: [Select]

Encoder: c:\windows\system32\cmd.exe
Extension: lossy.wv
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\wavpack -hm --blocksize=512 --merge-blocks -i - %d
Format is: lossless or hybrid
Highest BPS mode supported: 24

There is a known problem within foobar2000 (although more likely to do with cmd.exe itself) when running an executable within the cmd.exe command line from a path which includes spaces. The suggested fix for this is to enclose the element of the path which contains spaces within double quotation marks ("), e.g. c:\"program files"\directory_where_executable_is\executable_name

Change log 1.2.0: 16/12/09
Code optimisation;
Removal of negative -q values in default mode. Quality range for --altpreset remains -4 to 10 (--quality -4 --altpreset == --quality 0 --limit 15159).

[!--sizeo:1--][span style=\"font-size:8pt;line-height:100%\"][!--/sizeo--]Change log 1.1.5c: 21/11/09
Minor revision to internal setting for --altpreset.

Change log 1.1.5b: 20/11/09
Major revision to internal setting for --altpreset.

Change log 1.1.5a: 18/11/09
Bugfix: Correction to high sample-rate processing.

Change log 1.1.4s: 07/11/09
Bugfix: manual --limit setting not working as it should.

Change log 1.1.4r: 03/11/09
Bugfix: shaping in altpreset mode was artificially limited to 50% (only affected -q 6.5 and above).

Change log 1.1.4q: 02/11/09
Reversion to use of previous noise pre-calculated constant;
Shaping now OFF by default. To enable shaping use -s or --shaping, without a parameter for automatic shaping or with a value 0<=n<=1 for user specified shaping.

Change log 1.1.4p: 22/10/09
Mutual exclusivity of shaping, hilimit and altpreset removed;
Added noise pre-calculated constant removed in favour of improved derived formula;
--altpreset parameter now also -t.

Change log 1.1.4n: 27/09/09
Mutual exclusivity of shaping, hilimit and altpreset corrected.

Change log 1.1.4m: 26/09/09
--postanalyse function removed;
--limit changed to --hilimit and --lolimit;
--altpreset parameter introduced which changes default behaviour for shaping and hilimit.
[shaping = 0.5*(max(0,q/10)+max(0,q/10)^2.584962)) -q 0 = 0; -q 5 = 0.3333; -q 10 = 1]
[hilimit = round(14000 + 2000 * max(0,q/10)) / samplerate * 64) * (64/samplerate)]

Change log 1.1.4k: 24/08/09
--postanalyse function modified to use existing spreading function.

Change log 1.1.4j: 23/08/09
--limit lower range changed to 10000Hz.

Change log 1.1.4h: 22/08/09
--limit lower range changed to 14500Hz.

Change log 1.1.4g: 20/08/09
--maxsnr removed. -p or --postanalyse parameter implemented. Using this parameter checks the noise level of the correction data and compares to the low value derived from the associated source audio. If the correction noise (i.e. that of the difference signal) is greater than the source audio low value then the bits_to_remove value is reduced for the codec-block until the added noise is lower. Code further tidied. -F or --fftw parameter removed as FFTW dll is now automatically used if found (slight speed-up makes this the fastest way to go). Stack error fixed which occurs when libfftw3-3.dll v3.2.2 is used (newly released).

Change log 1.1.4f: 24/07/09
Bug in --maxsnr parameter fixed. Bug in pure Delphi compile fixed.

Change log 1.1.4e: 22/07/09
Major code redevelopment - more units, hopefully clearer. New parameter: -Y, --maxsnr <n> which allows specification of difference between maximum FFT result and added noise. Maxsnr works with both default spreading and --sortspread. Link to FFTW Windows DLL download page.

Change log 1.1.4d: 07/06/09
Bug fixed whereby lossyWAV would crash if 'libfftw3-3.dll' could not be initialised. If --fftw parameter is used and the DLL cannot be found then lossyWAV will revert to the existing FFT routines and output a warning. Link to FFTW Windows DLL download page.

Change log 1.1.4c: 05/06/09
FFTW can now be optionally used for FFT analyses in lossyWAV. Use of FFTW requires the presence of "libfftw3-3.dll" on the host computer, somewhere on the path and the addition of -F or --fftw to the lossyWAV command line. FFT (Delphi and assembler) further optimised. General code tidy-up. Link to FFTW Windows DLL download page.

Change log 1.1.4b: 14/05/09
FFT (Delphi and assembler) further optimised. Radix-4 FFT implemented in assembler and Delphi and Radix-8 in Delphi. Significant speedup of Delphi FFT throughput.
General code tidy-up.

Change log 1.1.4a: 05/05/09
--sorspread parameter no longer takes an additional parameter, now on/off;
spreading function changed slightly - now properly computes old and new averages separately;
FFT Real routine corrected as was giving wrong signs of some complex output values (did not affect magnitude of results);

Change log 1.1.3k: 30/04/09
Fault-finding release #1 to attempt to determine cause of WINE incompatibility. (Successful!! )

Change log 1.1.3j: 15/04/09
--sortspread parameter modified (again), now takes a parameter between 0 and 7, 2 is equivalent to beta 1.1.3i.
--centre parameter removed.
Reference_threshold tables removed in favour of direct calculation of the level of added noise due to bitdepth reduction using derived formula.

Change log 1.1.3i: 07/04/09
New --sortspread parameter modified (again). Bitrate matched with default spreading for my 55 problem sample set. Will revise table for my 10 Album Test Set.

Change log 1.1.3h: 05/04/09
New --sortspread parameter modified. Removed - bug found.

Change log 1.1.3g: 02/04/09
New --sortspread parameter introduced for testing purposes.

Change log 1.1.3f: 31/03/09
New --centre and --underlap <n> parameters introduced for testing purposes; Revised source.

Change log 1.1.3e: 18/03/09
Removal of old and new spreading functions in favour of variant; Code tidy up - speed improvements for pure delphi compile; Revised source.

Change log 1.1.3d: 05/03/09
Bug fix (would crash with a range error sometimes); Speedup of --varspread code. Revised source.

Change log 1.1.3c: 24/02/09
Introduction of -V or --varspread parameter to enable variant spreading function - a hybrid between the old and the new. Revised source.

Change log 1.1.3b: 23/02/09
Bug-fix: high sample rates with 1.1.3 would cause a range-check error or random results. Revised source.

Change log 1.1.3: 22/02/09
Integration of data structures used in new and old spreading functions. Source release.

Change log 1.1.2j: 18/02/09
Implementation of -O or --oldspread parameter to enable the use of the spreading function used in v1.1.0b instead of the revised version currently under development. This gives very slightly different results to v1.1.0b as is to be expected due to the revision of the reference-threshold constants at beta v1.1.1d.

Change log 1.1.2i: 12/02/09
Addition of a -N or --nasty (-q -2.0) and -A or --awful (-q -4.0) to allow extremely low quality levels to be explored.

Change log 1.1.2h: 12/02/09
Addition of a -N or --nasty (-q -2.0) to allow extremely low quality levels to be explored. Removed: Bug to be fixed.

Change log 1.1.2g: 10/02/09
Addition of a -r or --randombits parameter to randomise the zeroed lsbs.

Change log 1.1.2f: 09/02/09
Further modification to the spreading_function.

Change log 1.1.2e: 06/02/09
Further modification to the noise shaping process - first attempt to attenuate noise-shaping where bits_to_remove is zero for a particular codec block.

Change log 1.1.2d: 05/02/09
Further modification to the noise shaping process - audio data now no longer scaled prior to noiseshaping.

Change log 1.1.2c: 04/02/09
Further modification to the noise shaping process - noise shaping performed even when no bits removed.

Change log 1.1.2b: 03/02/09
Repair of the noise shaping process - now continuous for each channel rather than treating each codec-block totally separately;

Change log 1.1.2: 28/01/09
Code optimisations and data optimisations;
Revisions to the spreading function;

Change log 1.1.1e: 30/09/08
Interim beta, with source as reversion to Delphi complete (with conditional define to re-enable all IA-32/x87 code).

Change log 1.1.1d: 10/09/08
Further revision to the simplified spreading function - slightly higher bitrates than 1.1.1c but I'm happier with the method;
Reference-threshold constants re-calculated using more iterations (2^(32-fft-bit-length) iterations, i.e. 512K iterations for 8192 sample FFT and 128M iterations for 32 sample FFT) and for the first time taking into account FFT-result values less than 1. This only really affects bits-to-remove values between 1 and 7, which is in line with my expectation when I made the change to the noise-calculation method;

Change log 1.1.1c: 02/09/08
Further revision to the simplified spreading function;
Dither removed;

Change log 1.1.1b: 26/08/08
Revision to the simplified spreading function. All bin "averages" now calculated taking into account a variable proportion of bins to either side, i.e. "average" = (fft_result+(fft_result[i-1]+fft_result[i+1])*factor)/(1+2*factor), where factor = 0.0 at 20Hz and 1.0 at 16kHz, with linear interpolation for intermediate values.

Change log 1.1.1a: 25/08/08
Fundamental simplification of spreading function methodology put forward for comment. All bin "averages" now calculated taking into account a fixed proportion of bins to either side, i.e. "average" = (fft_result+(fft_result[i-1]+fft_result[i+1])*factor)/(1+2*factor), where factor = 0.26 in this case;
FFT result overall averaging now carried out prior to the spreading function rather than at the same time;
Reference_threshold constants revised slightly.

Change log 1.1.0b: 03/08/08
FFT lengths will now increase for higher bitrate audio, i.e. 88.2/96kHz, 176.4/192kHz and 352.8/384kHz;
improved logfile output and --detail output;
reference threshold constants for rectangular dither and triangular dither have been calculated so added noise should be the same for dither off and any dither level between 0 and 1 - the number of bits-to-remove will however reduce with "increasing" dither.[/size]

lossyWAV 1.2.0 Development Thread

Reply #1 – 2008-08-25 13:36:37

Some questions:

Do we need dither?
Do we need 32-bit integer processing?
Do we need the capability to create correction files?

I ask as these all add to the time taken to process files (even if the options themselves are not selected).

Comments / criticisms / brickbats welcomed as before.

I will acknowledge the usefulness of the correction file as a quick and automatic way of generating the difference signal between the lossless original and processed output (for scaling=1 only).

lossyWAV 1.2.0 Development Thread

Reply #2 – 2008-08-25 15:32:01

Quote from: Nick.C on 2008-08-25 13:36:37

Some questions:
Do we need dither?
Do we need 32-bit integer processing?
Do we need the capability to create correction files?
I ask as these all add to the time taken to process files (even if the options themselves are not selected).

Comments / criticisms / brickbats welcomed as before.

I will acknowledge the usefulness of the correction file as a quick and automatic way of generating the difference signal between the lossless original and processed output (for scaling=1 only).

Nick,
I have never been sure when/if dither should be used and if so, how much. I've tried with and without and can't hear any difference so from my perspective it isn't needed

Again personally, I can't see a need for 32 bit integer. As a CoolEdit user I would be more interested in 32 bit float but in any event I'd actually be unlikely to use it.

Similarly correction files. I've never used them and don't suppose I ever will as I get transparent (to me) output from lossyWav.

I actually like lossyWav as it is. It does exactly what I need. I can see that other's needs might be different to mine though.

The only thing I'd like to see is a nice GUI front end - I've mentioned that a couple of times before. I don't mean to go on about it but it would make life a bit easier for me as a non-techie.

lossyWAV 1.2.0 Development Thread

Reply #3 – 2008-08-25 15:43:29

Quote from: Nick.C on 2008-08-25 13:36:37

Some questions:
Do we need dither?
Do we need 32-bit integer processing?
Do we need the capability to create correction files?
I ask as these all add to the time taken to process files (even if the options themselves are not selected).

Comments / criticisms / brickbats welcomed as before.

I will acknowledge the usefulness of the correction file as a quick and automatic way of generating the difference signal between the lossless original and processed output (for scaling=1 only).

My opinion on the questions:

a) dithering is not needed
b) I like to see further support for 24 bit depth input files (I replaygain with foobar using 24 bit WAV output files). I cannot imagine anybody needs a bit depth of 32 bit. In case that's what your question is about.
c) I like to be able to listen to the error signal. I don't need the correction file for reconstructing the original signal.

Great that you're still struggling so much to improve lossyWAV.
I'm just a bit sceptical about the new spreading approach. It changes the machinery in a quite significant way at the low frequency end, and I think we can be very content with the current machinery. Changing the machinery would mean we throw away the experience we have so far with lossyWAV's quality (and though the experience situation isn't optimal we do have some experience), and start experiencing quality again from the zero point - more or less). Doesn't spreading just mean averaging over a certain number of bins? With this in mind I wouldn't care whether or not the virtual center of the bins involved is identical with one of the real bins. At least this is how I understand the new spreading idea. Sure there are numerous ways of doing the averaging, but are there expectations for a real benefit when going the new way?

lossyWAV 1.2.0 Development Thread

Reply #4 – 2008-08-25 19:14:03

I was waiting for this thread

I wish I could answer your questions but:
1: I don't know what lossywav can gain from dithering (the same as mp3 ? supposed more "natural" background noise ? I always thought dithering was made to soften frequency destruction effect so I don't get the use with lossywav)
2: I don't get what you meant but my CPU is 32bits & my audio is 16/24Bits
3: I already said I didn't use correction files in the other thread

I disagree with halb27 on the spreading function, if you must refrain experimentation because you fear to break the machine lossywav will never progress, you just need to be sure it worth it before making it a full release.
Also I don't need a gui personnaly, even if it would exist, I would use F2K. I agree it would be better than F2K for noobs allergic to command-line but it should be a lossywav/flac gui or the noob will end with a big wav file asking himself the purpose of such a useless codec so maybe a fork of speek' flac frontend ... but not a lossywav gui alone ...

Note: I will most likely convert my whole lossless collection to lossywav after the 1.2.0 release, so I hope it will be VERY good

lossyWAV 1.2.0 Development Thread

Reply #5 – 2008-08-25 21:05:21

Quote from: sauvage78 on 2008-08-25 19:14:03

.. you just need to be sure it worth it before making it a full release. ...

That's the problem.

lossyWAV 1.2.0 Development Thread

Reply #6 – 2008-08-25 21:18:43

One possible variant is to use 1 as a spreading value where 1.1.0b did and for all the values which exceed 1 use something else (1<value<2), i.e.
((2,2,2,2,2,2,2,2),(2,2,2,2,2,2,2,2),(2,2,2,2,2,2,2,2),(1,2,2,2,2,2,2,3),(1,1,2,2,2,2,2,3),(1,1,2,2,
2,2,3,4))
goes to
((SC,SC,SC,SC,SC,SC,SC,SC),(SC,SC,SC,SC,SC,SC,SC,SC),(SC,SC,SC,SC,SC,SC,SC,SC),(1,SC,SC,SC,SC,SC,SC,
SC),(1,SC,SC,SC,SC,SC,SC,SC),(1,1,SC,SC,SC,SC,SC,SC))
this should go some way to alleviate any concerns with respect to reducing quality as less averaging = lower minima.

lossyWAV 1.2.0 Development Thread

Reply #7 – 2008-08-26 00:28:30

I'll second a request for 32-bit float.

lossyWAV 1.2.0 Development Thread

Reply #8 – 2008-08-26 07:37:03

Quote from: Nick.C on 2008-08-25 21:18:43

One possible variant is to use 1 as a spreading value where 1.1.0b did and for all the values which exceed 1 use something else (1<value<2), i.e.
((2,2,2,2,2,2,2,2),(2,2,2,2,2,2,2,2),(2,2,2,2,2,2,2,2),(1,2,2,2,2,2,2,3),(1,1,2,2,2,2,2,3),(1,1,2,2,
2,2,3,4))
goes to
((SC,SC,SC,SC,SC,SC,SC,SC),(SC,SC,SC,SC,SC,SC,SC,SC),(SC,SC,SC,SC,SC,SC,SC,SC),(1,SC,SC,SC,SC,SC,SC,
SC),(1,SC,SC,SC,SC,SC,SC,SC),(1,1,SC,SC,SC,SC,SC,SC))
this should go some way to alleviate any concerns with respect to reducing quality as less averaging = lower minima.

If I understand it correctly you want to stay conservative when including more bins in the averaging compared to what we have now by applying a considerably smaller weight to the bins which are off-center to the highest degree (so a weight of >0.5 for the center bin of the 3 bin spreading replacing current 2 bin spreading?). Sounds good though I still can't see the potential advantage and why you aren't content with current spreading.

lossyWAV 1.2.0 Development Thread

Reply #9 – 2008-08-26 08:26:02

I am re-examining each major processing component in turn - it's the turn of the spreading function....

I've modified the spreading function so that at the bin corresponding to 20Hz the range is 1.0 and at 16kHz it is 2.0, with linear interpolation for intermediate bins.

lossyWAV beta 1.1.1b attached to post #1 in this thread.

[edit]
The concensus (and what David and SebastianG said earlier) seems to be that dither is not required within lossyWAV.

On the processing of 32-bit integer samples, I'll leave it in at the moment, but I don't think that there are many packages that would output them in favour of 32-bit float samples. I don't know if the method would work on 32-bit float samples - I have a feeling it would be difficult to determine how many bits precision to remove from a float value - unless it was a simple "reduce a 32-bit float value (23-bit mantissa) to a 24-bit float value (15-bit mantissa) by brute force...." process.

It seems that some people like the correction file for analysis rather than reversion to lossless - maybe the --merge parameter can go?
[/edit]

lossyWAV 1.2.0 Development Thread

Reply #10 – 2008-08-26 09:43:27

Quote from: Nick.C on 2008-08-26 08:26:02

... maybe the --merge parameter can go?

As for my needs: yes.

lossyWAV 1.2.0 Development Thread

Reply #11 – 2008-08-26 12:33:20

The noise floor already "floats" in 32-bit float.

What do FLAC and other lossless encoders do wrt floating point data and "wasted bits"?

Depending on that, the appropriate lossyWAV processing could be tricky but useful, or pointless.

I only ever use 32-bit float files as intermediate files. Sometimes I archive them as-is, so I can re-work the project later. lossyWAV might be useful here, though TBH I haven't even bothered FLACing them because it's so rare that I do this. Other people might do this on a daily basis!

I have no experience of 32-bit integer audio files. 48-bit integer is common in audio processing (DSP IIR filtering etc), but never as an output.

IMO dither can go, if having the option available is slowing down processing even when it's not used.

If "Implementation of SG's new noise shaping method" means dynamic noise shaping, then depending on how aggressively you do this, it might be worth changing from rectangular spreading functions to something else entirely. I'm pointing this out because you might spend a long time playing with the current spreading function, only to dump it soon after. What you have is a narrow (fractional) version of something vaguely related to the ERB (equivalent rectangular bandwidth) scale - I reckon one day you'll end up with something which is a narrow (fractional) version of something vaguely related to overlapping critical band filters.

I can't help feeling that there's no more or less reason to have reconstruction with lossyWAV than with wavpack lossy, apart from the currently inevitable clunkiness of it. However, if the concept is there, it's another "tick" in the format comparison table, and someone can always come along and implement a more graceful re-uniting of the lossy and correction files later, if they feed the need. If you drop this ability entirely, this possibility is removed. Whether you support the merging in lossyWAV itself is up to you - having that available can't slow down encoding though, can it?

Cheers,
David.

lossyWAV 1.2.0 Development Thread

Reply #12 – 2008-08-26 18:38:18

Suggestion: Cross-platform code, allowing Mac OS X and GNU/Linux users to take part of the fun...
But that may be too huge of a task?

lossyWAV 1.2.0 Development Thread

Reply #13 – 2008-08-26 23:42:01

Quote from: 2Bdecided on 2008-08-26 12:33:20

The noise floor already "floats" in 32-bit float. What do FLAC and other lossless encoders do wrt floating point data and "wasted bits"? Depending on that, the appropriate lossyWAV processing could be tricky but useful, or pointless.

FLAC doesn't even support floating point IIRC.

One could go both ways with that, and say that either lossyWAV has no need for floating point support, or that it provides a very nice way to gracefully encode the floating-point data. I'm leaning towards the latter.

The only issue I'd see otherwise is how to handle +0dbFS samples. I have no suggestions on how to handle that except perhaps to optionally right-shift the output by a few bits and scribble the gain down in the tags.

Quote

I only ever use 32-bit float files as intermediate files. Sometimes I archive them as-is, so I can re-work the project later. lossyWAV might be useful here, though TBH I haven't even bothered FLACing them because it's so rare that I do this. Other people might do this on a daily basis!

I would preferrably record my vinyl transcriptions in floating point on principle alone.

Quote

I have no experience of 32-bit integer audio files. 48-bit integer is common in audio processing (DSP IIR filtering etc), but never as an output.

Heh, if 32-bit float is going to be supported, why not 64-bit floating point too? It's a negligible code change.

In principle, the binning process used to establish critical band responses could be circumvented through clever frequency mapping. For instance, doing a frequency shift from 10khz down to say 100hz would mean that quantization noise that originally fit inside one bin could now fit in several. This could kick something into audibility.

Right?

What I'm getting at here is that perhaps more work should be put into tuning lossyWAV so that virtually all DSP effects/manipulations could not possibly cause an audibility difference, rather than merely ensuring that straight listening will not tease out a difference.

lossyWAV 1.2.0 Development Thread

Reply #14 – 2008-08-27 07:40:12

It would be possible to read 32-bit float values and write 32-bit integer values (having suitably scaled the output) - this would not change the file-size but only some of the fmt chunk information.

I'll look into it....

To fit in the range -2,147,483,648..2,147,483,647 the 32-bit float value would require to be scaled by a factor of 2^-97.

[edit] I've just been reading about the draft IEEE-754r standard and there will be a 16-bit float value in the range +/-1.## x 2^-15 to +/-1.## x 2^14 with a mantissa of 10 bits. This seems to open up the possibility of 11 bit precision in a 2^30 range, or taking what we know about lossyWAV into account effectively stores a 32-bit integer in a 16-bit float (albeit with reduced precision - but reduced precision is not proving to be a problem ).

lossyWAV 1.2.0 Development Thread

Reply #15 – 2008-08-27 07:49:12

A complete mapping of the floating point domain is unnecessary unless HDR techniques start creeping in from the video realm to the audio realm (which is rather unlikely). All I'd anticipate would be desired would be a bit shift from 0-4 bits if that.

... Not like I have any kind of valid use for that feature, so feel free to ignore it.

lossyWAV 1.2.0 Development Thread

Reply #16 – 2008-08-27 14:06:34

Quote from: Nick.C on 2008-08-25 12:16:05

1.2.0: Implementation of SG's new noise shaping method

Yay!
If you like to get a Matlab version of the code I sent you click here.

Quote from: Nick.C on 2008-08-25 12:16:05

1.2.0: Revisit the spreading function

Can you shed some more light on what's currently happening in this regard? What exactly is fft_result[k], why is there an averaging and what happens after the averaging?

Quote from: Nick.C on 2008-08-27 07:40:12

To fit in the range -2,147,483,648..2,147,483,647 the 32-bit float value would require to be scaled by a factor of 2^-97.

IIRC digital full scale is usually +/- 1.0 in float formats. So, in case you want to convert it to 24 bit ints, you should scale the floats by 2^23.

Cheers,
SG

lossyWAV 1.2.0 Development Thread

Reply #17 – 2008-08-27 14:49:01

I fail to see how anyone that goes through the lengths to save audio in floating point format would want to use lossyWav on it
The first step would be to convert to something like 24bit integer IMO.

lossyWAV 1.2.0 Development Thread

Reply #18 – 2008-08-27 22:18:22

Quote from: SebastianG on 2008-08-27 14:06:34

Can you shed some more light on what's currently happening in this regard? What exactly is fft_result[k], why is there an averaging and what happens after the averaging?

The FFT_result array is created by taking the magnitude of the raw results of the complex fft analysis and multiplying by the corresponding skewing value.

These results have always been averaged over a number of bins to remove zero or very low single bins. The most recent method now only takes into account a proportion of the bins on either side of the target bin rather than bins one or two bins away from the target bin. I feel that this will still remove single low bins but will possibly be better than the former method.

btw, thanks very much for the Matlab method - I can read matlab, not C!

I see what you mean about 32-bit floats. However the easiest way would be to convert to 32-bit integers (in the first instance) - maybe 24-bit integers later.

lossyWAV 1.2.0 Development Thread

Reply #19 – 2008-08-29 14:18:22

Quote from: GeSomeone on 2008-08-27 14:49:01

I fail to see how anyone that goes through the lengths to save audio in floating point format would want to use lossyWav on it
The first step would be to convert to something like 24bit integer IMO.

Not at all - often when I use 32-bit floats (in CEP), it's so I don't have to worry about clipping. It's not always about the quality at all - sometimes it's simple convenience.

Cheers,
David.

lossyWAV 1.2.0 Development Thread

Reply #20 – 2008-09-02 12:53:53

lossyWAV beta 1.1.1c attached to post #1 in this thread.

Further minor changes to the spreading-function, resulting in the following bitrates for my 10 album test set:

Code: [Select]

|===============|==========|==========|==========|==========|==========|==========|
|    Version    |   FLAC   | --insane |--extreme |--standard|--portable|  --zero  |
|===============|==========|==========|==========|==========|==========|==========|
|lossyWAV 1.1.0b| 854kbit/s| 632kbit/s| 548kbit/s| 463kbit/s| 376kbit/s| 285kbit/s|
|---------------|----------|----------|----------|----------|----------|----------|
|lossyWAV 1.1.1c| 854kbit/s| 627kbit/s| 542kbit/s| 457kbit/s| 373kbit/s| 281kbit/s|
|===============|==========|==========|==========|==========|==========|==========|

lossyWAV 1.2.0 Development Thread

Reply #21 – 2008-09-02 14:22:13

Guess nobody liked my suggestion.
Oh well! ...at least it was worth a try...

lossyWAV 1.2.0 Development Thread

Reply #22 – 2008-09-02 14:27:38

I love the idea, I just don't have the development platforms or the experience to carry out the conversion.

lossyWAV 1.2.0 Development Thread

Reply #23 – 2008-09-03 17:32:32

I just spend the last 15 min testing LossyWAV V1.1.0b Vs. LossyWAV V1.1.1c Beta at -q 1 (Ginnungagap), in order to see if there was any regression before the big noise shaping jump, personnaly I couldn't hear any major regression/progressions so I guess the serious things for 1.2 can start now

foo_abx 1.3.3 report
foobar2000 v0.9.5.5
2008/09/03 17:58:44

File A: C:\Documents and Settings\Sauvage.S\Bureau\Nouveau dossier\02- Ginnungagap Test Sample (Lossywav)b.lossy.tak
File B: C:\Documents and Settings\Sauvage.S\Bureau\Nouveau dossier\02- Ginnungagap Test Sample (Lossywav)o.lossy.tak

17:58:44 : Test started.
18:00:35 : 01/01 50.0%
18:01:43 : 02/02 25.0%
18:03:09 : 02/03 50.0%
18:03:54 : 03/04 31.3%
18:05:52 : 04/05 18.8%
18:07:42 : 05/06 10.9%
18:10:19 : 05/07 22.7%
18:11:52 : 06/08 14.5%
18:22:29 : 06/09 25.4%
18:22:57 : Test finished.

----------
Total: 6/9 (25.4%)

Edit1: Even if I reached 5/6 at the beginning I couldn't really tell what I was listening to (I mean inside the area of the usual artefact) ... so I think it was lucky guessing ...

Edit2: To be 100% sure it was lucky guessing I spend 10 min this morning to redo the test at -q 0 & I failed even more clearly, so for me there is no regression/improvement except the very small but welcomed kbps gain.

foo_abx 1.3.3 report
foobar2000 v0.9.5.5
2008/09/04 16:27:22

File A: C:\Documents and Settings\Sauvage.S\Bureau\Nouveau dossier\02- Ginnungagap Test Sample (Lossywav)b.lossy.tak
File B: C:\Documents and Settings\Sauvage.S\Bureau\Nouveau dossier\02- Ginnungagap Test Sample (Lossywav)o.lossy.tak

16:27:22 : Test started.
16:29:22 : 01/01 50.0%
16:30:29 : 02/02 25.0%
16:34:43 : 03/03 12.5%
16:35:57 : 03/04 31.3%
16:36:35 : 03/05 50.0%
16:38:30 : 03/06 65.6%
16:38:42 : Test finished.

----------
Total: 3/6 (65.6%)

lossyWAV 1.2.0 Development Thread

Reply #24 – 2008-09-03 22:35:01

Quote from: Nick.C on 2008-08-25 12:16:05

Suggested foobar2000 converter setup:

lossyFLAC:

Code: [Select]

Encoder: c:\windows\system32\cmd.exe
Extension: lossy.flac
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\flac - -b 512 -5 -f -o%d
Format is: lossless or hybrid
Highest BPS mode supported: 24

lossyTAK:

Code: [Select]

Encoder: c:\windows\system32\cmd.exe
Extension: lossy.tak
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\takc -e -p2m -fsl512 -ihs - %d
Format is: lossless or hybrid
Highest BPS mode supported: 24

lossyWV:

Code: [Select]

Encoder: c:\windows\system32\cmd.exe
Extension: lossy.wv
Parameters: /d /c c:\"program files"\bin\lossywav - --standard --silent --stdout|c:\"program files"\bin\wavpack -hm --blocksize=512 --merge-blocks -i - %d
Format is: lossless or hybrid
Highest BPS mode supported: 24

Just out of curiosity, why there is no example for lossyAPE (MonkeyAudio)?
Sorry if this has already been mentioned somewhere, the lossyWAV threads are a bit too long

~

Notice