## Topic: Near-lossless / lossy FLAC (Read 127094 times)previous topic - next topic

0 Members and 1 Guest are viewing this topic.
• Nick.C
• Developer
Near-lossless / lossy FLAC
##### Reply #175 – 18 July, 2007, 05:15:15 AM
Playing with more samples now and when no bits are "forced" to be removed the size of the sample set compressed in FLAC is almost identical to that of OGG at -q 10.

I managed to successfully implement a variable and conditional reduction to avoid clipping when rounding. However, it always produces a maximum possible amplitude of 2^(bs-1)-2^(bs-minimum_bits_to_keep)-1 i.e. 1023, 2047, 4095 from the absolute peak - for possibly 1 or two samples in the whole set. What would happen if I were to allow the occasional 32768 then reduce it to 32767 (by force)?
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848| FLAC -5 -e -p -b 512 -P=4096 -S-

• 2Bdecided
• Developer
Near-lossless / lossy FLAC
##### Reply #176 – 18 July, 2007, 05:41:44 AM
Playing with more samples now and when no bits are "forced" to be removed the size of the sample set compressed in FLAC is almost identical to that of OGG at -q 10.
That's interesting. It would be interesting to compare the added noise - I have no idea what OGG does at q10.
Quote
I managed to successfully implement a variable and conditional reduction to avoid clipping when rounding. However, it always produces a maximum possible amplitude of 2^(bs-1)-2^(bs-minimum_bits_to_keep)-1 i.e. 1023, 2047, 4095 from the absolute peak - for possibly 1 or two samples in the whole set. What would happen if I were to allow the occasional 32768 then reduce it to 32767 (by force)?
Nothing bad. It would just mean that, in that block only, FLAC couldn't take advantage of the wasted bits, and so would encode all 16. At least, that's my understanding. David Bryant has suggested that Wavpack might be able to handle the situation more intelligently.

What you suggest is a useful strategy (can you share it please?). I think I'd want to implement it across albums rather than tracks - both to avoid very slight but possibly audible loudness changes between tracks (4095 from peak is -1.16dB down, which is the point where it might just become audible) and any possible issues on gapless albums. Still, if you limited it to 2047 it would usually be OK on a per-track basis, and any issues on larger changes will still be relatively small.

For my use, I'd apply album ReplayGain before encoding (as long as it was negative - i.e. lower volume) so wouldn't expect to see much clipping.

Cheers,
David.

• Nick.C
• Developer
Near-lossless / lossy FLAC
##### Reply #177 – 18 July, 2007, 07:43:53 AM
*&*&^%\$^&^&\$! Editor / user compatibility issues.........

Revised code, changing implementation of fix_clipped and hard limiting to (2^(bs-1)-2^(bits_to_remove(codec_block_number))) for each block.

Edit: Codebox removed.  Forum no like.  Code attached instead.

 Synthetic Soul, you're a gentlemen and a scholar - cheers! [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848| FLAC -5 -e -p -b 512 -P=4096 -S-

• Nick.C
• Developer
Near-lossless / lossy FLAC
##### Reply #178 – 18 July, 2007, 02:35:25 PM
I've figured out a way to work out the optimum fix_clipped ratio based on the actual bits to remove per codec block rather than the maximum_bitd_to_remove for the whole WAV file. Will implement and revert.

Script updated - functional blocks of code moved about a bit...... text file is in uploads, Lossy FLAC thread post.

Optimum fix_clipped ratio refined and existing methods removed. Codec_block_size changed to 576 samples as this reduced file size for sample set (lossy flac'ed) from 33.6MiB to 32.3MiB. Code tidied up a fair bit and partially commented.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848| FLAC -5 -e -p -b 512 -P=4096 -S-

• Nick.C
• Developer
Near-lossless / lossy FLAC
##### Reply #179 – 20 July, 2007, 04:04:33 PM
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848| FLAC -5 -e -p -b 512 -P=4096 -S-

• stel
Near-lossless / lossy FLAC
##### Reply #180 – 20 July, 2007, 04:45:03 PM
Keep up the good work gents, I'm certainly keeping an eye on this topic.
I've tried using Octave but it doesn't want to install on my PC and I'm too poor to purchase Matlab so I can't try the matlab files. I might have a go at installing Octave on my Linux box.

I have in the past done a bit of C/C++ coding so I'm trying to put something together in C++. I'm not promising anything but so far I've managed to piece together a working WAV read/write and FFT routine in one app just need to join them together. Never tried anything like this before, but its good fun trying.

Steve

• kjoonlee
Near-lossless / lossy FLAC
##### Reply #181 – 02 August, 2007, 03:38:02 PM
Oh wow. I've only discovered this thread today. Near-lossless / lossy FLAC needs a catchy name so that you can avoid questions like "is it lossy or lossless?"

I propose "Flossy".

(Hat tip to Garf for Floggy.)

• Nick.C
• Developer
Near-lossless / lossy FLAC
##### Reply #182 – 03 August, 2007, 05:25:39 AM
Earlier in the thread it was discussed and, basically, the only bit being modified is the WAV file input to the FLAC file. As such, any FLAC file created from the processed WAV file is still a perfectly compatible FLAC file - not any kind of new format.

I totally agree that these processed files should be in some way differentiated from FLAC files created from lossless sources. In the script, 2Bdecided renames the WAV file from ".wav" to ".lossy.wav" to clearly mark which is which.

ps. Calling the processor SoundSimplifier was put forward and in my variations on 2Bdecided's script, I name the processed ".wav" files ".ss.wav".
[/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848| FLAC -5 -e -p -b 512 -P=4096 -S-

• kjoonlee
Near-lossless / lossy FLAC
##### Reply #183 – 03 August, 2007, 11:12:30 AM
Very well, but what I had meant was a quick shorthand for "lossless FLAC files made by compressing the lossy output of SoundSimplifier": Flossy files.

• Nick.C
• Developer
Near-lossless / lossy FLAC
##### Reply #184 – 03 August, 2007, 12:28:44 PM
As a colloquial reference, flossy is suitably amusing and rolls off the tongue - but for file naming, it still needs to end in ".flac"
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848| FLAC -5 -e -p -b 512 -P=4096 -S-

• Porcus
Near-lossless / lossy FLAC
##### Reply #185 – 13 August, 2007, 04:15:36 AM
The idea is simple: lossless codecs use a lot of bits coding the difference between their prediction, and the actual signal. The more complex (hence, unpredictable) the signal, the more bits this takes up. However, the more complex the signal, the more "noise like" it often is. It's seems silly spending all these bits carefully coding noise / randomness.

So, why not find the noise floor, and dump everything below it?

This isn't about psychoacoustics. What you can or can't hear doesn't come into it.

I beg to differ. Well, I understand what you want to do, but:
- the "lossy" part of "lossy compression" is really about optimizing "what to remove". There are bits that have higher "listening value" than others, and for each post-compression file size S, you want to keep the "most valuable subset of size S".
- of course it is not as simple as "find the most valuable subset of the raw PCM and compress it", it is "find the most valuable compressed subset" -- cropping and compression are (in principle) interacting.
- your idea assumes -- possibly implicitely -- that everything buried in the noise floor is of lesser value and can be cropped.

So, the question is:
For a B-bit bitstream with the noise floor occupying the lower b bits, you have one possible lossy compression by cropping at b and applying FLAC. This gives you a file of size S -- is there no lossy compression of file size at most S, sounding at least as good to the human ear? And if there is one better, is this crop+flac procedure reasonably close?

That question is all about psychoacoustics, imho.

Of course, if a lossy encoder sets out not to discriminate too much between musical styles -- including Japanese noise music -- it might (for all that I know) be that they cannot work by detecting and removing the noise floor, and so it might be possible to improve by hand-picking recordings where you don't care about the audible noise (you might even think that filtering it off is a subjective improvement). But in principle a lossy encoder could then add a -toleratenoisefloorremoval flag (with a parameter setting the aggressiveness!) and improve on both your and their software?

• 2Bdecided
• Developer
Near-lossless / lossy FLAC
##### Reply #186 – 13 August, 2007, 05:27:16 AM
So, the question is:
For a B-bit bitstream with the noise floor occupying the lower b bits, you have one possible lossy compression by cropping at b and applying FLAC. This gives you a file of size S -- is there no lossy compression of file size at most S, sounding at least as good to the human ear?
Of course. Any of the well known psychoacoustic codecs are going to give you smaller file size, more noise, and (usually) a perceptually transparent result.

Quote
That question is all about psychoacoustics, imho.
Only in this respect: if you quantise at (or below) the noise floor (actually the lowest FFT bin), is it audible?

Quote
Of course, if a lossy encoder sets out not to discriminate too much between musical styles -- including Japanese noise music -- it might (for all that I know) be that they cannot work by detecting and removing the noise floor, and so it might be possible to improve by hand-picking recordings where you don't care about the audible noise (you might even think that filtering it off is a subjective improvement). But in principle a lossy encoder could then add a -toleratenoisefloorremoval flag (with a parameter setting the aggressiveness!) and improve on both your and their software?
Quantising at the noise floor doesn't remove noise - by definition it adds noise since the signal is now even less like the orginal. There's already an option to decide how far below the noise floor to quantise.

lossyFLAC can cope perfectly well with pure noise (it keeps about 5-7 bits for pure white noise), but I'd be interested to hear some Japanese noise music - do you have a sample you could post?

Cheers,
David.

• Porcus
Near-lossless / lossy FLAC
##### Reply #187 – 15 August, 2007, 05:01:57 AM

• 2Bdecided
• Developer
Near-lossless / lossy FLAC
##### Reply #188 – 15 August, 2007, 05:28:11 AM
Thanks. I'm guessing lossyFLAC would be fine, though I bet there would be mp3 problem samples in there somewhere.

Cheers,
David.

• 2Bdecided
• Developer
Near-lossless / lossy FLAC
##### Reply #189 – 23 August, 2007, 12:10:48 PM
Nick,

Can you list the exact test set you were using when assessing file size please?

I have something new working.

Cheers,
David.

• Nick.C
• Developer
Near-lossless / lossy FLAC
##### Reply #190 – 23 August, 2007, 02:22:56 PM
Files as follows:

Code: [Select]
`13/07/2007  07:46         1,763,156 06_florida_seq.wav28/06/2007  14:25         4,207,772 10 - Dungeon - The Birth- The Trauma Begins.wav13/07/2007  07:46         2,116,844 14_Track03beginning.wav13/07/2007  07:46         2,249,144 16_Track03entreaty.wav13/07/2007  07:46         4,233,644 18_Track04cakewithtea.wav12/07/2007  13:54         2,727,980 34_Gabriela_Robin___Cats_on_Mars.wav29/06/2007  09:24         5,292,048 41_30sec.wav28/06/2007  14:25         1,464,340 A02_metamorphose.wav12/07/2007  13:54         1,058,444 A03_emese.wav08/08/2007  12:35         1,344,704 Angelic.wav27/06/2007  10:29         2,822,444 annoyingloudsong.wav28/06/2007  14:25           886,144 aps_Killer_sample.wav09/07/2007  16:29         2,145,060 Atem_lied.wav09/07/2007  16:29         3,377,108 ATrain.wav09/07/2007  16:29         4,410,076 Bachpsichord.wav13/07/2007  07:55         4,669,484 badvilbel.wav09/07/2007  16:29         4,320,784 BigYellow.wav09/07/2007  16:29           717,072 birds.wav08/08/2007  18:45         2,428,708 bruhns.wav18/07/2007  07:49         1,487,684 cricket__insect___edit_.wav27/06/2007  11:24         1,522,796 E50_PERIOD_ORCHESTRAL_E_trombone_strings.wav09/07/2007  16:29         2,646,180 eig.wav08/08/2007  18:45           797,372 Furious.wav09/07/2007  16:29           562,952 glass_short.wav13/07/2007  07:55         2,891,112 harp40_1.wav13/07/2007  07:55         1,986,864 herding_calls.wav09/07/2007  16:29         1,319,320 jump_long.wav08/08/2007  12:07           168,104 keys_1644ds.wav12/07/2007  13:54         1,766,396 ladidada_10s.wav12/07/2007  13:55         1,845,132 Liebe_so_gut_es_ging.wav28/06/2007  14:25           663,492 Moon_short.wav12/07/2007  13:54         1,416,420 Poets_of_the_fall___Shallow.wav09/07/2007  16:29         5,292,044 rach_original.wav09/07/2007  16:29         3,130,908 rawhide.wav12/07/2007  13:54         1,785,436 Rush___Hold_Your_Fire___Turn_the_Page.wav09/07/2007  16:29         1,697,724 S13_KEYBOARD_Harpsichord_C.wav27/06/2007  11:24           882,048 S30_OTHERS_Accordion_A.wav09/07/2007  16:29         3,357,548 S34_OTHERS_GlassHarmonica_A.wav27/06/2007  11:24         1,170,784 S35_OTHERS_Maracas_A.wav27/06/2007  11:24         2,292,528 S53_WIND_Saxophone_A.wav08/08/2007  12:35           486,196 SeriousTrouble.wav18/07/2007  07:49         3,514,148 swarm_of_wasps__edit_.wav09/07/2007  16:30         6,218,144 thewayitis.wav08/08/2007  12:35         7,605,028 the_product.wav08/08/2007  12:09           317,656 triangle.wav08/08/2007  19:15           777,516 triangle_2_1644ds.wav13/07/2007  07:55         1,769,512 trumpet.wav28/06/2007  14:25         2,095,424 VELVET.wav11/07/2007  11:33         3,707,200 wait.wav              49 File(s)    117,408,624 bytes`
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848| FLAC -5 -e -p -b 512 -P=4096 -S-

• 2Bdecided
• Developer
Near-lossless / lossy FLAC
##### Reply #191 – 24 August, 2007, 06:13:44 AM
Thanks.

Code: [Select]
`28/06/2007  14:25         4,207,772 10 - Dungeon - The Birth- The Trauma Begins.wav12/07/2007  13:54         2,727,980 34_Gabriela_Robin___Cats_on_Mars.wav18/07/2007  07:49         1,487,684 cricket__insect___edit_.wav12/07/2007  13:54         1,416,420 Poets_of_the_fall___Shallow.wav12/07/2007  13:54         1,785,436 Rush___Hold_Your_Fire___Turn_the_Page.wav18/07/2007  07:49         3,514,148 swarm_of_wasps__edit_.wav08/08/2007  12:35         7,605,028 the_product.wav08/08/2007  12:09           317,656 triangle.wav11/07/2007  11:33         3,707,200 wait.wav`

Cheers,
David.

• Nick.C
• Developer
Near-lossless / lossy FLAC
##### Reply #192 – 24 August, 2007, 06:45:33 AM
Thanks.

Code: [Select]
`28/06/2007  14:25         4,207,772 10 - Dungeon - The Birth- The Trauma Begins.wav12/07/2007  13:54         2,727,980 34_Gabriela_Robin___Cats_on_Mars.wav18/07/2007  07:49         1,487,684 cricket__insect___edit_.wav12/07/2007  13:54         1,416,420 Poets_of_the_fall___Shallow.wav12/07/2007  13:54         1,785,436 Rush___Hold_Your_Fire___Turn_the_Page.wav18/07/2007  07:49         3,514,148 swarm_of_wasps__edit_.wav08/08/2007  12:35         7,605,028 the_product.wav08/08/2007  12:09           317,656 triangle.wav11/07/2007  11:33         3,707,200 wait.wav`

Cheers,
David.

David - at work, so no time to find links, however, check your e-mail!

Best regards,

Nick.

As an aside, to allow the (timely) calculation of extreme fft_lengths I employed the following (as the longer fft_bit_length analyses seemed to converge more quickly):

noise_averages_bits=25;

noise_averages=ceil(2^(max(0,(noise_averages_bits-fft_bit_length(analysis_number)))^0.9));
so, for fft_bit_length=17:91 iterations; 14:404; 11:1726; 8:7160 and 5:28979.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848| FLAC -5 -e -p -b 512 -P=4096 -S-

• 2Bdecided
• Developer
Near-lossless / lossy FLAC
##### Reply #193 – 24 August, 2007, 12:06:31 PM
Thanks Nick.

I've implemented noise shaping, something like...

http://telecom.vub.ac.be/Research/DSSP/Pub.../AES-2002-B.pdf
http://telecom.vub.ac.be/Research/DSSP/Pub...ICASSP-2003.pdf

All due credit to SebG - this is almost exactly what he suggested on page 1 of this thread.

I make no guarantee that it's transparent (though I've tried!) - it's been a challenge to stop it going unstable (50dB more noise then you expected isn't great for the audio quality!) but it seems to have settled down now.

I'm getting these bitrates...

wav: 111 MB (117,408,624 bytes) = 1411kbps
lossless FLAC: 64.1 MB (67,304,026 bytes) = 809kbps
lossyFLAC6: 34.0 MB (35,754,414 bytes) = 429kbps
lossyFLAC10: 27.0 MB (28,378,924 bytes) = 341kbps

32kRG:
lossyFLAC6: 32.6 MB (34,209,716 bytes) = 411kbps
lossyFLAC10: 21.6 MB (22,696,905 bytes) = 273kbps

blocksize=576 throughout (may not be optimal for v10 and 32k)

32k = PPHS in foobar2k
RG=ReplayGain-by-track+clipping-prevention-by-peak in foobar2k
- which increases the volume of lots of clips in this test set (i.e. makes it less efficient)

No clipping prevention enabled in lossyFLAC, no dither.

The code is a mess for now, and about 10 times slower than the previous version. It uses lpc.m from the MATLAB sig proc toolbox, which means implementing it without this would be some work.

There's a big problem though. I contacted Prof Werner Verhelst to ask if the approach was patented. He said it wasn't - it had been a contract for an audio equipment manufacturer, so he couldn't share the code, but there was nothing to stop me implementing it myself and he'd be interested to hear how I got on.

So far so good. But if you actually read that paper, they make it quite clear that what they're suggesting is very close (a generalised version, if you like) of Sony's Super Bit Mapping technique. I assume this is patented - US 5,204,677 may be the correct patent.

Does this cover what I'm doing, and what's in that paper? I don't know.

Under UK patent law, you can play with it all you like privately, and also perform research (including commercial or public research) on the algorithm, but not with the algorithm. That's my understanding - it may be wrong. Other countries have other patent exemptions for R&D.

So I don't know what to do. I have no desire to fall foul of any patents.

FWIW the Sony patent can't have that long left - it claims priority from a Japanese patent from 1990. That means you can probably have lossyFLAC10 in 2010!

I welcome suggestions and legal opinions.

Cheers,
David.

• 2Bdecided
• Developer
Near-lossless / lossy FLAC
##### Reply #194 – 25 August, 2007, 02:30:43 PM
No lawyers on HA then?

• Dynamic
Near-lossless / lossy FLAC
##### Reply #195 – 25 August, 2007, 03:36:52 PM
No lawyers on HA then?

I was sitting this out to see whether you got any bites.

I've only got experience as a patent coordinator within my R&D group at my previous employer, and mostly on physical embodiment types of patents rather than method patents (essentially algorithms). I read the situation pretty much the same as you.

If it's not invalidated by prior art (I didn't look back at the "A" application and look for "X" and "Y" citations in the Search Report) then it would appear to cover the method you're trying to implement. I can't recall where I'd found that on previous occasions, and whether it's a publically accessible source like uspto.gov or a subscription service.

You might have scope to carry out research alone and publish source code for academic purposes (like the LAME project, which does not distribute the encoder, but publishes the source code). Anyone who then compiled that code or used it for non-research purposes might then be committing a breach of the patent in certain countries, while others in some jurisdictions might be free to share compiled code or even use it commercially. I'm not really au fait with the legality of this, but LAME seems to have been OK, and I didn't think one would be prevented from conducting private/commercial research on the algorithm or with the algorithm. The exception for research "with the algorithm" might be where you use a method not to research the method itself but as a tool for producing a commercial product as part of the production process. Proving that a company made an item using a particular method in court is rather difficult, which is why method claims weren't favoured by my previous employer. I understand that in certain fields, certain novel methods would thus remain secret (only documented internally in case they need to defend against an infringement lawsuit) rather than being publicly disclosed.

If the patent could be considered to be invalid at least for those claims applicable to your techniques, you might be able to prove it, but if you actually infringed it yourself and were sued, could you afford the lawyers to go to court and to pay Sony's lawyers if you lost. That's the kind of decision that companies have to make from time to time. OTOH, if you have meagre financial resources and don't actually impinge on Sony's business, would Sony actually sue you in the first place?

It's a tough one.

I've even heard of companies (possibly in desperate financial states or having farmed out their patent portfolio management to revenue generation firms) attempting to threaten their competitors or "offering the opportunity to license our inventions" with ungranted patent applications for which no claims were without "X" or "Y" citations indicating prior art found in the search report, and which at least some of the competitors hadn't implemented anyway and didn't look like they were about to either.

In summary, the LAME source-code and documentation only approach is worth consideration and investigation. This might also let you publish fair-use excerpts processed by the method for public listening tests in the interest of research and not personal gain, but not let you directly provide anyone with the tools that implement the methods, which would appear to be much more of a grey area.
Dynamic – the artist formerly known as DickD

• 2Bdecided
• Developer
Near-lossless / lossy FLAC
##### Reply #196 – 28 August, 2007, 06:51:09 AM
Thanks dynamic.

Yes, I was wondering about posting samples. If it can't be made transparent at a reasonable bitrate (i.e. significantly less than the non noise shaped approach), then it's not much use anyway - so samples are essential, and clearly research on the algorithm itself.

However, there's not much incentive for anyone to test if they can't then use it themselves.

I understand what the Lame project has done to get around the IP issues, and that they have "got away with it" so far. Maybe some people using lame commercially are actually paying mp3 license fees so it makes money for the patent holders.

However, I'm not comfortable with taking that route myself.

Cheers,
David.

• Nick.C
• Developer
Near-lossless / lossy FLAC
##### Reply #197 – 14 September, 2007, 11:25:40 AM

It is giving a close approximation to the output from the Matlab script.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848| FLAC -5 -e -p -b 512 -P=4096 -S-

• Nick.C
• Developer
Near-lossless / lossy FLAC
##### Reply #198 – 02 October, 2007, 08:19:00 AM
David,

I have been trying to find a (relatively simple) weighting which is a bit more scientific than the primitive skew I have used in lossyWAV.

I found the formulae for A, B, C & D weighting and also found tabulated values for ITU-R.468 (BBC Research Department noise weighting). Of course, I also have the tabulated values of the equal-loudness curve you use in Replay Gain.

I am wondering as to the applicability of D-Weighting (principally because I have the formula) or ITU-R.468 (as it may be simple to implement) as a substitute for the 20Hz to 3.7kHz skew currently available as an option in lossyWAV.

Best regards,

Nick.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848| FLAC -5 -e -p -b 512 -P=4096 -S-

• 2Bdecided
• Developer
Near-lossless / lossy FLAC
##### Reply #199 – 02 October, 2007, 10:34:20 AM
Sorry Nick, I don't think they're appropriate for this. SebG's suggestion of testing white noise in vorbis was a good one (see page 1!).

EDIT: Those curves are the audibility, or perceived loudness, of something on it's own. Whereas the noise we're adding here is added below something else. So you need masking curves, not absolute threshold / equal loudness curves. Still, I've learnt something - I hadn't heard of D-weighting before - sounds interesting for its intended application.

Cheers,
David.