...@Halb27: I think there might me some benefit in reducing the C at the end of the 1024 fft spf to, say, 9, to reduce the number of bins being averaged.
Quote from: Nick.C on 09 December, 2007, 08:10:55 AM...@Halb27: I think there might me some benefit in reducing the C at the end of the 1024 fft spf to, say, 9, to reduce the number of bins being averaged.IMO that's the right direction, and I did first trials, but not with the 1024 sample FFT but with the 64 sample FFT the resolution of which is fine IMO for judging about the highest frequency zones and which has a good time resolution which may be essential for samples like eig. So far I've seen the second highest frequency zone is most important for eig. 22225 yields quite a good though not perfectly transparent result. I'm pretty busy now but I'll try whether 22224 (as of -2) will improve things. But I guess we'll also have to come down from -nts +6 a bit. We'll see.
I've tried -3 -spf 22234-22235-22346-22357-22468 and it raises the bitrate to 412.3kbps for my sample set. It takes about 0.1bits off the number removed from eig.
Quote from: Nick.C on 10 December, 2007, 04:34:34 AMI've tried -3 -spf 22234-22235-22346-22357-22468 and it raises the bitrate to 412.3kbps for my sample set. It takes about 0.1bits off the number removed from eig.My trial yesterday was with -3 -spf 22225-22235-22346-22357-224FF and bits to remove for eig went down significantly (~ 1 bit in the critical first seconds). So I think the 2nd highest frequency zone is essential here, maybe the highest zone as well. Average bitrate of regular music did not go up significantly btw.
I tried eig again using -3 -spf 22224-22236-22347-22358-2246C thus being more demanding with the two highest frequency zones at the 64 sample FFT. Now I can't abx eig any more.But this is pretty much on the cutting edge for my listening experience, and I'm sure there are a lot of people out there with a better sensitivity towards temporal resolution problems. So I suggest we reduce the positive nts values and use -nts +3 for -3, and -nts 0 for -2.-3 -spf 22224-22236-22347-22358-2246C -nts 3 yields 375 kbps with my regular set, -2 -nts 0 yields 422 kbps.To me this is still a very good result not far away from the rsults of the current setting, and it brings us to a considerable amount back to the solid basis where in theory -nts should be 0.
As you can't abx any of the problem samples using -3 -spf 22224-22236-22247-22358-2246C -nts +6.0, I feel that a reduction of 3dB (0.5 bits) for the -nts parameter is a bit too much.-3 -spf 22224-22236-22347-22358-2246C -nts +3.0 results in 433.5kbps for my sample set and changing the +3.0 to +4.5 results in 422.7kbps for my sample set.So, I suggest we use -spf 22224-22236-22347-22358-2246C -nts +4.5 for quality preset -3.
Quote from: Nick.C on 10 December, 2007, 04:23:27 PMAs you can't abx any of the problem samples using -3 -spf 22224-22236-22247-22358-2246C -nts +6.0, I feel that a reduction of 3dB (0.5 bits) for the -nts parameter is a bit too much.-3 -spf 22224-22236-22347-22358-2246C -nts +3.0 results in 433.5kbps for my sample set and changing the +3.0 to +4.5 results in 422.7kbps for my sample set.So, I suggest we use -spf 22224-22236-22347-22358-2246C -nts +4.5 for quality preset -3.For a real-life impression why don't you take a restricted selection of full length tracks? If bitrate goes up for problematic tracks like those in your sample set this is welcome. It's not so welcome of course with regular music.My regular set consists of just 12 full length tracks of various musical direction so I can get at an impression very fast. I know from posted experience that my 375 kbps result is a bit low compared to other musical mixtures reported, but the difference isn't a big one. From this experience I think it's safe to say when the result of my regular set is 375 kbps then average bitrate is ~380 kbps.Sure ~380 kbps is a bit more than the ~350 kbps of the current -3 setting, but it's not by much IMO.If it's up to decide between -nts +3 and -nts +4.5 the difference is even smaller.The reason why I dislike a rather small lowering +6 to +4.5 is that I do think that small -nts steps don't have a significant effect. This is a bit due to my listening experience when using insane positive nts values.So I think in order to have a significant quality effect we shouldn't consider a delta lower than 3 for nts.Not for the sake of saving ~15 kbps.Of course this is because if in doubt I want to play it safe. Just my attitude towards it.
I'll process the 10 albums previously used for bitrate comparison using your proposal for the -3 quality preset. ........
lossyWAV beta v0.5.8 : WAV file bit depth reduction method by 2Bdecided.Delphi implementation by Nick.C from a Matlab script, www.hydrogenaudio.orgUsage : lossyWAV <input wav file> <options>Example : lossyWAV musicfile.wavQuality Options:-1 extreme settings [4xFFT] (-cbs 512 -nts -3.0 -skew 36 -snr 21 -spf 22224-22225-11235-11246-12358 -fft 11011)-2 default settings [3xFFT] (-cbs 512 -nts 0.0 -skew 36 -snr 21 -spf 22224-22235-22346-12347-12358 -fft 10101)-3 compact settings [2xFFT] (-cbs 512 -nts +3.0 -skew 36 -snr 21 -spf 22224-22235-22347-22358-2246C -fft 10001)Standard Options:-o <folder> destination folder for the output file-nts <n> set noise_threshold_shift to n dB (-48.0dB<=n<=+48.0dB) (-ve values reduce bits to remove, +ve values increase)-force forcibly over-write output file if it exists; default=offCodec Specific Options:-wmalsl optimise internal settings for WMA Lossless codec; default=offAdvanced / System Options:-shaping enable fixed shaping using bit_removal difference of previous samples [value = brd(-1)/4]; default=off-snr <n> set minimum average signal to added noise ratio to n dB; (-215.0dB<=n<=48.0dB) Increasing value reduces bits to remove.-skew <n> skew fft analysis results by n dB (0.0db<=n<=48.0db) in the frequency range 20Hz to 3.45kHz-spf <5x5hex> manually input the 5 spreading functions as 5 x 5 characters; These correspond to FFTs of 64, 128, 256, 512 & 1024 samples; e.g. 22235-22236-22347-22358-2246C (Characters must be one of 1 to 9 and A to F (zero excluded).-fft <5xbin> select fft lengths to use in analysis, using binary switching, from 64, 128, 256, 512 & 1024 samples, e.g. 01001 = 128,1024-cbs <n> set codec block size to n samples (512<=n<=4608, n mod 32=0)-quiet significantly reduce screen output-nowarn suppress lossyWAV warnings-detail enable detailled output mode-below set process priority to below normal.-low set process priority to low.Special thanks:David Robinson for the method itself and motivation to implement it in Delphi.Dr. Jean Debord for the use of TPMAT036 uFFT & uTypes units for FFT analysis.Halb27 @ www.hydrogenaudio.org for donation and maintenance of the wavIO unit.
Conversion using lossyWAV beta v0.5.8, FLAC -8|=======================================|=========|=========|=========|=========||Album | FLAC -8 | lW -1 | lW -2 | lW -3 ||=======================================|=========|=========|=========|=========||AC/DC - Dirty Deeds Done Dirt Cheap | 781kbps | 468kbps | 417kbps | 366kbps ||B52's - Good Stuff | 993kbps | 476kbps | 421kbps | 376kbps ||David Byrne - Uh-Oh | 937kbps | 464kbps | 413kbps | 363kbps ||Fish - Songs From The Mirror | 854kbps | 451kbps | 399kbps | 357kbps ||Gerry Rafferty - City To City | 802kbps | 468kbps | 416kbps | 366kbps ||Iron Maiden - Can I Play With Madness | 784kbps | 486kbps | 437kbps | 387kbps ||Jean Michel Jarre - Oxygene | 773kbps | 538kbps | 475kbps | 422kbps ||Marillion - The Thieving Magpie | 790kbps | 473kbps | 421kbps | 373kbps ||Mike Oldfield - Tr3s Lunas | 848kbps | 491kbps | 436kbps | 389kbps ||Scorpions - Best Of Rockers N' Ballads | 922kbps | 492kbps | 437kbps | 378kbps ||=======================================|=========|=========|=========|=========||Average | 850kbps | 480kbps | 426kbps | 376kbps ||=======================================|=========|=========|=========|=========||53 sample "problem" set | 784kbps | 543kbps | 491kbps | 434kbps ||=======================================|=========|=========|=========|=========|
.... but it might be advantageous to somehow note in a RIFF chunk the fact that the file is a LossyWAV file, and perhaps include a note of the settings used to create the file........spotted a discussion about the possibility of including checksums....
I've come across what I think is a problem sample. Its a bit strange in that the sample seems to play OK on one of my DAPs, but I can hear sound distortion/ sound breaking up in places when played on the other DAP. What puzzles me more is that the original FLAC plays without a problem on both DAPs and this is what leads me to believe its an encoder problem. I first spotted the issue on beta 5.4 but I've just tried it on 5.8 and get the same results.The DAPS are both rockboxed Sansa E280 & iAudio M5 and the earphones are Shure SE530. I can only hear the problem sample on the M5. The sample is 'Groove Armada - Soundboy Rock - Lightsonic' and although I hear issues throughout the track I've noted that it definitely happens at 4.30sec, the average bitrate for this track is 528kbps. when encoded using the standard -3 lossyWAV settings.Even more annoying is that I cannot hear the distortion on my PC using an AMP & Sennheiser HD650's either.I've also come across the problem on a different album but I need to dig this out again because I forgot to take a note when it happened.Are you interested in investigating this further? What would you need?I've also come across two samples by 'Shakespears Sister\Long Live The Queens!' album where the average bitrate for -3 encoding is 696kbps & 711kbps. Could these prove useful to you?
Sorry for not posting these sooner. I've been out all day.I've attached 10 second samples. The issue happens 5 seconds in when encoding the sample.flac file.Sorry, the M5_issue.flac isn't great quality but you can clearly hear the issue I'm experiencing. If you need any more info, give me a shout.I'm going to try and find the other track I've got problems with.Edit: Oh no, look at the post number... I'm not the devil, honest ThanksSteve
Nick, I guess when using -detail if you report a -1 as the number of bits removed this means no bit is removed due to clipping prevention (or does it say: due to clipping prevention not all the bits have been removed that could have been removed if we wouldn't use clipping prevention)?Can you imagine there is a problem when no bit is removed due to clipping prevention in one block, and ~10 bits are removed in the next block?
It *shouldn't* report -1 bits removed *ever*. That means there's a bug in that bit of the code.
On your point about 0 btr in one codec_block then 10 btr in the next, I don't really know.....
Thanks for your replies gents,Nick, I'm using flac settings -3 -m -e -r 2 -b 512 at the moment, but I had the same problem using -5 & -8One thing I haven't tried yet is to play the actual lossyWAV file, I will try this tonight along with your suggestions halb27.Also, I have seen the -1 bits removed on several of my encodings in the past due to clipping prevention, are you saying this could have a large impact on the encoding?Also regards the clipping, should I be using replaygain on this type of track? I've never used replaygain before because I've always been under the impression that its processing the sound so its no longer sounds like the original.
b) The replaygain procedure doesn't process the sound. It just computes a volume correction value and stores it in the file so that the playback machinery is able to adjust volume according to this value (in case the playback machinery has a replaygain feature).[snip]The only form of 'sound processing' because of replaygain can occur on playback when there is something like a 'soft clipping prevention' feature with replaygain which usually can be switched off if not wanted.
Quote from: halb27 on 17 December, 2007, 02:56:25 AMb) The replaygain procedure doesn't process the sound. ...Well that's not entirely true. In fact, if the replaygain code is handled in 16bits, it can easily cause an audible effect to be heard. Given it's a portable, and probably only supports 16bit (and FIR), I would bet money that the output of the replaygain multiplier is a 16bit number, thereby truncating bits. ....
b) The replaygain procedure doesn't process the sound. ...