Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: lossyWAV Development (Read 561319 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

lossyWAV Development

Reply #650
well, i got bored again...



[edit]
click here to see it on different colored backgrounds
all of those are actually the same exact PNG file as the one i put in this post. 
[/edit]

[edit2]
here's the logo, "naked", if you wanna see it alone.



[/edit2]

lossyWAV Development

Reply #651
...@Halb27: I think there might me some benefit in reducing the C at the end of the 1024 fft spf to, say, 9, to reduce the number of bins being averaged.

IMO that's the right direction, and I did first trials, but not with the 1024 sample FFT but with the 64 sample FFT the resolution of which is fine IMO for judging about the highest frequency zones and which has a good time resolution which may be essential for samples like eig. So far I've seen the second highest frequency zone is most important for eig. 22225 yields quite a good though not perfectly transparent result. I'm pretty busy now but I'll try whether 22224 (as of -2) will improve things. But I guess we'll also have to come down from -nts +6 a bit. We'll see.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #652
...@Halb27: I think there might me some benefit in reducing the C at the end of the 1024 fft spf to, say, 9, to reduce the number of bins being averaged.
IMO that's the right direction, and I did first trials, but not with the 1024 sample FFT but with the 64 sample FFT the resolution of which is fine IMO for judging about the highest frequency zones and which has a good time resolution which may be essential for samples like eig. So far I've seen the second highest frequency zone is most important for eig. 22225 yields quite a good though not perfectly transparent result. I'm pretty busy now but I'll try whether 22224 (as of -2) will improve things. But I guess we'll also have to come down from -nts +6 a bit. We'll see.
I've tried -3 -spf 22234-22235-22346-22357-22468 and it raises the bitrate to 412.3kbps for my sample set. It takes about 0.1bits off the number removed from eig.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #653
I've tried -3 -spf 22234-22235-22346-22357-22468 and it raises the bitrate to 412.3kbps for my sample set. It takes about 0.1bits off the number removed from eig.

My trial yesterday was with -3 -spf 22225-22235-22346-22357-224FF and bits to remove for eig went down significantly (~ 1 bit in the critical first seconds). So I think the 2nd highest frequency zone is essential here, maybe the highest zone as well. Average bitrate of regular music did not go up significantly btw.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #654
I've tried -3 -spf 22234-22235-22346-22357-22468 and it raises the bitrate to 412.3kbps for my sample set. It takes about 0.1bits off the number removed from eig.

My trial yesterday was with -3 -spf 22225-22235-22346-22357-224FF and bits to remove for eig went down significantly (~ 1 bit in the critical first seconds). So I think the 2nd highest frequency zone is essential here, maybe the highest zone as well. Average bitrate of regular music did not go up significantly btw.
You're right - it does indeed bring down eig quite a lot. How about a combination: 22225-22235-22346-22357-22468? This yields 414.0kbps for my 53 sample set.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #655
I tried eig again using -3 -spf 22224-22236-22347-22358-2246C thus being more demanding with the two highest frequency zones at the 64 sample FFT. Now I can't abx eig any more.
But this is pretty much on the cutting edge for my listening experience, and I'm sure there are a lot of people out there with a better sensitivity towards temporal resolution problems. So I suggest we reduce the positive nts values and use -nts +3 for -3, and -nts 0 for -2.

-3 -spf 22224-22236-22347-22358-2246C -nts 3 yields 375 kbps with my regular set, -2 -nts 0 yields 422 kbps.

To me this is still a very good result not far away from the rsults of the current setting, and it brings us to a considerable amount back to the solid basis where in theory -nts should be 0.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #656
I tried eig again using -3 -spf 22224-22236-22347-22358-2246C thus being more demanding with the two highest frequency zones at the 64 sample FFT. Now I can't abx eig any more.
But this is pretty much on the cutting edge for my listening experience, and I'm sure there are a lot of people out there with a better sensitivity towards temporal resolution problems. So I suggest we reduce the positive nts values and use -nts +3 for -3, and -nts 0 for -2.

-3 -spf 22224-22236-22347-22358-2246C -nts 3 yields 375 kbps with my regular set, -2 -nts 0 yields 422 kbps.

To me this is still a very good result not far away from the rsults of the current setting, and it brings us to a considerable amount back to the solid basis where in theory -nts should be 0.
I will happily agree the -spf parameter for -3.

As you can't abx any of the problem samples using -3 -spf 22224-22236-22247-22358-2246C -nts +6.0, I feel that a reduction of 3dB (0.5 bits potentially) for the -nts parameter is a bit too much.

-3 -spf 22224-22236-22347-22358-2246C -nts +3.0 results in 433.5kbps for my sample set and changing the +3.0 to  +4.5 results in 422.7kbps for my sample set.

So, I suggest we use -spf 22224-22236-22347-22358-2246C -nts +4.5 for quality preset -3.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #657
As you can't abx any of the problem samples using -3 -spf 22224-22236-22247-22358-2246C -nts +6.0, I feel that a reduction of 3dB (0.5 bits) for the -nts parameter is a bit too much.

-3 -spf 22224-22236-22347-22358-2246C -nts +3.0 results in 433.5kbps for my sample set and changing the +3.0 to  +4.5 results in 422.7kbps for my sample set.

So, I suggest we use -spf 22224-22236-22347-22358-2246C -nts +4.5 for quality preset -3.

For a real-life impression why don't you take a restricted selection of full length tracks? If bitrate goes up for problematic tracks like those in your sample set this is welcome. It's not so welcome of course with regular music.
My regular set consists of just 12 full length tracks of various musical direction so I can get at an impression very fast. I know from posted experience that my 375 kbps result is a bit low compared to other musical mixtures reported, but the difference isn't a big one. From this experience I think it's safe to say when the result of my regular set is 375 kbps then average bitrate is ~380 kbps.
Sure ~380 kbps is a bit more than the ~350 kbps of the current -3 setting, but it's not by much IMO.
If it's up to decide between -nts +3 and -nts +4.5 the difference is even smaller.
The reason why I dislike a rather small lowering +6 to +4.5 is that I do think that small -nts steps don't have a significant effect. This is a bit due to my listening experience when using insane positive nts values.
So I think in order to have a significant quality effect we shouldn't consider a delta lower than 3 for nts.
Not for the sake of saving ~15 kbps.
Of course this is because if in doubt I want to play it safe. Just my attitude towards it.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #658
As you can't abx any of the problem samples using -3 -spf 22224-22236-22247-22358-2246C -nts +6.0, I feel that a reduction of 3dB (0.5 bits) for the -nts parameter is a bit too much.

-3 -spf 22224-22236-22347-22358-2246C -nts +3.0 results in 433.5kbps for my sample set and changing the +3.0 to  +4.5 results in 422.7kbps for my sample set.

So, I suggest we use -spf 22224-22236-22347-22358-2246C -nts +4.5 for quality preset -3.

For a real-life impression why don't you take a restricted selection of full length tracks? If bitrate goes up for problematic tracks like those in your sample set this is welcome. It's not so welcome of course with regular music.
My regular set consists of just 12 full length tracks of various musical direction so I can get at an impression very fast. I know from posted experience that my 375 kbps result is a bit low compared to other musical mixtures reported, but the difference isn't a big one. From this experience I think it's safe to say when the result of my regular set is 375 kbps then average bitrate is ~380 kbps.
Sure ~380 kbps is a bit more than the ~350 kbps of the current -3 setting, but it's not by much IMO.
If it's up to decide between -nts +3 and -nts +4.5 the difference is even smaller.
The reason why I dislike a rather small lowering +6 to +4.5 is that I do think that small -nts steps don't have a significant effect. This is a bit due to my listening experience when using insane positive nts values.
So I think in order to have a significant quality effect we shouldn't consider a delta lower than 3 for nts.
Not for the sake of saving ~15 kbps.
Of course this is because if in doubt I want to play it safe. Just my attitude towards it.
Tomorrow morning I'll process the 10 albums previously used for bitrate comparison using your proposal for the -3 quality preset.

I had a thought - if we end up with, say, 380kbps then that's still a bit less than OGG q 10 (circa 400kbps) and I'm not worried as it will be only about 60kbps above the upper bitrate limit for standard MP3.

I would be content with that.

Overall I am more concerned with the quality of the processed output than I am with the bitrate.

As an aside, I recently ordered a 16GB compact flash (to go with my 3 x 4GB SD Cards) for my iPAQ - lots more space for my .lossy.FLAC collection  ! Combined with Mortplayer using GSPFLAC.DLL it's working well.

[edit] Corrected OGG max bitrate [/edit]
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

 

lossyWAV Development

Reply #659
I'll process the 10 albums previously used for bitrate comparison using your proposal for the -3 quality preset. ........
lossyWAV beta v0.5.8 attached: Superseded.

Modified -1, -2 & -3 quality presets.[!--sizeo:1--][span style=\"font-size:8pt;line-height:100%\"][!--/sizeo--]
Code: [Select]
lossyWAV beta v0.5.8 : WAV file bit depth reduction method by 2Bdecided.
Delphi implementation by Nick.C from a Matlab script, www.hydrogenaudio.org

Usage   : lossyWAV <input wav file> <options>

Example : lossyWAV musicfile.wav

Quality Options:

-1            extreme settings [4xFFT] (-cbs 512 -nts -3.0 -skew 36 -snr 21
              -spf 22224-22225-11235-11246-12358 -fft 11011)
-2            default settings [3xFFT] (-cbs 512 -nts  0.0 -skew 36 -snr 21
              -spf 22224-22235-22346-12347-12358 -fft 10101)
-3            compact settings [2xFFT] (-cbs 512 -nts +3.0 -skew 36 -snr 21
              -spf 22224-22235-22347-22358-2246C -fft 10001)

Standard Options:

-o <folder>   destination folder for the output file
-nts <n>      set noise_threshold_shift to n dB (-48.0dB<=n<=+48.0dB)
              (-ve values reduce bits to remove, +ve values increase)
-force        forcibly over-write output file if it exists; default=off

Codec Specific Options:

-wmalsl       optimise internal settings for WMA Lossless codec; default=off

Advanced / System Options:

-shaping      enable fixed shaping using bit_removal difference of previous
              samples [value = brd(-1)/4]; default=off
-snr <n>      set minimum average signal to added noise ratio to n dB;
              (-215.0dB<=n<=48.0dB) Increasing value reduces bits to remove.
-skew <n>     skew fft analysis results by n dB (0.0db<=n<=48.0db) in the
              frequency range 20Hz to 3.45kHz
-spf <5x5hex> manually input the 5 spreading functions as 5 x 5 characters;
              These correspond to FFTs of 64, 128, 256, 512 & 1024 samples;
              e.g. 22235-22236-22347-22358-2246C (Characters must be one of
              1 to 9 and A to F (zero excluded).
-fft <5xbin>  select fft lengths to use in analysis, using binary switching,
              from 64, 128, 256, 512 & 1024 samples, e.g. 01001 = 128,1024
-cbs <n>      set codec block size to n samples (512<=n<=4608, n mod 32=0)

-quiet        significantly reduce screen output
-nowarn       suppress lossyWAV warnings
-detail       enable detailled output mode

-below        set process priority to below normal.
-low          set process priority to low.

Special thanks:

David Robinson for the method itself and motivation to implement it in Delphi.
Dr. Jean Debord for the use of TPMAT036 uFFT & uTypes units for FFT analysis.
Halb27 @ www.hydrogenaudio.org for donation and maintenance of the wavIO unit.
[/size]Summary of bitrates for 10 album test set.
Code: [Select]
 Conversion using lossyWAV beta v0.5.8, FLAC -8
|=======================================|=========|=========|=========|=========|
|Album                                  | FLAC -8 |  lW -1  |  lW -2  |  lW -3  |
|=======================================|=========|=========|=========|=========|
|AC/DC - Dirty Deeds Done Dirt Cheap    | 781kbps | 468kbps | 417kbps | 366kbps |
|B52's - Good Stuff                     | 993kbps | 476kbps | 421kbps | 376kbps |
|David Byrne - Uh-Oh                    | 937kbps | 464kbps | 413kbps | 363kbps |
|Fish - Songs From The Mirror           | 854kbps | 451kbps | 399kbps | 357kbps |
|Gerry Rafferty - City To City          | 802kbps | 468kbps | 416kbps | 366kbps |
|Iron Maiden - Can I Play With Madness  | 784kbps | 486kbps | 437kbps | 387kbps |
|Jean Michel Jarre - Oxygene            | 773kbps | 538kbps | 475kbps | 422kbps |
|Marillion - The Thieving Magpie        | 790kbps | 473kbps | 421kbps | 373kbps |
|Mike Oldfield - Tr3s Lunas             | 848kbps | 491kbps | 436kbps | 389kbps |
|Scorpions - Best Of Rockers N' Ballads | 922kbps | 492kbps | 437kbps | 378kbps |
|=======================================|=========|=========|=========|=========|
|Average                                | 850kbps | 480kbps | 426kbps | 376kbps |
|=======================================|=========|=========|=========|=========|
|53 sample "problem" set                | 784kbps | 543kbps | 491kbps | 434kbps |
|=======================================|=========|=========|=========|=========|
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #660
Thank you, Nick.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #661
For all of the developers involved with this project, I'd first like to compliment you for coming up with such a wonderful idea. I've been tracking this thread for a while, and am greatly impressed by the progress

My question relates to the identification of LossyWAV files. I'm sure it's a lot of [unnecessary?] hassle, but it might be advantageous to somehow note in a RIFF chunk the fact that the file is a LossyWAV file, and perhaps include a note of the settings used to create the file. APEv2 tags are also an option, though I hear [I have not confirmed this personally] that some software has trouble with APE tags at the end of RIFF files. RIFF mechanisms would be preferable as this information would be stored by some codes (WavPack for sure, others?) and could be re-pasted into a file when uncompressed.

Yes, I am aware that lossily compressed audio can be decompressed to WAV files without the WAV file being tagged in any special way, but if some tagging mechanism that would be performed by LossyWAV were in effect, it would be immediately obvious that the file is not losslessly compressed.

Then again, if it's more trouble that it is worth, then forget it, but if it's doable, then it would be a nice feature to have. In an ideal world, then lossless encoding tools could even read this header/footer/info tag and adjust blocksizes accordingly, allowing the user to get efficient results without personally knowing that the file has been pre-processed.

Regards,

UED77

[Edit: I just browsed back a couple pages to some posts I've previously missed and spotted a discussion about the possibility of including checksums. Unfortunately, I found the response to that a bit complicated to understand, so I don't know if it was ruled feasible or not.]
UED77
wavpack 4.50 -hx3; lame 3.97 -V4 --vbr-new

lossyWAV Development

Reply #662
.... but it might be advantageous to somehow note in a RIFF chunk the fact that the file is a LossyWAV file, and perhaps include a note of the settings used to create the file....

....spotted a discussion about the possibility of including checksums....
Thanks for the input and appreciation. We're having fun with this project !

When I work out how the WAV format works (Halb27 did all the difficult bit by writing the WAV I/O routines) I'll try to add a check for a relevant chunk with lossyWAV data in it and if none exists, create one with something like "lossyWAV <version> <quality setting> <list of other settings that were actually used> <CRC32 of output samples>"

I've just removed any reliance on knowing how large the file is (with a view to piped input, though how I pipe in from, say, Foobar2000 then pipe out to, say, FLAC, I haven't a clue as yet. Could it be as simple "lossywav - <quality setting> | flac -8 - -o<output_filename>"?) from the code.

Another thing on the list of things to do is to allow the parallel creation of a correction file, presumably with the same RIFF chunk in it, to allow recreation of the lossless original.

Thanks again,

Nick.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #663
bug:

lossyWAV Development

Reply #664
I had a PM from stel with some concerns:

Quote
I've come across what I think is a problem sample. Its a bit strange in that the sample seems to play OK on one of my DAPs, but I can hear sound distortion/ sound breaking up in places when played on the other DAP. What puzzles me more is that the original FLAC plays without a problem on both DAPs and this is what leads me to believe its an encoder problem. I first spotted the issue on beta 5.4 but I've just tried it on 5.8 and get the same results.

The DAPS are both rockboxed Sansa E280 & iAudio M5 and the earphones are Shure SE530. I can only hear the problem sample on the M5. The sample is 'Groove Armada - Soundboy Rock - Lightsonic' and although I hear issues throughout the track I've noted that it definitely happens at 4.30sec, the average bitrate for this track is 528kbps. when encoded using the standard -3 lossyWAV settings.

Even more annoying is that I cannot hear the distortion on my PC using an AMP & Sennheiser HD650's either.

I've also come across the problem on a different album but I need to dig this out again because I forgot to take a note when it happened.

Are you interested in investigating this further? What would you need?

I've also come across two samples by 'Shakespears Sister\Long Live The Queens!' album where the average bitrate for -3 encoding is 696kbps & 711kbps. Could these prove useful to you?
If anyone else has had any experience of this, please add samples of up to 30 seconds of the portion of the track in question to this thread.

Many thanks,

Nick.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #665
Sorry for not posting these sooner. I've been out all day.
I've attached 10 second samples. The issue happens 5 seconds in when encoding the sample.flac file.
Sorry, the M5_issue.flac isn't great quality but you can clearly hear the issue I'm experiencing.
If you need any more info, give me a shout.
I'm going to try and find the other track I've got problems with.

Edit: Oh no, look at the post number... I'm not the devil, honest

Thanks
Steve

lossyWAV Development

Reply #666
Sorry for not posting these sooner. I've been out all day.
I've attached 10 second samples. The issue happens 5 seconds in when encoding the sample.flac file.
Sorry, the M5_issue.flac isn't great quality but you can clearly hear the issue I'm experiencing.
If you need any more info, give me a shout.
I'm going to try and find the other track I've got problems with.

Edit: Oh no, look at the post number... I'm not the devil, honest

Thanks
Steve
Thanks for the samples - I'll have a listen....

Just a thought, but which FLAC setting are you using? It appears to me that -8 will require more CPU than, say, -3. Halb27 and Mitch 1 2 found that -3 -e -m -r 2 -b 512 works really well (only about 1kbps difference from -8), and probably takes less effort to decode.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #667
Thanks for your sample, stel.

I can hear the distortion with your M5 sample at ~ sec.5, but not when encoding sample.flac myself using lossyWAV -3 - the same experience you have on your computer.

How can we figure out whether it's an encoder problem (the fact that lossless FLAC works fine sounds like that) or a problem specific to the iAudio M5 (the fact that on a computer and on your other DAP there is no problem sounds like that)?

I suggest you do some other encodings using -2, -1, -1 -nts 6, -1 -nts 9, -1 -nts 12, ... and report what happens.
There's a lot of clipping in this sample. Can you reduce the volume a bit by using for instance a wav editor, and have a look whether the problem remains?

Nick, I guess when using -detail if you report a -1 as the number of bits removed this means no bit is removed due to clipping prevention (or does it say: due to clipping prevention not all the bits have been removed that could have been removed if we wouldn't use clipping prevention)?
Can you imagine there is a problem when no bit is removed due to clipping prevention in one block, and ~10 bits are removed in the next block?
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #668
Nick, I guess when using -detail if you report a -1 as the number of bits removed this means no bit is removed due to clipping prevention (or does it say: due to clipping prevention not all the bits have been removed that could have been removed if we wouldn't use clipping prevention)?
Can you imagine there is a problem when no bit is removed due to clipping prevention in one block, and ~10 bits are removed in the next block?
It *shouldn't* report -1 bits removed *ever*. That means there's a bug in that bit of the code.

On your point about 0 btr in one codec_block then 10 btr in the next, I don't really know.....
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #669
It *shouldn't* report -1 bits removed *ever*. That means there's a bug in that bit of the code.

There are a lot of -1's when using -3 -detail.
On your point about 0 btr in one codec_block then 10 btr in the next, I don't really know.....

Just my thought when looking at the -detail report interpreting '-1' as '0 btr'.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #670
Thanks for your replies gents,
Nick, I'm using flac settings -3 -m -e -r 2 -b 512 at the moment, but I had the same problem using -5 & -8

One thing I haven't tried yet is to play the actual lossyWAV file, I will try this tonight along with your suggestions halb27.

Also, I have seen the -1 bits removed on several of my encodings in the past due to clipping prevention, are you saying this could have a large impact on the encoding?

Also regards the clipping, should I be using replaygain on this type of track? I've never used replaygain before because I've always been under the impression that its processing the sound so its no longer sounds like the original.

lossyWAV Development

Reply #671
Thanks for your replies gents,
Nick, I'm using flac settings -3 -m -e -r 2 -b 512 at the moment, but I had the same problem using -5 & -8

One thing I haven't tried yet is to play the actual lossyWAV file, I will try this tonight along with your suggestions halb27.

Also, I have seen the -1 bits removed on several of my encodings in the past due to clipping prevention, are you saying this could have a large impact on the encoding?

Also regards the clipping, should I be using replaygain on this type of track? I've never used replaygain before because I've always been under the impression that its processing the sound so its no longer sounds like the original.
Replaygain (applied to the file, rather than appended) will only ever change amplitude and should only affect volume - not sound.

I'll have a look at the -1 btr issue, although, thinking about it, it should have no impact at all as when btr falls to zero then the block is merely stored and is not processed at all.

Playing the WAV should be a good test to see whether the problem lies with playback or processing.
lossyWAV -q X -a 4 -s h -A --feedback 2 --limit 15848 --scale 0.5 | FLAC -5 -e -p -b 512 -P=4096 -S- (having set foobar to output 24-bit PCM; scaling by 0.5 gives the ANS headroom to work)

lossyWAV Development

Reply #672
a) Nick said a '-1' shouldn't be seen with -detail, so probably there's something wrong and he'll certainly find out. Maybe only the display is affected. We'll see.
From what it's meant to be the current clipping prevention is to maintain quality - the downside is that bitrate with strongly clipping tracks can get pretty high as with your reported samples. But maybe there are other side effects with the clipping prevention strategy like no bit removed in one block due to clipping prevention and 10 bits removed in the next block due to normal lossyWAV mechanism leading to distortion. I can't imagine it's like that cause that would be audible also outside of your iAudio M5 environment. But until things are clear we should keep it in mind. For clarification it would be fine if you could encode a volume reduced variant of your sample. I can do the loudness reduced encoding in case you're not used to wave editing.

b) The replaygain procedure doesn't process the sound. It just computes a volume correction value and stores it in the file so that the playback machinery is able to adjust volume according to this value (in case the playback machinery has a replaygain feature).
The target is to have each track in a series of tracks originating from different albums at its adequate loudness.
Sound impression varies with different volume due to the volume depent frequency characteristics of our audio perception.
The only form of 'sound processing' because of replaygain can occur on playback when there is something like a 'soft clipping prevention' feature with replaygain which usually can be switched off if not wanted.
Anyway whether or not you want to use replaygain has nothing to do with our problem here in case it should be an encoding problem.
lame3995o -Q1.7 --lowpass 17

lossyWAV Development

Reply #673
b) The replaygain procedure doesn't process the sound. It just computes a volume correction value and stores it in the file so that the playback machinery is able to adjust volume according to this value (in case the playback machinery has a replaygain feature).
[snip]
The only form of 'sound processing' because of replaygain can occur on playback when there is something like a 'soft clipping prevention' feature with replaygain which usually can be switched off if not wanted.


Well that's not entirely true.  In fact, if the replaygain code is handled in 16bits, it can easily cause an audible effect to be heard.  Given it's a portable, and probably only supports 16bit (and FIR), I would bet money that the output of the replaygain multiplier is a 16bit number, thereby truncating bits.

The first thing to do would be to find out if your player's DAC even supports 24bit output.  If not, then it's at least running dithering on the end of the replaygain function (or somewhere before it's sent to the DAC), if not just truncating anyways.

lossyWAV Development

Reply #674
b) The replaygain procedure doesn't process the sound. ...


Well that's not entirely true.  In fact, if the replaygain code is handled in 16bits, it can easily cause an audible effect to be heard.  Given it's a portable, and probably only supports 16bit (and FIR), I would bet money that the output of the replaygain multiplier is a 16bit number, thereby truncating bits. ....

OK, but that's not exactly what I would call sound processing.
lame3995o -Q1.7 --lowpass 17