Print Page - Musepack encoder

Title: Musepack encoder
Post by: S_O on 2010-12-05 20:54:34

Hello,
several years ago it was said that changing the MusePack encoder to mp2 output won´t take very long. But AFAIK nobody has ever released a musepack-based mp2-encoder so far.
Because I need a high-quality mp2-encoder (for authoring DVDs) I tried it myself:

I was able to modify the encoder in a way it outputs a 448kbit/s MP1 file, if the bitrate musepack wants to encode is higher, the last subbands are just cut off, otherwise the frame is padded with 0. The problem is now: After just one musepack frame (=three layer1 frames) the bitrate is decreased dramatically: Beginning from band 5 only 2 bits are assigned for each subband, the lower ones also have 8 bits max. It doesn´t matter at what quality I try, it´s the same for thumb to insane.
Otherwise the output file will play fine (very noticeable artifacts because of low effective bitrate (about 200kbps layer1, stereo, 44100khz), but otherwise it seems to work).
Is here anybody familiar with the MusePack encoder able to tell why there is this bitrate drop? I haven´t done anything special, just disabled MS coding, changed the scalefactors, created a function to write MPEG layer1 bitstream and modified the allocate function not to use unsupported quantizers (resolution is increased in that case) and that allocation is limited by maximum bitrate (that´s not causing the problem). That´s it. Unfortunately the source code is not very structured/readable so I don´t see what I´m missing.
Any ideas?

Title: Musepack encoder
Post by: alexeysp on 2010-12-06 10:21:11

Although I cannot answer your question, may I ask why wouldn't you just use TooLAME (http://en.wikipedia.org/wiki/TooLAME) instead?

Title: Musepack encoder
Post by: S_O on 2010-12-06 14:52:46

TooLAME isn´t developed for a long time now, also the development of it´s successor twoLame seems to be stopped for some years already. Both encoders don´t provide the quality of MusePack. MusePack has a more advanced encoder, allowing much higher quality. Of course the mp2 bitstream format does not allow all MusePack features (M/S stereo, huffman, PNS, true VBR,...), but you should be able to encode mp2 files about the same quality as musepack with a major bitrate increase (like 256 - 320 kbps comparable to musepack standard).
I need the encoder to author DVDs containing music. Unfortunately LPCM causes problems with several players and there is also no free, high-quality AC-3 encoder around. MusePack based MP2 seems to be good choice too me.

I found out that the problem are the SMRs returned by "Psychoakustisches_Modell", beginning from the second call of that function they are much too low. Unfortunately I haven´t found a reason why. Any ideas?

Title: Musepack encoder
Post by: S_O on 2010-12-06 22:43:05

I was able to trace the problem inside "Psychoakustisches_Modell" to the function "PreechoControl", which is changing a global array. After commenting the array-modifying code the bitrate drop is gone, but the bitrate allocation is awful. Most likely this is not the only problem. I can encode decent sounding 448 kbps MP1 file using a fixed allocation table (not using psychoacoustics at all).
What am I missing that the function "Psychoakustisches_Modell" is returning no useable values?

Title: Musepack encoder
Post by: alexeysp on 2010-12-07 14:47:09

No offence, but I think you're wasting your time. I seriously doubt that there could be any audible difference between TooLAME at 384 kbps (maybe even lower) and the thing you're trying to make.

In any case, claims like

Quote from: S_O on 2010-12-06 14:52:46

Both encoders don´t provide the quality of MusePack. MusePack has a more advanced encoder, allowing much higher quality.

certainly demand for a proof (especially considering we're talking about bitrates around ~400 kbps).

Title: Musepack encoder
Post by: S_O on 2010-12-07 17:53:10

Quote from: alexeysp on 2010-12-07 14:47:09

No offence, but I think you're wasting your time. I seriously doubt that there could be any audible difference between TooLAME at 384 kbps (maybe even lower) and the thing you're trying to make.

In any case, claims like

Quote from: S_O on 2010-12-06 14:52:46
Both encoders don´t provide the quality of MusePack. MusePack has a more advanced encoder, allowing much higher quality.

certainly demand for a proof (especially considering we're talking about bitrates around ~400 kbps).

I understand what you mean, I remember having quite bad results toolame, I just checked twolame and the first sample I´ve tried was easily ABXable (8/8) at 256kbps (and noticeable artifacts at 192kbps) and my ears are not "tuned" to hear encoding artifacts. By the way, I´m not talking about ~400kps bitrates, 384kbps is maximum for mp2 and it would be great if it could already sound transparent at 256kbps. High bitrate doesn´t imply high quality. Try blade at 320kbps for mp3, you will find a lot of killer samples that won´t sound transparent. You also cannot compare this bitrates of a old subband-coder with no entropy coding to modern mdct-based codecs (aac, vorbis or even mp3).
MusePack achieves this high quality because of a highly tuned encoder. Because MP2 is basically MusePack with several features missing it is possible to create mp2 bitstream of similar quality at higher bitrates using the MusePack encoder.

To musepack source:
I take it all back! I was completely wrong. It had nothing to do with "Psychoakustisches_Modell". MS needed to be disabled on two places and I noticed I was coding one quantizer wrong, making the sound quite awful (and I didn´t used that one in my fixed allocation table). So I basically got it, now I need to change the code to a more sophisticated CBR allocation, all the mp2 allocation tables etc.

At the moment there is just one question left: Combine Penalty. It´s about how many scalefactors are coded for each band in a frame (1, 2 or 3). Therefore it uses this magic table:

Code: [Select]

static const unsigned char  Penalty [256] = {
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
      0,  2,  5,  9, 15, 23, 36, 54, 79,116,169,246,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
    255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
};

#define P(new,old)  Penalty [128 + (old) - (new)]

P is called with new and old scalefactor index and the value compared to combine penalty value of the profile (6 default for all profiles). MP2 uses different scalefactors so this table probably needs to be updated with new values. Unfortunately I have no idea what this values represent ans how this values have been calculated in the first place.

Title: Musepack encoder
Post by: alexeysp on 2010-12-07 23:28:57

Apparently this table provides penalty for replacing "old" with "new" depending on their difference. The penalty is maximum if old < new, or if the difference exceeds 11. Since "old" and "new" are actually indices into scalefactor table, and, if I get it right, the scalefactors are defined as scf = 10**(-0.1*index/1.26), then the difference of indices maps to certain ratio of scalefactors. If we plot the penalty values Penalty[128] through Penalty[139] vs. the corresponding scalefactor ratio values, we will get the following graph:

(http://www.i.com.ua/~alexeysp/p.png)

Now, don't cite me on this, but it seems that in these coordinates the penalty values are remarkably well described by the following function:

p(x) = 4.5*((1/x**2)-1)

where x is the scalefactor ratio (or by simple quadratic function for reverse ratios). The proportionality constant probably comes from some power-of-two to power-of-ten conversion, but it's just a guess.

For what it's worth, hope this helps.

Title: Musepack encoder
Post by: S_O on 2010-12-08 13:12:41

Thank you very much! That makes a sense and helped a lot. I calculated new values for mp2 scalefactors.

I hope I´m able to finish that encoder so it can be released beginning of next year.

Title: Musepack encoder
Post by: S_O on 2010-12-15 12:28:38

I´m making some progress creating the encoder (I´m already able to write mp2 files with dynamic allocation, removed all global variables and separated frontend from encoder), but I found something I don´t understand:

Code: [Select]

#define MPPENC_DENORMAL_FIX_BASE ( 32. * 1024. /* normalized sample value range */ / ( (float) (1 << 24 /* first bit below 32-bit PCM range */ ) ) )
#define MPPENC_DENORMAL_FIX_LEFT ( MPPENC_DENORMAL_FIX_BASE )
#define MPPENC_DENORMAL_FIX_RIGHT ( MPPENC_DENORMAL_FIX_BASE * 0.5f )

This values are added to the input. Any idea why, and why only the half on the right channel?

Title: Musepack encoder
Post by: benski on 2010-12-15 19:00:05

Quote from: S_O on 2010-12-15 12:28:38

I´m making some progress creating the encoder (I´m already able to write mp2 files with dynamic allocation, removed all global variables and separated frontend from encoder), but I found something I don´t understand:

Code: [Select]
#define MPPENC_DENORMAL_FIX_BASE ( 32. * 1024. /* normalized sample value range */ / ( (float) (1 << 24 /* first bit below 32-bit PCM range */ ) ) )
#define MPPENC_DENORMAL_FIX_LEFT ( MPPENC_DENORMAL_FIX_BASE )
#define MPPENC_DENORMAL_FIX_RIGHT ( MPPENC_DENORMAL_FIX_BASE * 0.5f )
This values are added to the input. Any idea why, and why only the half on the right channel?

No idea on why the right channel only gets half, but "denormal" refers to an issue with the FPU on some Intel processors where values very close to zero get calculated using a more precise but much slower mode. There's no perfect way around it but a common approach is to add a small inaudible value so that numbers close to zero get pushed back over the threshold. It's nothing to do with musepack or mp2, simply an optimization that's commonly done on dsp code that will run on the x86

Title: Musepack encoder
Post by: S_O on 2010-12-16 21:06:19

Interesting. You say the reason this is done, because very small (denormal) numbers slow done the processor?

Based on the fact that Musepack encoder operates on floating point numbers in the range -32767 ... +32767, the smallest number (positive, absolute) you get with 16 bit input is 1. With 24 bit input the smallest number is 1/256. But added is 1/512, that doesn´t make much sense, because -1/256 is also possible and resulting in -1/512, a even smaller number (absolute value).
I read that the smallest normal number in 32 bit float is about 1,401*10^(-45). That is a lot smaller than 1/512. In fact to reach that small number you would need a 165-Bit int audio source.

Are you sure that´s the reason? I haven´t implemented this stuff in my encoder yet and it is very fast (faster than Musepack).

I also read that SSE2 provides a flag in a control register telling the processor to make all denormal numbers 0, wouldn´t that make more sense instead of adding a constant?

Title: Musepack encoder
Post by: benski on 2010-12-17 01:40:57

I can't remember all the details of layer 2, but if there are any IIR filters in the signal path will eventually create denormals, regardless of value range, when there is digital silence at the input.

Yes, I'd you use the SSE2 extension you can disable special denormal processing. But it is only used for sse2 instructions (e.g. addpd) and not x87 instructions (e.g. fadd )

Yes denormals are super slow.

Title: Musepack encoder
Post by: S_O on 2010-12-18 15:13:59

I´ve tested different files, also digital silence and the encoder never slows down (since denormals are about 700 times slower than normal numbers I think that would be noticeable). Of course there may be a audio file causing denormals somewhere in the processing, but I don´t see why adding 1/512 helps (except preventing 0 at the input, and that only for 16/24 Bit int sources). I also could not find anything like this in twolame code.

I think already finished most of the work, the encoder should support all features of mp2 except intensity stereo (and dual channel is just like stereo, just a flag in the header is different).
Now I have to test if my modified allocation code causes any problems. That´s the biggest difference between Musepack and MP2: Musepack can allocate yust as many bits for every subband how the encoder thinks it´s best. In MP2 there are not only fixed frame sizes, but also allocation tables that doesn´t allow all quantizers for all subbands.
The way I did it: Musepack allocation increases the Resolution of each subband until the Mask-To-Noise-Ratio (MNR) is smaller than 1.
For MP2 I added:
-Increase all resolutions until there are codeable by MP2 (new MNRs for changed subbands)
-If VBR: Find mininum frame size that allows coding these resolutions
-Decrease resolution of all subbands one (codeable) step until audio fits into frame (for VBR that only happens if maximum bitrate is reached)
-Calculate new MNRs
-Increase resolution of subband with highest MNR and calculate new MNR in a loop until frame is completly filled.
Any better ideas?

Title: Musepack encoder
Post by: benski on 2010-12-18 18:33:28

When lowering resolution, can you lower critical band(s) last or not at all?

Title: Musepack encoder
Post by: S_O on 2010-12-19 00:46:41

The Musepack Psy-Model decides what bands are critical.
Instead of reducing one single band I reduce all bands. The idea is, that if for example band X might have the lowest MNR of all, therefore I decreased it, but after decreasing the MNR increased dramatically, while decreasing the resolution of band Y with higher MNR would have increased the MNR of that band just a little.

Therefore I decrease all bands and then increase always the one with highest MNR until the frame is filled. The idea is that the highest MNR of all bands should be as low as possible, rather than that the average MNR should be as low as possible. Of course if frame is not overfull in the first place the decreasing step is skipped. To get the highest quality, the PsyModel should be set to a value that comes closest to the mp2 bitrate. I need to to some further testing, but musepack standard (5.0) should go with about 256kbps. That means most of the frames will only be increased, but still 1/3 has to be decreased (depends of course on the sample, but average for some music I tested). Of course the encoder will offer the same parameters to control the PsyModel like Musepack.

The allocation algorithm treats all bands the same, but the PsyModel calculates the values to get the MNR. The MNR is also calculated different for transient and non-transient bands (the PsyModel also finds out what bands are transient bands).

Something completely non-technical: Any idea how I should name that encoder, mp2enc or similar doesn´t sound like a completely new mp2 encoder. I asked the Musepack team, they do not want the encoder named in a way that resembles Musepack.

Title: Musepack encoder
Post by: bryant on 2010-12-19 07:08:30

NAME = Name Ain't a Musepack Encoder

Title: Musepack encoder
Post by: pbelkner on 2010-12-19 11:12:25

Quote from: benski on 2010-12-17 01:40:57

Yes denormals are super slow.

By poor accident I just came across this:
[blockquote]Denormal numbers in floating point signal processing applications
Laurent de Soras
2005.04.19
http://ldesoras.free.fr/doc/articles/denormal-en.pdf (http://ldesoras.free.fr/doc/articles/denormal-en.pdf)[/blockquote]

HydrogenAudio

Lossy Audio Compression => MPC => Topic started by: S_O on 2010-12-05 20:54:34