Resurrecting/Preserving the Helix MP3 encoder

Topic: Resurrecting/Preserving the Helix MP3 encoder (Read 77534 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #525 – 2024-05-18 14:17:59

I'm glad to help, but clearly Case by far put in most of the work!

The downsampling issues were introduced with https://github.com/maikmerten/hmp3/commit/6d57f3ff9affbe4b5a1b4135dd4b0b1a8ca35ba5 - so at least it's known where to start looking into things.

(The encoder has two resamplers: srccf.cpp took 16 bit input and produced resampled 16 bit output, srccfb.cpp took 8 bit input and produced 16 bit output. Now one takes 32 bit floats, the other still takes 8 bit chars, I guess at one point one could drop the 8 bit thing)

---

edit: The resampling-code has various special cases, between which is chosen here: https://github.com/maikmerten/hmp3/blob/28a129a96b7bbc7a2cb68e76ff2381e1c070cc06/hmp3/src/srcc.cpp#L832

On the current dev branch:

44100 -> 48000: case 7, works
44100 -> 44100: case 5, works
44100 -> 32000: case 9, no audio
44100 -> 24000: case 9, no audio
44100 -> 22050: case 9, no audio
44100 -> 16000: segfault

32000 -> 48000: case 7, works
32000 -> 44100: case 7, works
32000 -> 32000: case 5, works
32000 -> 24000: case 8, no audio
32000 -> 22050: case 9, corrupted, stuttery audio
32000 -> 16000: case 8, corrupted, stuttery audio

---

The segfault for the 44100 -> 16000 case happens in https://github.com/maikmerten/hmp3/blob/28a129a96b7bbc7a2cb68e76ff2381e1c070cc06/hmp3/src/cnts.c#L285 because the indices ix0 and ix1 are getting pretty big/pretty negative (ix0: -2147483648 ix1: -2147483648)

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #526 – 2024-05-18 16:06:05

Some more resampling data points:

22050 -> 48000: case 7, works
22050 -> 44100: case 6, works
22050 -> 32000: case 7, works
22050 -> 24000: case 7, works
22050 -> 22050: case 5, works
22050 -> 16000: case 9, ear-damaging, mangled, screaming and screeching sound, ffmpeg decoder unhappy

16000 -> 48000: case 7, works
16000 -> 44100: case 7, works
16000 -> 32000: case 6, works
16000 -> 24000: case 7, works
16000 -> 16000: case 5, works

In short, indeed appears to be *down*sampling that is problematic. 22050 to 16000 is especially nasty, don't try with headphones!

edit: For sake of completeness:

48000 -> 48000: case 5, works
48000 -> 44100: case 9, segfault in hmp3/src/l3math.c:688
48000 -> 32000: case 8, no audio
48000 -> 24000: case 8, ear-damaging, mangled, screaming and screeching sound
48000 -> 22050: case 9, ear-damaging, mangled, screaming and screeching sound
48000 -> 16000: case 8, segfault in hmp3/src/cnts.c:285

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #527 – 2024-05-19 08:27:05

Thanks for this overview.
I lack the relevant knowledge to look for the error in the right place, but the presence of those segfaults as defined probably also means that the error will not be in the resampling process itself, but "to the left or right of it" (as you described in the edit), e.g. also because the upsampling is working and transparent, maybe..? ..and i understand "corrupted stuttery audio" as something more fatal than just distortion of a sonically non-transparent resampler.

Can code analysis in Lame be helpful here, how are these code "bindings" resolved there in case of downsampling activation?
Maintaining downsampling support seems to make sense only if this problem is fixed. If that won't happen, wouldn't changing the logic under which downsampling has to be used or by disabling downsampling support, be a workaround? But of course this could also have a negative consequence(s).

The article in the HA wikipedia on the Lame encoder, section "Resampling".. in the sense of the last sentence in this section I have a question.. If, after all, there was a bug in the code of the internal resampler, is it theoretically possible to perform a total replacement of internal resampler with e.g. a libretro resampler?

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #528 – 2024-05-19 10:16:28

My best guess right now is that there's some kind of memory corruption going on. Float is twice as big as short, so the encoder falling apart in various places (mostly when accessing lookup tables) might indicate that resampling is trampling over memory somewhere.

C be like that.

Lame is a completely different (and more readable, IMO) code base and I don't expect to find something there that applies to the Helix MP3 encoder.

It's theoretically possible to replace the resampler, but one needs to be careful about licensing. The Helix MP3 Encoder has a very exotic open-source license, which *is* recognized as Free Software license, but it makes it quite a bit more complicated to figure out what other code can be legally mixed in.

edit:

Some more digging in, when resampling 44100 to 32000, the resampler can produce vastly "out of bounds" values... just ploppin in a 'printf("u: %f v: %f\n", u, v);' into Csrc::src_filter_dual_case4 shows that output values can be completely out of whack...

Code: [Select]

u: -10339643495273183456589268008304640.000000   v: -0.000000
u: 407311519258318227550806361505792.000000   v: 0.000000
u: 9921843064804505112641775729639424.000000   v: 0.000000
u: 1551401312512909985066881267007488.000000   v: 0.000000
u: 35341039057689407428212771715547136.000000   v: 0.000000
u: -85764139298480145538534354483937280.000000   v: -0.000000
u: 682444216638130611362993002619338752.000000   v: 0.000000
u: 358179272327912044724473916228108288.000000   v: 0.000000
u: -165015089812987648412178647898652672.000000   v: -0.000000
u: 75186501211684328081405592394006528.000000   v: 0.000000
u: -16369631476501701137608916066107392.000000   v: -0.000000
u: 42112486892657766126052288080379904.000000   v: 0.000000
u: -6317488413892193681442495566708736.000000   v: -0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: -0.000000   v: -0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: -nan   v: -nan
u: -nan   v: -nan
u: -nan   v: -nan
u: -nan   v: -nan
u: -nan   v: -nan
u: -nan   v: -nan
u: -nan   v: -nan
u: -nan   v: -nan
u: -nan   v: -nan
u: -nan   v: -nan
u: -nan   v: -nan
u: -nan   v: -nan
u: -nan   v: 0.000024
u: -nan   v: -0.000036
u: 0.000000   v: -0.000000
u: -0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000
u: 0.000000   v: 0.000000

(and yeah, my insane C debugging skills mostly are printf-based... ewww...)

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #529 – 2024-05-19 11:13:25

I have been looking at this issue after you mentioned the downsampling problem. The sr_convert function only converts 1152 frames of audio data to float as that was what I assumed the encoder eats. But the resampler can consume a lot more data. Perhaps the best solution would be to convert all data to float during read.

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #530 – 2024-05-19 11:42:29

Thanks for an explanations and all the infos.
Hopefully the problem is not complex and can be solved by implementing some new or correcting an existing command or logic. Unfortunately, this does not automatically mean time savings to solve it..
I'm just surprised that this problem hasn't been fixed in the RealNetworks days and has arrived in the year 2024.. unbelievable. Thought i do not know the genesis of this issue, or if there is an encoder version from the past, that downsamples correctly.

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #531 – 2024-05-19 13:03:43

@Case thanks for having a look. An idea why the problem would show up recently? Even with the 16/8-bit-only code one would need many more input samples when downsampling (e.g., three times more when going 48000 down to 16000).

Indeed I think it would probably be rather neat if the PCM reading code would always provide float samples, no matter the original input format. This at least would get rid of the special code for 8-bit samples and shrink the code base a bit.

To be honest, I don't know why RealNetworks has the resampling step within the encoder anyways. I would have imagined the proper place is *between* the WAV reading code and the encoder.

@jaro1 I think in this case, RealNetworks isn't to blame, it's something that broke "recently".

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #532 – 2024-05-19 16:18:34

@maikmerten - try the attached patch.

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #533 – 2024-05-19 16:55:50

@Case That patch seems to work well (tested 44100 to 32000, 22050, 24000 and 16000). I'm baffled at how consistently you find your way through that codebase, I was still trying to find out how the samples get passed around. Impressive!

edit: Removed compiler warning stuff, seems it needed a "make clean".

edit2: Pushed to dev https://github.com/maikmerten/hmp3/commit/d581e8926afe829f57b0fdbf43ee58f3753eca5f

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #534 – 2024-05-19 17:23:08

To lighten things up a bit.. I'm sitting in the last row of the auditorium watching the two of you on stage. Even if I did a software tester course, I would move forward one row at most and the podium would still be far away. Someone is sort of "pulling the solutions out of their sleeve"..
Anyway, apologies to RealNetworks, such a wrong.. i didn't realize that this is a recent issue, hence i would mention an older version of Helix with functional downsampling instead of taking inspiration from Lame.

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #535 – 2024-05-19 17:54:24

Now that the downsampling case appears to work fine (I'll do some more testing), perhaps I can unify the 16-bit/8-bit sample input cases...

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #536 – 2024-05-19 18:14:00

Happy to hear you also see things working correctly. I initially created routines that converted the read PCM data to float directly, but all the gapless length handling is based on sizes and would have needed fixes everywhere. So I just patched the original on-the-fly floatification. I forgot about unifying the 8-bit case in the excitement.

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #537 – 2024-05-19 18:25:00

Thanks a lot for your effort! I think that indeed the approach you chose is the best that can be done without turning things upside down and potentially introducing all sorts of nasty surprises!

Locally, I already have a version that omits the 8-bit special case, building on top of your patch :-)

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #538 – 2024-05-20 09:29:00

Okay, I now threw out the 8-bit resampling code - now all sample formats are handled by the same 32-bit float resampling/conversion steps. https://github.com/maikmerten/hmp3/commit/a07d35c3dbe5acb22e1301c27df2dfe92dccb908

I also adjusted the Makefile and VisualStudio project files (by hand). Seems the automated builds on AppVeyor still work ( https://ci.appveyor.com/project/maikmerten/hmp3/builds/49846828/artifacts ), so I assume the VisualStudio project file surgery went okay.

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #539 – 2024-05-20 10:27:14

Looks good, but shouldn't the conversion from 8-bit use -128 instead of -127? Signed 8-bit range is -128..+127.

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #540 – 2024-05-20 10:59:21

Thanks, yeah, seems you're correct. It appears most sources consider the correct bias to be 128. Fixed!

edit: Actually, thinking about this, I may have managed to confuse myself. If 0x80 (128) is the midpoint ("0.0"), and 0xFF (255) is "1.0", then shouldn't have -127 be correct in the first place? Either way, given that the range isn't perfectly symmetric, shouldn't there be a DC bias either way? So isn't "-127.5" technically even more correct?

Also, perhaps I should multiply with 256 instead of 255 to scale it properly?

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #541 – 2024-05-20 11:53:56

I missed that multiplier. Yeah, it should be 256.
Using 127.5 would be mathematically more correct for making the peaks symmetric, but zero point is supposed to be at signed zero. The negative side is one higher in all integers but it's not a problem.

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #542 – 2024-05-20 12:00:25

Thanks again for clarification. I experimented a bit by generating a Sine wave in Audacity with at 0.0 dB, exporting it to pcm_u8, encoding that and importing the MP3 back. These experiments back your reasoning (-128.0, then times 256.0).

Fixed in dev.

Re: Resurrecting/Preserving the Helix MP3 encoder

Reply #543 – 2024-05-20 13:13:34

Quote from: maikmerten on 2024-05-20 09:29:00

Okay, I now threw out the 8-bit resampling code - now all sample formats are handled by the same 32-bit float resampling/conversion steps. https://github.com/maikmerten/hmp3/commit/a07d35c3dbe5acb22e1301c27df2dfe92dccb908

I also adjusted the Makefile and VisualStudio project files (by hand). Seems the automated builds on AppVeyor still work ( https://ci.appveyor.com/project/maikmerten/hmp3/builds/49846828/artifacts ), so I assume the VisualStudio project file surgery went okay.

thanks !!!

Notice