Can someone name at least one such slow device? According to these results FLAC -8 decoding is even faster than mp3 decoding on all tested hardware.Seeing that table I think I remembered wrong. That table was what I remembered but couldn't find, FLAC compression levels being referred to in a benchmark on decoding performance.
Anyway, I'd like the opinion of the people reading this on the following (quote from the mailing list: http://lists.xiph.org/pipermail/flac-dev/2021-June/006470.html)
Code is here: https://github.com/ktmf01/flac/tree/autoc-sse2 Before I send a push request, I'd like to discuss a choice that has to be made.
I see a few options
- Don't switch to autoc as doubles, keep current speed and ignore possible compression gain
- Switch to autoc as doubles, but keep current intrinsics routines. This means some platforms (with only SSE but not SSE2 or with VSX) will get less compression, but won't see a large slowdown.
- Switch to autoc as doubles, but remove current SSE and disable VSX intrinsics for someone to update them later (I don't have any POWER8 or POWER9 hardware to test). This means all platforms will get the same compression, but some (with only SSE but not SSE2 or with VSX) will see a large slowdown.
Thanks in advance for your replies and comments on this.
So, switching from single precision to double precision, I'm faced with a choice: should I keep the fast but single-precision routines for SSE and VSX, so they keep the same speed but no compression benefit? Or should these be removed/disabled so all platforms (ARM, POWER, ia32, AMD64) get the same compression, but at varying costs?