Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: TAK 2.3.3 (Read 11583 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

TAK 2.3.3

Final release of TAK 2.3.3 ((T)om's lossless (A)udio (K)ompressor)

This release brings an 64-bit decoder library and better unicode support for the GUI version.

It consists of:

  • TAK Applications 2.3.3.
  • TAK Winamp plugin 2.3.3
  • TAK Decoder library 2.3.3 (x86/x64)
  • TAK SDK 2.3.3

Re: TAK 2.3.3

Reply #1
What's new

New features:

  • An 64-bit decoder library for the SDK.
  • I have also created 64-bit versions of the applications. As expected they are bigger and slower without any advantage. As long as Windows supports 32-bit applications i see no reason to release them. But i will continously maintain them, so that they are ready when needed.
  • Unicode support of the GUI version is no longer limited to the open file dialogs. The required switch to a newer version of my development environment is responsible for a 3.5 times bigger program file.

Improvements:

  • Tiny encoding speed improvements of not more than 3 percent for Intel cpus based upon the skylake microarchitecture (6th to 10th Generation Core). I could have squeezed out more but only at the expense of significantly slower processing on older platforms. As rule of thumb i am taking into account cpu microarchitectures of the last 10 years.

Results

Here the results for my primary file set.

Test system: Intel i3-8100 (3.6 GHz / 1 Thread), Windows 10.

Code: [Select]
Preset  Enco-Speed                Deco-Speed
---------------------------------------------------------
        2.3.2    2.3.3    Win %   2.3.2    2.3.3    Win %
---------------------------------------------------------
-p0     868.72   893.94    2.90   806.84   810.57    0.46
-p0e    698.08   711.16    1.87   812.44   817.09    0.57
-p0m    418.70   425.54    1.63   815.07   819.19    0.51
-p1     726.44   748.04    2.97   787.03   788.37    0.17
-p1e    482.99   491.72    1.81   788.73   791.12    0.30
-p1m    314.46   322.68    2.61   790.61   794.60    0.50
-p2     580.58   591.20    1.83   715.05   718.12    0.43
-p2e    347.07   348.35    0.37   714.52   719.10    0.64
-p2m    203.02   206.26    1.60   715.99   718.71    0.38
-p3     301.25   306.83    1.85   697.20   703.97    0.97
-p3e    241.91   244.79    1.19   697.88   702.06    0.60
-p3m    131.89   133.65    1.33   699.12   701.40    0.33
-p4     183.23   186.54    1.81   650.44   657.64    1.11
-p4e    158.75   160.61    1.17   651.01   658.40    1.14
-p4m     82.49    83.73    1.50   651.21   656.84    0.86
---------------------------------------------------------
Speed as multiple of realtime playback.

And to illustrate the speed disadvantage of the 64-bit version:
Code: [Select]
Preset  Enco-Speed                Deco-Speed
---------------------------------------------------------
        32 bit   64 bit   Win %   32 bit   64 bit   Win %
---------------------------------------------------------
-p0     893.94   827.49   -7.43   810.57   700.80  -13.54
-p0e    711.16   661.05   -7.05   817.09   709.15  -13.21
-p0m    425.54   398.79   -6.29   819.19   710.48  -13.27
-p1     748.04   698.29   -6.65   788.37   689.93  -12.49
-p1e    491.72   461.79   -6.09   791.12   693.09  -12.39
-p1m    322.68   303.56   -5.93   794.60   695.48  -12.47
-p2     591.20   557.27   -5.74   718.12   634.55  -11.64
-p2e    348.35   336.29   -3.46   719.10   633.90  -11.85
-p2m    206.26   193.64   -6.12   718.71   636.49  -11.44
-p3     306.83   287.58   -6.27   703.97   619.50  -12.00
-p3e    244.79   233.77   -4.50   702.06   620.18  -11.66
-p3m    133.65   123.27   -7.77   701.40   621.06  -11.45
-p4     186.54   172.30   -7.63   657.64   577.94  -12.12
-p4e    160.61   145.73   -9.26   658.40   579.34  -12.01
-p4m     83.73    76.29   -8.89   656.84   578.45  -11.93
---------------------------------------------------------
Speed as multiple of realtime playback.

 

Re: TAK 2.3.3

Reply #2
Future

The next release should add support for the AVX2 instruction set. I achieved encoding speed improvements of about 14 percent for preset -p4m on my primary system (Intel Skylake based CPU), less for other presets. But results of my secondary (Haswell based) system were discouraging: Maximum improvement of 8 percent for presets p4 and p4e and up to 23 percent slower encoding for p2m, p3m and p4m!

Those presets make the most use of AVX2-instructions and should also benefit the most. But they seem to trigger the automatic down clocking mechanism of the cpu. AVX2 base and turbo frequencies are lower than the regular ones. This wouldn't hurt too much if the encoder would mostly use AVX2 instructions, but that's not the case. I havent profiled the code yet but i would estimate that about 30 percent of the encoding time goes to AVX2 instructions. And this is no continuous block, instead blocks of x86/SSE2 and AVX2 instructions are alternating.

That's bad, beacuse it will cause many transitions between the different clock rates. During such transitions the speed can be much slower than the lower clock rate would suggest. After the last AVX2 instruction the lower clock rate will be maintained for a considerable amount of time, therfore succeeding non-AVX2 instructions will also be excecuted slower.

Well, my haswell cpu is an 35w low power quad core, quite a challenge for an older desktop microarchitecture. The difference between regular and AVX2 clock most likely is considerably bigger than for the common 65W+ cpus.

Nevertheless i am really hesitant to release an AVX2-version which will make encoding on an unknown number of older systems slower. And  imho the possible advantage isn't big enough to justify an elaborate study and implementation of a cpu dependend code path.

Currently it's not clear what i will do next. Possibly i will try to improve the encoding speed by algorithmic modifications. Ktf's latest "Lossless codec comparison" also made me think about the (re-) introduction of higher predictor counts.

Features for later Versions:

  • Port to Lazarus / Freepascal. Nice for Linux support.
  • Fast integrity check without decoding based upon the checksums only.
  • Transcode mode.
  • Tuning of the encoder for the problem files which have been reported in the past months.

Re: TAK 2.3.3

Reply #3
Thnx for the new release. :)

Re: TAK 2.3.3

Reply #4
Currently it's not clear what i will do next.

Your website is slightly out-of-date, listing the following items
Quote
    Unterstützung für Unicode-Zeichensätze.
    Eine deutschsprachige Version.
    Noch ein bißchen mehr Geschwindigkeit und Kompressionseffizienz...
    Anwendungen für andere Plattformen als Windows.
    Unterstützung für mehr als 6 Audiokanäle.

  • For FLAC I think there is still a bit to gain by improving quantization of predictor coefficients, but it seems this is not applicable to TAK, as it doesn't store raw predictor coefficients like FLAC does. I'm sharing the idea just in case it does make sense
  • You could consider switching from MD5 to something faster. I've been told most SHA checksums are faster simply because they don't have a long dependency chain and can more efficiently use the superscalar properties and out-of-order execution capabilities of modern CPUs. (That would obviously break backward compatibility, but only for checksumming) For FLAC checksumming is quite a significant part of decoding CPU load (and encoding for the fastest presets), so I imagine this is also the case for TAK.
  • You could consider looking into the arithmetic coder used in Daala/AV1. That is an arithmetic coder that was specifically designed to evade any existing patents. I don't know whether that is fast enough for your liking

Of course I don't know where you would like TAK to go from here. Do you still like to make (big) changes/additions to the format, or do you want to keep things backwards-compatible?

I know you've been talking about open-sourcing, but I can imagine this is a big step. If you'd like to see TAK gain more users, you could consider contributing a bare essentials TAK encoder to ffmpeg for example. I would imagine a TAK encoder without all specific tuning and tweaks, just a simple TAK encoder would already beat FLAC with ease. Or instead of open-sourcing the software, you could open up the format by creating a document describing the structure and sharing the ideas you used. Maybe someone else will do it for you (like the ffmpeg guys did with wavpack)

Please don't feel offended or pressured to do anything, I just wanted to contribute a few ideas.
Music: sounds arranged such that they construct feelings.


Re: TAK 2.3.3

Reply #6
Thank you Thomas for the new release! Your details about efforts and insights are always a pleasure to read too.

@Future:
of course welcome to also have CPUs of the last decade in view, I guess most of us will run those. Reg. to https://en.wikipedia.org/wiki/Advanced_Vector_Extensions the last 5-7years-generations seems to have AVX2 support, newer CPUs are targeting next generation AVX-512.
My CPU does not have AVX2, next one in the next 12 months will, nevertheless I personally start endcoding and do not watch the progressbar. I rarely take care of some % of encoding as with current enc-speed-range, which is already at an awesome level, this is not the driving factor - to me.
To whom might this be an 'issue' or real downside at all, usecases, scenarios, ..? The group might get smaller and smaller anyway.
IMHO it makes sense to really go for the newer instruction set with next version, which will be sort of minor-version 2.4 then, guess so. And for the almost impossible case of bug fixing release another 2.3.x patch-version of no-AVX2 TAK.

Also like ktf's thoughts.

Re: TAK 2.3.3

Reply #7
You could consider switching from MD5 to something faster. I've been told most SHA checksums are faster simply because they don't have a long dependency chain and can more efficiently use the superscalar properties and out-of-order execution capabilities of modern CPUs. (That would obviously break backward compatibility, but only for checksumming) For FLAC checksumming is quite a significant part of decoding CPU load (and encoding for the fastest presets), so I imagine this is also the case for TAK.

The "Fast integrity check without decoding based upon the checksums only" would resolve most of [note1] that. I think so much that the downside to replacing MD5 outweights the benefits.

In TAK and WavPack, MD5 is optional and disabled by default - in WavPack it is viewed more as a fingerprint, and even more so after having implemented the non-decoding integrity check.
From that point of view, where MD5 is an optional fingerprint, I think it is a great advantage to have the same [note2] algorithm across FLAC/TAK/WavPack(/OptimFROG if anyone cares):
 * if you want something quicker, then use the default; integrity verification will be faster than decoding anyway
 * if you want a checksum as a fingerprint, you presumably want the one that everyone uses (say, if you transcode: yep, every MD5 appears precisely twice, that is source and target; and if you want to use say foo_bitcompare in the end to be sure, just sort source file list and target file list by MD5).

[note1]: Here is the exception. If one wants to verify not only integrity, but to check an encoding against the original PCM, then one could make a checksum of the PCM, encode, decode and check against the checksum. Then two checksums are calculated and a slow algorithm is a penalty - that would be "unnecessary" if the MD5 is not written to the file, never again to be used.
Now is it worth it to implement a second algorithm for the cases where MD5 is not stored?
(Even if MD5 has to be calculated once to be stored, a more than twice as fast algorithm would save time. But is it worth it?)

[note2]: except well, codecs differ on how to calculate MD5 on 8 bit signals, them being unsigned. Who has a big collection of 8 bit  .wav compressed?

Re: TAK 2.3.3

Reply #8
Quote
You could consider looking into the arithmetic coder used in Daala/AV1. That is an arithmetic coder that was specifically designed to evade any existing patents. I don't know whether that is fast enough for your liking

That hasn't stopped Microsoft from patenting the rANS entropy method, regardless.

Which to me leaves us in the exact same situation as if we use ordinary arithmetic coding. Might as well just use that or range coding. :/

Re: TAK 2.3.3

Reply #9
  • Port to Lazarus / Freepascal. Nice for Linux support.

While I appreciate the general direction of this decision (I will move on to macOS in the next few weeks, so I can't have TAK files), be aware that Delphi probably has better macOS support than Lazarus these days, so if you plan to support Windows and Linux and macOS, watch out for random quirks in the underlying language - it is picky on Macs.
audiophile // flac & wavpack, mostly // using too many audio players

Re: TAK 2.3.3

Reply #10
    Future

    The next release should add support for the AVX2 instruction set. I achieved encoding speed improvements of about 14 percent for preset -p4m on my primary system (Intel Skylake based CPU), less for other presets. But results of my secondary (Haswell based) system were discouraging: Maximum improvement of 8 percent for presets p4 and p4e and up to 23 percent slower encoding for p2m, p3m and p4m!

    Those presets make the most use of AVX2-instructions and should also benefit the most. But they seem to trigger the automatic down clocking mechanism of the cpu. AVX2 base and turbo frequencies are lower than the regular ones. This wouldn't hurt too much if the encoder would mostly use AVX2 instructions, but that's not the case. I havent profiled the code yet but i would estimate that about 30 percent of the encoding time goes to AVX2 instructions. And this is no continuous block, instead blocks of x86/SSE2 and AVX2 instructions are alternating.

    That's bad, beacuse it will cause many transitions between the different clock rates. During such transitions the speed can be much slower than the lower clock rate would suggest. After the last AVX2 instruction the lower clock rate will be maintained for a considerable amount of time, therfore succeeding non-AVX2 instructions will also be excecuted slower.

    Well, my haswell cpu is an 35w low power quad core, quite a challenge for an older desktop microarchitecture. The difference between regular and AVX2 clock most likely is considerably bigger than for the common 65W+ cpus.

    Nevertheless i am really hesitant to release an AVX2-version which will make encoding on an unknown number of older systems slower. And  imho the possible advantage isn't big enough to justify an elaborate study and implementation of a cpu dependend code path.
    ...
    FWIW modern CPU's downclock much less or not at all in the presence of AVX2 or even AVX512 (I'm thinking desktop, laptop or other heavily power/thermally-constrained hardware may vary). AMD had AVX2 support on paper earlier, but parity support for AVX2 (one instruction per cycle) didn't come until Zen2 (before that AMD's throughput was half intel's clock for clock, but there still a minor benefit to the AVX2 path from less instructions so better performance of cpu-frontend/instruction-cache). AVX512 in particular had a rocky start with intel but AMD's implementation in Zen4 was solid out of the gate. https://travisdowns.github.io/blog/2020/08/19/icl-avx512-freq.html

    Haswell was the first arch that had AVX2 so low power haswell has to be the worst-case, I'm guessing a large majority of AVX2 systems in use will see a benefit always. IMO it would be more than sufficient if you had a --no-avx2 flag that a user could use if they have hardware that supports avx2 but that path performs worse.

    ...
    You could consider switching from MD5 to something faster. I've been told most SHA checksums are faster simply because they don't have a long dependency chain and can more efficiently use the superscalar properties and out-of-order execution capabilities of modern CPUs. (That would obviously break backward compatibility, but only for checksumming) For FLAC checksumming is quite a significant part of decoding CPU load (and encoding for the fastest presets), so I imagine this is also the case for TAK.[/li][/list]
    ...
    Modern CPU's have acceleration for sha1 and sha256, IMO that's a big reason to use them. For a similar reason crc32c is interesting: https://www.corsix.org/content/fast-crc32c-4k

    https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=1566,6495&cats=Cryptography

    ...
    In TAK and WavPack, MD5 is optional and disabled by default - in WavPack it is viewed more as a fingerprint, and even more so after having implemented the non-decoding integrity check.
    From that point of view, where MD5 is an optional fingerprint, I think it is a great advantage to have the same [note2] algorithm across FLAC/TAK/WavPack(/OptimFROG if anyone cares):
    ...
    That is MD5's one saving grace IMO, but it also means I'll die before MD5 does.

    Quote
    You could consider looking into the arithmetic coder used in Daala/AV1. That is an arithmetic coder that was specifically designed to evade any existing patents. I don't know whether that is fast enough for your liking

    That hasn't stopped Microsoft from patenting the rANS entropy method, regardless.

    Which to me leaves us in the exact same situation as if we use ordinary arithmetic coding. Might as well just use that or range coding. :/
    I believe ANS is faster and is easier to multithread. I don't think patent trolling is a reason to discount something, if anything it's proof that a novel development is worth a damn. Just give the middle finger to Microsoft and carry on.

    Re: TAK 2.3.3

    Reply #11
    Modern CPU's have acceleration for sha1 and sha256, IMO that's a big reason to use them. For a similar reason crc32c is interesting: https://www.corsix.org/content/fast-crc32c-4k

    https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=1566,6495&cats=Cryptography
    I think that even with CPU acceleration there are faster hashes. xxHash, for example: https://github.com/Cyan4973/xxHash
    Allegari nihil et allegatum non probare, paria sunt.

    Re: TAK 2.3.3

    Reply #12
    That's a good call, XXH128 looks extremely good. It's not cryptographic but that's fine?

    Re: TAK 2.3.3

    Reply #13
    Is there any lossless audio format that uses anything "cryptographic", except TTA's encryption feature that nothing supports [note]?
     
    There was an exchange of opinion some grudgy third party developer refused to touch the WavPack 5 format over its block checksums not being tamper-safe, but how many would want to store all of those in a database for subsequent verification anyway? Monkey's Audio could have chosen something like that, the codec uses the MD5 of the entire encoded stream so that - with exceptions I just learned about - a given signal encoded with Monkey's "High" file always has the same encoded bit stream (for the audio part, that is); since that is on the encode, it doesn't agree with anything else and Ashland could then have selected any checksum algorithm of his choice, cryptographic even. But, nah.

    [note]ffplay can playback encrypted TTA when given the password as a command-line option, but I don't know any player that starts reading the file, finds it is encrypted and prompts the user ...
    Also, back around 2005, TTA went for a model where there are no presets. Imagine a Monkey's Audio format which has just one mode (circa "fast", for CDDA purposes). That makes it "kinda useful" for torrenting out, because no user will retrieve it and then mess up the shareability by a recompression in-place. There is a collection of fan art that uses TTA for that purpose, and also claims that one reason to use it is the lack of metadata ... which is not true, but saying so discourages users from trying to tag and thus change the "CD image" files.
    For file sharing purposes, it would make sense to have a cryptographic hash per block, and share each block as a segment. Would require its own P2P thing that will never happen. Also you can imagine a delta-copy backup that would scrutinize file content that way. Not going to happen either. But at least a "potential use" for cryptographic block hashes.
    As for TTA, their blocks are apparently checksummed (IDK about algorithm, but it is GPL so those who speak code can find it), without being able to make use of it - at least not the reference decoder. https://hydrogenaud.io/index.php/topic,122094.0.html

    Re: TAK 2.3.3

    Reply #14
    Is there any lossless audio format that uses anything "cryptographic", except TTA's encryption feature that nothing supports [note]?
    MD5 and SHA1 are cryptographic, they can just be tampered with more easily now that they've been broken. I think SHA1 still takes considerable resources to find a collision, but MD5 is broken to the point that any hardware can find a collision easily. A hash that's non-cryptographic from the start might allow craftable-collisions in the ballpark of MD5 or maybe much easier.

    Also, back around 2005, TTA went for a model where there are no presets. Imagine a Monkey's Audio format which has just one mode (circa "fast", for CDDA purposes). That makes it "kinda useful" for torrenting out, because no user will retrieve it and then mess up the shareability by a recompression in-place. There is a collection of fan art that uses TTA for that purpose, and also claims that one reason to use it is the lack of metadata ... which is not true, but saying so discourages users from trying to tag and thus change the "CD image" files.
    As long as a compressor doesn't use floating point or unpredictable RNG either in the format or in heuristics that make decisions about the bitstream, two distinct computers can compress the same file and get the same compressed output. torrentzip and torrent7zip do exactly this, and I believe tak can do the same?

    For file sharing purposes, it would make sense to have a cryptographic hash per block, and share each block as a segment. Would require its own P2P thing that will never happen.
    I don't think it's necessary, for sending files you rely on the transport layer to verify that it's transporting properly, be it raw TCP/UDP or a protocol built on top like rsync/torrent. No need to reinvent.

    In a perfect world frames wouldn't even need a crc checksum to validate, everyone would store their audio on a modern filesystem that does block-level checksumming and filesystem API's would be mature to the point that a decoder could determine when a block fails and where the next good block is to be able to skip the corruption cleanly. Unfortunately even if that advanced API existed you can't rely on people using BTRFS/ZFS/modern-NTFS.

    Re: TAK 2.3.3

    Reply #15
    OK, MD5 was "cryptographic" but so broken that it is used only for non-cryptographic use.
    And even if you can force lossless compressors into making bitwise the same stream, that is not the issue: once you allow for more than a very few different bit streams to encode the same audio, there is no longer much value added in tamper-freeness on the encoded stream. Once the user employs flac 1.4.x to re-encode an older file, they will be different. When re-encoding is so common that you have a ton of different bitstreams and thus a ton of different block checksums for valid audio, there is no way to distinguish "untampered audio" from tampered - use the decoded MD5 for that. It is different for TTA when a given signal has a given encoded bitstream and thus the encode has a unique hash - and as well for Monkey's when you are down to "only five" (at least for 16 bits signals).


    don't think it's necessary, for sending files you rely on the transport layer to verify that it's transporting properly
    Yes, hence "will never happen". But a tailor-made-for-media sharing could distinguish out metadata and make sure that you know precisely when all the audio (/video) blocks are transferred no matter how much your peers change the metadata. And that is apparently what that particular curator was attempting to emulate, in making it "harder" for users to change the files in any way whatsoever: all cuesheets gathered in one archive to be extracted before manipulation, all audio tagless in a format with unique bitstream, and a neat little white lie that this format doesn't support tagging.

    In a perfect world frames wouldn't even need a crc checksum to validate, everyone would store their audio on a modern filesystem that does block-level checksumming and filesystem API's would be mature to the point that a decoder could determine when a block fails and where the next good block is to be able to skip the corruption cleanly.
    That would of course detect changes in tags, and so this perfect world would require the data handling - filesystem API and applications - to keep tags in separate file system blocks.  On one hand I am actually mildly surprised that this hasn't been implemented, given how much I/O it would takes to rewrite a whole video file, but on the other I guess that editing software so much by and large do operations that require so much processing that it isn't that much of an issue - and they can use temporary metadata files anyway if they want, and write to file only when project is committed.

    Re: TAK 2.3.3

    Reply #16
      Future

      The next release should add support for the AVX2 instruction set. I achieved encoding speed improvements of about 14 percent for preset -p4m on my primary system (Intel Skylake based CPU), less for other presets. But results of my secondary (Haswell based) system were discouraging: Maximum improvement of 8 percent for presets p4 and p4e and up to 23 percent slower encoding for p2m, p3m and p4m!

      Those presets make the most use of AVX2-instructions and should also benefit the most. But they seem to trigger the automatic down clocking mechanism of the cpu. AVX2 base and turbo frequencies are lower than the regular ones. This wouldn't hurt too much if the encoder would mostly use AVX2 instructions, but that's not the case. I havent profiled the code yet but i would estimate that about 30 percent of the encoding time goes to AVX2 instructions. And this is no continuous block, instead blocks of x86/SSE2 and AVX2 instructions are alternating.

      That's bad, beacuse it will cause many transitions between the different clock rates. During such transitions the speed can be much slower than the lower clock rate would suggest. After the last AVX2 instruction the lower clock rate will be maintained for a considerable amount of time, therfore succeeding non-AVX2 instructions will also be excecuted slower.

      Well, my haswell cpu is an 35w low power quad core, quite a challenge for an older desktop microarchitecture. The difference between regular and AVX2 clock most likely is considerably bigger than for the common 65W+ cpus.

      Nevertheless i am really hesitant to release an AVX2-version which will make encoding on an unknown number of older systems slower. And  imho the possible advantage isn't big enough to justify an elaborate study and implementation of a cpu dependend code path.
      ...
      FWIW modern CPU's downclock much less or not at all in the presence of AVX2 or even AVX512 (I'm thinking desktop, laptop or other heavily power/thermally-constrained hardware may vary). AMD had AVX2 support on paper earlier, but parity support for AVX2 (one instruction per cycle) didn't come until Zen2 (before that AMD's throughput was half intel's clock for clock, but there still a minor benefit to the AVX2 path from less instructions so better performance of cpu-frontend/instruction-cache). AVX512 in particular had a rocky start with intel but AMD's implementation in Zen4 was solid out of the gate. https://travisdowns.github.io/blog/2020/08/19/icl-avx512-freq.html

      Haswell was the first arch that had AVX2 so low power haswell has to be the worst-case, I'm guessing a large majority of AVX2 systems in use will see a benefit always. IMO it would be more than sufficient if you had a --no-avx2 flag that a user could use if they have hardware that supports avx2 but that path performs worse.

      ...
      You could consider switching from MD5 to something faster. I've been told most SHA checksums are faster simply because they don't have a long dependency chain and can more efficiently use the superscalar properties and out-of-order execution capabilities of modern CPUs. (That would obviously break backward compatibility, but only for checksumming) For FLAC checksumming is quite a significant part of decoding CPU load (and encoding for the fastest presets), so I imagine this is also the case for TAK.[/li][/list]
      ...
      Modern CPU's have acceleration for sha1 and sha256, IMO that's a big reason to use them. For a similar reason crc32c is interesting: https://www.corsix.org/content/fast-crc32c-4k

      https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=1566,6495&cats=Cryptography

      ...
      In TAK and WavPack, MD5 is optional and disabled by default - in WavPack it is viewed more as a fingerprint, and even more so after having implemented the non-decoding integrity check.
      From that point of view, where MD5 is an optional fingerprint, I think it is a great advantage to have the same [note2] algorithm across FLAC/TAK/WavPack(/OptimFROG if anyone cares):
      ...
      That is MD5's one saving grace IMO, but it also means I'll die before MD5 does.

      Quote
      You could consider looking into the arithmetic coder used in Daala/AV1. That is an arithmetic coder that was specifically designed to evade any existing patents. I don't know whether that is fast enough for your liking

      That hasn't stopped Microsoft from patenting the rANS entropy method, regardless.

      Which to me leaves us in the exact same situation as if we use ordinary arithmetic coding. Might as well just use that or range coding. :/
      I believe ANS is faster and is easier to multithread. I don't think patent trolling is a reason to discount something, if anything it's proof that a novel development is worth a damn. Just give the middle finger to Microsoft and carry on.

      If thats the case, I seen arithmetic coding anyway being streamed. Even range coding can be done in multiple streams, using SIMD. But yeah, ANS is better. I really hate that MS tried and somewhat succeeded in claiming it. Was at some point pondering it for my executable compressor in conjunction with LZ77 + suffix array-based matcher.

      https://github.com/richgel999/sserangecoding

      Re: TAK 2.3.3

      Reply #17
      Just wanted to say that TAK has become my favourite lossless audio encoder recently.

      Thomas: You've really created something unique here. A big thank you for all the work you've done. :) I hope you'll continue to develop it further. 8)
      Your ideas for v3.0 sounds very interesting. :D

      I've found TAK to be very good at compressing hi-res audio files. Maybe you could add support for even higher audio sample rates? 352.8 kHz, 384 kHz,
      705,6 kHz, 768 kHz and so on? I know these sample rates are not very common (at least not yet), but since a lot of DACs out there have support for them
      these days it would be nice to see them supported by TAK as well.