Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: libttaR (TTA rewrite part 2) (Read 8628 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: libttaR (TTA rewrite part 2)

Reply #1
good news, bad news

good news:
i've been working on a multi-threaded version. the encoder is already done in the 1.1-dev branch

bad news:
the tta wikipedia page was deleted

Re: libttaR (TTA rewrite part 2)

Reply #2
1.1 is now nearly finished. There is just some minor stuff for me to get at. I plan to officially release it on the first of next month.

here is a benchmark against ffmpeg:

system:
   Linux 5.10.0-23-amd64 SMP Debian 5.10.179-1 (2023-05-12) x86_64 GNU/Linux
CPU:
   AMD Ryzen 7 1700 64-bit 8-Core MT MCP 1550/3750 MHz
ffmpeg:
   ffmpeg version 4.3.6-0+deb11u1
   built with gcc 10 (Debian 10.2.1-6)
   libavutil      56. 51.100 / 56. 51.100
   libavcodec     58. 91.100 / 58. 91.100
   libavformat    58. 45.100 / 58. 45.100
   libavdevice    58. 10.100 / 58. 10.100
   libavfilter     7. 85.100 /  7. 85.100
   libavresample   4.  0.  0 /  4.  0.  0
   libswscale      5.  7.100 /  5.  7.100
   libswresample   3.  7.100 /  3.  7.100
   libpostproc    55.  7.100 / 55.  7.100
ttaR:
   Debian clang version 11.0.1-2
   -march=native -mtune=native
file:
   Nirvana - MTV Unplugged in New York
   571084460   mtv.wav
   337981896   mtv.tta

##############################################################################

 Performance counter stats for 'ffmpeg -loglevel quiet -threads 1 -i mtv.wav -f tta /dev/null -y' (20 runs):

          8,925.80 msec task-clock                #    1.000 CPUs utilized            ( +-  0.06% )
                60      context-switches          #    0.007 K/sec                    ( +-  5.24% )
                16      cpu-migrations            #    0.002 K/sec                    ( +-  2.69% )
            86,798      page-faults               #    0.010 M/sec                    ( +-  0.00% )
    33,302,059,704      cycles                    #    3.731 GHz                      ( +-  0.05% )  (83.33%)
       566,286,522      stalled-cycles-frontend   #    1.70% frontend cycles idle     ( +-  0.54% )  (83.33%)
     3,988,341,464      stalled-cycles-backend    #   11.98% backend cycles idle      ( +-  0.18% )  (83.33%)
    65,718,741,261      instructions              #    1.97  insn per cycle
                                                  #    0.06  stalled cycles per insn  ( +-  0.00% )  (83.33%)
     8,702,525,883      branches                  #  974.986 M/sec                    ( +-  0.01% )  (83.34%)
       384,955,890      branch-misses             #    4.42% of all branches          ( +-  0.03% )  (83.33%)

           8.92647 +- 0.00557 seconds time elapsed  ( +-  0.06% )


 Performance counter stats for 'ffmpeg -loglevel quiet -threads 1 -i mtv.tta -f s16le /dev/null -y' (20 runs):

          6,504.99 msec task-clock                #    1.000 CPUs utilized            ( +-  0.13% )
                55      context-switches          #    0.008 K/sec                    ( +-  2.46% )
                16      cpu-migrations            #    0.002 K/sec                    ( +-  2.49% )
             3,692      page-faults               #    0.567 K/sec                    ( +-  0.04% )
    24,289,757,929      cycles                    #    3.734 GHz                      ( +-  0.13% )  (83.31%)
       327,298,120      stalled-cycles-frontend   #    1.35% frontend cycles idle     ( +-  0.15% )  (83.31%)
     3,454,321,337      stalled-cycles-backend    #   14.22% backend cycles idle      ( +-  0.20% )  (83.33%)
    67,039,367,957      instructions              #    2.76  insn per cycle
                                                  #    0.05  stalled cycles per insn  ( +-  0.00% )  (83.35%)
     8,499,223,928      branches                  # 1306.569 M/sec                    ( +-  0.00% )  (83.36%)
       313,478,808      branch-misses             #    3.69% of all branches          ( +-  0.01% )  (83.34%)

           6.50534 +- 0.00866 seconds time elapsed  ( +-  0.13% )


 Performance counter stats for 'ffmpeg -loglevel quiet -threads 16 -i mtv.tta -f s16le /dev/null -y' (320 runs):

          9,947.58 msec task-clock                #   12.309 CPUs utilized            ( +-  0.09% )
             4,450      context-switches          #    0.447 K/sec                    ( +-  0.37% )
             1,041      cpu-migrations            #    0.105 K/sec                    ( +-  1.04% )
             8,380      page-faults               #    0.842 K/sec                    ( +-  0.00% )
    36,296,805,088      cycles                    #    3.649 GHz                      ( +-  0.02% )  (83.37%)
       562,776,161      stalled-cycles-frontend   #    1.55% frontend cycles idle     ( +-  0.15% )  (83.13%)
     5,056,980,289      stalled-cycles-backend    #   13.93% backend cycles idle      ( +-  0.02% )  (83.21%)
    67,127,958,734      instructions              #    1.85  insn per cycle
                                                  #    0.08  stalled cycles per insn  ( +-  0.00% )  (83.30%)
     8,518,827,278      branches                  #  856.372 M/sec                    ( +-  0.00% )  (83.43%)
       317,537,726      branch-misses             #    3.73% of all branches          ( +-  0.00% )  (83.57%)

           0.80817 +- 0.00137 seconds time elapsed  ( +-  0.17% )

##############################################################################

 Performance counter stats for 'ttaR encode -q -S mtv.wav -o /dev/null' (20 runs):

          5,697.67 msec task-clock                #    1.000 CPUs utilized            ( +-  0.10% )
                13      context-switches          #    0.002 K/sec                    ( +- 14.11% )
                 0      cpu-migrations            #    0.000 K/sec                    ( +- 31.26% )
               252      page-faults               #    0.044 K/sec                    ( +-  0.10% )
    21,277,212,273      cycles                    #    3.734 GHz                      ( +-  0.08% )  (83.32%)
       424,132,363      stalled-cycles-frontend   #    1.99% frontend cycles idle     ( +-  1.19% )  (83.32%)
     8,164,992,255      stalled-cycles-backend    #   38.37% backend cycles idle      ( +-  0.17% )  (83.33%)
    50,036,808,248      instructions              #    2.35  insn per cycle        
                                                  #    0.16  stalled cycles per insn  ( +-  0.00% )  (83.34%)
     3,929,444,278      branches                  #  689.658 M/sec                    ( +-  0.01% )  (83.36%)
       371,064,057      branch-misses             #    9.44% of all branches          ( +-  0.01% )  (83.34%)

           5.69822 +- 0.00548 seconds time elapsed  ( +-  0.10% )


 Performance counter stats for 'ttaR encode -q -t16 mtv.wav -o /dev/null' (320 runs):

          8,177.73 msec task-clock                #   15.379 CPUs utilized            ( +-  0.02% )
             1,300      context-switches          #    0.159 K/sec                    ( +-  2.07% )
               203      cpu-migrations            #    0.025 K/sec                    ( +-  2.01% )
             5,834      page-faults               #    0.713 K/sec                    ( +-  0.04% )
    30,484,625,770      cycles                    #    3.728 GHz                      ( +-  0.02% )  (83.09%)
       439,889,964      stalled-cycles-frontend   #    1.44% frontend cycles idle     ( +-  0.38% )  (83.18%)
     4,980,516,390      stalled-cycles-backend    #   16.34% backend cycles idle      ( +-  0.04% )  (83.31%)
    50,068,984,657      instructions              #    1.64  insn per cycle        
                                                  #    0.10  stalled cycles per insn  ( +-  0.00% )  (83.46%)
     3,937,619,679      branches                  #  481.505 M/sec                    ( +-  0.00% )  (83.60%)
       380,192,849      branch-misses             #    9.66% of all branches          ( +-  0.00% )  (83.36%)

          0.531753 +- 0.000344 seconds time elapsed  ( +-  0.06% )


 Performance counter stats for 'ttaR decode -q -S mtv.tta -f raw -o /dev/null' (20 runs):

          5,807.10 msec task-clock                #    1.000 CPUs utilized            ( +-  0.12% )
                15      context-switches          #    0.003 K/sec                    ( +-  9.58% )
                 1      cpu-migrations            #    0.000 K/sec                    ( +- 25.36% )
               207      page-faults               #    0.036 K/sec                    ( +-  0.10% )
    21,697,968,813      cycles                    #    3.736 GHz                      ( +-  0.12% )  (83.29%)
       409,444,598      stalled-cycles-frontend   #    1.89% frontend cycles idle     ( +-  0.06% )  (83.33%)
     8,348,408,571      stalled-cycles-backend    #   38.48% backend cycles idle      ( +-  0.30% )  (83.35%)
    53,311,974,255      instructions              #    2.46  insn per cycle        
                                                  #    0.16  stalled cycles per insn  ( +-  0.00% )  (83.35%)
     3,884,628,982      branches                  #  668.945 M/sec                    ( +-  0.00% )  (83.36%)
       370,081,414      branch-misses             #    9.53% of all branches          ( +-  0.01% )  (83.32%)

           5.80763 +- 0.00684 seconds time elapsed  ( +-  0.12% )


 Performance counter stats for 'ttaR decode -q -t16 mtv.tta -f raw -o /dev/null' (320 runs):

          8,396.63 msec task-clock                #   15.660 CPUs utilized            ( +-  0.02% )
             2,705      context-switches          #    0.322 K/sec                    ( +-  0.61% )
                70      cpu-migrations            #    0.008 K/sec                    ( +-  1.97% )
             4,820      page-faults               #    0.574 K/sec                    ( +-  0.00% )
    31,341,341,315      cycles                    #    3.733 GHz                      ( +-  0.02% )  (83.16%)
       404,497,658      stalled-cycles-frontend   #    1.29% frontend cycles idle     ( +-  1.11% )  (83.19%)
     4,983,844,918      stalled-cycles-backend    #   15.90% backend cycles idle      ( +-  0.03% )  (83.27%)
    53,368,851,871      instructions              #    1.70  insn per cycle        
                                                  #    0.09  stalled cycles per insn  ( +-  0.00% )  (83.37%)
     3,896,895,024      branches                  #  464.102 M/sec                    ( +-  0.00% )  (83.55%)
       378,556,224      branch-misses             #    9.71% of all branches          ( +-  0.00% )  (83.45%)

          0.536170 +- 0.000363 seconds time elapsed  ( +-  0.07% )


 

Re: libttaR (TTA rewrite part 2)

Reply #3
Using (old version) ffmpeg of cli tool for performance comparison is flawed and extremely biased.
Also current ffmpeg cli tool have bad performance with smaller packets due to clumsy MT work of ex-developer.
Also just to run generic build of ffmpeg for the first time takes extra time... That is just few points I wanted to emphasize how such comparison is unfair and biased.
Please remove my account from this forum.

Re: libttaR (TTA rewrite part 2)

Reply #4
Also this repo does not have any SIMD x86 assembly and the only way performance can go up if compiler is modern clang.
Please remove my account from this forum.

Re: libttaR (TTA rewrite part 2)

Reply #5
Using (old version) ffmpeg of cli tool for performance comparison is flawed and extremely biased.
The tta codec version should be the same between the version I used and the current version, or at least the differences are negligible. You are always welcome to post your own benchmarks.

Also current ffmpeg cli tool have bad performance with smaller packets due to clumsy MT work of ex-developer.
ok

Also just to run generic build of ffmpeg for the first time takes extra time
That cannot be more than a millisecond.

Also this repo does not have any SIMD x86 assembly and the only way performance can go up if compiler is modern clang.
If anything, that is a virtue.
FYI, most of the performance increases came from optimizing the rice coder, which is completely unSIMDable

Re: libttaR (TTA rewrite part 2)

Reply #6
smaller packets, wouldn't TTA be fixed at about a second - that's not small?

Anyway, ffmpeg 5 isn't the newest and hottest either ... though, compared to this article:
the tta wikipedia page was deleted
Been flagged as looking like an ad for twelve years ... and not too wrongfully either.

Re: libttaR (TTA rewrite part 2)

Reply #7
smaller packets, wouldn't TTA be fixed at about a second - that's not small?
yes, with CD quality audio, the framesize is 180KiB.

Re: libttaR (TTA rewrite part 2)

Reply #8
1.1 officially released

lib:
    api changes
    performance improvements

cli:
    multi-threading
    faster single-threading
    some bug fixes

Re: libttaR (TTA rewrite part 2)

Reply #9
Still extremely misleading and biased, proper benchmark are done by decoding very long audio files and looking at speed of decoding versus realtime. Also TTA is more irrelevant and niche than TAK, and TAK actually did have some cool new innovative stuff to offer, TTA have nothing new to offer.
Please remove my account from this forum.

Re: libttaR (TTA rewrite part 2)

Reply #10
smaller packets, wouldn't TTA be fixed at about a second - that's not small?
yes, with CD quality audio, the framesize is 180KiB.

Not decoded framesize, but encoded ones as stored in final output, if audio is mostly silence it will slow thing down unless TTA encode silence frames extremely inefficiently which may be true after all considering it poor design decisions.
Please remove my account from this forum.

Re: libttaR (TTA rewrite part 2)

Reply #11
Not decoded framesize, but encoded ones as stored in final output, if audio is mostly silence it will slow thing down unless TTA encode silence frames extremely inefficiently which may be true after all considering it poor design decisions.
so ffmpeg gets slower when it has to read less? what a well designed piece of software
and yes, tta is not that efficient with silence compared to most other codecs

Re: libttaR (TTA rewrite part 2)

Reply #12
Indeed, 60 seconds of silence (44100Hz S16) encoded with ffmpeg: tta is 654K and flac is 17K.
Please remove my account from this forum.

Re: libttaR (TTA rewrite part 2)

Reply #13
I tested the compression of silence: https://hydrogenaud.io/index.php/topic,122413.0.html .  (Post says "5.1" but it was five channels.)
TTA isn't that bad for a codec that doesn't handle wasted bits - compare to Monkey's and ALAC.
I could get smaller files by manually setting block sizes. 29k with ALS, 70k with FLAC, and 220k with WavPack --pair-unassigned-chans (edit: fixed numbers)

Anyway, don't whine over people who make a little code-improvement project. The TTA reference implementation isn't good.
As for the format itself, there is one thing to it, that sometimes is useful in some applications: the encoded bit stream is unique. (Bar the encryption feature, which I haven't seen anything being able to play except command-line ffplay.) Would make it ideal for p2p distribution.

Re: libttaR (TTA rewrite part 2)

Reply #14
I do not whine over people, I whine over their claims.
Please remove my account from this forum.

Re: libttaR (TTA rewrite part 2)

Reply #15
just pushed a new version
here are some benchmarks with the same system as last time

 Performance counter stats for 'ttaR encode mtv.wav -o /dev/null -Sq' (100 runs):

          5,515.51 msec task-clock                #    1.000 CPUs utilized            ( +-  0.04% )
                17      context-switches          #    0.003 K/sec                    ( +-  2.25% )
                 0      cpu-migrations            #    0.000 K/sec                    ( +- 23.53% )
               249      page-faults               #    0.045 K/sec                    ( +-  0.04% )
    20,614,727,454      cycles                    #    3.738 GHz                      ( +-  0.03% )  (83.30%)
       414,162,884      stalled-cycles-frontend   #    2.01% frontend cycles idle     ( +-  0.27% )  (83.31%)
     7,251,118,011      stalled-cycles-backend    #   35.17% backend cycles idle      ( +-  0.07% )  (83.33%)
    49,521,631,830      instructions              #    2.40  insn per cycle
                                                  #    0.15  stalled cycles per insn  ( +-  0.00% )  (83.35%)
     3,926,932,639      branches                  #  711.980 M/sec                    ( +-  0.00% )  (83.37%)
       367,105,365      branch-misses             #    9.35% of all branches          ( +-  0.01% )  (83.33%)

           5.51610 +- 0.00197 seconds time elapsed  ( +-  0.04% )


 Performance counter stats for 'ttaR encode mtv.wav -o /dev/null -Mq' (800 runs):

          7,968.96 msec task-clock                #   15.448 CPUs utilized            ( +-  0.02% )
             1,640      context-switches          #    0.206 K/sec                    ( +-  1.04% )
               173      cpu-migrations            #    0.022 K/sec                    ( +-  1.67% )
             6,016      page-faults               #    0.755 K/sec                    ( +-  0.03% )
    29,712,858,608      cycles                    #    3.729 GHz                      ( +-  0.02% )  (83.12%)
       434,875,989      stalled-cycles-frontend   #    1.46% frontend cycles idle     ( +-  0.97% )  (83.18%)
     4,780,472,493      stalled-cycles-backend    #   16.09% backend cycles idle      ( +-  0.05% )  (83.30%)
    49,558,892,226      instructions              #    1.67  insn per cycle
                                                  #    0.10  stalled cycles per insn  ( +-  0.00% )  (83.43%)
     3,937,732,253      branches                  #  494.134 M/sec                    ( +-  0.00% )  (83.57%)
       377,188,832      branch-misses             #    9.58% of all branches          ( +-  0.00% )  (83.39%)

          0.515850 +- 0.000214 seconds time elapsed  ( +-  0.04% )


 Performance counter stats for 'ttaR decode mtv.tta -o /dev/null -Sqfraw' (100 runs):

          5,537.05 msec task-clock                #    1.000 CPUs utilized            ( +-  0.04% )
                32      context-switches          #    0.006 K/sec                    ( +-  8.36% )
                 0      cpu-migrations            #    0.000 K/sec                    ( +- 17.28% )
               249      page-faults               #    0.045 K/sec                    ( +-  0.04% )
    20,687,537,120      cycles                    #    3.736 GHz                      ( +-  0.03% )  (83.30%)
       414,159,293      stalled-cycles-frontend   #    2.00% frontend cycles idle     ( +-  1.13% )  (83.32%)
     7,695,566,474      stalled-cycles-backend    #   37.20% backend cycles idle      ( +-  0.08% )  (83.34%)
    52,204,014,192      instructions              #    2.52  insn per cycle
                                                  #    0.15  stalled cycles per insn  ( +-  0.00% )  (83.35%)
     3,882,320,752      branches                  #  701.154 M/sec                    ( +-  0.00% )  (83.36%)
       369,304,146      branch-misses             #    9.51% of all branches          ( +-  0.01% )  (83.33%)

           5.53776 +- 0.00199 seconds time elapsed  ( +-  0.04% )


 Performance counter stats for 'ttaR decode mtv.tta -o /dev/null -Mqfraw' (800 runs):

          8,260.12 msec task-clock                #   15.772 CPUs utilized            ( +-  0.05% )
             2,543      context-switches          #    0.308 K/sec                    ( +-  0.42% )
                49      cpu-migrations            #    0.006 K/sec                    ( +-  1.78% )
             6,196      page-faults               #    0.750 K/sec                    ( +-  0.00% )
    30,810,333,718      cycles                    #    3.730 GHz                      ( +-  0.05% )  (83.07%)
       432,967,316      stalled-cycles-frontend   #    1.41% frontend cycles idle     ( +-  1.10% )  (83.15%)
     4,828,446,714      stalled-cycles-backend    #   15.67% backend cycles idle      ( +-  0.06% )  (83.32%)
    52,263,391,831      instructions              #    1.70  insn per cycle
                                                  #    0.09  stalled cycles per insn  ( +-  0.00% )  (83.47%)
     3,895,307,363      branches                  #  471.580 M/sec                    ( +-  0.00% )  (83.60%)
       377,341,389      branch-misses             #    9.69% of all branches          ( +-  0.03% )  (83.39%)

          0.523718 +- 0.000282 seconds time elapsed  ( +-  0.05% )

Re: libttaR (TTA rewrite part 2)

Reply #16
progress report

 Performance counter stats for 'ttaR encode mtv.wav -o /dev/null -Sq' (100 runs):

          4,821.56 msec task-clock                #    1.000 CPUs utilized            ( +-  0.03% )
                12      context-switches          #    0.003 K/sec                    ( +- 12.90% )
                 1      cpu-migrations            #    0.000 K/sec                    ( +- 11.55% )
               250      page-faults               #    0.052 K/sec                    ( +-  0.05% )
    18,020,919,189      cycles                    #    3.738 GHz                      ( +-  0.02% )  (83.31%)
       241,875,546      stalled-cycles-frontend   #    1.34% frontend cycles idle     ( +-  0.02% )  (83.31%)
     9,735,938,921      stalled-cycles-backend    #   54.03% backend cycles idle      ( +-  0.03% )  (83.31%)
    41,858,787,347      instructions              #    2.32  insn per cycle
                                                  #    0.23  stalled cycles per insn  ( +-  0.00% )  (83.34%)
     2,721,401,687      branches                  #  564.423 M/sec                    ( +-  0.00% )  (83.37%)
       202,385,457      branch-misses             #    7.44% of all branches          ( +-  0.01% )  (83.35%)

           4.82211 +- 0.00138 seconds time elapsed  ( +-  0.03% )


 Performance counter stats for 'ttaR encode mtv.wav -o /dev/null -Mq' (800 runs):

          6,641.20 msec task-clock                #   15.177 CPUs utilized            ( +-  0.03% )
             1,625      context-switches          #    0.245 K/sec                    ( +-  1.25% )
               198      cpu-migrations            #    0.030 K/sec                    ( +-  1.54% )
             4,562      page-faults               #    0.687 K/sec                    ( +-  0.04% )
    24,782,070,565      cycles                    #    3.732 GHz                      ( +-  0.03% )  (83.03%)
       301,274,442      stalled-cycles-frontend   #    1.22% frontend cycles idle     ( +-  1.59% )  (83.11%)
     5,459,187,296      stalled-cycles-backend    #   22.03% backend cycles idle      ( +-  0.05% )  (83.32%)
    41,895,374,871      instructions              #    1.69  insn per cycle
                                                  #    0.13  stalled cycles per insn  ( +-  0.00% )  (83.52%)
     2,731,176,529      branches                  #  411.247 M/sec                    ( +-  0.00% )  (83.66%)
       209,633,049      branch-misses             #    7.68% of all branches          ( +-  0.00% )  (83.35%)

          0.437586 +- 0.000284 seconds time elapsed  ( +-  0.06% )


 Performance counter stats for 'ttaR decode mtv.tta -o /dev/null -Sqfraw' (100 runs):

          4,949.68 msec task-clock                #    1.000 CPUs utilized            ( +-  0.03% )
                12      context-switches          #    0.002 K/sec                    ( +-  8.32% )
                 1      cpu-migrations            #    0.000 K/sec                    ( +- 17.73% )
               249      page-faults               #    0.050 K/sec                    ( +-  0.04% )
    18,498,709,968      cycles                    #    3.737 GHz                      ( +-  0.03% )  (83.31%)
       243,117,545      stalled-cycles-frontend   #    1.31% frontend cycles idle     ( +-  0.03% )  (83.32%)
     9,996,864,891      stalled-cycles-backend    #   54.04% backend cycles idle      ( +-  0.07% )  (83.33%)
    49,377,194,009      instructions              #    2.67  insn per cycle
                                                  #    0.20  stalled cycles per insn  ( +-  0.00% )  (83.34%)
     2,883,743,681      branches                  #  582.612 M/sec                    ( +-  0.00% )  (83.36%)
       225,043,953      branch-misses             #    7.80% of all branches          ( +-  0.02% )  (83.34%)

           4.95021 +- 0.00155 seconds time elapsed  ( +-  0.03% )


 Performance counter stats for 'ttaR decode mtv.tta -o /dev/null -Mqfraw' (800 runs):

          7,594.67 msec task-clock                #   15.713 CPUs utilized            ( +-  0.02% )
             2,339      context-switches          #    0.308 K/sec                    ( +-  1.25% )
               107      cpu-migrations            #    0.014 K/sec                    ( +-  1.76% )
             4,760      page-faults               #    0.627 K/sec                    ( +-  0.01% )
    28,358,801,073      cycles                    #    3.734 GHz                      ( +-  0.02% )  (83.12%)
       317,697,227      stalled-cycles-frontend   #    1.12% frontend cycles idle     ( +-  0.78% )  (83.16%)
     6,392,451,696      stalled-cycles-backend    #   22.54% backend cycles idle      ( +-  0.04% )  (83.27%)
    49,420,586,080      instructions              #    1.74  insn per cycle
                                                  #    0.13  stalled cycles per insn  ( +-  0.00% )  (83.40%)
     2,894,376,414      branches                  #  381.106 M/sec                    ( +-  0.00% )  (83.60%)
       231,457,120      branch-misses             #    8.00% of all branches          ( +-  0.01% )  (83.44%)

          0.483339 +- 0.000121 seconds time elapsed  ( +-  0.02% )

Re: libttaR (TTA rewrite part 2)

Reply #17
Any Windows builds?
yep!

Re: libttaR (TTA rewrite part 2)

Reply #18
recently updated; performance is a little better now
i think i'll stop posting benchmarks

Any Windows builds?

i've been looking into this.
after commenting out some and adding a little, i can get it to compile with llvm-mingw and run with wine, but it produces a corrupted output. there is probably something else nonportable that i missed. the good news is that single & multi threaded produce the same result.
i'll continue looking into it.


Re: libttaR (TTA rewrite part 2)

Reply #20
just pushed some x86-64-v3 (AVX2 capable) SIMD intrinsics
decoding is now much faster (encoding a little too)
especially multi-threaded decoding

Re: libttaR (TTA rewrite part 2)

Reply #21
i "downgraded" to x86-64-v2, because the one AVX2 that was used was slower that several SSE2
risc ftw?