https://github.com/stseelig/libttaR
good news, bad news
good news:
i've been working on a multi-threaded version. the encoder is already done in the 1.1-dev branch
bad news:
the tta wikipedia page was deleted
1.1 is now nearly finished. There is just some minor stuff for me to get at. I plan to officially release it on the first of next month.
here is a benchmark against ffmpeg:
system:
Linux 5.10.0-23-amd64 SMP Debian 5.10.179-1 (2023-05-12) x86_64 GNU/Linux
CPU:
AMD Ryzen 7 1700 64-bit 8-Core MT MCP 1550/3750 MHz
ffmpeg:
ffmpeg version 4.3.6-0+deb11u1
built with gcc 10 (Debian 10.2.1-6)
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
ttaR:
Debian clang version 11.0.1-2
-march=native -mtune=native
file:
Nirvana - MTV Unplugged in New York
571084460 mtv.wav
337981896 mtv.tta
##############################################################################
Performance counter stats for 'ffmpeg -loglevel quiet -threads 1 -i mtv.wav -f tta /dev/null -y' (20 runs):
8,925.80 msec task-clock # 1.000 CPUs utilized ( +- 0.06% )
60 context-switches # 0.007 K/sec ( +- 5.24% )
16 cpu-migrations # 0.002 K/sec ( +- 2.69% )
86,798 page-faults # 0.010 M/sec ( +- 0.00% )
33,302,059,704 cycles # 3.731 GHz ( +- 0.05% ) (83.33%)
566,286,522 stalled-cycles-frontend # 1.70% frontend cycles idle ( +- 0.54% ) (83.33%)
3,988,341,464 stalled-cycles-backend # 11.98% backend cycles idle ( +- 0.18% ) (83.33%)
65,718,741,261 instructions # 1.97 insn per cycle
# 0.06 stalled cycles per insn ( +- 0.00% ) (83.33%)
8,702,525,883 branches # 974.986 M/sec ( +- 0.01% ) (83.34%)
384,955,890 branch-misses # 4.42% of all branches ( +- 0.03% ) (83.33%)
8.92647 +- 0.00557 seconds time elapsed ( +- 0.06% )
Performance counter stats for 'ffmpeg -loglevel quiet -threads 1 -i mtv.tta -f s16le /dev/null -y' (20 runs):
6,504.99 msec task-clock # 1.000 CPUs utilized ( +- 0.13% )
55 context-switches # 0.008 K/sec ( +- 2.46% )
16 cpu-migrations # 0.002 K/sec ( +- 2.49% )
3,692 page-faults # 0.567 K/sec ( +- 0.04% )
24,289,757,929 cycles # 3.734 GHz ( +- 0.13% ) (83.31%)
327,298,120 stalled-cycles-frontend # 1.35% frontend cycles idle ( +- 0.15% ) (83.31%)
3,454,321,337 stalled-cycles-backend # 14.22% backend cycles idle ( +- 0.20% ) (83.33%)
67,039,367,957 instructions # 2.76 insn per cycle
# 0.05 stalled cycles per insn ( +- 0.00% ) (83.35%)
8,499,223,928 branches # 1306.569 M/sec ( +- 0.00% ) (83.36%)
313,478,808 branch-misses # 3.69% of all branches ( +- 0.01% ) (83.34%)
6.50534 +- 0.00866 seconds time elapsed ( +- 0.13% )
Performance counter stats for 'ffmpeg -loglevel quiet -threads 16 -i mtv.tta -f s16le /dev/null -y' (320 runs):
9,947.58 msec task-clock # 12.309 CPUs utilized ( +- 0.09% )
4,450 context-switches # 0.447 K/sec ( +- 0.37% )
1,041 cpu-migrations # 0.105 K/sec ( +- 1.04% )
8,380 page-faults # 0.842 K/sec ( +- 0.00% )
36,296,805,088 cycles # 3.649 GHz ( +- 0.02% ) (83.37%)
562,776,161 stalled-cycles-frontend # 1.55% frontend cycles idle ( +- 0.15% ) (83.13%)
5,056,980,289 stalled-cycles-backend # 13.93% backend cycles idle ( +- 0.02% ) (83.21%)
67,127,958,734 instructions # 1.85 insn per cycle
# 0.08 stalled cycles per insn ( +- 0.00% ) (83.30%)
8,518,827,278 branches # 856.372 M/sec ( +- 0.00% ) (83.43%)
317,537,726 branch-misses # 3.73% of all branches ( +- 0.00% ) (83.57%)
0.80817 +- 0.00137 seconds time elapsed ( +- 0.17% )
##############################################################################
Performance counter stats for 'ttaR encode -q -S mtv.wav -o /dev/null' (20 runs):
5,697.67 msec task-clock # 1.000 CPUs utilized ( +- 0.10% )
13 context-switches # 0.002 K/sec ( +- 14.11% )
0 cpu-migrations # 0.000 K/sec ( +- 31.26% )
252 page-faults # 0.044 K/sec ( +- 0.10% )
21,277,212,273 cycles # 3.734 GHz ( +- 0.08% ) (83.32%)
424,132,363 stalled-cycles-frontend # 1.99% frontend cycles idle ( +- 1.19% ) (83.32%)
8,164,992,255 stalled-cycles-backend # 38.37% backend cycles idle ( +- 0.17% ) (83.33%)
50,036,808,248 instructions # 2.35 insn per cycle
# 0.16 stalled cycles per insn ( +- 0.00% ) (83.34%)
3,929,444,278 branches # 689.658 M/sec ( +- 0.01% ) (83.36%)
371,064,057 branch-misses # 9.44% of all branches ( +- 0.01% ) (83.34%)
5.69822 +- 0.00548 seconds time elapsed ( +- 0.10% )
Performance counter stats for 'ttaR encode -q -t16 mtv.wav -o /dev/null' (320 runs):
8,177.73 msec task-clock # 15.379 CPUs utilized ( +- 0.02% )
1,300 context-switches # 0.159 K/sec ( +- 2.07% )
203 cpu-migrations # 0.025 K/sec ( +- 2.01% )
5,834 page-faults # 0.713 K/sec ( +- 0.04% )
30,484,625,770 cycles # 3.728 GHz ( +- 0.02% ) (83.09%)
439,889,964 stalled-cycles-frontend # 1.44% frontend cycles idle ( +- 0.38% ) (83.18%)
4,980,516,390 stalled-cycles-backend # 16.34% backend cycles idle ( +- 0.04% ) (83.31%)
50,068,984,657 instructions # 1.64 insn per cycle
# 0.10 stalled cycles per insn ( +- 0.00% ) (83.46%)
3,937,619,679 branches # 481.505 M/sec ( +- 0.00% ) (83.60%)
380,192,849 branch-misses # 9.66% of all branches ( +- 0.00% ) (83.36%)
0.531753 +- 0.000344 seconds time elapsed ( +- 0.06% )
Performance counter stats for 'ttaR decode -q -S mtv.tta -f raw -o /dev/null' (20 runs):
5,807.10 msec task-clock # 1.000 CPUs utilized ( +- 0.12% )
15 context-switches # 0.003 K/sec ( +- 9.58% )
1 cpu-migrations # 0.000 K/sec ( +- 25.36% )
207 page-faults # 0.036 K/sec ( +- 0.10% )
21,697,968,813 cycles # 3.736 GHz ( +- 0.12% ) (83.29%)
409,444,598 stalled-cycles-frontend # 1.89% frontend cycles idle ( +- 0.06% ) (83.33%)
8,348,408,571 stalled-cycles-backend # 38.48% backend cycles idle ( +- 0.30% ) (83.35%)
53,311,974,255 instructions # 2.46 insn per cycle
# 0.16 stalled cycles per insn ( +- 0.00% ) (83.35%)
3,884,628,982 branches # 668.945 M/sec ( +- 0.00% ) (83.36%)
370,081,414 branch-misses # 9.53% of all branches ( +- 0.01% ) (83.32%)
5.80763 +- 0.00684 seconds time elapsed ( +- 0.12% )
Performance counter stats for 'ttaR decode -q -t16 mtv.tta -f raw -o /dev/null' (320 runs):
8,396.63 msec task-clock # 15.660 CPUs utilized ( +- 0.02% )
2,705 context-switches # 0.322 K/sec ( +- 0.61% )
70 cpu-migrations # 0.008 K/sec ( +- 1.97% )
4,820 page-faults # 0.574 K/sec ( +- 0.00% )
31,341,341,315 cycles # 3.733 GHz ( +- 0.02% ) (83.16%)
404,497,658 stalled-cycles-frontend # 1.29% frontend cycles idle ( +- 1.11% ) (83.19%)
4,983,844,918 stalled-cycles-backend # 15.90% backend cycles idle ( +- 0.03% ) (83.27%)
53,368,851,871 instructions # 1.70 insn per cycle
# 0.09 stalled cycles per insn ( +- 0.00% ) (83.37%)
3,896,895,024 branches # 464.102 M/sec ( +- 0.00% ) (83.55%)
378,556,224 branch-misses # 9.71% of all branches ( +- 0.00% ) (83.45%)
0.536170 +- 0.000363 seconds time elapsed ( +- 0.07% )
Using (old version) ffmpeg of cli tool for performance comparison is flawed and extremely biased.
Also current ffmpeg cli tool have bad performance with smaller packets due to clumsy MT work of ex-developer.
Also just to run generic build of ffmpeg for the first time takes extra time... That is just few points I wanted to emphasize how such comparison is unfair and biased.
Also this repo does not have any SIMD x86 assembly and the only way performance can go up if compiler is modern clang.
Using (old version) ffmpeg of cli tool for performance comparison is flawed and extremely biased.
The tta codec version should be the same between the version I used and the current version, or at least the differences are negligible. You are always welcome to post your own benchmarks.
Also current ffmpeg cli tool have bad performance with smaller packets due to clumsy MT work of ex-developer.
ok
Also just to run generic build of ffmpeg for the first time takes extra time
That cannot be more than a millisecond.
Also this repo does not have any SIMD x86 assembly and the only way performance can go up if compiler is modern clang.
If anything, that is a virtue.
FYI, most of the performance increases came from optimizing the rice coder, which is completely unSIMDable
smaller packets, wouldn't TTA be fixed at about a second - that's not small?
Anyway, ffmpeg 5 isn't the newest and hottest either ... though, compared to this article:
the tta wikipedia page was deleted
Been flagged as looking like an ad for twelve years ... and not too wrongfully either.
smaller packets, wouldn't TTA be fixed at about a second - that's not small?
yes, with CD quality audio, the framesize is 180KiB.
1.1 officially released
lib:
api changes
performance improvements
cli:
multi-threading
faster single-threading
some bug fixes
Still extremely misleading and biased, proper benchmark are done by decoding very long audio files and looking at speed of decoding versus realtime. Also TTA is more irrelevant and niche than TAK, and TAK actually did have some cool new innovative stuff to offer, TTA have nothing new to offer.
smaller packets, wouldn't TTA be fixed at about a second - that's not small?
yes, with CD quality audio, the framesize is 180KiB.
Not decoded framesize, but encoded ones as stored in final output, if audio is mostly silence it will slow thing down unless TTA encode silence frames extremely inefficiently which may be true after all considering it poor design decisions.
Not decoded framesize, but encoded ones as stored in final output, if audio is mostly silence it will slow thing down unless TTA encode silence frames extremely inefficiently which may be true after all considering it poor design decisions.
so ffmpeg gets slower when it has to read less? what a well designed piece of software
and yes, tta is not that efficient with silence compared to most other codecs
Indeed, 60 seconds of silence (44100Hz S16) encoded with ffmpeg: tta is 654K and flac is 17K.
I tested the compression of silence: https://hydrogenaud.io/index.php/topic,122413.0.html . (Post says "5.1" but it was five channels.)
TTA isn't that bad for a codec that doesn't handle wasted bits - compare to Monkey's and ALAC.
I could get smaller files by manually setting block sizes. 29k with ALS, 70k with FLAC, and 220k with WavPack --pair-unassigned-chans (edit: fixed numbers)
Anyway, don't whine over people who make a little code-improvement project. The TTA reference implementation isn't good.
As for the format itself, there is one thing to it, that sometimes is useful in some applications: the encoded bit stream is unique. (Bar the encryption feature, which I haven't seen anything being able to play except command-line ffplay.) Would make it ideal for p2p distribution.
I do not whine over people, I whine over their claims.