HydrogenAudio

Lossless Audio Compression => Lossless / Other Codecs => Topic started by: TBeck on 2010-12-11 20:45:04

Title: TAK 2.1.0 - Beta release 3
Post by: TBeck on 2010-12-11 20:45:04
Beta release 3 of TAK 2.1.0 ((T)om's lossless (A)udio (K)ompressor)

It consists of:

- TAK Applications 2.1.0 Beta 3 b
- TAK Winamp plugin 2.1.0 Beta 3
- TAK Decoding library 2.1.0 Beta 3

The final release will additionally contain the SDK.

Download:

Download link removed.

The final version has been released: TAK 2.1.0 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=86037&view=findpost&p=738553)

What's new in Beta 3 b

This release brings only some minor fixes and modifications.

Bug fixes:

- Some part of the encoder comes in to versions: One for Single core, one for Multi core encoding. If you specify -tn1 on the command line, the single core version is beeing used, otherwises the multi core version. But if you didn't specify -tn# at all, the previos version of Takc was using the multi core version with only one thread. On systems with more than 1 core this setting will often be faster than the true single core version. Therefore speed comparisons of 'takc -tn1' vs. 'takc -tn2 (or more)' were valid, but 'takc' vs. 'takc -tn2 (or more)' were not. On my system the difference is about 3 percent. This is also relevant for comparisons of V2.0 and V2.1 Beta 3. If you have more than one core and didn't specify -tn1, the speed advantage of for instance the new SSSE3 optimizations has been overestimated.

Modifications:

- Added the -cpu# switch to the command line version, which lets you control some cpu optimizations.
- Some additions to the application's ReadMe.

What's new in Beta 3

This release brings speed optimizations (new in Beta 2) and multi core support (new in Beta 3) for the encoder and adds a new user selectable codec, which significantly improves the compression efficiency of LossyWav-processed files. Files compressed with this codec can not be decoded by earlier versions of Tak, Takc, in_tak and tak_deco_lib! The default codec remains unchanged und is therefore backwards compatible to TAK V2.0.0.

Improvements:

- Encoding speed improvements of about 10 to 20 percent (depends on preset and cpu) for cpus with the SSSE3 instruction set. Since SSSE3 (note the three 'S') isn't supported by AMD, only intel users will benefit from those optimizations.
- The encoder now creates up to four threads to utilize multiple cpu cores. Specify the thread number in the General Options dialog of the GUI-version or with the -tn option of the command line version. By default only one thread is created. You will only notice a speed up, if the encoding speed isn't already limited by the performance of your drives.
- New additional codec that improves the compression efficiency of LossyWav-processed files by up to about 2 percent (relative to the original file size) for the quality setting -q5.0 (less or more for other settings). It supports any block size that is an integer multiple of 256 samples. Please don't specify the -fsl512 option at the command line. While this was required for the standard codec, it will severily hurt the performance of the new dedicated LossyWav-codec. Another advantage of the new codec: You will not loose much compression if LossyWav deceides to remove no bits, as can happen with for instance some low amplitude files with little signal complexity. Simply specify -cLW at the command line to activate the new codec. Earlier it wasn't advantegous to use presets higher than -p2m when encoding LossyWav-Files. That's no longer true, you may even benefit from -p4m.

Modifications:

- The file info function now also shows the name of the codec used to compress the file. The new codec is called "3 LossyWav (TAK 2.1)".
- Moved the verify-option from the details-dialog to the general compression options dialog.
- All dialogs with an Add-files-option locked the source folder until the dialog was closed. Hopefully this is no longer the case (new in Beta 3).

Known issues:

- If you use pipe decoding and the application reading the pipe is beeing terminated before the whole file has been read, TAKC may get into an endless loop and has to be manually killed with the task manager. I don't think this is a big issue but i will try to fix it in one of the next versions. BTW: Big thanks to shnutils for testing the pipe decoding!
- There seem to be some compatibility issues with pipe decoding to some other applications ("crc1632.exe" has been reported). I will try to fix it in the next release.

Results

Below i will present some test results illustrating the compression efficiency and speed improvements of V2.1.

All test have been conducted with a Pentium Dual Core E5200, 2.5 GHz, 45 nm. Disk IO was disabled respectively minimized by a cache. Speed values are beeing expressed as multiple of real time, compression values as the proportion compressed / uncompressed file size. The test corpus consisted of 46 files.

Speed results for the encoder optimizations (SSSE3)

Code: [Select]
        V 2.0     V 2.1     Improvement

-p0     360,19    388,27      7,80%
-p1     276,29    319,08     15,49%
-p2     192,64    229,08     18,92%
-p3      90,60    107,72     18,90%
-p4      50,30     57,32     13,96%
-p4m     22,28     25,58     14,81%

You can find some user reports in the Beta 2 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=85390&view=findpost&p=734097) thread.

Speed results for the multi core encoder
         
Code: [Select]
      1 Thread  2 Threads   Improvement

-p0     388,27    736,27     89,63%
-p1     319,08    613,12     92,15%
-p2     229,08    445,95     94,67%
-p3     107,72    212,32     97,10%
-p4      57,32    113,73     98,41%
-p4m     25,58     50,43     97,15%


Results for the LossyWav-codec

First a comparison of different codecs and LossyWav quality settings.

Compression efficiency for various LossyWav quality settings

Code: [Select]
         FLAC 1.2.1 TAK 2.0    TAK 2.1    Advantage over
         -8         -p2m       -p4m       FLAC       TAK 2.0

-q0.0    20,61      19,07      17,25       3,36       1,82
-q2.5    27,43      25,95      23,93       3,50       2,02
-q5.0    33,26      31,78      29,62       3,64       2,16
-q7.5    38,79      37,28      35,03       3,76       2,25

Sometimes LossyWav deceides not to remove any or only very few bits from a file. Then it can happen, that the LossyWav-Mode of the codec is less efficient then the standard mode. To test this i used a worst case scenario.
I compressed my test corpus with TAK's standard (-cStd) and LossyWav (-cLW) codec, but without prior processing with LossyWav.

Compression efficiency for unprocessed files

Code: [Select]
         TAK 2.1    TAK 2.1
         -cStd      -cLW       Loss

-p0      58,74      58,99      -0,25
-p1      57,84      57,73       0,11
-p2      56,90      57,00      -0,10
-p3      56,36      56,44      -0,08
-p4      56,02      56,06      -0,04
-p4m     55,88      55,97      -0,09

Since the presets of the 2 codecs are constructed slightly different, they are not directly comparable. But i think it is safe to say, that the average loss is usually not bigger than about 0.1 percent.

Encoding and decoding speed are close to the standard codec, therefore i conducted no tests.

Beta testing

The beta version has already gone through extensive testing performed by my automatic scripts. But i had no oppurtunity to test encoding with more than 2 cores. Bugs are possible! If you want to help, please make sure to first compress, then decompress and finally compare the decompressed files with the original files. It may not be sufficient to use -v (Verify) and -md5 (MD5-creation and validation) to reveal multi core encoder errrors!

Please try the beta release and report any bugs in this thread.

I would also be happy about tests of compression efficiency and speed.

This will be the last beta release unless bug fixes are necessary.

Thanks for testing and have fun

Thomas
Title: TAK 2.1.0 - Beta release 3
Post by: Bugs.Bunny on 2010-12-11 23:34:28
Tested the Beta 3 with my Intel Core i7 920 (4 cores Hyperthreading -> 8 CPUs in Task-Manager)

4 threads p4m
Compression: 51.83 %
Duration: 43.87 sec
Speed: 77.66 * real time
The Task-Manager showed a CPU usage of ~52% while encoding.

Decompress:
Duration: 15.51 sec
Speed: 219.71 * real time

Bit compared the wav files (original vs encoded->decoded) and they where all 100% identical.
Title: TAK 2.1.0 - Beta release 3
Post by: alvaro84 on 2010-12-12 09:26:39
I've just tested the new mulithreaded encoder with the 2 (65nm Conroe 4M) cores I have. I stored the source WAVs on a HDD and read them several times during the test so I'm pretty sure they were cached (it's a ~450MB test corpus, March over sky from dBu music, a Touhou remix album), the target was an SSD. My results are quite interesting:

1*1THD: Total encoding time: 1:04.771, 40.62x realtime (Single thread, single instance)
1*2THD: Total encoding time: 0:41.215, 63.83x realtime (Multithreaded, single instance - used the built-in multithreaded encoder)
2*2THD: Total encoding time: 0:46.160, 56.99x realtime (Multithreaded, two instances - used both the built-in and the foobar threading)
2*1THD: Total encoding time: 0:37.846, 69.52x realtime (Single thread, two instances - used -tn1 and two concurrent encoders at once)

It seems that in this particular case the old, external threading worked better than the new, built-in support. Note that it was not an I/O bound case, nothing depended on HDD head movement.
Title: TAK 2.1.0 - Beta release 3
Post by: Nowings69 on 2010-12-12 15:32:35
E7200 (2 threds SIMD SSE4.1)

TEST.wav uses P2 test mode with tak.exe

TAK 2.0.0(MMX) 
C: 56.94%           
D: 8.81sec           
S: 241.81x           


TAK 2.1.0(NONE)
C: 56.94%   
D: 19.98sec       
S: 107.17x       


TAK 2.1.0(MMX) 
C: 56.94%
D: 8.77sec 
S: 242.79x 


TAK 2.1.0(SSSE3)
C: 56.94% 
D: 7.58sec   
S: 281.13x 
 
TAK 2.1.0(SSSE3 -threds2)
C: 56.94%
D: 3.94sec
S: 540.70x


If takc.exe with fb2k
TAK 2.1.0(SSSE3) is same as tak.exe
But SSSE3 with 2 threds is 440x-480x

Anyway TAK's logo is nothing
I was going to design it like this
(http://img280.imagevenue.com/loc163/th_69187_01_122_163lo.jpg) (http://img280.imagevenue.com/img.php?image=th_69187_01_122_163lo.jpg)

but this is too cheap!
(http://img129.imagevenue.com/loc1058/th_69263_d_122_1058lo.jpg) (http://img129.imagevenue.com/img.php?image=th_69263_d_122_1058lo.jpg)
   
Title: TAK 2.1.0 - Beta release 3
Post by: Steve Forte Rio on 2010-12-12 18:04:13
Great update!

And now, can you add the switch for choosing what CPU's encoder must use?
It can be useful for dual core processors with hyperthreading technology (like Core i) - to prevent using of two threads of one physical CPU


Here is some strange results for my Core i3 530 (2,93 Ghz, 2 cores, 4 threads):

Code: [Select]
TAK 2.1.0 beta 3 -p4m -ihs

HT-on, -tn1  - 31.04x realtime

HT-on, -tn4  - 61.63x realtime

HT-on, -tn2  - 42.42x realtime

HT-off, -tn2  - 57.35x realtime


Look how a big difference is there between results for enabled and disabled Hyper Threading...
Title: TAK 2.1.0 - Beta release 3
Post by: jc3213 on 2010-12-13 04:47:56
i3 350M 2.26G Hyper-Threading running TAK 2.1.0 beta 3 with SSSE3

======================================================================================

Thread 4, Preset P2

CYNTHIA HARRELL - I AM THE WIND.wav  61.69%  466*

Compression:    61.69 %
Duration:        0.60 sec
Speed:          463.38 * real time

------------------------------------------------------------------------------------------------------

Thread 4, Preset P4M + MD5

CYNTHIA HARRELL - I AM THE WIND.wav  60.81%  56*

Compression:    60.81 %
Duration:        4.94 sec
Speed:          56.28 * real time

===================================================================================

Thread 1 , Preset P2

CYNTHIA HARRELL - I AM THE WIND.wav  61.69%  191*

Compression:    61.69 %
Duration:        1.45 sec
Speed:          191.06 * real time

------------------------------------------------------------------------------------------------------

Thread 1 , Preset P4M + MD5

CYNTHIA HARRELL - I AM THE WIND.wav  60.81%  23*

Compression:    60.81 %
Duration:        11.89 sec
Speed:          23.36 * real time

===================================================================================

This is amazing. Thanks for your hard working Thomas, this one is really awesome.

BUT,I wonder if the final release can identify the Multi-threading and SSSE3 automatically,and is able to save a user preset profile?It is very annoying because I have to change the OPTION + PRESET everytime I run TAK.Please make these possible.

Sorry for my bad English.
Title: TAK 2.1.0 - Beta release 3
Post by: TBeck on 2010-12-13 16:01:18
Thank you all for testing!

4 threads p4m
The Task-Manager showed a CPU usage of ~52% while encoding.

Seems as if -p4m here is already too fast to take advantage of more than 2 threads. Maybe it's time to make -p4m a bit stronger (and slower)...

1*2THD: Total encoding time: 0:41.215, 63.83x realtime (Multithreaded, single instance - used the built-in multithreaded encoder)
2*1THD: Total encoding time: 0:37.846, 69.52x realtime (Single thread, two instances - used -tn1 and two concurrent encoders at once)

It seems that in this particular case the old, external threading worked better than the new, built-in support. Note that it was not an I/O bound case, nothing depended on HDD head movement.

If takc.exe with fb2k
TAK 2.1.0(SSSE3) is same as tak.exe
But SSSE3 with 2 threds is 440x-480x

Well, that's not optimal. But i am unsure, if i can improve it. There can be so much interaction if 2 programs (TAK and FOOBAR) are running simultaneously. I may try one or two modifications, but not for this release.

TAK 2.1.0(NONE)
S: 107.17x       

TAK 2.1.0(MMX) 
S: 242.79x

Interesting. It's been quite some time since i evaluated the speed of NONE. NONE is using plain pascal code, which is not even really optimized. It mainly serves as a reference to check the assembler optimizations against.

Anyway TAK's logo is nothing.

Which logo? TAK still lacks one...

And now, can you add the switch for choosing what CPU's encoder must use?
It can be useful for dual core processors with hyperthreading technology (like Core i) - to prevent using of two threads of one physical CPU
...
Look how a big difference is there between results for enabled and disabled Hyper Threading...

I am just working on it, but i think you will have to wait until V2.1.1 for a solution. That because a lot of testing on different systems may be required.

BUT,I wonder if the final release can identify the Multi-threading and SSSE3 automatically,and is able to save a user preset profile?It is very annoying because I have to change the OPTION + PRESET everytime I run TAK.Please make these possible.

If available, SSSE3 is selected automatically. I am unsure, if the same should be done with multi-threading. If and how many threads are advantegous may depend on several system and setup specific factors, which TAK can't determine.

Currently i don't intend to implement a configuration file. Maybe later. Can you possibly use foobar for encoding?
Title: TAK 2.1.0 - Beta release 3
Post by: Steve Forte Rio on 2010-12-13 19:01:28
Sorry, can't find the answer here.

is it possible to disable SSSE3 optimization for encoding?
Title: TAK 2.1.0 - Beta release 3
Post by: alvaro84 on 2010-12-13 20:23:40
Well, that's not optimal. But i am unsure, if i can improve it. There can be so much interaction if 2 programs (TAK and FOOBAR) are running simultaneously. I may try one or two modifications, but not for this release.


Now that you say - it makes perfect sense. Maybe I should test it outside foobar? But this is the 'real life' way I use TAK
Title: TAK 2.1.0 - Beta release 3
Post by: TBeck on 2010-12-13 20:32:03
Sorry, can't find the answer here.

is it possible to disable SSSE3 optimization for encoding?

If you are referring to the command line version: Good question! And i have to apologize... There is no switch to control cpu optimizations. It must have vanished sometime. If it ever existed, i am not really sure. But surely i could add such a switch.

Well, i have to to take stock of myself (a new item from my dictionary), to deceide, if i want to release another beta, which accomplishes some of the more recent user demands. Unfortunately too many betas may create a bad impression for irregular readers. I have to think about it.
Title: TAK 2.1.0 - Beta release 3
Post by: TBeck on 2010-12-13 20:45:18
Well, that's not optimal. But i am unsure, if i can improve it. There can be so much interaction if 2 programs (TAK and FOOBAR) are running simultaneously. I may try one or two modifications, but not for this release.

Now that you say - it makes perfect sense. Maybe I should test it outside foobar? But this is the 'real life' way I use TAK

I am not sure, but possibly your setting already provides optimum performance. TAK itself most likely will never process multiple files simultaneously, but maybe exactly this leads to maximum performance on many systems. Probably a bit disappointing for you. Possibly TAK's multi-core-encoding is only advantegous for less sophisticated users (i am sure, there are many, i myself can count me among them when using some other applications), who don't know how to setup foobar for optimum performance.
Title: TAK 2.1.0 - Beta release 3
Post by: Bugs.Bunny on 2010-12-13 20:47:14
Seems as if -p4m here is already too fast to take advantage of more than 2 threads. Maybe it's time to make -p4m a bit stronger (and slower)...

A bit stronger -p4m would be interesting. But also it would be interesting to have an option for 8 threads. Since my CPU has got 4 hyperthreading cores = 8 threads it would be interesting to see how much 8 threads would raise the CPU usage on my system.
I've seen some multithreading applications that nearly fill my system to 100%.

Btw. I've got two SSDs installed in my PC: an Intel SLC drive (System) and a Vertex MLC (Data). For my encoding/decoding test I did read the files from one drive and write them to the other. The drives should not be a limiting factor...

Tried P2 and P0 with 4 threads now also:
P2
Compression:    52.82 %
Duration:        5.45 sec
Speed:          625.49 * real time

P0
Compression:    54.15 %
Duration:        3.67 sec
Speed:          929.00 * real time
Title: TAK 2.1.0 - Beta release 3
Post by: Nowings69 on 2010-12-13 23:27:54
Which logo? TAK still lacks one...

This one?
(http://img229.imagevenue.com/loc548/th_82399_01_122_548lo.jpg) (http://img229.imagevenue.com/img.php?image=th_82399_01_122_548lo.jpg)

Thanks always nice one to all

Title: TAK 2.1.0 - Beta release 3
Post by: alvaro84 on 2010-12-14 00:48:27
I am not sure, but possibly your setting already provides optimum performance. TAK itself most likely will never process multiple files simultaneously, but maybe exactly this leads to maximum performance on many systems. Probably a bit disappointing for you. Possibly TAK's multi-core-encoding is only advantegous for less sophisticated users (i am sure, there are many, i myself can count me among them when using some other applications), who don't know how to setup foobar for optimum performance.


I think that processing one single file via multiple threads could be better in the aforementioned 'real life' circumstances. The way I measured was optimized to show the speed differences in the runs, but real media is rarely cached, especially when there's more of it. Processing two (or more!) files on HDDs simultaneously would lead more head movement and therefore could be much slower. I experienced that mere 2 encoders reading and writing concurrently a single hard drive can be a serious bottleneck, and it can be effectively alleviated by processing the same file in parallel so the encoder can do linear reads/writes. I'll do a test on uncached data if someone don't outrun me (which can be quite easy as I won't possibly turn on my home PC until Friday evening and even then I may won't run any tests in a bad sleep deprivation I'll have by then.)

It seems there's room to improve my test methods too
We need to eliminate the drives as limiting factors to measure what the encoder is capable of - yet we need to have that bottleneck to see the usefulness of the changes you just made. Two different things.

(Yup, I'd really like to go all SSD, they're cool and silent, but even if terabyte SSDs exist (do they?), they cost an arm and a leg  So HDDs are still a reality.)
Title: TAK 2.1.0 - Beta release 3
Post by: hellokeith on 2010-12-14 07:46:18
Thomas,

Anything specifically you would like to see tested on 4 cores (no HT)? I have no idea about tak switches, but I'm willing to carry out any test you seek, with your instruction.  I've many albums in lossless format I can convert to wav and then test with tak.
Title: TAK 2.1.0 - Beta release 3
Post by: jc3213 on 2010-12-14 11:38:02
Thanks for the answer,but I'd prefer application rather than foobar/winamp cause I don't know whether the parameters is correct,because I don't know where to find a complete manual about TAKC parameters.(sorry)

EDIT:X264 encoder is able to determine multi-thread if use the parament "--threads auto" ,and the number of  threads can be managed by using "--threads X"(X = number of the thread)

EDIT2:Maybe you can add two command line parameters: "-mmx" to enable MMX support,and "-ssse3" to enable SSSE3 support.

Moderation: removed unnecessary large quote.
Title: TAK 2.1.0 - Beta release 3
Post by: jc3213 on 2010-12-14 12:24:01
Phenom II x4 905e 2.2GHz, running application.

=========================================================

Thread 4 Preset P2

Priere -プリエール-.wav              59.80%  290*

Compression:    59.80 %
Duration:        1.31 sec
Speed:          288.66 * real time

--------------------------------------------------------------------------

Thread 1 Preset P2

Priere -プリエール-.wav              59.80%  52*

Compression:    59.80 %
Duration:        7.23 sec
Speed:          52.16 * real time

=========================================================

Thread 4 P4M

Priere -プリエール-.wav              58.97%  66*

Compression:    58.97 %
Duration:        5.69 sec
Speed:          66.27 * real time

--------------------------------------------------------------------------

Thread 1 P4M

Priere -プリエール-.wav              58.97%    8*

Compression:    58.97 %
Duration:        47.75 sec
Speed:            7.89 * real time

=========================================================

My friend's AMD quad-core processor gives more than 400% boost up @ oreset 2 and more than 800% boost up @ preset 4m.Is there any thing wrong??
Title: TAK 2.1.0 - Beta release 3
Post by: TBeck on 2010-12-14 16:18:19
And now, can you add the switch for choosing what CPU's encoder must use?
It can be useful for dual core processors with hyperthreading technology (like Core i) - to prevent using of two threads of one physical CPU
...
Look how a big difference is there between results for enabled and disabled Hyper Threading...

I am just working on it, but i think you will have to wait until V2.1.1 for a solution. That because a lot of testing on different systems may be required.

After accomplishing the code i thought a bit more about the issue (wrong order: code first then ask...). And i deceided not to add an option to let the user select cores respectively to let the codec automatically select only physical cores.

Let me explain why:

First: It's no surprise that Hyperthreading doesn't work well for the codec. Hyperthreading can only be advantegous, if the threads are doing different things, or more specifically using different execution units of the physical core (for instance Integer-, floating point or MMX-calculations). But the codec threads will most of the time require the same execution ports. So you only get a penalty for the overhead of the hyperthreading management.

You could use Window's SetProcessAffinityMask() function to restrict TAK to only physical cores (that's what my new code would do). But this is generally considered as bad practice. One reasonable example: Two different applications (for instance TAK and foobar,) are using the same method. Both then would be bound to the same physical cpu's although they possibly are using different execution units and would benefit from hyperthreading.

Ok, a quite sophisticated user who knows what he is doing would choose a proper configuration to optimize his system for this situation. But what about the average user? I am sure that as soon as i add a '-UseOnlyPhysicalCpus'-switch you will find recommendations to tune TAK for maximum performance by applying this switch. And sometimes this may be bad advice.

Therefore i am currently quite reluctant to implement such a switch.

Sorry, can't find the answer here.

is it possible to disable SSSE3 optimization for encoding?

If you are referring to the command line version: Good question! And i have to apologize... There is no switch to control cpu optimizations. It must have vanished sometime. If it ever existed, i am not really sure. But surely i could add such a switch.

Well, i have to to take stock of myself (a new item from my dictionary), to deceide, if i want to release another beta, which accomplishes some of the more recent user demands. Unfortunately too many betas may create a bad impression for irregular readers. I have to think about it.

I have added a switch to control the cpu optimizations.

By doing so i found a (non-fatal) bug in the command line version:

Some part of the encoder comes in to versions: One for Single core, one for Multi core encoding. If you specify -tn1, the single core version is beeing used, otherwises the multi core version. But if you don't specify -tn# at all, Takc will use the multi core version with only one thread.

On systems with more than 1 core this setting will often be faster than the true single core version. Therefore speed comparisons of 'takc -tn1' vs. 'takc -tn2 (or more)' are valid, but 'takc' vs. 'takc -tn2 (or more)' are not. On my system the difference is about 3 percent.

This is also relevant for comparisons of V2.0 and V2.1. If you have more than one core and don't specify -tn1, the speed advantage of for instance the new SSSE3 optimizations may be overestimated.

Probably i will soon release a Beta 3b.

A bit stronger -p4m would be interesting. But also it would be interesting to have an option for 8 threads. Since my CPU has got 4 hyperthreading cores = 8 threads it would be interesting to see how much 8 threads would raise the CPU usage on my system.
I've seen some multithreading applications that nearly fill my system to 100%.

As i explained above, hyperthreading isn't advantegous for the encoder. And you can only achieve such a high cpu usage, if neither disk io nor memory access are limiting factors.

I'll do a test on uncached data if someone don't outrun me (which can be quite easy as I won't possibly turn on my home PC until Friday evening and even then I may won't run any tests in a bad sleep deprivation I'll have by then.)

Please take care of yourself 

Anything specifically you would like to see tested on 4 cores (no HT)? I have no idea about tak switches, but I'm willing to carry out any test you seek, with your instruction.  I've many albums in lossless format I can convert to wav and then test with tak.

Thank you for the offer. Please wait for the Beta 3b release.
Title: TAK 2.1.0 - Beta release 3
Post by: TBeck on 2010-12-15 00:03:45
The beta 3 b has been released. Please see the first post for details.
Title: TAK 2.1.0 - Beta release 3
Post by: Nowings69 on 2010-12-15 14:24:51
E7200 (2 threads SIMD SSE4.1)

TEST.wav uses takc.exe -p2 -ihs with fb2k

TAK2.0.0

Compression:    64.30 %
Duration:        18.46 sec
Speed:          211.88 * real time


TAK2.1.0 beta 3 b(NONE)

Compression:    64.30 %
Duration:        38.09 sec
Speed:          102.66 * real time


TAK2.1.0 beta 3 b(MMX)

Compression:    64.30 %
Duration:        18.25 sec
Speed:          214.34 * real time


TAK2.1.0 beta 3 b(SSSE3)

Compression:    64.30 %
Duration:        15.76 sec
Speed:          248.10 * real time


TAK2.1.0 beta 3 b(SSSE3 -threads 2)

Compression:    64.30 %
Duration:        10.96 sec
Speed:          356.71 * real time


TAK2.1.0 beta 3 b(default)

Compression:    64.30 %
Duration:        16.14 sec
Speed:          242.32 * real time


TAK2.1.0 beta 3 b(default -threads 2)

Compression:    64.30 %
Duration:        9.76 sec
Speed:          400.70 * real time
Title: TAK 2.1.0 - Beta release 3
Post by: TBeck on 2010-12-16 00:02:55
Thanks again for testing the beta.

I would like to release TAK 2.1 final til sunday. Then my mind will be free to deal with something else. Well, not necessarily with something totally different, but with another aspect.

Therefore i am asking again for bug reports.

BTW: Has anyone tried the new LossyWav-codec? No need to try it, if you don't use LossyWav at all. I am just curious, if this codec is of any value for somebody.
Title: TAK 2.1.0 - Beta release 3
Post by: hellokeith on 2010-12-16 07:35:47
Anything specifically you would like to see tested on 4 cores (no HT)? I have no idea about tak switches, but I'm willing to carry out any test you seek, with your instruction.  I've many albums in lossless format I can convert to wav and then test with tak.

Thank you for the offer. Please wait for the Beta 3b release.


Quote from: TBeck link=msg=0 date=
The beta 3 b has been released. Please see the first post for details.


Thomas,

Do you have something specific you wanted tested?
Title: TAK 2.1.0 - Beta release 3
Post by: Nowings69 on 2010-12-16 09:54:07
BTW: Has anyone tried the new LossyWav-codec? No need to try it, if you don't use LossyWav at all. I am just curious, if this codec is of any value for somebody.


I usually choose LossyFLAC for LPCM with Video in Matroska
and it is better for portable(e.g.ROCKBOX) too.
Because I dont know how to play it with LossyTAK right now.
Title: TAK 2.1.0 - Beta release 3
Post by: jc3213 on 2010-12-16 15:12:57
Thanks again for testing the beta.

I would like to release TAK 2.1 final til sunday. Then my mind will be free to deal with something else. Well, not necessarily with something totally different, but with another aspect.

Therefore i am asking again for bug reports.

BTW: Has anyone tried the new LossyWav-codec? No need to try it, if you don't use LossyWav at all. I am just curious, if this codec is of any value for somebody.

Well, I'd prefer MP3 rather than other LossyCodec, because my portable devices don't support any Lossless Codecs or other LossyCodec besides LossyWav.I don't think it's nessessary to have a lossyTAK,cause no handheld devices support it. This is only my personal opinion.
Title: TAK 2.1.0 - Beta release 3
Post by: TBeck on 2010-12-16 22:32:02
Do you have something specific you wanted tested?

Thank you! Regarding the speed optimizations anything sensible has been evaluated. As i wrote in the initial post, verification of the proper codec function is always useful: "If you want to help, please make sure to first compress, then decompress and finally compare the decompressed files with the original files. It may not be sufficient to use -v (Verify) and -md5 (MD5-creation and validation) to reveal multi core encoder errrors!" Maybe you could do this with your favourite preset.

There are usually more and more interesting testing opportunities when i introduce new codec features which have to be tuned for optimum performance.

I usually choose LossyFLAC for LPCM with Video in Matroska
and it is better for portable(e.g.ROCKBOX) too.
Because I dont know how to play it with LossyTAK right now.

Well, I'd prefer MP3 rather than other LossyCodec, because my portable devices don't support any Lossless Codecs or other LossyCodec besides LossyWav.I don't think it's nessessary to have a lossyTAK,cause no handheld devices support it. This is only my personal opinion.

I was aware of those restrictions, but thought, there would be some users using LossyWav only for archiving purposes, where the lack of hardware support wouldn't matter.

I think i will remove the dedicated LossyWav codec from TAK 2.1. It can easily be added again when there is some demand. But currently i don't want to add such a quite complex feature that has not been tested by anyone else but me. Nevertheless the development of the codec was a quite interesting task for me.
Title: TAK 2.1.0 - Beta release 3
Post by: hellokeith on 2010-12-17 03:27:20
Dream Theater - Octavarium
8 tracks encoded with tak 2.1.0 beta 3
Intel Core i5 (quad, no ht) 750 @ 2.67GHz

-pMax

wav________%tak___-tn1__-tn2_-tn3__-tn4
89183180___66.48%__35*__59*__85*__94*
58760060___59.97%__35*__61*__83*__90*
80443148___66.60%__34*__59*__86*__86*
47510444___70.03%__33*__58*__82*__88*
86969948___70.66%__35*__62*__86*__95*
71707820___71.01%__34*__59*__86*__91*
113418188__67.09%__34*__60*__84*__90*
254018348__63.70%__35*__60*__87*__92*

wav________%tak___-tn1__-tn2___-tn3___-tn4
x__________66.29__34.52__59.90__85.37__91.22

Amazing speed improvements with the multi-threading!
Title: TAK 2.1.0 - Beta release 3
Post by: jc3213 on 2010-12-17 04:32:50
Well, I'd prefer MP3 rather than other LossyCodec, because my portable devices don't support any Lossless Codecs or other LossyCodec besides LossyWav.I don't think it's nessessary to have a lossyTAK,cause no handheld devices support it. This is only my personal opinion.

I was aware of those restrictions, but thought, there would be some users using LossyWav only for archiving purposes, where the lack of hardware support wouldn't matter.

I think i will remove the dedicated LossyWav codec from TAK 2.1. It can easily be added again when there is some demand. But currently i don't want to add such a quite complex feature that has not been tested by anyone else but me. Nevertheless the development of the codec was a quite interesting task for me.

Maybe you can release an individual build only purposed for LossyWav archiving.If users want a LossyTAK Compressor, they may join the beta test to help you with debugging or something.Since TeraByte Capacity HardDisk is really cheap now,so most of the TAK users demand a Lossless TAK Compressor.I think this will be better rather than completely remove LossyWav codecs,then add it again when required.And sorry that I can't help testing TAK 2.1.0 Beta 3 B,because I'm not at home this week.

Hope you good luck,I can't wait trying the TAK 2.1.0 Final Release.
Title: TAK 2.1.0 - Beta release 3
Post by: Alexxander on 2010-12-17 13:35:33
Using the same fileset as mentioned in my earlier post in beta 2 thread (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=85390&view=findpost&p=734556) (10.3GB wav format, encoded by foobar2000, reading from SSD, writing to modern hard drive and CPU Intel duo core E7600), I got these results:
   
Code: [Select]
    
TAKv210b3    TAKv210b3    TAKv210b3  TAKv210b3    TAKv210b3
-p4m         -p4m         -p4m       -p4m         -p4m
1 process    2 processes  1 process  2 processes  4 processes
-tn1         -tn1         -tn2       -tn2         -tn1
34:06.467    18:31.429    23:51.730  22:41.264    18:16.874
30.66x       56.46x       43.83x     46.10x       57.21x

Number of processes is controlled by foobar2000 Preferences>Tools>Converter>Thread count
Any available instruction set is used.
Published data is a result of one run (but seems reasonable).

I have done several other smaller tests with -p4m and always come to the same conclusion: encoding with 2 processes+single thread per process is much faster than one proces+2 threads per process.

I suspect that with faster encoding settings the conclusion will be different. If I find time during the next days, I might try -p0.
Title: TAK 2.1.0 - Beta release 3
Post by: Alexxander on 2010-12-17 13:59:15
Quick test of -p0:

Code: [Select]
TAKv210b3
1 process  1 process  2 process
-p0 -tn1   -p0 -tn2   -p0 -tn1
3:10.072   3:13.691   3:39.758
330.20x    324.03x    285.59x

As opposed to -p4m, -p0 compression setting doesn't use the full cpu power and goes up and down. The bottleneck for -p4m is the cpu and for -p0 it's the drive(s) speed. This is why one process with 2 threads runs faster than 2 processes with one thread (as opposed to -p4m setting).
Title: TAK 2.1.0 - Beta release 3
Post by: alvaro84 on 2010-12-18 11:13:40
I tried to run an unusual test for you. I'll do it soon, but before I do, I report you a problem that stopped me from doing the test.
Your encoder don't seem to be able to handle some paths (I've already experienced it, but forgot to report).
The ones I try to convert now in a folder tree, downloaded to my pendrive (on my brother's computer), are single file TTA albums with such paths (try them):

Quote
L:/Touhou lossless music collection/[dBu music]/2005.05.04 [DBCD-0001] 弾奏結界 紅魔狂詩曲 Scarlet Rapsodia [例大祭2]/CDImage.cue
L:/Touhou lossless music collection/[dBu music]/2005.05.04 [DBCD-0002] 弾奏結界 幻葬旋律曲 Necromanza [例大祭2]/CDImage.cue
L:/Touhou lossless music collection/[dBu music]/2005.05.04 [DBCD-0004] 弾奏結界 追憶鎮魂曲 Nostalgic Requiem [例大祭2]/CDImage.cue
L:/Touhou lossless music collection/[dBu music]/2005.12.30 [DBCD-0005] 深弾奏結界 散華嬉遊曲 Flower Divertimento [C69]/CDImage.cue
L:/Touhou lossless music collection/[dBu music]/2007.05.20 [DBCD-0009] 絶弾奏結界 兎角宴舞曲 courante impromptu [例大祭4]/dbu Music - 絶弾奏結界 兎角宴舞曲 courante impromptu.cue
L:/Touhou lossless music collection/[dBu music]/2007.12.31 [DBCD-0010] 風弾奏結界 神交風雅曲 Oratario del Vento [C73]/Audio CD.cue


And their .tta counterparts, respectively.
(These will be my test subjects in the next test, weighing almost 2GB, definitely too much to entirely cache on my rig w/ 2GB RAM).

Edit: it seems the problem is the file creation part, because I renamed the folders to lack kanji, yet when I tried to convert them to subfolders by album titles, TAK encoder exited again.
Title: TAK 2.1.0 - Beta release 3
Post by: alvaro84 on 2010-12-18 12:06:35
OK, so here's the test itself. I hoped to incorporate I/O times in the test by exceeding the volume of a single album. I even copied the files to the HDD after I got path errors converting from pendrive. This way I was really satisfied and hoped to have created a favorable track for the built-in threading... but it all proved to be an own goal...
The reason is: I left the sorce files as they are (single file per album TTAs), and this format poses another bottleneck. This made 2-tread reading from a (quite freash and new, formatted a few weeks ago...) hard drive a non-issue. The issue it brought up was the slow TTA decoding (~168x realtime on this machine), which is done on a single thread when I use 1 instance of the TAK encoder... so the results are totally contradicting with my expectations 

2 intances, 1 thread each - Total encoding time: 4:19.445, 58.66x realtime
1 intance, 2 threads each - Total encoding time: 5:55.745, 42.78x realtime

Anyway, doing 2 reads on a single HDD when the target drive is not the same (hardware) doesn't limit anyhing compared to TTA decoding. It seems that the only scenario where I could make the built-in threading win would be converting uncompressed data, within one single HDD (I seldom convert from a drive to itself, exactly because it imposes serious i/o limit on converting, and especially on muxing video), and preferably more threads than the core count of my current core 2 duo. And I still say that built-in threading is a good thing. I may be hopeless  Or I may think of the future, massively multicore CPUs (like 8-core Bulldozers).

I still owe you an i/o independent test between Tak 2.0 and 2.1, on an AMD (2-core, low-power 2.5GHz Brisbane K8) based computer - it's right here in the next room  But if there is no difference apart from the SSSE3 optimization, there's no point doing this one.
Title: TAK 2.1.0 - Beta release 3
Post by: sPeziFisH on 2010-12-19 13:00:32
wow, damn brutal speed optimizations, Thomas   

Intel Core i3 350M  - cores:2 threads:4 HT SSSE3 (http://img696.imageshack.us/i/intelcorei3350m.png/)

using ramdisk and -p2 (suffient to me, space-benefit with -p3 and -p4 is too small with regard to the higher compression-times)

Code: [Select]
thread:1 ssse3
Duration:        23.43 sec
Speed:          169.29 * real time

Code: [Select]
thread:4 ssse3
Duration:        10.08 sec
Speed:          393.27 * real time



Code: [Select]
thread:4 none
Duration:        25.09 sec
Speed:          158.08 * real time
Code: [Select]
thread:4 mmx
Duration:        11.99 sec
Speed:          330.86 * real time


Bugs are possible! If you want to help, please make sure to first compress, then decompress and finally compare the decompressed files with the original files. It may not be sufficient to use -v (Verify) and -md5 (MD5-creation and validation) to reveal multi core encoder errrors!

Don't get the point, md5-creation by using external applications (like nirsoft.net - HashMyFiles (http://www.nirsoft.net/utils/hash_my_files.html)) should be sufficient or not?

greetings to the good ol wueste 
Title: TAK 2.1.0 - Beta release 3
Post by: krafty on 2010-12-22 19:47:48
Thanks TBeck, for this brilliant codec.
Just want to express my desire to see this released for Linux based systems!
Remember us!
Title: TAK 2.1.0 - Beta release 3
Post by: Steve Forte Rio on 2010-12-26 19:07:46
Hello!
just finished testing of TAK 2.0.3 b. Here is my results

PC configuration:
Core i3 530 2.93 GHz
2x2Gb DDR3-1333
SATAII HDD 500 Gb Hitachi (source), SATAII HDD 1TB Hitachi (destination)
Windows 7 32-bit

Audio:
File size : 804MB (843 702 428 bytes)
Duration : 1:19:42.893 (210925596 samples)
Sample rate : 44100 Hz
Channels : 2
Bits per sample : 16
Bitrate : 1411 kbps
Codec : PCM
Encoding : lossless

Encoding: foobar2000 1.1.1, tak -e -<x> -tn1 -ihs - %d
Decoding: foobar2000 1.1.1 with foo_benchmark (high priority, buffer entire file into memory, 5 passes)

(http://audiophilesoft.ucoz.ua/misc/tak203b.png)
Title: TAK 2.1.0 - Beta release 3
Post by: jc3213 on 2010-12-28 04:58:13
Befor TAK compress.

Hash my file==> CYNTHIA HARRELL - I AM THE WIND.wav   MD5:1c27a7dab07a51a276acf5dc644bc0ad   SHA1:c17030bbae306f974c9fe7c4c74048c33e70db22   CRC32:ce752a4a

--------------------------------------------------------------------------------------------

After TAK compress (4 Threads + SSSE3 + MD5 + Verify + P4M)

File Info==>CYNTHIA HARRELL - I AM THE WIND.tak    MD5:641729292fb0ffb5b0cd95b3f243c9b0

Hash my file==>CYNTHIA HARRELL - I AM THE WIND.tak   MD5:e4df01999fb3b67a67c9bb850c480a34   SHA1:545e11cd6991c3fcb70cb460edd4302739f890b1   CRC32:6595eb49

--------------------------------------------------------------------------------------------

After Tak decompress

Hash my file==>CYNTHIA HARRELL - I AM THE WIND.wav   MD5:1c27a7dab07a51a276acf5dc644bc0ad   SHA1:c17030bbae306f974c9fe7c4c74048c33e70db22   CRC32:ce752a4a

--------------------------------------------------------------------------------------------

After TAK compress 2nd time (1 Thread + None + P2)


Hash my file==>CYNTHIA HARRELL - I AM THE WIND.tak   MD5:9248dfa820c5c800c4a4531fd1627a7c   SHA1:253a2f613fc0272e87cc57edfa70ce55383bb089   CRC32:22ea9bca

--------------------------------------------------------------------------------------------

AFter Tak decompress 2nd time

Hash my file==>CYNTHIA HARRELL - I AM THE WIND.wav   MD5:1c27a7dab07a51a276acf5dc644bc0ad   SHA1:c17030bbae306f974c9fe7c4c74048c33e70db22   CRC32:ce752a4a

--------------------------------------------------------------------------------------------

It Looks there is no error compressing file with SSSE3 + Multi-threads. Hopes this will help

EDIT:Add another test.
Title: TAK 2.1.0 - Beta release 3
Post by: TBeck on 2010-12-29 20:57:04
Thank you all for testing!

...
Amazing speed improvements with the multi-threading!

Cool!


I think i will remove the dedicated LossyWav codec from TAK 2.1. It can easily be added again when there is some demand. But currently i don't want to add such a quite complex feature that has not been tested by anyone else but me. Nevertheless the development of the codec was a quite interesting task for me.

Maybe you can release an individual build only purposed for LossyWav archiving.If users want a LossyTAK Compressor, they may join the beta test to help you with debugging or something.Since TeraByte Capacity HardDisk is really cheap now,so most of the TAK users demand a Lossless TAK Compressor.I think this will be better rather than completely remove LossyWav codecs,then add it again when required.

Well, i have removed the new dedicated LossyWav codec for now.

I don't intend to release a seperate build including this codec for several reasons (only the most important ones listed):

- Although i am trusting my comprehensive automatic test scripts, i know, that external tests are required to feel really safe. If nobody is testing the new codec, my personal quality standards have not been accomplished.
- Maybe it was generally a bad idea to add such a codec to TAK. Especially, if it would have been widely used! Maybe TAK then would get the reputation, that it can not be trusted, because too much lossy content is beeing published with it. Since you can never be sure, if the source of a lossless encode isn't lossy, this is logically and practically bullshit, but it is easy to get a bad reputation.

I have done several other smaller tests with -p4m and always come to the same conclusion: encoding with 2 processes+single thread per process is much faster than one proces+2 threads per process.

Maybe i have to tune the io-part of the multi-threaded-encoder to make it competive with foobar's simultaneous multiple-file encoding...

I tried to run an unusual test for you. I'll do it soon, but before I do, I report you a problem that stopped me from doing the test.
Your encoder don't seem to be able to handle some paths (I've already experienced it, but forgot to report).
....
Edit: it seems the problem is the file creation part, because I renamed the folders to lack kanji, yet when I tried to convert them to subfolders by album titles, TAK encoder exited again.

Maybe you just noticed, that TAK currently doesn't support unicode char sets? Sorry...

Anyway, doing 2 reads on a single HDD when the target drive is not the same (hardware) doesn't limit anyhing compared to TTA decoding. It seems that the only scenario where I could make the built-in threading win would be converting uncompressed data, within one single HDD (I seldom convert from a drive to itself, exactly because it imposes serious i/o limit on converting, and especially on muxing video), and preferably more threads than the core count of my current core 2 duo. And I still say that built-in threading is a good thing. I may be hopeless  Or I may think of the future, massively multicore CPUs (like 8-core Bulldozers).

More evidence that TAK's multi core encoder might need some tuning... Thank you!

wow, damn brutal speed optimizations, Thomas 

Thank you!

Thanks TBeck, for this brilliant codec.
Just want to express my desire to see this released for Linux based systems!
Remember us!

No promises, but requests are always capable to affect the priorities of items on my todo list.

Hello!
just finished testing of TAK 2.0.3 b. Here is my results
(http://audiophilesoft.ucoz.ua/misc/tak203b.png)

Great! The diagram is excatly illustrating the way TAK's preset system and general design is intended to work:

- Always fast decoding, affected only slightly by the preset choice.
- Decoding is about equally fast for one preset's evaluation levels.
- A really good default preset -p2 regarding compression ratio, encoding and decoding speed. Again thanks to the users who helped to create it!
- Maybe it's another hint, that it's now time to make -p4m a bit stronger...

It Looks there is no error compressing file with SSSE3 + Multi-threads. Hopes this will help

Definitely! Thank you very much!

I will now prepare the final release. It will still be called 2.1, although i am usually only changing the second place of the version number, if something has been added, which can not be decoded by prior versions. Now, that the LossyWav-codec has been removed, this is no longer true. But i think it would be iritating if i now release a 2.0.1 and later a 2.1 with totally different functionality than the current betas (named 2.1).

BTW: I wasn't totally lazy in the meantime. The next release (2.1.1) will -among other things- again be a bit faster.

  Thomas
Title: TAK 2.1.0 - Beta release 3
Post by: Destroid on 2010-12-30 20:18:08
Thank you, Thomas, for sharing your work. Always a pleasure and always exciting to participate. Also, I forgot to thank you for conjuring up multi-core abilities as a reaction to my question of TAK's potential parallelism. Could it be that multi-core was already in-the-works? :shrug:

- Maybe it was generally a bad idea to add such a codec to TAK. Especially, if it would have been widely used! Maybe TAK then would get the reputation, that it can not be trusted, because too much lossy content is beeing published with it.
I said this before that I, too, don't believe in tying lossy techniques to a lossless program. I think it is my hard-nosed conviction of the fundamental philosophy behind lossless audio encoding. Perhaps it's because I work primarily with original waveforms in studio projects and take the lossless thing quite seriously, arguably more than a person who can simply re-rip their CD's. All I'm trying to say is that I agree with: as long as the person who packaged the source material is trusted, TAK will deliver lossless as specified.

In regards to Steve Forte Rio's diagram:
Quote
Maybe it's another hint, that it's now time to make -p4m a bit stronger...
I didn't run my own tests yet, but the diagram shows minor compression gains from migrating from -p0e -> p0m. Perhaps it was the material being tested that gave this result, but it does raise the inquiry of whether evaluation modes -pXe and -pXm share qualities independent of the numerical preset. If so, can evaluation modes be strengthened?

Happy new year!
Title: TAK 2.1.0 - Beta release 3
Post by: Steve Forte Rio on 2010-12-30 22:23:34
Quote
- Maybe it's another hint, that it's now time to make -p4m a bit stronger...


looking forward to it 
Title: TAK 2.1.0 - Beta release 3
Post by: jc3213 on 2011-01-08 10:59:49
Great! The diagram is excatly illustrating the way TAK's preset system and general design is intended to work:

- Always fast decoding, affected only slightly by the preset choice.
- Decoding is about equally fast for one preset's evaluation levels.
- A really good default preset -p2 regarding compression ratio, encoding and decoding speed. Again thanks to the users who helped to create it!
- Maybe it's another hint, that it's now time to make -p4m a bit stronger...

  Thomas


Well,I think with the speed up provided by SSSE3 and Mulit-thread compressing, you can simplify the preset,such as scaling P0~P5 from:P0(new) -> P1(current) , P1(new) -> P2(current). P2(new) -> P3(current), P3(new) -> P4(current) , P4(new) -> P4M(current) , P5 -> (stronger than current P4M). and the default set to P2(new)

There are too many presets which are detailedly listed,and some of them are seldom used or even useless.