Skip to main content
Topic: TAK 2.0 Development (Read 32184 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

TAK 2.0 Development

Reply #25
And no, that's not the results of the dedicated LossyWav codec i intend to develop later.
Pardon the double-post, but I somehow overlooked this part of the OP. Thomas, can you elaborate about this?
"Something bothering you, Mister Spock?"

TAK 2.0 Development

Reply #26
Yes, the "focus has shifted from the higher to the lower presets". This was happening from the first presentation of YALAC (as TAK was named before it's first stable release). I have always been affected by user comments at hydrogen. Most users (who posted) wanted it to be as fast as possible. A significant amount of posters (not necessarily users...) kept emphasizing TAK's slower decoding speed than FLAC (especially when using foobar). Maybe i have taken this too serious... Possibly a lot of users indeed would welcome a bit stronger compression in exchange for a bit slower decoding. Possibly i should create a poll?


I would say, the strongest side of TAK is efficiency - size/speed rate. It's great to have a codec that has presets with encoding/decoding speed comparable to FLAC and presets with compression levels comparable to APE High.

So I think you shouldn't have hard time choosing "speed or compression". Lower presets are for better speed, higher ones - for better compression

TAK 2.0 Development

Reply #27
Quote
Possibly a lot of users indeed would welcome a bit stronger compression in exchange for a bit slower decoding. Possibly i should create a poll?
Why not? I predict the people who use -p3 and up would be in favor of better compression, whereas -p2 users want the fastest encode/decode performance. I am not sure how much die-hard users who want the best compression would care about slower decompression speed. As far as -p3 settings and higher I think the poll might as well ask if decode speed is a priority (i.e. DAW and HTPC don't need to be battery-saving).

I would say, the strongest side of TAK is efficiency - size/speed rate. It's great to have a codec that has presets with encoding/decoding speed comparable to FLAC and presets with compression levels comparable to APE High.

So I think you shouldn't have hard time choosing "speed or compression". Lower presets are for better speed, higher ones - for better compression

I think you both are right: Let's regard -p0 to -p2 as the speed settings focussing on maximum decoding speed and -p3/-p4 as the power settings which sacrifice some more decoding speed for a bit better compression, even if the proportion may get somewhat insane efficiencywise.

I have alreday raised the maximum predictor count to 256 and will post some results soon.

And no, that's not the results of the dedicated LossyWav codec i intend to develop later.
Pardon the double-post, but I somehow overlooked this part of the OP. Thomas, can you elaborate about this?

I am quite confident that i can modify the codec to achieve at least 1 percent better compression for LossyWav-files. Furthermore it may perform significantly better when compressing files with sections where LossyWav hasn't removed any bits and the efficiency falls behind that of a pure lossless encode.

Earlier i wanted to integrate those modifications into the TAK 2.0 codec, but now i plan to create a dedicated codec (the file format supports up to 64 different codecs) . That's a bit easier to do and doesn't stall the 2.0 release.

  Thomas



TAK 2.0 Development

Reply #28
Thomas,

I'm curious on your views of GPGPU and how it could be beneficial for TAK. Do you have any plans to look at this? Mind you I only have a general understanding of how this stuff works but it seems to be getting more interesting now things are going to get more standardized.

The following thread also handles this and even has a proof of concept FLAC implementation based on CUDA.

TAK 2.0 Development

Reply #29
I have alreday raised the maximum predictor count to 256 and will post some results soon.

Here they are:

Code: [Select]
AMD Sempron 2.2 GHz                                                                     
                                                                        
Preset  Compression            Enco-Speed                 Deco-Speed              
        1.1.2   2.0     Win    1.1.2    2.0       Win     1.1.2    2.0       Win
                                                                        
-p3     56.52   56.32   0.20    55.97    55.53   -0.79%   190.88   210.42   10.24%
-p4     56.16   55.96   0.20    32.07    25.10  -21.73%   166.89   148.77  -10.86%
-p4m    56.07   55.80   0.27    17.81     9.34  -47.56%


Intel Pentium Dual Core 2 GHz                                                                  
                                                                        
Preset  Compression            Enco-Speed                 Deco-Speed              
        1.1.2   2.0     Win    1.1.2    2.0       Win     1.1.2    2.0       Win
                                                                        
-p3     56.52   56.32   0.20    64.64    63.45   -1.84%   214.36   234.40    9.35%
-p4     56.16   55.96   0.20    36.90    28.64  -22.38%   193.97   178.50   -7.98%
-p4m    56.07   55.80   0.27    21.07    10.38  -50.74%

Compression is relative to the original file size, Enco- and Deco-Speed expressed as multiple of real time.

-p3 is now using 96 instead of 80, -p4 256 instead of 160 predictors.

Some results for other sample rates and bit depths:       

Code: [Select]
                   Preset  Compression         
                           1.1.2     2.0 Eval  Win
                    
24 bit /  44 khz    -p4m   56.78     56.67     0.11
24 bit /  96 khz    -p4m   50.69     50.61     0.08
24 bit / 192 khz    -p4m   44.86     43.76     1.10
  8 bit             -p4m   38.23     37.08     1.15


Decoding of -p4 is still at least 150 times faster than realtime, what should be acceptable.

TAK 2.0 Development

Reply #30
Code: [Select]
Intel Pentium Dual Core 2 GHz                                                                   
                                                                        
Preset  Compression            Enco-Speed
        1.1.2   2.0     Win    1.1.2    2.0       Win
                                                                        
-p4     56.16   55.96   0.20    36.90    28.64  -22.38%
-p4m    56.07   55.80   0.27    21.07    10.38  -50.74%

That's quite an impressive improvement actually, when compare old p4m to new p4.
Compression ratio from both presets are about the same (0.11 difference) but encoding speed improves 35.93%.

TAK 2.0 Development

Reply #31
Compression ratio from both presets are about the same (0.11 difference) but encoding speed improves 35.93%.
One of us is getting it wrong here. If I understand correctly, compression time drops from 20 times real time to 10 times real time, which means the -p4m compression will take twice as long with TAK 2.0...

Edit: Now I get it, you compared old p4m with the new p4. Sorry, my bad...

TAK 2.0 Development

Reply #32
I'm curious on your views of GPGPU and how it could be beneficial for TAK. Do you have any plans to look at this? Mind you I only have a general understanding of how this stuff works but it seems to be getting more interesting now things are going to get more standardized.

The following thread also handles this and even has a proof of concept FLAC implementation based on CUDA.

I am always interested into new opportunities to optimize TAK and GPGPU is no exception. It's definitely possible to utilize GPGPU for encoding, but i have no practical experience yet. And it will take some time until i will try it. The most important reason:

I am very concerned about the reliability of the encoder. I am worried about failures of the GPU-memory resulting in unrecognized encoder errors. Unless i find a trustable study which shows that GPU-memory isn't failing more often than system memory or unless more GPU's come with ECC for their memory, i really dont wan't to take a risk.

Currently i would prefer to add multicore support to the encoder. It's easier to do, will probably result in a larger speed gain (especially on systems with slow GPU's) and i am feeling more safe regarding the reliability.

TAK 2.0 Development

Reply #33
...
I am very concerned about the reliability of the encoder. I am worried about failures of the GPU-memory resulting in unrecognized encoder errors. Unless i find a trustable study which shows that GPU-memory isn't failing more often than system memory or unless more GPU's come with ECC for their memory, i really dont wan't to take a risk.
...

Sounds perfectly reasonable to me and it might be the safest approach for new tech, although I think nVidia is actually going to include ECC into their new Fermi architecture. I actually don't know if it will also be made available in normal consumerproducts as marketsegmentation sometimes dictates otherwise. Thanks for the answer and nice to hear that you're looking into multicore CPU support.

TAK 2.0 Development

Reply #34
Currently i would prefer to add multicore support to the encoder. It's easier to do, will probably result in a larger speed gain (especially on systems with slow GPU's) and i am feeling more safe regarding the reliability.

As I've had to deal with bit-flip errors with RAM myself, I understand your concern. I also recall reading an article recently about how GPU RAM errors weren't considered critical since one wrong pixel in a video game doesn't matter much.
Here's the thing with lossless codecs though: you can always compare the encode to the source. So, I'm not too worried.

As for the multi-core CPU vs. GPU: think of laptops, HTPCs and generally lower-end computers. They're more likely to have only one or two CPU cores, and a GPU. My new laptop is something of a special case, but a GPU-enabled encoder would run much faster on it: it has an Intel Atom N270 CPU (one core, two threads) and an NVIDIA Ion GPU (GeForce 9400M). The latter's benefit becomes clear when playing high-definition videos, where CPU usage stays below 20% (most of the time, below 10%).

Now, the question is: what machines would benefit most from encoding speed improvements on TAK, which is already quite fast? In my case, my Atom/Ion netbook would certainly benefit more from a GPU-enabled encoder than my quad-core AMD Phenom PC would benefit from a multi-threaded implementation, especially since I can already encode multiple files in parallel.

TAK 2.0 Development

Reply #35
I am very concerned about the reliability of the encoder. I am worried about failures of the GPU-memory resulting in unrecognized encoder errors. Unless i find a trustable study which shows that GPU-memory isn't failing more often than system memory or unless more GPU's come with ECC for their memory, i really dont wan't to take a risk.

A quick search with google revealed this interesting page: MemtestG80: A Memory and Logic Tester for NVIDIA CUDA-enabled GPUs. It contains a link to an pdf of the study "Hard Data on Soft Errors: A Large-Scale Assessment of Real-World Error Rates in GPGPU". I had no time to really read or even critically rate the article, but here is an excerpt from the conclusions:

Quote
"We have presented the first large-scale study of error rates in GPGPU hardware, conducted over more than 20,000 GPUs on the Folding@home distributed computing network. Our control experiments on consumer-grade and dedicated-GPGPU hardware in a controlled environment found no errors. However, our large-scale experimental results show that approximately two-thirds of tested cards exhibited a pattern-sensitive susceptibility to soft errors in GPU memory or logic, confirming concerns about the reliability of the installed base of GPUs for GPGPU computation. We have further demonstrated that this nonzero error rate cannot be adequately explained by overclocking or time of day of execution (a proxy for ambient temperature). However, it appears to correlate strongly with GPU architecture, with boards based on the newer GT200 GPU having much lower error rates than those based on the older G80/G92 design. While we cannot rule out user error, misconfiguration on the part of Folding@home donors, or environmental effects as the cause behind nonzero error rates, our results strongly suggest that GPGPU is susceptible to soft errors under normal conditions on non-negligible timescales."


As I've had to deal with bit-flip errors with RAM myself, I understand your concern. I also recall reading an article recently about how GPU RAM errors weren't considered critical since one wrong pixel in a video game doesn't matter much.
Here's the thing with lossless codecs though: you can always compare the encode to the source. So, I'm not too worried.

Yes, this seems to be the way to go. I forgot about TAK's verify option, which will decode each frame after encoding and compare the output with the original. Since decoding is so fast, this could be performed by the CPU without sacrificing too much encoding speed.

Now, the question is: what machines would benefit most from encoding speed improvements on TAK, which is already quite fast? In my case, my Atom/Ion netbook would certainly benefit more from a GPU-enabled encoder than my quad-core AMD Phenom PC would benefit from a multi-threaded implementation, especially since I can already encode multiple files in parallel.

I agree, this is the question... The answer may require a lot of testing of real implementations. Currently i don't want to put too much effort into it, but i will keep an eye on evaluations of GPU-implementations of similar algorithms as TAK is using.

TAK 2.0 Development

Reply #36
I'm curious on your views of GPGPU and how it could be beneficial for TAK. Do you have any plans to look at this? Mind you I only have a general understanding of how this stuff works but it seems to be getting more interesting now things are going to get more standardized.

The following thread also handles this and even has a proof of concept FLAC implementation based on CUDA.

I am always interested into new opportunities to optimize TAK and GPGPU is no exception. It's definitely possible to utilize GPGPU for encoding, but i have no practical experience yet. And it will take some time until i will try it. The most important reason:

I am very concerned about the reliability of the encoder. I am worried about failures of the GPU-memory resulting in unrecognized encoder errors. Unless i find a trustable study which shows that GPU-memory isn't failing more often than system memory or unless more GPU's come with ECC for their memory, i really dont wan't to take a risk.

Currently i would prefer to add multicore support to the encoder. It's easier to do, will probably result in a larger speed gain (especially on systems with slow GPU's) and i am feeling more safe regarding the reliability.

I'm glac that there are more people like me who are concerned about GPU encoding correctness and it's great that you're one of them.
I've seen somewhere a study done with NVIDIA collaboration that showed that GPU memory errors were indeed an issue...as well as some silicon issues, IIRC GPUs with few processing units damaged passing validation.

But there's a simple solution: Encode on GPU, move to the main memory and then use CPU to verify the results. Retry (on CPU?) in case of a failure.
It should warranty correctness with overhead low enough to still provide a great speed boost.
Verification that parameters are best chosen would be infeasible, which would hurt compression ratio somehow, but I think that the speed improvement is well worth it...especially that there are reliable GPUs from NVIDIA on the way.

TAK 2.0 Development

Reply #37
Now i am preparing a first release... Probably i will call it a beta.

I am looking at code i have written weeks or months ago to find errors. Today i caught one. Fortunately: Even my quite exhaustive script-based test set possibly wouldn't have caught it because it was based upon very rare conditions. Mathematically possible but extremely rare in practice.

Addditionally i am feeding the encoder with random data to test the decoder regarding features of the encoder, which will later be implemented. I want to be sure, that the TAK 2.0 decoder will decode files created by later, more sophisticated encoders without any problems.

Some bad news for some power hungry users: For now i have again reduced the maximum predictor count from 256 to 160. But the decoder will be laid out to support even up to 320 predictors; this way i am able to add a really insane preset -p5 later ( if i want) and the files will be decodable by the V 2.0 decoder.

I hope, this wasn't boaring. I will release a first version as soon as i am feeling confident enough about the reliability of the new codec.

  Thomas

TAK 2.0 Development

Reply #38
Some bad news for some power hungry users: For now i have again reduced the maximum predictor count from 256 to 160.
  Thomas

Not boring at all, about the max predictors, would a value like 192 be a reasonable compromise?
In theory, there is no difference between theory and practice. In practice there is.

TAK 2.0 Development

Reply #39
Not boring at all, about the max predictors, would a value like 192 be a reasonable compromise?

Well, the problem is, how you define reasonable. For users looking for optimum efficiency (speed/compression), the compression advantage would be too small to justify slower encoding and decoding. And users looking for maximum compression would still ask for more...

For V2.0 i will limit the predictor count to 160.

This also because higher predictor counts will require a tuning of the algorithm, which estimates the optimum predictor order. Currently it is failing sometimes when using more the 160 predictors. This tuning can be a lot of work and i prefer to first tune some other encoder options, which hardly affect the decoding speed and nevertheless may provide some nice improvements even for the fast presets. This will be my task for 2.0.1.

Later (2.0.2 ?) i may release an evaluation version with up to 320 predictors and let the users try themselves, if it is really advantageous for their files.

I need good reasons to increase the predictor count, because doing so will -besides the initial tuning- also mean more work regarding future modifications of the encoder, because there tends to be a lot of interaction between high predictor orders and other parts of the codec.

BTW: In the past days i have performed a lot of tests to verify the proper function of the new codec and so far everything was ok. I hope to release a first version within one week. 

TAK 2.0 Development

Reply #40
Big thanks for developing and improving TAK!
Was there any progress for album cover support in the Winamp plug-in?


TAK 2.0 Development

Reply #41
Was there any progress for album cover support in the Winamp plug-in?

I suppose this would require a modified plugin which uses a new interface to communicate with WinAmp. I haven't done this yet, but it should be possible to write such a plugin based upon the "TAK_deco_lib.dll". Maybe someone more experienced with WinAmp than me could quite easily do this... Otherwises i will have do it sooner or later, but probably not before releasing TAK 2.0.1 or TAK 2.0.2.

Actually i wanted to release a beta of V2.0 within the next days, but then i remembered some encoder option i was using years ago (with little success) but later have removed. Now i tried it again and achieved some tiny improvements of up to 0.05 percent for some presets. Not really much but it comes without nearly no speed penality and it's about as much (or sometimes more) as could be gained by increasing the predictor count from 160 to 192. Therefore i think it's worth to implement it. This may delay the release by a couple of days.

TAK 2.0 Development

Reply #42
Quote
A significant amount of posters (not necessarily users...) kept emphasizing TAK's slower decoding speed than FLAC (especially when using foobar).
Decoding speed is definitely a priority, although in this age of computing I think current TAK decoding speeds will be fine for current portables (FLAC decoding speed may be a bit overkill nowadays as compared to 5 years ago.

A DAP with a voltage scaling CPU will gain power efficiency and thus battery life with an easier-to-decode format, even if it has enough raw horsepower to decode APE realtime.
DAPs, however, are shifting quickly away from HDDs, and therefore the battery cost of larger files has diminished significantly.
Creature of habit.

TAK 2.0 Development

Reply #43
Prepearing the release

Currently my secondary PC is performing a lot of automated tests to validate the proper function of the new codec. This may take one or two days.

In the meantime i have to deceide, if i want to add another optimization which improves the compression of 192 KHz files by about 0.25 percent but unfortunately has no significant effect on files with lower sample rates. There are good reasons against this optimization:

- It will either make the encoder slower (for any sampling rate!) or require a higher code compexity to avoid this speed penality.
- TAK's compression efficiency for 192 KHz files is already on top, only beaten by OptimFrog in my tests, therefore it's not really neccessary to add a bit more, especially because 192 KHz files are a bit exotic.

So it's very likely that we will see a first release within the next days.

For now some (final) results for my primary sample set:

Code: [Select]
AMD Sempron 2.2 GHz                                                                     
                                                                        
Preset  Compression            Enco-Speed                 Deco-Speed              
        1.1.2   2.0     Win    1.1.2    2.0       Win     1.1.2    2.0       Win
                                                                        
-p0     58.83   58.74   0.09   264.10   289.24    9.52%   283.52   293.55    3.54%
-p1     57.98   57.84   0.14   193.18   205.66    6.46%   275.80   288.77    4.70%
-p2     57.07   56.90   0.17   131.39   135.32    2.99%   250.36   257.09    2.69%
-p3     56.52   56.36   0.16    55.97    60.78    8.59%   190.88   218.79   14.62%
-p4     56.16   56.02   0.14    32.07    34.35    7.11%   166.89   174.72    4.69%
-p4m    56.07   55.89   0.18    17.81    14.88  -16.45%

Intel Pentium Dual Core 2 GHz                                                                  
                                                                        
Preset  Compression            Enco-Speed                 Deco-Speed              
        1.1.2   2.0     Win    1.1.2    2.0       Win     1.1.2    2.0       Win
                                                                        
-p0     58.83   58.74   0.09   261.83   283.94    8.44%   298.67   311.12    4.17%
-p1     57.98   57.84   0.14   201.75   215.76    6.94%   291.72   305.56    4.74%
-p2     57.07   56.90   0.17   146.14   149.26    2.13%   262.58   271.21    3.29%
-p3     56.52   56.36   0.16    64.64    68.66    6.22%   214.36   239.69   11.82%
-p4     56.16   56.02   0.14    36.90    38.09    3.22%   193.97   203.54    4.93%
-p4m    56.07   55.89   0.18    21.07    16.98  -19.41%
                   
Compression is relative to the original file size, Enco- and Deco-Speed expressed as multiple of real time.

I like those results: Better compression and higher encoding speed and higher decoding speed for any basic preset (without additionally evaluation level like p4m). For me this combination justifies the introduction of a new codec.

Some results for other file types and for LossyWav:

Code: [Select]
                   Preset  Compression         
                           1.1.2     2.0 Eval  Win
                    
24 bit /  44 khz    -p4m   56.78     56.73     0.05
24 bit /  96 khz    -p4m   50.69     50.70    -0.01
24 bit / 192 khz    -p4m   44.86     43.71     1.15
8 bit               -p4m   38.23     37.19     1.04
LossyWav -q00       -p2m   19.37     18.92     0.45
LossyWav -q25       -p2m   26.38     26.00     0.38
LossyWav -q50       -p2m   32.34     31.97     0.37


  Thomas

TAK 2.0 Development

Reply #44
In the meantime i have to deceide, if i want to add another optimization which improves the compression of 192 KHz files by about 0.25 percent but unfortunately has no significant effect on files with lower sample rates. There are good reasons against this optimization:

- It will either make the encoder slower (for any sampling rate!) or require a higher code compexity to avoid this speed penality.
- TAK's compression efficiency for 192 KHz files is already on top, only beaten by OptimFrog in my tests, therefore it's not really neccessary to add a bit more, especially because 192 KHz files are a bit exotic.

If it doesn't reduce decompression speed, I'm for it.

TAK 2.0 Development

Reply #45
In the meantime i have to deceide, if i want to add another optimization which improves the compression of 192 KHz files by about 0.25 percent but unfortunately has no significant effect on files with lower sample rates. There are good reasons against this optimization:

- It will either make the encoder slower (for any sampling rate!) or require a higher code compexity to avoid this speed penality.
- TAK's compression efficiency for 192 KHz files is already on top, only beaten by OptimFrog in my tests, therefore it's not really neccessary to add a bit more, especially because 192 KHz files are a bit exotic.


Then you may delay it until 2.0.1 (or 2.0.2), it won't hurt. 192 KHz files is really a bit exotic thing for now.

Looking forward to first TAK 2.0 beta



TAK 2.0 Development

Reply #46
If it doesn't reduce decompression speed, I'm for it.

Well, after some fine tuning the possible advantage is down to 0.1 percent...

Then you may delay it until 2.0.1 (or 2.0.2), it won't hurt. 192 KHz files is really a bit exotic thing for now.

Unfortunately the modification would make the format incompatible, therefore it has to be done before the 2.0 release.

TAK 2.0 Development

Reply #47
Unfortunately the modification would make the format incompatible, therefore it has to be done before the 2.0 release.


Well... the best thing from 'end-user point of view' would be:

- no additional speed penalty for any (especially most used lower) sample rates
- additional 0.25% for 192 kHz files (though for many users who don't use such files it's rather virtual advantage)

The price in this case is a higher code compexity. As a developer you decide whether you'd like to implement it or not. It's up to you.

TAK 2.0 Development

Reply #48
The 192KHz is a strange one, it is poised to predict the future of digital audio.

Today, this sample rate (likely at 24-bit) is mostly used in commercial pro-studio and some DVD-A (and not very many that I can recall, I'm not a fan of DVD-A).

The more popular sampling rate out there is 96KHz used in most consumer pro-audio and (most?) DVD-A.

Of course, someday, everybody will listen to everything in 192KHz  , but will leaving out the slight 192KHz advantage today have more dramatic improvements later that hamper TAK later?

IMO, 192KHz is exotic and will likely be used very little in comparison to 96KHz. For how long will this remain? 5 years? Maybe 10? I'm not exactly sure. :shrug:
"Something bothering you, Mister Spock?"

TAK 2.0 Development

Reply #49
Of course, someday, everybody will listen to everything in 192KHz

I'm not so sure about that. The industry seems to be pushing for MP3s instead of CD/DVD-A/lossless.

 
SimplePortal 1.0.0 RC1 © 2008-2019