TAK 1.1.1 Development

Topic: TAK 1.1.1 Development (Read 16417 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

TAK 1.1.1 Development

2009-01-27 11:31:28

Purpose of this thread

- I will post news about the development of the next TAK version.
- I may ask you questions regarding new features.
- You may comment my work, ask for features and help me to do it right.

What i did until 2009-01-27

I made some decisions:

1) I often was tempted to make the codec more complex to achieve even better compression. But this doesn't make sense. TAK always has to be fast. Furthermore forwards prediction (as used in TAK) is probably not the way to achieve highest compression ratios. If sometime in the future i can not resist to write an even stronger codec, i will go for backwards prediction or a combination of both.

But all this doesn't mean that i will not improve the compression ratio in future versions! As long as improvements slow down only the encoder but not the decoder, anything is allowed.

2) I will no longer put so much effort into tiny speed optimizations. Thanks to Synthetic Soul for changing his test setup! His latest comparison has pointed out, what a large effect a variation of a test platform can have on the speed and how easy tiny speed improvements can be eaten up by this. Since i am a bit of a junkie regarding speed optimizations, i will later write this down at least a hundred times:

I will no longer put so much effort into tiny speed optimizations!
I will no longer put so much effort into tiny speed optimizations!!
I will no longer put so much effort into tiny speed optimizations!!!

A junkie would probably throw his drugs into the toilet at this point. I have removed a lot of existing tiny optimizations from TAK... Nice side effect: Simplier code and about 9 KB saved. And the speed will still be on the level of TAK 1.0.4.

I am still allowed to work on significant speed optimizations. Probably one of the next versions will support SSE2 instructions in the encoder. I am hoping for speed improvements of up to 50 percent for the more demanding presets like -p3m to -p4m.

Well, what to do with all the time gained by staying clean?

New features in TAK 1.1.1

1) In very rare cases the presets -p3 and -p4 will compress much worse than the lower presets. Thanks to DOS386 for reporting one of this cases. A new filter in the encoder will nearly eliminate this annoying effect.

2) MD5 calculation and verification of the raw audio data.

3) Option to lower the process priority.

and possibly something more.

Thomas

TAK 1.1.1 Development

Reply #1 – 2009-01-27 12:14:34

I totally agree on you decisions 1+2.

TAK 1.1.1 Development

Reply #2 – 2009-01-27 14:07:54

Hey Thomas, there's a new feature in WavPack that I never thought of before, but now that it's there I have found it useful and I hope to see it in TAK:
-Directly encode/decode raw pcm data

Also I'm glad you're adding MD5 to TAK

TAK 1.1.1 Development

Reply #3 – 2009-01-27 14:40:40

Is it possible to give a hint about when we can expect SSE2? In a few months, in 6 months, end of the year?

TAK 1.1.1 Development

Reply #4 – 2009-01-28 13:02:54

Quote from: Squeller on 2009-01-27 12:14:34

I totally agree on you decisions 1+2.

Fine!

Quote from: ssjkakaroto on 2009-01-27 14:07:54

Hey Thomas, there's a new feature in WavPack that I never thought of before, but now that it's there I have found it useful and I hope to see it in TAK:
-Directly encode/decode raw pcm data

Thanks for the suggestions! I will put it on my to do list for one of the next versions. I don't think you will have to wait very long.

Quote from: me7 on 2009-01-27 14:40:40

Is it possible to give a hint about when we can expect SSE2? In a few months, in 6 months, end of the year?

Not really. When i am thinking about a new release, i always try to find a good mixture of the features on my to do list. This time MD5 had a higher priority.

But i don't think it will take more than half a year.

To take advantage of SSE2 support, you will need a quite new CPU. Older CPUs with SSE2 support have not implemented true 128-bit processing of SSE2-registers, instead they will split the calculation into 2 64-Bit operations, what may be even a bit slower than TAK's current MMX-processing!

Some CPUs with fully featured SSE2 implementations are: Intel's Core technology based ones (including Celeron Core and Pentium Dual Core) and AMD Phenom. I hope, i am right...

Just out of curiousity: Why is this so important for you?

Thomas

TAK 1.1.1 Development

Reply #5 – 2009-01-28 18:49:56

Maybe I come back to old question but what about foobar's decoder?
Internally TAK's decoder is fastest one http://www.synthetic-soul.co.uk/comparison...Rate&Desc=1
But foobar library at -p0m -p1m is actually slower than FLAC -8. As I need a fastest decoding speed for loosy transcode this is the main reason I still prefer FLAC.
Is there any plan to increase perfomance for foobar?

TAK 1.1.1 Development

Reply #6 – 2009-01-29 12:58:55

Quote from: IgorC on 2009-01-28 18:49:56

Internally TAK's decoder is fastest one http://www.synthetic-soul.co.uk/comparison...Rate&Desc=1
But foobar library at -p0m -p1m is actually slower than FLAC -8. As I need a fastest decoding speed for loosy transcode this is the main reason I still prefer FLAC.

Yes, that's a bit strange since the foobar library is using exactlty the same code for decoding as the TAK applications.

I see 2 possible reasons for the faster decoding of the FLAC foobar plugin:

1) Bad interaction of TAK's decoding library and the foobar plugin.
2) Possibly the FLAC plugin is using asynchronous file io for reading the source file.

I don't think 1) could result in such a huge difference, therfore i bet on 2).

Quote from: IgorC on 2009-01-28 18:49:56

Is there any plan to increase perfomance for foobar?

I will have a look at it.

Since i wouldn't like to touch the foobar plugin, i would have to modify the decoding library. I have to deceide, if i want to have platform specific io-functions in it, but i am not dogmatic regarding this issue.

Thomas

TAK 1.1.1 Development

Reply #7 – 2009-01-29 23:10:14

Quote from: TBeck on 2009-01-29 12:58:55

I see 2 possible reasons for the faster decoding of the FLAC foobar plugin:

1) Bad interaction of TAK's decoding library and the foobar plugin.
2) Possibly the FLAC plugin is using asynchronous file io for reading the source file.

I don't think 1) could result in such a huge difference, therfore i bet on 2).

I just read this post. Obviously the speed difference also exists, if the data is beeing read from memory. Then it can't be 2).

But from this thread i have also learnt, that the FLAC decoder has been integrated into foobar.

This can make a a huge difference!

For instance an integrated decoder may output the audio in the format foobar is using internally (i don't know which). TAK's native output formate is one array of Longs per channel. Before sending this data to foosions foobar input plugin, this data is beeing converted to channel wise interleaved Ints. If this isn't the format foobar is using internally, another conversion is required.

Such operations can require a considerable amount of cpu time if a codec is decoding as fast as TAK!

Thomas

TAK 1.1.1 Development

Reply #8 – 2009-01-30 00:52:16

Quote from: TBeck on 2009-01-28 13:02:54

Quote from: me7 on 2009-01-27 14:40:40
Is it possible to give a hint about when we can expect SSE2? In a few months, in 6 months, end of the year?

Not really. When i am thinking about a new release, i always try to find a good mixture of the features on my to do list. This time MD5 had a higher priority.

But i don't think it will take more than half a year.

To take advantage of SSE2 support, you will need a quite new CPU. Older CPUs with SSE2 support have not implemented true 128-bit processing of SSE2-registers, instead they will split the calculation into 2 64-Bit operations, what may be even a bit slower than TAK's current MMX-processing!

Some CPUs with fully featured SSE2 implementations are: Intel's Core technology based ones (including Celeron Core and Pentium Dual Core) and AMD Phenom. I hope, i am right...

Just out of curiousity: Why is this so important for you?

Thomas

Thanks for the info.
The reason is that I want to upgrade my library from FLAC to TAK in the near future. If a SSE2 version was almost done that might bring significant speed improvements (I'm talking about transcoding a whole library) I'd wait for it. I don't mean to rush you, just so I know if now is a good time to transcode.

TAK 1.1.1 Development

Reply #9 – 2009-01-30 01:04:40

Yes, I was aware about FLAC integrated decoder and that's what makes it fastest one.
I don't want to get into discussion about OSS but I think thanks to it foobar's developers integrated FLAC decoder internally.
It's your code and I respect it.
Thank you for answers.

TAK 1.1.1 Development

Reply #10 – 2009-01-30 13:21:49

Quote from: me7 on 2009-01-30 00:52:16

The reason is that I want to upgrade my library from FLAC to TAK in the near future. If a SSE2 version was almost done that might bring significant speed improvements (I'm talking about transcoding a whole library) I'd wait for it. I don't mean to rush you, just so I know if now is a good time to transcode.

I didn't feel rushed. I simply was curious. I am always happy about user requests (at least as long as i am able to accomplish them...).

Maybe i will play a bit with SSE2 in the next weeks. But no promises! It always depends on my spare time.

Quote from: IgorC on 2009-01-30 01:04:40

Yes, I was aware about FLAC integrated decoder and that's what makes it fastest one.
I don't want to get into discussion about OSS but I think thanks to it foobar's developers integrated FLAC decoder internally.
It's your code and I respect it.

Thank you!

Thomas

TAK 1.1.1 Development

Reply #11 – 2009-01-30 16:04:52

Thomas, can't you make a decoder plugin for foobar that doesn't rely on the external library? Maybe that could speed things up.

TAK 1.1.1 Development

Reply #12 – 2009-01-31 23:07:15

Quote from: TBeck on 2009-01-30 13:21:49

Quote from: me7 on 2009-01-30 00:52:16
The reason is that I want to upgrade my library from FLAC to TAK in the near future. If a SSE2 version was almost done that might bring significant speed improvements (I'm talking about transcoding a whole library) I'd wait for it. I don't mean to rush you, just so I know if now is a good time to transcode.

I didn't feel rushed. I simply was curious. I am always happy about user requests (at least as long as i am able to accomplish them...).

Maybe i will play a bit with SSE2 in the next weeks. But no promises! It always depends on my spare time.

I had some time to try it. The first results are a bit disillusioning...

I guess i will have a hard time to achieve a speed up of about 15 percent. Probably not sufficient to give you the advice to wait.

And possibly not even sufficient to integrate it into the codec. But it's always nice to learn something new.

Quote from: ssjkakaroto on 2009-01-30 16:04:52

Thomas, can't you make a decoder plugin for foobar that doesn't rely on the external library? Maybe that could speed things up.

I don't think this would help a lot.

Thomas

TAK 1.1.1 Development

Reply #13 – 2009-02-01 15:43:00

Quote from: TBeck on 2009-01-31 23:07:15

I guess i will have a hard time to achieve a speed up of about 15 percent.

On what architecture did you try? You've just mentioned that it'll only be useful on CPUs with full-width (128-bit) SSE units.
(But if it's really hard to get 15% more speed even on full-fledged SIMD128 architectures... well, it may really not worth it...)

TAK 1.1.1 Development

Reply #14 – 2009-02-08 20:35:33

Quote from: alvaro84 on 2009-02-01 15:43:00

Quote from: TBeck on 2009-01-31 23:07:15
I guess i will have a hard time to achieve a speed up of about 15 percent.

On what architecture did you try? You've just mentioned that it'll only be useful on CPUs with full-width (128-bit) SSE units.
(But if it's really hard to get 15% more speed even on full-fledged SIMD128 architectures... well, it may really not worth it...)

I tried it on a CPU with 128-bit SSE integer units. Indeed, a possible speed improvement of not more than 15 % isn't sufficient to justify the effort. Especially because i like to make the code simplier.

I have just removed most of the assembler routines and then optimized the remaining ones.

The binaries are now up to 9 KB smaller and the speed is still a bit better than in 1.0.4. I like it simple.

I have spent this sunday with writing a test suite to verify the function of the modified assembler routines. Everything is fine.

Md5 is too working very well.

Probably it will not take very long until a beta release of V1.1.1.

Thomas

TAK 1.1.1 Development

Reply #15 – 2009-02-11 21:26:15

Preparing the beta release

All the features listed in the first post have been implemented and tested.

Many thanks to shnutils for his great shntool! It was extremely helpful to verify TAK's Md5 checksums of the raw audio data.

One addition to the list of changes: I have removed the support for seek tables.

Now i will have to update the documentation and perform my comprehensive encoding/decoding validation test. Then i will release V1.1.1 Beta.

Thomas

TAK 1.1.1 Development

Reply #16 – 2009-02-12 08:35:02

Some questions for developers using the SDK:

1) I would like to remove the german documentation, because it's a bit annoying to always have to update two versions. Ok?

2) I intend to add the source code of the module decoding the container objects to the SDK. While it depends on some other modules which will not yet be published, it nevertheless provides a quite good definition of the stream format. Interested?

Thomas

TAK 1.1.1 Development

Reply #17 – 2009-02-12 14:46:41

And another question from me: you mentioned that 1.1.1 drops the support for seek tables. Is the decoder still going to use the seek information stored in files encoded with older versions or it'll be dead weight from now on (not much weight though, I have to admit)?

TAK 1.1.1 Development

Reply #18 – 2009-02-12 15:11:01

Quote from: TBeck on 2009-02-12 08:35:02

Some questions for developers using the SDK:

You are not asking me but I will still answer

Quote

1) I would like to remove the german documentation, because it's a bit annoying to always have to update two versions. Ok?

You may want to keep the German docs, with a notice that they are not up-to-date and which version of TAK do they correspond to.

Quote

2) I intend to add the source code of the module decoding the container objects to the SDK. While it depends on some other modules which will not yet be published, it nevertheless provides a quite good definition of the stream format. Interested?

Thomas

That looks like a first step towards open sourcing TAK, I think it is great news! Even if that is all we have it would make the format more "open".

TAK 1.1.1 Development

Reply #19 – 2009-02-13 03:57:29

Quote from: TBeck on 2009-02-11 21:26:15

All the features listed in the first post have been implemented and tested.

Nice Thomas!

TAK 1.1.1 Development

Reply #20 – 2009-02-14 01:04:07

Quote from: alvaro84 on 2009-02-12 14:46:41

And another question from me: you mentioned that 1.1.1 drops the support for seek tables. Is the decoder still going to use the seek information stored in files encoded with older versions or it'll be dead weight from now on (not much weight though, I have to admit)?

No, the decoder will not use the seek tables created by older versions. And yes, you are loosing about 0.002 percent of compression efficiency if you keep the seektable.

Quote from: jido on 2009-02-12 15:11:01

That looks like a first step towards open sourcing TAK, I think it is great news! Even if that is all we have it would make the format more "open".

Fine!

Quote from: ssjkakaroto on 2009-02-13 03:57:29

Quote from: TBeck on 2009-02-11 21:26:15
All the features listed in the first post have been implemented and tested.

Nice Thomas!

Thank you

I hope to release the beta this weekend. The release has been delayed because i am just upgrading my secondary PC which i use to validate my releases.

Thomas

TAK 1.1.1 Development

Reply #21 – 2009-02-25 20:05:24

Quote from: jido on 2009-02-12 15:11:01

Quote from: TBeck on 2009-02-12 08:35:02
1) I would like to remove the german documentation, because it's a bit annoying to always have to update two versions. Ok?

You may want to keep the German docs, with a notice that they are not up-to-date and which version of TAK do they correspond to.

Nothing is worse than outdated information. If you don't want to update the German documentation, then remove it. That would be fine by me. If you want to keep the old documentation around, then provide a complete version of the old SDK.

Quote from: TBeck on 2009-02-12 08:35:02

2) I intend to add the source code of the module decoding the container objects to the SDK. While it depends on some other modules which will not yet be published, it nevertheless provides a quite good definition of the stream format. Interested?

This could be a step towards luring someone else to port your code to C.

TAK 1.1.1 Development

Reply #22 – 2009-02-25 20:37:53

Quote from: TBeck on 2009-01-29 12:58:55

I see 2 possible reasons for the faster decoding of the FLAC foobar plugin:

1) Bad interaction of TAK's decoding library and the foobar plugin.
2) Possibly the FLAC plugin is using asynchronous file io for reading the source file.

I don't think 1) could result in such a huge difference, therfore i bet on 2).

Ad 2) No, it doesn't.
Ad 1) This has been discussed before. See these measurements. Whatever the exact reason may be, the fact that it is in a DLL has a significant and strongly processor architecture dependent effect on the TAK code.

Quote from: TBeck on 2009-01-29 12:58:55

Since i wouldn't like to touch the foobar plugin, i would have to modify the decoding library. I have to deceide, if i want to have platform specific io-functions in it, but i am not dogmatic regarding this issue.

It would be a very unfortunate choice to remove the IO abstraction. I also think this time would be better invested in porting your library to C, so you can use a modern compiler.

Quote from: TBeck on 2009-01-29 23:10:14

But from this thread i have also learnt, that the FLAC decoder has been integrated into foobar.

If you could produce a static library with the TAK decoder code, that would be worth a try.

Quote from: TBeck on 2009-01-29 23:10:14

For instance an integrated decoder may output the audio in the format foobar is using internally (i don't know which). TAK's native output formate is one array of Longs per channel. Before sending this data to foosions foobar input plugin, this data is beeing converted to channel wise interleaved Ints. If this isn't the format foobar is using internally, another conversion is required.

Such operations can require a considerable amount of cpu time if a codec is decoding as fast as TAK!

The foo_input_std DLL includes the FLAC decoder as a library which outputs the same format as the TAK library (for 16 bit source audio files at least). Since it is available as C source, it can be compiled and linked by the Microsoft C/C++ compiler that generates the whole DLL. This means it can benefit from additional optimizations, for example the "whole program optimization" which works across object file boundaries (object file = intermediate binary output).

TAK 1.1.1 Development

Reply #23 – 2009-03-28 17:40:24

Quote from: foosion on 2009-02-25 20:37:53

The foo_input_std DLL includes the FLAC decoder as a library which outputs the same format as the TAK library (for 16 bit source audio files at least). Since it is available as C source, it can be compiled and linked by the Microsoft C/C++ compiler that generates the whole DLL. This means it can benefit from additional optimizations, for example the "whole program optimization" which works across object file boundaries (object file = intermediate binary output).

One of the reason I still stay on FLAC is: FLAC was well integrated into Foobar2000, and I depends on it a lot, much more than ITunes.

Also lot of PC and some of the MAC software support it, TAK need a "BASS (XMPLAY)" like library to let the other program to use it easily. Much appreciate Thomas' hardwork to make TAK fast and have good compression result, for PC still lack an ACM and DirectShow filter, and no work seems done on MAC.

TAK 1.1.1 Development

Reply #24 – 2009-04-06 19:18:03

In the YALAC thread I found information about ECC support. Will it be available in TAK?

Notice