Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Video Codecs - their CPU usage (Read 5241 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Video Codecs - their CPU usage

Hi,
    I'm looking at building a series of videos on a machine with a 206 MHz Intel SA-1110 microprocessor (a StrongARM SA1110 in all but name). The screen itself is only 240 by 320 but annoyingly, it uses 16-bits per pixel (5:6:5) and doesn't have the benefit of SIMD operations. I presume I have to decode to 24-bit (byte each of red, green & blue). Happily, the CPU does have an excellent barrel shifter so the conversion should be pretty quick if some good, hand-coded assembly language is used. I can organize it so that the data I am working on is already in the L1 cache for reading & s push multiple registers so I can fill an entire cache line in 1 instruction. Meanwhile, I've moved onto the next part of the image.

I have tried REALLY hard to get the Dhrymips or core-marks or ANY indication of how much CPU time the decoder takes. With H264 I know it's best, if you can, to set it to maximum complexity (which uses analysis by synthesis) so it takes you middle-range (2GHz) PC overnight to compress it - the compression time doesn't matter since it isn't live. In fact, it's taken from a  GY-HM650 camcorder in AVCHD format set to highest quality (you get 10 minutes footage). It will transcode the output into H.264, HD MPEG2 (35/25/19Mbps) and other formats.

The problem is, I cannot find an example where H.264, Vorbis Ogg, Vorbis Opus and all of the others are tested for the CPU usage for a test sequence. Since so many use 8:8:8 final output, it may even be possible to reduce the accuracy of the decoder (to save time) since the bottom 2 or 3 bits are lost anyway. I have E-mailed everyone connected to these formats and NOBODY want's to answer the question. I think that they just assume it will only be used on GHz+ CPUs <-not the plural since 4-core & 8-core machines are out there.

Audio is simply a stereo 16-bit DAC which deals with double-buffering itself (although with a few lines of code, triple buffering is simple).

Can someone PLEASE help. I promise I'm not one of those people who ask for an experts time if I hadn't looked very, very hard myself. A test sequence of different sizes, ideally, but just 1 size will give me a good enough guesstimate. I've read that some of the CPUs have an extra 2K instruction (or maybe mixed) cache which I believe can also be switched into being a 2K scratchpad. That's what the original Playstation had. No cache, just a 32K scratchpad for code & data. Never underestimate the power that can bring. The Atari Jaguar had about 12 waitstates for a cache-miss so everyone codeded their games in 16K lumps that were transferred in as needed. They even managed a reasonable version of doom - John Carmack is about the best programmer I've ever met. His historic use of a dummy read to fill the destination cache line when he was drawing walls. Take that one MOV Cl,[EDI] out of the loop and the game runs at ½ the original speed!

Many thanks in advance,
CC
Thanks for the help!