Skip to main content
Topic: Discwriter/converter processing two files simultaneously on HT CPU (Read 7065 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Discwriter/converter processing two files simultaneously on HT CPU

With FB2K 0.9 or 0.9.1 beta 1, the converter/discwriter processing for conversion to FLAC, WAV etc. always seems to process two files simultaneously. The title bar for the processing window reads "Converting - X,Y/N" where X and Y are the numbers of the two files being converted and N is the total number of files to process. Presumably, the hyperthreading CPU in my PC is making FB2K think it should run two conversion threads for two processors (you see two flac.exe processors in the Task Manager, for example). Of course on an HT CPU the "second processor" does not actually exist and I'd rather FB2K didn't make every effort to soak 100% of the CPU time either way, because I usually want conversion to trundle on in the background while I continue to use the machine for other things.

I haven't compared the speed of conversion against v0.8.3 but speed isn't the problem. Hideous amounts of NTFS disc fragmentation are a much greater concern. According to the Windows defragmenter, FLAC files converted from 3-4 minute WAVs at CD quality end up in many fragments. Going from FLAC to WAV is predictably worse - a quick test a moment ago on a freshly defragmented (twice) drive resulted in two 5 minute FLAC files giving WAV files in 22 and 21 fragments, respectively. The extra time taken to either defragment the disc or move converted files to another HDD and back again is time consuming and adds to the already heavily stressed disc operations caused by all the seeking occuring during conversion in the first place.

I can't find any existing references to this behaviour on the forum or after a brief scan of the docs, and I see nothing that might turn off dual processing in the Preferences. Is there any way to disable this well intentioned, but poorly implemented feature, or perhaps trick FB2K into only processing one file?

Perhaps in future FB2K could estimate the size of the output file and pre-allocate a certain size file on disc up front to reduce or eliminate fragmentation problems. For uncompressed output formats it should be easy to calculate a relatively precise size; for compressed formats, perhaps a file as big as the uncompressed equivalent of the source should be used, or as much space as is left on the disc, whichever is larger. The over-allocation would ensure reduced fragmentation and the files would of course shrink to their correct extent when conversion completed. Some fragmentation would still occur as new files got written into the gaps left by the shrinkages, but it'd still be a lot better than the present behaviour.

Thanks...

Discwriter/converter processing two files simultaneously on HT CPU

Reply #1
Why do you even care whether the files are fragmented?

Discwriter/converter processing two files simultaneously on HT CPU

Reply #2
Why do you even care whether the files are fragmented?
Personally, my main three reasons are:
  • I noticed this in the first place because of the amount of noise my HDD started making during conversion with 0.9; this was because of excessive seeking. That activity will undoubtedly reduce the life of the component. I don't want to have to replace an HDD early because of a strange implementation choice in an application which is based on a false assumption (the machine does not have two CPUs).

  • Any NTFS drive suffers increasing fragmentation over time, resulting in reduced filesystem and operating system performance. Fragmentation of system files (in particular cache related and logging items) tend to be prime candidates. As any competent Windows user of any vintage should be aware, it is prudent to run the disk defragmenter tool from time to time. On Windows XP, the tool looks at the amount of fragmentation and based on some internal criteria says whether or not defragmentation is recommended. Having converted several hundred WAV files to FLAC with v0.9, I ran this tool to check system level fragmentation, with some suspicion of the converter introducing problems too. The disc was very badly fragmented. The defragmentation process took an extraordinarily long time and stressed the disc even further (seeking, read/write cycles, elevated operating temperature). On those occasions when I want to run the defragmentation tool just to reduce or remove general "wear and tear" fragmentation, it is ridiculous to have to wait for hundreds of FLAC (or other format) files to be defragmented just because a sub-optimal algorithm in a conversion application quite deliberately introduced excessive fragmentation in the first place.

  • The data rate of most audio files is very low compared to the ability of the disc to deliver the data even in the face of enormous fragmentation, but people use their machines for other things than just playing back a single stream. I do multitrack audio and video editing on my PC for example. Most files are saved on a different HDD from the one containing the FLAC files in question but some must live on the primary disc just because of making the best use of space across the HDDs present in the machine. If I have heavily fragmented FLAC files that I later delete, the next file saved to that part of the disc will itself be fragmented as NTFS soaks up the fragmented free space. For high bandwidth applications such as those that I commonly run, the resultant effective reduction in disc throughput can be catastrophic (projects that previously ran fine in editors start to stutter and lose sync).
All in all, fragmentation is just "a bad idea", both because it loads the software, loads the hardware, and reduces the performance of any process reading or writing to fragmented areas. No application which is knowingly and avoidably causing excess fragmentation to a disc should be considered as anything other than buggy IMHO. Assuming a disagreement on that count, at least a "turn this off" control would, surely, be a reasonable work-around, if it is not possible to modify the converters to write and then fill empty files of an anticipated maximum size as I suggested previously?

Discwriter/converter processing two files simultaneously on HT CPU

Reply #3
  • I don't want to have to replace an HDD early because of a strange implementation choice in an application which is based on a false assumption (the machine does not have two CPUs).


If you're concerned about it that much why not just turn off Hyper-Threading in the BIOS? The whole point of Hyper-Threading is to allow applications to run multi-threaded like this.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #4
By keeping hyper-threading on, he can use part of his CPU for browsing operations, word processing, etc.

A solution to this problem which I read about on another thread (search?) was to reduce the foobar2000's processor affinity in the task manager, to a single core.

On top of this, if you're running such high performance, surely you can afford a better defragmenter?  Diskeeper will improve the way you defragment by a huge amount.  A full defrag takes 15 minutes on a 50 gig partition with 15% free space and 35 % fragmentation, on my system.

Good luck (and welcome to HA),
Shade.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #5
If you're concerned about it that much why not just turn off Hyper-Threading in the BIOS? The whole point of Hyper-Threading is to allow applications to run multi-threaded like this.
That would indeed stop FB2K from running two converter processes, but surely the point of hyperthreading is to improve overall system performance by using otherwise idle CPU cycles for independent instruction sequences. Since an operating system such as Windows is already running plenty of independent processes, it is quite capable of scheduling things optimally itself, within the usual limits of Microsoft's ability to write an effective scheduler

Win2K performs poorly on HT processors with HT enabled but WinXP performs better, because WinXP is "aware" that there are not really two identical processors and you shouldn't try to entirely schedule a thread on one of them, because one is, in effect, much, much slower than the other. If you try running Win2K in such a mode you will from time to time see strange effects where an application appears to slow down significantly despite overall low CPU usage because of scheduling problems. WinXP doesn't do this (bugs aside). It's one of the reasons I originally upgraded to WinXP a few years ago (my P4 base system is actually pretty old now).

Running two converter threads second-guesses the operating system and makes the Win2K mistake of assuming that there are two symmetric CPUs when this is not the case. The OS will probably just end up context switching more often than would otherwise have been the case. Turning of hyperthreading will make all operations on my machine run more slowly, for the sake of side stepping a feature in one single application. However, I've just seen a notification for another reply that gives a solution, so onto that now...

Discwriter/converter processing two files simultaneously on HT CPU

Reply #6
All in all, fragmentation is just "a bad idea", both because it loads the software, loads the hardware, and reduces the performance of any process reading or writing to fragmented areas. No application which is knowingly and avoidably causing excess fragmentation to a disc should be considered as anything other than buggy IMHO. Assuming a disagreement on that count, at least a "turn this off" control would, surely, be a reasonable work-around, if it is not possible to modify the converters to write and then fill empty files of an anticipated maximum size as I suggested previously?

Multithreading is a good idea as paralleling tasks is a very good idea. You can workaround your particular problem by downloading freeware PsExec and setting up a CMD file (e.g. foobar2000.cmd):
Code: [Select]
psexec -l -d -a 1 "%programFiles%\foobar2000\foobar2000.exe"

Put this file into the very same directory as where you extracted psexec.exe. CMD file is an ordinary text TXT file renamed to CMD.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #7
Why do you even care whether the files are fragmented?
Personally, my main three reasons are:
  • I noticed this in the first place because of the amount of noise my HDD started making during conversion with 0.9; this was because of excessive seeking. That activity will undoubtedly reduce the life of the component. I don't want to have to replace an HDD early because of a strange implementation choice in an application which is based on a false assumption (the machine does not have two CPUs).



This is wrong for several reasons.  First, running two threads at once produces the same number of seeks as running one thread twice.  This should be obvious if you think about how writing works. 

Second, even if this somehow effected seeks, its irrelevent.  Its going to fail regardless, and trying to optimize it's failure point is beyond retarded.  You don't even know why it'll fail.

Finally, you machine has two logical CPUs (and has set the required registers telling the OS that it has 2 CPUs), so its a correct assumption on the software's part.

Quote
[/li][li]Any NTFS drive suffers increasing fragmentation over time, resulting in reduced filesystem and operating system performance. Fragmentation of system files (in particular cache related and logging items) tend to be prime candidates. As any competent Windows user of any vintage should be aware, it is prudent to run the disk defragmenter tool from time to time. On Windows XP, the tool looks at the amount of fragmentation and based on some internal criteria says whether or not defragmentation is recommended. Having converted several hundred WAV files to FLAC with v0.9, I ran this tool to check system level fragmentation, with some suspicion of the converter introducing problems too. The disc was very badly fragmented. The defragmentation process took an extraordinarily long time and stressed the disc even further (seeking, read/write cycles, elevated operating temperature). On those occasions when I want to run the defragmentation tool just to reduce or remove general "wear and tear" fragmentation, it is ridiculous to have to wait for hundreds of FLAC (or other format) files to be defragmented just because a sub-optimal algorithm in a conversion application quite deliberately introduced excessive fragmentation in the first place.


Why do you care if your FLAC files are fragmented?  Are you disk limited when you play them back?  That seems unlikely.

Quote
[/li][li]The data rate of most audio files is very low compared to the ability of the disc to deliver the data even in the face of enormous fragmentation, but people use their machines for other things than just playing back a single stream. I do multitrack audio and video editing on my PC for example. Most files are saved on a different HDD from the one containing the FLAC files in question but some must live on the primary disc just because of making the best use of space across the HDDs present in the machine. If I have heavily fragmented FLAC files that I later delete, the next file saved to that part of the disc will itself be fragmented as NTFS soaks up the fragmented free space. For high bandwidth applications such as those that I commonly run, the resultant effective reduction in disc throughput can be catastrophic (projects that previously ran fine in editors start to stutter and lose sync).[/li][/list]All in all, fragmentation is just "a bad idea", both because it loads the software, loads the hardware, and reduces the performance of any process reading or writing to fragmented areas. No application which is knowingly and avoidably causing excess fragmentation to a disc should be considered as anything other than buggy IMHO. Assuming a disagreement on that count, at least a "turn this off" control would, surely, be a reasonable work-around, if it is not possible to modify the converters to write and then fill empty files of an anticipated maximum size as I suggested previously?


The impact of fragmentation on modern file systems is so small I'm surprised you even care.  Have you actually measured the effect or are you just assuming that since FAT had issues with fragmentation in the 80s that NTFS will today?

The more I think about this, the less this whole idea makes sense.  If you're creating flac files, then deleting them, the fragmentation is the same no matter how they're created.  The FS has no idea what the files are, it's just writing out blocks.  So if you interlieve 10000 blocks, or write them one file at a time and then delete the blocks, you still have the exact same 10000 block (in the worst case anyway) hole.

So even if fragmentation did matter, theres nothing you could do about it, aside from never write files to the disk.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #8
Quote
' date='Apr 21 2006, 04:44 PM' post='384803']A solution to this problem which I read about on another thread (search?)
I did! Honest guv, but I couldn't find any relevant threads using the various search criteria I tried.

Quote
reduce the foobar2000's processor affinity in the task manager, to a single core.
Great idea, I hadn't even considered that as a possibility. This does indeed work. Wonder if Windows can be made to launch FB2K in that mode every time... It'd still be nice to have a UI switch to control this, but it's a viable workaround either way.

Incidentally, I did some benchmarking to see whether or not FB2K was faster with two conversions running at once. The results are fairly predictable but worth mentioning - see below.

Quote
On top of this, if you're running such high performance, surely you can afford a better defragmenter?  Diskeeper will improve the way you defragment by a huge amount.
I'll look into it. The Home edition looks pretty cheap but I can't help feeling I'm paying for my impatience

Back-of-envelope benchmarking

Done on a machine with lots of other stuff running with a hand-held stop watch, so take with a pinch of salt, really. 4 FLAC files are converted to WAV format (44.1KHz/16 bit) at once using "Convert to same directory" from a playlist. The Explorer view for the target directory is open, as is the Task Manager. The WAV files are deleted (completely - not just sent to the recycle bin) between each run.

Using CPU0 only (one conversion at a time):

13.73s / 14.35s / 12.60s / 13.29s / 12.98s: 13.39s

CPU usage around 30%, fairly variable, showing the usual usage pattern for my machine of lots of activity on CPU0 and only minor activity on CPU1 (the Task Manager shows 50% utilisation when the main CPU is fully in use, because it assumes the processors have equal power - but in a single CPU HT system, CPU1 is virtual and of much lower power than CPU). Clearly the task is disc-bound.

CPU0 and CPU1, same test:

14.93s / 15.11s / 16.55s / 15.67s / 16.20s: 15.69s (about 17% slower)

CPU usage around 50%, quite variable, showing a very similar pattern of spikes for CPU0 and 1 in the Task Manager performance graph. Again, clearly the task is disc bound, though there seems to be more CPU usage on average for an overall slower result. A more common test, though - at least in my case - is WAV to FLAC, since it's not often I'm decompressing files in a batch rather than compressing them. This ought to be CPU bound and the numbers bear that out. Using just two of the WAVs before with otherwise the same conditions and on CPU0 only:

44.55s / 44.65s: 44.60s

I only ran it twice since it was taking longer and I got bored  - CPU0 at 50%, CPU1 at 1-2%, 51-52% overall. CPU0 and CPU1:

37.49s / 37.05s: 37.27s (just over 16% faster)

In this case, the CPU usage was 100% throughout, so there was clearly better resource utilisation and this time the dual conversion went about as much faster as it did slower in the disc bound case.

Multithreading is a good idea as paralleling tasks is a very good idea.
It depends on the conditions, I imagine. If you've only one CPU then running two conversions in parallel is surely a bad idea, since all it does is causes more context switches by having one extra busy process running (consider the pathological case where the converter(s) is/are the only process(es) running). I come from an embedded and in particular ARM-based background, where things like that are particularly expensive. Otherwise, why stop at two simultaneous conversions? Because it'd only get slower with more.

Even if you've two CPUs, the rough benchmarks I did in the previous message show that things aren't always quicker. If the two processes were entirely independent then even in the HT case, yes, you'd always expect to see a performance improvement, but since both rely on the same physical disc, it seems that there comes a point where the increased disc activity caused by the two writing processes outweighs the savings elsewhere.

Quote
Code: [Select]
psexec -l -d -a 1 "%programFiles%\foobar2000\foobar2000.exe"

Great, thanks, that's a neat way to automate Shade[ST]'s solution.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #9
This is wrong for several reasons.  First, running two threads at once produces the same number of seeks as running one thread twice.  This should be obvious if you think about how writing works.
Indeed, but nonetheless the benchmarking showed the disc bound case taking longer and the drive is certainly a lot noisier. If we expect one block to be written after another - just because they happen to correspond to different logical files, doesn't mean the drive has to know or care - then why should this be the case? Perhaps an interaction with updating the MFT/containing folder, or an interaction with either the windows or drive's own caches, is the cause. Does anyone know enough about the operation of NTFS to comment? I've tried looking around on TechNet a bit but there's nothing too obvious there.

Perhaps NTFS writes the files out in blocks that are significant in size with respect to total track capacity but smaller than the chunks of data written between each conversion thread context switch. If so, as we switch between files the HDD has to change between the different blocks, which in turn means it seeks back and forth between tracks which it would not do if just writing one sequential file.

It might take a bit longer to write the files because during a normal sequential write we may expect the drive to be able to access concurrent sectors as the disc spins underneath the head, but perhaps the context switches between the conversion processes are sufficient to interrupt supply of data from the cache(s) and make the disc have to wait for the relevant sector to come around by rotation.

Quote
Second, even if this somehow effected seeks, its irrelevent.  Its going to fail regardless, and trying to optimize it's failure point is beyond retarded.  You don't even know why it'll fail.
Consideration of how hardware performs is surely a duty of any software engineer when writing code that is expected to execute on or use resources of said hardware. A disc that is online for a given number of hours but idle will last longer than disc online for the same period but constantly seeking between tracks.

Quote
The impact of fragmentation on modern file systems is so small I'm surprised you even care.
Well Microsoft care, for a start. Take the concern about MFT fragmentation as an example: http://support.microsoft.com/?kbid=174619

FB2K's approach has always seemed to be one of efficiency and quality and IMHO it would be a terrible shame to ignore fragmentation on some wooly assumption that "it's not that bad these days" (despite big jumps in capacity, have HDD seek times really improved so much that we don't care anymore...?).

Quote
Have you actually measured the effect or are you just assuming that since FAT had issues with fragmentation in the 80s that NTFS will today?
I never owned a Windows PC based on FAT; I didn't buy into the Windows market until 2K arrived. I've had problems in the past with strenuous AV applications stuttering which was solved after defragmentation, though it's rare (Logic Delta was particularly annoying for halting playback or recording on the grounds it had decided the CPU or HDD bandwidth were insufficient). It's rare perhaps because I don't have many applications that actively cause fragmentation as a routine side effect of their operations; it just occurs because of the natural pattern of deleting and adding files to the disc that creates variable sized free space holes. With FB2K 0.9 that is no longer the case.

Quote
The more I think about this, the less this whole idea makes sense.  If you're creating flac files, then deleting them, the fragmentation is the same no matter how they're created.
Only if you delete the files that were written concurrently in pairs. If you only delete one of a pair, you're left with a large number of small holes in the free map.

Quote
The FS has no idea what the files are, it's just writing out blocks. So if you interlieve 10000 blocks, or write them one file at a time and then delete the blocks, you still have the exact same 10000 block (in the worst case anyway) hole.
If you delete both files, yes. But if you write two lots of 5000 blocks interleaved, then delete one of them, you're left with 5000 holes (...aren't you? Maybe I'm wrong about how NTFS organises data).

Thanks for all the responses so far, BTW.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #10
Quote
13.73s / 14.35s / 12.60s / 13.29s / 12.98s: 13.39s

CPU usage around 30%, fairly variable, showing the usual usage pattern for my machine of lots of activity on CPU0 and only minor activity on CPU1 (the Task Manager shows 50% utilisation when the main CPU is fully in use, because it assumes the processors have equal power - but in a single CPU HT system, CPU1 is virtual and of much lower power than CPU). Clearly the task is disc-bound.

CPU0 and CPU1, same test:

14.93s / 15.11s / 16.55s / 15.67s / 16.20s: 15.69s (about 17% slower


I'm confused.  With the second test, what exactly did you do?  Record 1 file at a time with HT enabled?  What does that show exactly?

Quote
Indeed, but nonetheless the benchmarking showed the disc bound case taking longer and the drive is certainly a lot noisier.


First, I said constant number of seeks.  Not constant running time.

Second, I don't understand how you benchmarked, so its not even clear to me that the run time was different.

Quote
Perhaps NTFS writes the files out in blocks that are significant in size with respect to total track capacity but smaller than the chunks of data written between each conversion thread context switch. If so, as we switch between files the HDD has to change between the different blocks, which in turn means it seeks back and forth between tracks which it would not do if just writing one sequential file.


Blocks are always written out in integer multiples because the file system caches MBs worth of data and then flushes it when the disk is idle (since reads but not writes block).

Quote
It might take a bit longer to write the files because during a normal sequential write we may expect the drive to be able to access concurrent sectors as the disc spins underneath the head, but perhaps the context switches between the conversion processes are sufficient to interrupt supply of data from the cache(s) and make the disc have to wait for the relevant sector to come around by rotation.


I think you're confusing a few issues here. First, if you're using HT, there are no additional context switches at all (which is the whole point of HT).  Second, the process it self is just handing off data to the OS and then to the driver.  The driver then collects all writes from all processes into a large buffer and then flushes them when the disk is idle.  I don't remember how Windows does this, but its probably similar to LInux where you can cache for up to 30 seconds and hundreds of MBs worth of data (assuming you have that much RAM).

Quote
Consideration of how hardware performs is surely a duty of any software engineer when writing code that is expected to execute on or use resources of said hardware. A disc that is online for a given number of hours but idle will last longer than disc online for the same period but constantly seeking between tracks.


Clearly false.  The time to failure is a random vairable.  You can't say when it will fail, only the odds that it will fail.  Odds that you don't know anything about.

Quote
Well Microsoft care, for a start. Take the concern about MFT fragmentation as an example: http://support.microsoft.com/?kbid=174619


What does MFT fragmentation have to do with writing FLAC files? 

Quote
I never owned a Windows PC based on FAT; I didn't buy into the Windows market until 2K arrived. I've had problems in the past with strenuous AV applications stuttering which was solved after defragmentation, though it's rare (Logic Delta was particularly annoying for halting playback or recording on the grounds it had decided the CPU or HDD bandwidth were insufficient). It's rare perhaps because I don't have many applications that actively cause fragmentation as a routine side effect of their operations; it just occurs because of the natural pattern of deleting and adding files to the disc that creates variable sized free space holes. With FB2K 0.9 that is no longer the case.


No you don't. 

Quote
Only if you delete the files that were written concurrently in pairs. If you only delete one of a pair, you're left with a large number of small holes in the free map.


Quote
If you delete both files, yes. But if you write two lots of 5000 blocks interleaved, then delete one of them, you're left with 5000 holes (...aren't you? Maybe I'm wrong about how NTFS organises data).


Think about what that would require.  You're telling me that you go through and delete every other song whenever you encode FLAC files.  Do you really do this on a regular basis?  Can I ask why?

Discwriter/converter processing two files simultaneously on HT CPU

Reply #11
Hmmm. The board doesn't allow nested quotes, so apologies if the context is quite heavily snipped.

I'm confused.  With the second test, what exactly did you do?  Record 1 file at a time with HT enabled?
I repeated the first test with the process affinity for FB2K set to CPU0 and CPU1, rather than just CPU0. This means FB2K did exactly the same set of conversions as before (four FLAC files to WAV files), but ran two simultaneous conversion processes at once for two pairs of files, rather than one at a time for four files.

Quote
What does that show exactly?
I think it shows that with I/O bound conversions the new behaviour in FB2K results in reduced performance. The later tests showed that in CPU bound operations the new behaviour results in increased performance, by (possibly just by chance) a similar percentage.

Quote
I said constant number of seeks.  Not constant running time.
OK, fair enough; so do you have any thoughts on why the running time is variable if disc activity should be equivalent either way? There must surely be some extra activity to account for the longer running time in the I/O bound case not seen in the CPU bound case (unless it's not I/O bound at all - the significantly less than 100% CPU utilisation during higher bitrate output from the converter could be due to something else I suppose).

Quote
Blocks are always written out in integer multiples because the file system caches MBs worth of data and then flushes it when the disk is idle (since reads but not writes block).
One might expect this, however, the target disc indicator LED shows nothing before the conversion process starts, then lights up immediately and stays illuminated more or less continuously while the conversion process runs. The disc itself is noisy then, but all activity ceases as soon as the conversion window closes; the LED goes out and the disc goes quiet. If there's any write-behind cacheing happening, it's over such small amounts of data that the delays involved are imperceptible to the user. Perhaps the cacheing strategy in WinXP is not what you expect it to be, or does not work as effectively as you hope.

I converting two FLAC files to WAVs on a freshly defragmented 250GB/4K cluster NTFS drive a moment ago using FB2K 0.9.1 beta 1 with its default behaviour of two concurrent conversions. For the 43MB and 35MB WAV files, I get 225 and 168 fragments respectively - around 200K per fragment, but this is not constant; converting other files, including WAV back to FLAC, resulted in average fragment sizes of 76 to 250K (with a FLAC file holding both the largest average and smallest average size, so it doesn't seem to be a function of the rate of conversion, which might have indicated some fine grained time-based write cache flushing). On this 1GB machine there is over 470MB presently allocated to the system cache according to the Task Manager (though AIUI this includes VM allocation) and the conversion operation to WAV took less than 10 seconds, so it looks as if there was plenty of opportunity to collate the write operations in the cache and flush them (much) later on. But that didn't happen.

Quote
I think you're confusing a few issues here. First, if you're using HT, there are no additional context switches at all (which is the whole point of HT).
Well the OS still switches to other processes, such as the FB2K UI thread which updates the progress bar. Assuming XP does not hold exclusive use of the virtual CPU for one of the conversion processes while this goes on, but instead uses a general switching policy, then there should still be two context switches going on when the two conversion threads are switched out (the state of two threads must be stored, even though when those two threads were running concurrently it wasn't necessary to switch between them). In that context for simplicity we might consider them one big thread that's twice as expensive to switch.

You're right that mentioning switching in the original context was misleading since even if there's only one thread running to convert each file, there still need to be 'n' threads in total to convert 'n' files whether run all at once or one at a time. Perhaps the HT case does even better because the two threads do get allocated to CPUs in a less general fashion (e.g. a priority change; in effect the time slice gets longer). In any event I'm way out of my depth when it comes to such specific details of WinXP process scheduling and its interaction with HT virtual CPUs and the fragment size being reported by the Diskeeper trial I'm using to get the fragment data doesn't seem to be directly related to any obvious factors such as file size, conversion type (WAV to FLAC or FLAC to WAV) or conversion time.

Quote
Second, the process it self is just handing off data to the OS and then to the driver.  The driver then collects all writes from all processes into a large buffer and then flushes them when the disk is idle.  I don't remember how Windows does this, but its probably similar to LInux where you can cache for up to 30 seconds and hundreds of MBs worth of data (assuming you have that much RAM).
There is clearly not anywhere near as large a collection of data prior to flushing since the delay before disc activity commences during conversion, and the time until disc activity ceases after conversion, is imperceptible. Or am I misinterpreting the activity indications from the HDD?

Quote
(I wrote: "A disc that is online for a given number of hours but idle will last longer than disc online for the same period but constantly seeking between tracks.")

Clearly false.  The time to failure is a random vairable.  You can't say when it will fail, only the odds that it will fail.  Odds that you don't know anything about.
Hmmm, of course failure at a particular time can only be expressed as a probability, but are you saying that the (average) time to hardware failure in a moving parts device is not influenced by how much that hardware is used? I find that hard to accept, but would welcome a pointer to documentation to the contrary to further my understanding of modern HDDs. It could be that in modern designs the MTBF of the head assemblies is equal to or greater than that of the motor so that there is no difference between the drive running idle, and the drive running while continuously reading and writing data from (say) random sectors. Is this the case?

Quote
(I wrote quoting http://support.microsoft.com/?kbid=174619)

What does MFT fragmentation have to do with writing FLAC files?
You asked, generally, why anyone would care about NTFS fragmentation. I picked out that example as I happened to be reading the article at the time.

Now, perhaps I want to write those FLAC files to DVD. They'll be read at high speed. The DVD writing speed may be reduced (assuming buffer underrun protection, otherwise we'll make drinks coasters) if the FLAC file fragmentation slows data transfer sufficiently. I might just want to copy the files to a backup HDD as quickly as possible. Or I might delete various FLAC files some time later; perhaps I don't like the track, or it is a compilation album track and I buy the original album, deciding to keep that version instead of the compilation one. Who's to say what data will end up being written into the mess of free space holes that get left behind by deletions? Who's to say what purpose it will be used for? Who can say whether it will be performance critical? The MFT could end up extending into that space - Microsoft seem to think this would matter.

I'd like to measure the time taken to copy fragmented converted WAV files to a clean drive and compare with copying unfragmented files, but the read cache gets in the way. I need some copying/benchmarking tool which bypasses or invalidates the cache contents for, ideally, both read and write. Do you know of any? Then we could put some real numbers to this debate and see whether or not it really is a problem. I'm happy either way because of the CPU affinity work-around posted earlier, but it'd be nice to know whether or not I'm just completely deluded...

Quote
You're telling me that you go through and delete every other song whenever you encode FLAC files.  Do you really do this on a regular basis?  Can I ask why?
No, indeed I don't do that, but it's not quite what I suggested. Hopefully the examples I gave above are useful to see where I'm coming from. But like I say, at this point I think we need to get some real numbers down.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #12
Invalidating the cache invalidates the real-world results.

You seem to be anally-retentive about this fragmentation issue. If it's such a big deal, convert them to a drive that's not their final resting place and them move them to where you want them after the conversion. Problem solved, at some time cost. foobar2000 makes it a breeze.

As for fragmentation, the only time I've ever had it as an issue was running at around 1% free space on my 160GB disk for several months. And then, when I freed enough space to rip/burn a DVD, despite massive fragmentation, it just was a little slower. No big deal, just a little bit of time cost. And it's not like I have to sit and wait for it either.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #13
Invalidating the cache invalidates the real-world results.
But if I'm copying files to another HDD for backup, or even writing 4.5GB to a DVD, the disc cache is largely irrelevant because it's not big enough to hold all the data and I won't necessarily have done anything to have that data already stored in the cache. Most reads would have to go straight to the drive.

If I copy a set of files repeatedly as part of a benchmark, re-reading those files increases the chances of them being read from the cache. This is completely different from a real world example of use and would produce varying results with each run as the cache contents settled down. Transferring large amounts of data to try and thrash the cache merely attempts to second-guess its behaviour and is not a deterministic way to ascertain what sort of impact fragmentation might have.

...Ah! At last, I've pulled my finger out enough to find one: http://www.winimage.com/readfile.htm - I'll use this for some numbers and report back later.

Quote
You seem to be anally-retentive about this fragmentation issue.
Well if I'm anally-retentive about anything - in this context  - it's modern software performance.

Quote
(...Convert to another drive and copy back to defragment...) Problem solved, at some time cost. foobar2000 makes it a breeze.
Yes, at some time cost. Whereas FB2K 0.8.3 made it a breeze too, but without generating reasons to copy the files after conversion...

At this point a solution for anyone concerned about the issue has already been presented. However since the multithreaded converter must have been implemented with an eye on improving performance, the reason this thread is still going is mostly (I thought) to discuss exactly what those gains were and better understand the implementation - and its caveats - which might even in due course allow further improvements to be made.

Quote
And then, when I freed enough space to rip/burn a DVD, despite massive fragmentation, it just was a little slower.
So fragmentation did reduce your system performance. In your case you only had massive fragmentation because of your nearly full disc (presumably you kept deleting files to make space and adding new things into the gaps). With FB2K 0.9.x's default conversion behaviour such fragmentation occurs automatically, but we don't know how much of a performance hit this generates. It could indeed be trivial, or it might have a surprisingly harsh impact.

Why is it so wrong to question this change in FB2K which does something that potentially reduces system performance when 0.8.3 didn't have the problem? That's all I'm trying to understand here. What's the gain that's worth the fragmentation penalty? We've seen that the multithreaded converter process can be slower as well as faster - by all means run your own tests to see if you produce different results. What's now needed is a quantitative description of performance degradation for reading the fragmented output files to see whether the converter speed gain, when it happens, is significant or overwhelmed by the fragmentation cost. Without this, quite apart from the fact that multithreaded conversion actually seems to perform worse under some limited circumstances, it has a definite and directly measurable undesirable side effect on the filesystem. On that basis it would surely be foolish to assert that the behaviour was nonetheless beneficial.

If anyone's (still!) reading and has more than one physical processor core it would be interesting to test the speed of the converter on such hardware. I imagine the performance gains would be pretty huge in that case and it would be much easier to decide that this was worth the filesystem issues.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #14
First, I converted two FLACs to WAV with CPU affinity set to CPU0 only (one at a time) on a fairly new disc with not many files (it's being used for backups). The output WAV files still had a few fragments - 2 and 6 for 01.wav and 02.wav respectively. The uncached read speed results were:
    File= 49826 Kb/Sec with  36124412 bytes : 01.wav
    File= 51543 Kb/Sec with  36234956 bytes : 02.wav
    Average =  50671 Kb/Sec with  72359368 bytes (total : 1427 msec)

    File= 48101 Kb/Sec with  36124412 bytes : 01.wav
    File= 50820 Kb/Sec with  36234956 bytes : 02.wav
    Average =  49459 Kb/Sec with  72359368 bytes (total : 1462 msec)

    File= 50103 Kb/Sec with  36124412 bytes : 01.wav
    File= 51397 Kb/Sec with  36234956 bytes : 02.wav
    Average =  50778 Kb/Sec with  72359368 bytes (total : 1424 msec)

    File= 50033 Kb/Sec with  36124412 bytes : 01.wav
    File= 51397 Kb/Sec with  36234956 bytes : 02.wav
    Average =  50742 Kb/Sec with  72359368 bytes (total : 1425 msec)

    Overall average:
50412.5 Kb/Sec[/li][/list]
Between each of the four runs I deleted the files completely and regenerated them, since on a more fragmented disc some previous runs had showed much more variable results under such circumstances until I eventually gave up and tried the cleaner alternative drive. So, this represents a fairly solid result with each run giving similar results.

Next, I generated the same two WAVs from the same two FLACs with CPU affinity set back to CPU0 and CPU1, so FB2K generated both at once. I got 134 and 133 fragments for the two files. The read results were:
    File= 27766 Kb/Sec with  36124412 bytes : 01.wav
    File= 28666 Kb/Sec with  36234956 bytes : 02.wav
    Average =  28210 Kb/Sec with  72359368 bytes (total : 2564 msec)

    File= 28332 Kb/Sec with  36124412 bytes : 01.wav
    File= 27980 Kb/Sec with  36234956 bytes : 02.wav
    Average =  28166 Kb/Sec with  72359368 bytes (total : 2568 msec)

    File= 28068 Kb/Sec with  36124412 bytes : 01.wav
    File= 28286 Kb/Sec with  36234956 bytes : 02.wav
    Average =  28188 Kb/Sec with  72359368 bytes (total : 2566 msec)

    File= 28178 Kb/Sec with  36124412 bytes : 01.wav
    File= 27121 Kb/Sec with  36234956 bytes : 02.wav
    Average =  27639 Kb/Sec with  72359368 bytes (total : 2617 msec)

    Overall average:
28013.25 Kb/Sec[/li][/list]
Now, I don't know about you, but I reckon a 45% drop in raw read throughput is certainly not some minor and insignificant factor. If anyone else thinks these results look worrying, perhaps you could try running a read speed test of your own to see if you get similar figures? In the mean time, I'll be setting the CPU affinity to just CPU0.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #15
The only thing this even approaches stating is that HyperThreading sucks. You can reap the reward of twice-fast encoding when you have two real CPUs, or two real CPU cores, or two real CPUs each with two real cores. Or maybe Intel will push out something with multiple independent cores faster than AMD.

And then what exactly is the point of having multiple processors and/or cores with equivalent power, if not to perform equivalent tasks if you so want them to? Perhaps so you can encode a file with one, encode and burn a movie with another, fetch your heavily encrypted and compressed mail with a third, all while you browse SlashDot on the fourth?

Did it ever occur to you that the process of defragmenting your drive could be a bigger source of stress than regular use? How many free and/or professional defragmenting packages actually claim to reduce the fragmentation to near zero, or even the "ideal" zero point, in just a single pass? I can only think of PerfectDisk claiming to produce remarkable results in a single pass, with only 5% free space, which is quite a claim.

Then with a 95% filled 150GB or larger drive, and maybe 20% or greater fragmentation, you're probably looking at cutting half the life off the drive defragmenting it. And if you've got that much saturation to begin with, chances are you don't move most of it around so often anyway. And how long would it take for that sort of pattern to lead to that much fragmentation? Months? A year?

Discwriter/converter processing two files simultaneously on HT CPU

Reply #16
The root of what Pond is saying is not that defragmentation is good, it's rather that fragmentation is suboptimal, and having two encoders running at once seems to produce fragmentation. What's more, it reduces overall encoding speed on a Hyperthreading system, and could possibly do the same to a dual-CPU system. That should be enough evidence to merit consideration for a fix, if nothing else.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #17
You can reap the reward of twice-fast encoding when you have two real CPUs, or two real CPU cores, or two real CPUs each with two real cores.

Well consider this: Say I have four cores. I am converting a bunch of files. The decoding speed is fast. (Fast enough so its irrelevant) I am encoding to WAV (no DSPs etc.) So the encoding speed is also fast, and in this case the operation would be probably limited by write speed. Is it really really such a good idea to try and read 4 large files and write another four large files simultaneously (which I assume is what happens)? I don't think you would do it if you were just copying files, and it hardly going to be 4x fast (maybe slower?). Buts its also maybe making assumptions about about the source media. What will happen if the source files are located on a (data) CD? Read speed may then be limiting the operation, and I don't recall my CD drive coping well with reading four files simultaneously (but maybe I just have a bad memory). (Forgive me if I'm wrong, since I don't know what actually happens..)
.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #18
The only thing this even approaches stating is that HyperThreading sucks. You can reap the reward of twice-fast encoding when you have two real CPUs, or two real CPU cores, or two real CPUs each with two real cores.
The difference between CPU bound and I/O bound operations is pretty fundamental stuff in software engineering. In the CPU bound case (WAV to FLAC) HT multithreading did give a small speed improvement. In the I/O bound case, the CPU was not fully utilised. It wouldn't have mattered if I had a thousand CPUs, because it was the hard disc slowing the FLAC to WAV conversion down, not the CPU. The fact that conversion went slower when two threads were executing is not necessarily the fault of HT - it reflects what happens to Windows when you try to write two files simultaneously to the same physical disc.

Multiprocessing is not about achieving 'n' times the performance for 'n' CPUs, unless the threads that are executing do so entirely within the CPU and do not access any shared resources. The minute you go to something shared - e.g. shared memory areas, single client drivers, physical items like hard discs - you risk resource contention leading to a decline in performance.

Quote
And then what exactly is the point of having multiple processors and/or cores with equivalent power, if not to perform equivalent tasks if you so want them to?
The point is to speed up CPU-intensive tasks where the mathematical processing job can be shared sensibly between the processor cores. Multiple CPU cores do not help your I/O bandwidth unless the software driver implementation is so poor or the CPU so slow that I/O is limited by 100% CPU utilisation rather than hard disc subsystem speed.

You are right to cite examples of having one CPU handle, say, movie encoding while another handles your web browser execution, but any access to shared systems that can only handle one client at a time will result in one or the other having to wait a bit. So you certainly see performance improvements in most cases but you'll hardly ever see twice the performance with a dual core system over a single core system, all other things being equal. And sometimes things can go completely pear shaped, with resource contention issues causing the dual core system to actually go slower than the single core equivalent.

Quote
Did it ever occur to you that the process of defragmenting your drive could be a bigger source of stress than regular use?
Yes, in fact I think I mentioned as much early on in the thread. We want to avoid the need to defragment if possible, if you think that such operations do stress the drive and reduce its MTBF. Other posts in this thread have expressed an opinion that reading or writing to the disc has no impact on its longevity, though. If that's true, defragmentation won't hurt your drive any more than it is already hurt simply by being powered up and spinning.

Quote
And how long would it take for that sort of pattern to lead to that much fragmentation? Months? A year?
About as long as it takes FB2K 0.9 on a multicore or HT system to convert more than one file! That's the point of this thread. Achieving hundreds of fragments for a single FLAC file represents, broadly, massive fragmentation, much worse IMHO than we would expect to see from a normal day to day use of a drive in a home PC... Again, some numbers here would be nice, but I'm not sure how to get them other than a straw poll along the lines of "how many fragments do your files seem to have?"

If someone could confirm my back of the envelope benchmarks on filesystem performance that'd be cool, since they indicate that the fragmented files resulting from multithreaded conversion take almost twice as long to read from the disc, which is a serious penalty. Because of this, a person concerned about how long it might take to read those files when backing up or writing a DVD or whatever might want to either defragment their drive, or copy the files to a second HDD to achieve defragmentation This is a time consuming process and may stress the drive unduly if read/write operations affect MTBF. Better, perhaps, to stop the fragments from being created in the first place.

Meanwhile:

What's more, it reduces overall encoding speed on a Hyperthreading system, and could possibly do the same to a dual-CPU system.
Well I agree it certainly might with I/O bound operations. Both HT and true multicore machines are likely to perform better for the CPU-bound operations like FLAC, Vorbis or MP3 encoding though.

Discwriter/converter processing two files simultaneously on HT CPU

Reply #19
Another thought is that a HT processor can be used quite good during a single conversion. 1 (virtual) core can do encoding while the other is decoding (usualy cheaper, but not always e.g. APE extra high to MPC).

My vote goes for a setting in the Advanced page (Converter, use multi cpu cores (yes,no))
In theory, there is no difference between theory and practice. In practice there is.

 
SimplePortal 1.0.0 RC1 © 2008-2019