Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Lack of parallelism when processing files (Read 1072 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Lack of parallelism when processing files

Version: fb2k 2.1 preview 2023-08-28
Platform: Win 11 22H2 (22621.2215) x64

If I drag a folder of ~800 albums of previously-unseen MP3s into fb2k, it takes a long time (minutes) to process the incoming files. Meanwhile task manager shows CPU at 2% and disk at <50% (much of which isn't fb2k anyway). It seems like the incoming file processing is serialized, and could go faster split into parallel threads?

 

Re: Lack of parallelism when processing files

Reply #1
That sounds a tad slow, but might just the speed your hardware manages to seek/read those files.

fb2k is much faster to use if you use its Media Library feature. However that requires you to place your music in one or several configurable folders, say your Music and your Downloads folder.

Navigating, searching und dropping into a playlist from within fb2k then becomes near instantaneous after the initial slow background scan. And yes, the watched directories are refreshed near instantly (local HDDs) or upon restart/button press (network drive).

Re: Lack of parallelism when processing files

Reply #2
That sounds a tad slow, but might just the speed your hardware manages to seek/read those files.

I have a RAID array backing the files that is significantly faster than that to do raw I/O such as copying (and, as I mentioned, my disk utilization is relatively low to be I/O-bound).

Quote
fb2k is much faster to use if you use its Media Library feature.

Yes, my normal library is in the media library, and startup/rescan is reasonable for that. In this case, the files are (intentionally) not part of my library.

Re: Lack of parallelism when processing files

Reply #3
I believe the reason is that hammering I/O more would cause more seeking and make things slower.

I don't know what kind of RAID setup you have there but both from my spinning hard drives and from a very fast NVMe SSD drive doing what you did resulted in 100% load on the disks already now. If there were more threads loading files simultaneously the spinning hard drives would have slowed down. With SSD there shouldn't have been a difference.

Re: Lack of parallelism when processing files

Reply #4
I looked into this further, and I no longer believe fb2k has a problem.

When I take a folder of ~6200 MP3s (~35 GB) from the RAID array and:
* Copy it to an NVMe SSD with Windows Explorer
* Drag it into fb2k for processing
Both the disk utilization and total time are similar, at ~80% and ~56 sec, respectively. The lower-than-100% disk utilization is possibly due to the source storage space using 1 parity bit per 4 data bits; I wonder if Windows does a parity check on reads and reports throughput in terms of fraction-of-the-total-physical-limits, which would mean ~80% is the peak steady-state throughput. The <50% utilization I reported initially was, I believe, due to me looking at the sum utilization across all drives, not just the RAID array. (Oops.)

This suggests to me that the bottleneck here is either physical, or at a lower level (drive firmware, Windows disk subsystem, etc.). I could use ETW to grab a trace here and look in more detail at what's going on, but it doesn't seem warranted.

Sorry for the noise!