An argument against tag-centric music players 2009-03-24 03:57:59 I'm really tweaked when I do tag work in foobar2000, and for whatever reason (usually relating to not enough tag buffer existing in the file etc), the entire fraction-of-a-gigabyte file gets rewritten to change a measly ~200 bytes. But I just realized the issue may be bigger than just pulling my chain.The current unrecoverable error rate for hard drives is in the vicinity of 10^-15 bits. The current error rate for memory is around 10^-12 flips/bit*hr. Let's say that you have a 1TB (8e12 bit) FLAC music library, and you need to redo the tags on them. For whatever reason, your music player decides it needs to rewrite each file in total to get this done. (In my experience, this has to be done at least once - depending on how the creator of the file handled the tagging buffers - and may need to be done multiple times.) Your hard drive array has a bulk read/write speed of 50MB/s=400Mb/s for reading the file off disk, slapping on the new tag and writing it back out. Assume for the sake of argument that I/O is done with a 1MB buffer (this is either excessively high or excessively low depending on what you're looking at).The total operation takes 8e12/400e6=2e5 seconds (about 5.6 hours) to complete.The probability of an unrecoverable hard disk error occurring during this timeframe is 8e12*1e-15 = 0.8%.The probability of an unrecoverable RAM error occurring during this timeframe is 8e6*2e5/3600*1e-12=0.044%.The probability of the library being corrupted by either the hard disk or the RAM is 1-(1-0.8%)(1-0.044%)=0.84%.Those are not odds I enjoy, especially because (with the joys of facets) I wind up retagging reasonably often. Of course, there are two extremely obvious solutions to this that are commonly proposed: a) verify files after modification/copying, and b) use RAID/ECC. But the latter is still rather expensive (compared to the baseline of just slapping a single 1TB drive up) and the former doesn't keep this sort of thing from happening in the first place. I'd like a solution that minimizes the error rate on the hardware I already own, thank you very much.Rather.... I am wondering if there is a more direct, and clean, solution to this in the music player itself. iTunes already does it: it doesn't store its tag information in the file. It stores it in a database typically under My Documents. Of course, that causes all sorts of problems for people who want to pull their media away from iTunes or otherwise don't use it as a music manager - but for those of us who use the fb2k media library or another similar scheme, we only really need the tag information to be applied when we transcode or back stuff up. We never really need the file tags inside the program, and given what I'm pointing out, I think that the file-based approach has a significantly increased (if theoretical) risk of file corruption.In short, I think we as a community should revisit the concept of database-oriented tagging as being superior to file-based tagging on reliability grounds. Database tagging obviously doesn't solve the problem entirely, but it is a far more effective use of hard disk activity. Databases can be backed up much more easily and can have more error-recovery information embedded in them than is typically available in music files. I think that retag-triggered file rewrites occur far, far too often, and music libraries are advancing to a size where a rewrite should be avoided as much as possible.