Skip to main content
Topic: Protecting audio files from bit rot? (Read 8090 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Re: Protecting audio files from bit rot?

Reply #25
Things go wrong with computers. When things go wrong with hard disks, it can cost us our data. If that data is primarilly music, well... it's music.

If you loose important data because you drive goes bad. your backup methods is wrong.
If you loose you important data because you house goes on fire. you backup method is wrong.
almost anything less than total annihilation of earth, you can do a proper backup against
Completely agree.
My ordinary data is on a raid5. one drive can go wrong and i can recover it.  really important data that i can never retrieve again i also have on a removable  USB HDD, in case the house goes on fire and i need to grab it quickly with me, or if 2 drives should fail i have a backup to recover it
I have it on a  optical media roughly 7400 KM/s away away that i update roughly once a  year.
If my house get nuked and the entire state of Texas is a burning inferno.  my important data is still intact.
if I remember rightly, what the books say is RAID is about availability and not about security. There are probably a number of reasons for that which I have forgotten, as such things have not been my trade for over 13 years, but you can loose an entire RAID setup to an electrical glitch or  even a system fault writing bad data over your good files.  If your HDD is connected when that lightening strikes the next block, your RAID, internal disks and external disk are probably now as good as dust. Possibly, even if the USB drive is not connected, everything electronic in your house might be dead.

I sometimes reflect that if I had the pass the system audit (and it wasn't exactly tough or bullshit-proof) that I did have to pass every year, with the backup system I have at home, it would fail miserably. I'm afraid yours is worse.

I have two (three, most times) off-machine backup copies of my operating system and all my data. Once a week, an off-site disk comes home and the onsite disk goes away.

--- the once-a-week is subject to human failure, and is sometimes every three weeks. Back in working days, not getting a backup tape off-site every day was a big deal.

--- I know I am not protected against the creeping corruption that is unlikely but not impossible.

My system would not pass muster professionally, even with me! I rate it good enough for my data, and my exposure to data loss of one to three weeks is actually acceptable. Even in the face of my laziness, a substantial acquisition of new data, such as photos from a holiday, will be backed up and send off site much sooner rather than later.

There is another thing about backups. No backup is actually known to be good until it tested --- and I have never, not even in work, been able to have a duplicate system just to test backups with. Life as an IT manager: long periods of boredom punctuated by moments of intense fear ;)
The most important audio cables are the ones in the brain

Re: Protecting audio files from bit rot?

Reply #26
For most people "bitrot" is only going to be noticed when the system stops booting or the drive refusing to spin up, at which point all data will be lost. 

It wouldn't hurt if folks actually looked at the SMART data once in awhile.  ;D
Quis custodiet ipsos custodes?  ;~)

Re: Protecting audio files from bit rot?

Reply #27
I don't think I've ever seen an audio file damaged by bit rot. You'd have to be fairly unlucky to have an audible difference, or have a hard drive that was rapidly failing.  
To me, this is the key issue. Sure, there's the possibility of bit rot mainly due to DRAM memory errors (happens more frequently with non-ECC-memory). A single flipped bit may be a huge issue with program code. But, with audio files, it's pretty unlikely a single flipped bit would be audible.

EDIT: One should also put the frequency of bit rot into perspective. I believe a number that's sometimes tossed around is around one flipped bit per TB written if the RAM is non-ECC. If, for example, you're doing backups of your music collection from your desktop rig to your NAS, it's fairly unlikely that bit rot will ever become an issue. Bit rot is a more serious problem in data centers, cloud storage and such where data may be moved around somewhat frequently.

Re: Protecting audio files from bit rot?

Reply #28
I have it on a  optical media roughly 7400 KM/s away away that i update roughly once a  year

Thank you! You have made me think (the wheels grind slow) about my procedures, and realise that I lack, and need, that longer-term-historical aspect. I'll take that on board.

My off-site storage is only about 10km away. I didn't plan it to be safe from the major flood in my city last December, because nobody saw the event coming in advance. As it happens, all of my original electronics escaped the water anyway, but the property where the backup is kept never even got wet.  Physical separation counts. The next-door neighbour is a bad choice for reasons other than lightening. In fact... there is a hell of a lot to think about 
The most important audio cables are the ones in the brain

Re: Protecting audio files from bit rot?

Reply #29
There are filesystems specifically designed to protect against bitrot.  I have all my data on a Linux "server" in my basement that has a 4 TB mirrored btrfs drive.  I run a btrfs scrub about once every 2 weeks.  If it detects a checksum mismatch, it will copy the file from the mirror.  There other cool features such as snapshotting that can also help.

On the BSD side, ZFS is a similar filesystem.  The FreeNAS OS will let you take a PC and turn it into a NAS using ZFS and help protect against bitrot.  I think some high end NASes also offer bitrot protection. 

Re: Protecting audio files from bit rot?

Reply #30
Btrfs is still considered experimental, but the simple schemes (read: not RAID5) are stable. I'm using it in RAID1 mode at the moment. Even though I had a problem with the filesystem itself once, the data is safe, mainly because a broken FS refuses to mount. For those that want a non-experimental system, ZFS is an alternative.

Non-checksummed RAID1 setups died for me after I read this article. Even without RAID1, a checksummed FS is useful because you immediately notice the file is corrupted, and maybe have the chance to restore an older copy before, say, deleting the older copy to clean up some space.

Re: Protecting audio files from bit rot?

Reply #31
If ZFS (or btrfs) is doing it's job you won't notice a file is corrupted. It will silently correct the problem and write the correct data to another location. You would never get the corrupted version to notice it. You will only notice an error counter when you check the pool status.

Re: Protecting audio files from bit rot?

Reply #32
If this story can warn anyone ... I went fifteen years without a single drive failure, but when it rains, it pours; this one was one of the more annoying ones:

I had a Windows XP that enabled write caching on NTFS drives. (Likely, there were no external other-than-FAT back when XP was fresh.) My mobo and HDD were not the best of friends, and USB started to drop and reconnect - I think.
With write caching, that meant I got a bubble message saying that data had been lost. No way (for me) to tell which data. Unfortunately, it had happened while the file table was written to, and as a result, the drive and Windows messed up which file segments belonged to what files. A song would change midway just because the file segment was overwritten with something else.

Back when I decided on FLAC, I did not realize how important a checksummed file format was, but I quickly learned to appreciate it. The lack of a checksum in mp3 was kinda "just another way this file format sucks" (and that mp3 decoders are often not accurate more down to a roundoff error anyway) - but then came ALAC in MP4. Lossless audio not even worth an 128-bit hash ... WTF?
“It sounded bad to me. Digital. They have digital. What is digital? And it’s very complicated, you have to be Albert Einstein to figure it out.”
- Donald Trump, May 2017

Re: Protecting audio files from bit rot?

Reply #33
Lame stores a checksum of the encoded audio data.
Is 24-bit/192kHz good enough for your lo-fi vinyl, or do you need 32/384?

Re: Protecting audio files from bit rot?

Reply #34
If this story can warn anyone ... I went fifteen years without a single drive failure, but when it rains, it pours; this one was one of the more annoying ones:
 

16 years, I acquired an old Amstrad PC1640 back in 2000, with a 20MB 5 1/2" HDD!

That was replaced by an AMD K6/2 450 MHz running 98se, still working.

Then I got into games again, enter an Athlon 1.4GHz again on 98se, still working.

Then into XP with an Intel Celeron, I squeezed an additional 1TB HDD into this with all my CDs ripped on it, and I'm still using this drive in a i3 W7 box used just to run my Duet.

My gaming rig is an i5 running W10 on a 500GB ssd and a 1TB usb HDD.

Some more info on bit rot from youtube.

https://www.youtube.com/watch?v=Ie9qomn3_3U


Re: Protecting audio files from bit rot?

Reply #35
If ZFS (or btrfs) is doing it's job you won't notice a file is corrupted. It will silently correct the problem and write the correct data to another location. You would never get the corrupted version to notice it. You will only notice an error counter when you check the pool status.
That's only assuming you are using one of the redundant schemes. Of course, only those make sense for actual archiving.

However, there are still strange bugs creeping up in btrfs sometimes, which lead to actual filesystem corruption. The first rule in that situation is: don't panic, get help. Either on the mailing list or the irc channel. Most horror stories I heard about btrfs failing resulting in data loss were cases where people just tried to fix the system themselves and made a mistake. Or ignored the fact the feature they used was explicitly marked as experimental.

I've read about ZFS a little, and it seems while the filesystem is stable, there are ways to destroy your data when manipulating pools incorrectly. It's also less flexible than btrfs (no dynamic switching of redundancy scheme, e. g. from single -> raid1 -> raid 0 -> raid5 -> single).

Can ZFS use a single storage device as a "semi-raid1" (aka -d=dup in btrfs)? While useless against hardware failures, you can still use this scheme to protect yourself from bitrot. Just keep in mind it halves the writing speed as well as the capacity.

Re: Protecting audio files from bit rot?

Reply #36
Lame stores a checksum of the encoded audio data.

Yes it does, but who reads it?
“It sounded bad to me. Digital. They have digital. What is digital? And it’s very complicated, you have to be Albert Einstein to figure it out.”
- Donald Trump, May 2017

Re: Protecting audio files from bit rot?

Reply #37
I've read about ZFS a little, and it seems while the filesystem is stable, there are ways to destroy your data when manipulating pools incorrectly.

This is true of any filesystem. If you foolishly aim a gun (ZFS, btrfs, XFS, ext or otherwise) at your face and pull the trigger, bad things are likely to happen.

Can ZFS use a single storage device as a "semi-raid1" (aka -d=dup in btrfs)? While useless against hardware failures, you can still use this scheme to protect yourself from bitrot. Just keep in mind it halves the writing speed as well as the capacity.

ZFS has a copies parameter which will instruct the filesystem to create as many copies of the data as you like. If on a single device, it stores them in different locations on that device (I use this feature with the SSD my base OS is on).

Re: Protecting audio files from bit rot?

Reply #38
Sure, any filesystem can have its internal database or other internal dependencies become corrupt and suddenly you can't access your files. It's just that when it happens on ZFS, you will lose access to everything, and recovery is difficult if not impossible. Happened to me within 48 hours of trying ZFS for the first time about 6 months ago! I didn't do anything unusual, just used my OS like normal, and all of the sudden it wouldn't boot due to some kind of internal corruption. I looked and look for help, only to find others who got the same error messages as me all just gave up and reformatted. So I will revisit ZFS only when easy-to-use recovery tools become available or it is beefed up so that its internal failures are self-correcting or at least made less catastrophic.

Re: Protecting audio files from bit rot?

Reply #39
Lame stores a checksum of the encoded audio data.

Yes it does, but who reads it?
Anyone with enough motivation.  How does the question change the fact that Lame offers a means to verify the encoded audio (not that I didn't already address the general issue in an earlier post)?
Is 24-bit/192kHz good enough for your lo-fi vinyl, or do you need 32/384?

Re: Protecting audio files from bit rot?

Reply #40
Lame stores a checksum of the encoded audio data.

Yes it does, but who reads it?
Anyone with enough motivation.  How does the question change the fact that Lame offers a means to verify the encoded audio (not that I didn't already address the general issue in an earlier post)?

So how do I get lame to report errors? Using the --verbose option is apparently not enough.
Neither do foobar2000 nor does VUPlayer's audiotester.exe check this CRC. Nor the mp3val I just downloaded for the hell of it.
“It sounded bad to me. Digital. They have digital. What is digital? And it’s very complicated, you have to be Albert Einstein to figure it out.”
- Donald Trump, May 2017

Re: Protecting audio files from bit rot?

Reply #41
So how do I get lame to report errors? Using the --verbose option is apparently not enough.
Neither do foobar2000 nor does VUPlayer's audiotester.exe check this CRC. Nor the mp3val I just downloaded for the hell of it.
This? (I haven't tried)
https://hydrogenaud.io/index.php/topic,111995.msg922998.html#msg922998

P.S. The fhg encoder in Adobe Audition 1.5 also has an option to write CRC checksum, but even Audition itself cannot detect file corruption. I used a hex editor to deliberately change a byte and there is no warning or error when I reopen the file.

Re: Protecting audio files from bit rot?

Reply #42
This? (I haven't tried)
https://hydrogenaud.io/index.php/topic,111995.msg922998.html#msg922998

Whoa, this is what I get for seeing only the FLAC part :-)

Next on the wishlist: write checksum to already encoded mp3 files.
“It sounded bad to me. Digital. They have digital. What is digital? And it’s very complicated, you have to be Albert Einstein to figure it out.”
- Donald Trump, May 2017

Re: Protecting audio files from bit rot?

Reply #43
Encspot is another; though the existence of validation tool doesn't change the fact that Lame writes a checksum.
Is 24-bit/192kHz good enough for your lo-fi vinyl, or do you need 32/384?

Re: Protecting audio files from bit rot?

Reply #44
Forgive me for snipping in the quote :D

Completely agree.
I agree with you agreeing :D
if I remember rightly, what the books say is RAID is about availability and not about security....
You are right RAID only protects against a drive failure.
Software corruptions or accidental deleting it will not because its  not a timed different backup.
Thats what my USB drive is for its a backup in case  i hit ctrl+a  and shift+del on the wrong time :D
Still lightning strike and house fire can take it all out but that is why i have my most important data ( in this case not audio but the pictures of my daughter) on optical disk at my parents in Denmark. I go back to Denmark roughly once a year to visit my family ( I'm living in Texas).  off cause two house fire or lightning strike could still erase all my data both in Texas and in Denmark but i can live with those odds.  My disk er usually contains the original data.  and a winrar copy with data recovery and the rest of space is filled with par data. might be crazy but might as well take full use of the space and since its a once in a year thing i can spend the extra time

I rate it good enough for my data, and my exposure to data loss of one to three weeks is actually acceptable.
And that's exactly the choice people need to make. is this safe enough for me. how valuable is my data and how much do i want to protect it . How bad are the risk of loosing it all

There is another thing about backups. No backup is actually known to be good until it tested
Well even if you test it now it could theoretical be broken later on when you need it its all a chance/risc games.
However you my advice on optical media backup is to run a simple C1/C2 scanner on CD and whatever its called on DVD's
I never keep any Important data on Disc's that shows C2 errors

I have had 2 data losses in my memory.
my last was when my laptop drive died. i knew it was bad so i had most of my important files away from the drive.
It got bad sectors and sometimes would refuse to read files but i keep on using until it finally died and i trashed that laptops ( it had many other defect at the time)
and Ive had optical disc rot once.  the reflective layer liquefied



Anyway i will generally advice people to check out PAR and/or checksum/hashfiles systems.
i have md5 and SHA1 verification on all my multimedia files ( video or audio) to verify if they are OK.  should one get corrupted it can be identified before i back it up and i can verify my old backup as well. and just reuses the one for my backup.
Its an very easy process once you get used to it
Sven Bent - Denmark

Re: Protecting audio files from bit rot?

Reply #45
Back when I decided on FLAC, I did not realize how important a checksummed file format was, but I quickly learned to appreciate it. The lack of a checksum in mp3 was kinda "just another way this file format sucks" (and that mp3 decoders are often not accurate more down to a roundoff error anyway) - but then came ALAC in MP4. Lossless audio not even worth an 128-bit hash ... WTF?

You know you can just... hash it yourself  there are tons of tools for hashing/checksumming it. to avoid a file format because it doesn have it seems weird to me when the option to use it on all kinds of files are avaible.
Now i will say it IS and added bonus to have it checked at every playback by the player.
Sven Bent - Denmark

Re: Protecting audio files from bit rot?

Reply #46
I think it goes along the same quirky lines as people wanting to contain an entire album, cue, log, artwork and any other imaginable piece of supporting material in a single, playable file. I imagine the even more quirky will embed a list of hashes for the supporting material in a vorbis comment.  Maybe someone has already addressed the challenge of replacing the list with a single, all-encompassing hash that's still embedded.
Is 24-bit/192kHz good enough for your lo-fi vinyl, or do you need 32/384?

Re: Protecting audio files from bit rot?

Reply #47
I've read about ZFS a little, and it seems while the filesystem is stable, there are ways to destroy your data when manipulating pools incorrectly.

This is true of any filesystem. If you foolishly aim a gun (ZFS, btrfs, XFS, ext or otherwise) at your face and pull the trigger, bad things are likely to happen.
Maybe I should clarify: I mean ways that appear totally logical. Like adding a device to a pool, except ZFS doesn't rebalance all your data across devices and just uses the newly added device (because it has the most free space) to write most of the data, thus increasing the chance most of the new parity will be on the new device as well. At least that's what I get from the example in this article. There also seems to be no way to fix this (or the author doesn't mention one) without taking your data off the pool, recreating it with new number of devices and moving your data back - that's cumbersome.

With btrfs, you add your device, run a rebalance and all the data + parity is spread evenly across all the new devices.

Re: Protecting audio files from bit rot?

Reply #48
This? (I haven't tried)
https://hydrogenaud.io/index.php/topic,111995.msg922998.html#msg922998

Whoa, this is what I get for seeing only the FLAC part :-)

Next on the wishlist: write checksum to already encoded mp3 files.
Just tried. It reported all files in one of my downloaded mp3 album have CRC error

Code: [Select]
- Warning: Indicated LAME audio size incorrect or unrecognized tag block
* Error: CRC-16 check failed on audio header.
* Error: CRC-16 check failed on audio data.

It looks like a compatibility issue rather than real rot.

There are also a few individual mp3 files have CRC error without that warning, I checked the files' MD5 and googled them and I can find the exact files so it is not my problem.

It will be good if the content providers also provide a checksum file alongside with the media files (in case they don't support checksum) so we can verify their integrity, just like some software downloading sites.

 

Re: Protecting audio files from bit rot?

Reply #49
Greynol already told you the answer. ZFS (or btrfs). I've been bitten by bitrot before when I had a drive silently corrupt hundreds of pictures. I didn't catch it until I randomly looked through the pictures. Once found I went back through my backups and found all the backups except my very oldest (and due to be destroyed) copy had all been quietly copied with errors. I was saved only by the fact I was lazy about destroying my oldest backup that one time. I switched to a file system that offers automatic fault detection and correction. ZFS. Problem solved.
I read on ZFS before posting this topic.  My computer is running Windows which does not support the ZFS file system.  Maybe in the future I could build a FreeNAS system dedicated to music, but I don't have the money for that right now.
Ever heard an old MP3 with a very tiny blip of static? Guess what, that's 1 frame broken screwing up a few milliseconds worth of audio data.

I was always under the impression that such blips were not really silent corruption, rather someone missing the last segment of a p2p download, possibly due to copying before as the p2p client was about to finish, but had not yet written to file the last few bits.
Also I suspect that artifacts at the start are often non-audio written by buggy software, and interpreted as audio by some player.

Anyway, this is not really on-topic to the big question.

So I am curious if anyone here has taken preventive measures to protect their audio files?  Are you storing your music library on newer, experimental file system on a different computer?  Maybe you zip all your albums and store them on external media so you can checksum them if the album has problems later on.

No reason to use .zip for checksum if the audio is lossless - FLAC and WavPack are checksummed formats. For lossy files, one is stuck with the codec. (Is there BTW any simple utility that calculates md5 from audio and writes to tag, and can verify?)

I am not so sure if "experimental" is an appropriate term for the the file systems in question (zfs/btrfs) ... unless you were thinking of something else?
I read about the chirping or static noises on an article.  I figure static noise or that fuzzy sound could also be from an old vinyl rip.  Also a damaged CD or poor software as you said could cause those noises.  But if one day you listen to one of your albums you have heard a dozen times and hear a CHIRP in the middle, that sounds like a case of bitrot.

I don't have any WavPack files, I primarly use FLAC, MP3, and have experimented with a few other codecs.  So with FLAC I can simply highlight every album in foobar and verify integrity to check if the data has been changed.  With MP3 there is no checksum embedded.  So why not zip your albums and put them on external hard drive?  Sure having 2 copies of every album will take up space, but then you have a backup that is verified.

Experimental was a poor choice of word.  I also seem to make topics late at night when I am tired.

 
SimplePortal 1.0.0 RC1 © 2008-2018