Hydrogenaudio Forums

Hydrogenaudio Forum => General Audio => Topic started by: deathcoreRULES on 2016-05-27 22:06:41

Title: Protecting audio files from bit rot?
Post by: deathcoreRULES on 2016-05-27 22:06:41
I don't know how many people here are familiar with the term "bit rot" but it's basically silent corruption of data.  Something like a malfunctioning hard drive controller or a loose cable can cause the bits to get flipped, corrupting your data.  If bit rot has occurred, a backup won't save your files because you will just be replacing the old good backup with a new bad one.  The two most common file systems these days - HFS+ (Mac OS X) and NTFS (Windows) do not protect against bit rot data corruption

While bit rot is silent corruption, people have discovered it by hearing the damage it has done to their audio files.   Ever heard an old MP3 with a very tiny blip of static? Guess what, that's 1 frame broken screwing up a few milliseconds worth of audio data.  Other annoying noises like an MP3 that started with a chirp, or a track that has a "click" in it, bad pop noises, or a song that skips ahead a few seconds are caused by bit rot.

So I am curious if anyone here has taken preventive measures to protect their audio files?  Are you storing your music library on newer, experimental file system on a different computer?  Maybe you zip all your albums and store them on external media so you can checksum them if the album has problems later on.  Or maybe you don't care at all and will re-rip your songs or albums if you find problems in the future.
Title: Re: Protecting audio files from bit rot?
Post by: saratoga on 2016-05-27 22:24:11
Backups are really the only thing you can do to protect against hardware failure.
Title: Re: Protecting audio files from bit rot?
Post by: bennetng on 2016-05-27 22:46:38
Use audio formats with checksum support, like flac and wavpack. It cannot prevent or avoid data corruption but you don't need to listen to the files one by one to verify file integrity. There are tools like this:

http://www.foobar2000.org/components/view/foo_verifier

That's one of the reasons why I don't archive audio files in wav format.

Title: Re: Protecting audio files from bit rot?
Post by: Chibisteven on 2016-05-27 23:03:56
Multiple back ups is your best defense before it ever starts to happen.  Always check files before you back up a recent change and if you notice problems with corruption, find the cause of it.  In a lot cases it maybe necessary to either wipe the drive or replace it depending on the cause of corruption.

Also check your back ups for signs of failure periodically.  Disguard any back up that's gone bad and replace it with new back up that's free of problems ASAP.

Offline back up with several external devices both on site and off site that are often not connected to any computer or network, is better because the possibility of malware attacks such as ransomware as well as fire and natural disasters.
Title: Re: Protecting audio files from bit rot?
Post by: sven_Bent on 2016-05-28 00:52:15
What a load of audiofail BS.  "Bit rot" *facepalm* .  There is nothing specific between you audio files and other data you dont need to safeguard it in any special way.

all the same methods apply

Backup
Data verification chekcsum/hash
Error recovery

You might want to look into .par files
Title: Re: Protecting audio files from bit rot?
Post by: Chibisteven on 2016-05-28 01:39:07
What a load of audiofail BS.  "Bit rot" *facepalm* .  There is nothing specific between you audio files and other data you dont need to safeguard it in any special way.

all te same mthod apply

Backup
Data verification chekcsum/hash
Error recovery

You might want to look into .par files


Bit rot (based on the context of the original poster) (https://en.wikipedia.org/wiki/Data_degradation) is failure of the storage medium, it's another term for when a storage medium begins to lose small bits of information quietly.  He was talking about data degradation and worried about losing his collection to it.  Not any different from someone worrying about losing their photo collection to silent data corruption (https://en.wikipedia.org/wiki/Data_corruption#Silent_data_corruption) or any other personal or valuable data.

Terms get thrown around a lot and lots of times people can get confused with them.  Bit rot actually refers to software rot (https://en.wikipedia.org/wiki/Software_rot) and many confuse it with data rot (https://en.wikipedia.org/wiki/Data_degradation).
Title: Re: Protecting audio files from bit rot?
Post by: sven_Bent on 2016-05-28 03:33:37
The problem is that OP brings it up like its something special about audio. Its simple data corruption like any other data corruption just as likely/unlikely like anything else ( just like you say).  But  OP thinks that backups does not work because somehow the BIT's are rotted and copying over backup does not overwrite the "Bad bits" with good bits.
My point is there is nothing special to worry about in regards to audio data. and all the same methods apply that applies with any other kind of data.
If you blindly copy over you good backup with a new version. without knowing if it good or bad, then you backup method is wrong, it has nothing to do with audio data itself.
Which is also why i advised him to look into .par file which will be able to restore corrupted parts of files.  compared to storage consumption is far better than having multiple copies of the same files.
Title: Re: Protecting audio files from bit rot?
Post by: schmidj on 2016-05-28 05:43:02
Sven, actually he has a point, if you don't check that your files are good before doing a backup, and write the new backup over the old one, as many people do when backing up to a hard drive or even the cloud.  If a file has become corrupted without your knowledge, and you subsequently do a backup and overwrite an old, good backup, you have destroyed the last good copy of the file.  All too common because of the way most automatic backup software is set up.
Title: Re: Protecting audio files from bit rot?
Post by: deathcoreRULES on 2016-05-28 05:52:07
What a load of audiofail BS.  "Bit rot" *facepalm* .  There is nothing specific between you audio files and other data you dont need to safeguard it in any special way.

all the same methods apply

Backup
Data verification chekcsum/hash
Error recovery

You might want to look into .par files

I never said there was a difference between audio files or other data like pictures.  I mention audio files because this is a forum dedicated to music...  In my music library I have found audio files that have been damaged in different ways by bit rot, but have replaced most.  You don't know it happens because it's silent corruption.  You won't know if you have a bad copy of a song or an album if your computer automatically backups your music library to a hard drive.

I am curious what your method is.
Title: Re: Protecting audio files from bit rot?
Post by: greynol on 2016-05-28 06:13:32
You quoted him suggesting you look into .par files. Did you do that?  While you're at it, try zfs if you haven't already.

There are smart copy programs available. fb2k has a file integrity plugin.

If your automated backup system doesn't do:
Quote
Data verification chekcsum/hash
Error recovery
then you should stop using it.
Title: Re: Protecting audio files from bit rot?
Post by: saratoga on 2016-05-28 07:19:41
What a load of audiofail BS.  "Bit rot" *facepalm* .  There is nothing specific between you audio files and other data you dont need to safeguard it in any special way.

all the same methods apply

Backup
Data verification chekcsum/hash
Error recovery

You might want to look into .par files

I never said there was a difference between audio files or other data like pictures.  I mention audio files because this is a forum dedicated to music...  In my music library I have found audio files that have been damaged in different ways by bit rot, but have replaced most.  You don't know it happens because it's silent corruption.  You won't know if you have a bad copy of a song or an album if your computer automatically backups your music library to a hard drive.

I am curious what your method is.


I don't think I've ever seen an audio file damaged by bit rot. You'd have to be fairly unlucky to have an audible difference, or have a hard drive that was rapidly failing.  
Title: Re: Protecting audio files from bit rot?
Post by: kode54 on 2016-05-28 07:54:49
Hard drives, as well as optical media, are protected by cyclical redundancy checksums, or CRC, as well as some error correction codes. When a sector has been demolished by hardware failure, it won't simply result in flipped bits, it will result in attempts to recover the data, followed by outright read errors. This is more likely to happen, if at all, either due to defective hardware, or aging hardware.

"Flipped bits" may occur if the file is being read, then a bit flips on the way to memory, or in memory, but only if it's then rewritten to the file that way. I guess maybe you can try to guard against this happening by having a board capable of using ECC memory, and install some ECC memory, but then you're looking at crashes or other memory errors instead of silently losing your files. I still think this sort of thing is incredibly rare, though.

Now, flash memory, like in USB drives, memory cards, and solid state drives, can suffer from bit rot, if they are not kept powered on occasion. I'm not sure if they need to be explicitly refreshed, or simply powered on regularly. And I think I more recently heard that the bit rot factor doesn't really creep into the game until you're looking at near end of life devices, approaching unusable due most of their capacity already hitting its maximum write cycle count, and reallocations eating into the usable capacity. I could be wrong, though. I don't think anyone has done extensive testing of this, feel free to research this on your own.
Title: Re: Protecting audio files from bit rot?
Post by: Chibisteven on 2016-05-28 09:14:56
Now, flash memory, like in USB drives, memory cards, and solid state drives, can suffer from bit rot, if they are not kept powered on occasion. I'm not sure if they need to be explicitly refreshed, or simply powered on regularly. And I think I more recently heard that the bit rot factor doesn't really creep into the game until you're looking at near end of life devices, approaching unusable due most of their capacity already hitting its maximum write cycle count, and reallocations eating into the usable capacity. I could be wrong, though. I don't think anyone has done extensive testing of this, feel free to research this on your own.

Flash memory is just starting to come into play with larger and larger capacities.  Time will tell us a lot about newer emerging technology such as solid state drives over traditional mechanical ones.  I be very interested in what the long term things that can happen with this type of technology is, without finding out the hard way on my own that is.  I'm not very trusting of it for backup purposes because of some of that stuff that has been reported.  I certain use flash drives and SD cards a lot though but those get plugged in a lot or used as storage on media players and phones.
Title: Re: Protecting audio files from bit rot?
Post by: bennetng on 2016-05-28 10:11:47
I remembered when I was still using Windows XP, there were really some audio and video files in my harddrive getting noticeably corrupted, but I suspect I could have ignored those error like some of the chkdsk reports or lost clusters due to power failure or unexpected reboots. Couldn't remember if the disk/partition was using FAT32 or NTFS though. Never happened again since I upgraded to Windows 7.

In case of CDR data corruption the drive will simply retry and spin like crazy and throw a CRC error.

A good thing about audio file format with internal checksum support is the checksum will not be affected by metadata so updating tags will not change the checksum. However it seems that checksum is not popular among lossy formats, but at least wavpack supports it.
Title: Re: Protecting audio files from bit rot?
Post by: Nongorilla on 2016-05-28 13:04:29
Many file formats have checksums built-in that can be used to detect bit rot with the right tools.  If I may shamelessly plug my own freeware, I have written a couple bit rot detectors available here:

https://Mediags.codeplex.com

The Mediags console program will sweep your directories and report bit rot in .flac, .mp3 (LAME only), and other file type by verifying all available checksums.  This is possible because these formats store internal checksums of their data.  Formats such as ALAC cannot be checked this way because they have no internal checksums.

The site above also features the UberFLAC product (console & WPF) which will do the same checks as Mediags on your .flac files
and will also do deep analysis of an EAC rip which includes verifying .flac files against the CRC-32 values in the EAC log.

These tools are intended for archivists who care about data integrity down to the bit.  These apps require .NET 4 and are purely portable.
Title: Re: Protecting audio files from bit rot?
Post by: Kees de Visser on 2016-05-28 14:16:26
FWIW, I've never been disappointed (so far) by using separate audio drives and/or partitions, thus keeping audio data apart from the system drive.
Title: Re: Protecting audio files from bit rot?
Post by: Thad E Ginathom on 2016-05-28 16:19:58
What a load of audiofail BS.  "Bit rot" *facepalm* .  There is nothing specific between you audio files and other data you dont need to safeguard it in any special way.

all the same methods apply

Backup
Data verification chekcsum/hash
Error recovery

You might want to look into .par files

I never said there was a difference between audio files or other data like pictures.  I mention audio files because this is a forum dedicated to music...  In my music library I have found audio files that have been damaged in different ways by bit rot, but have replaced most.  You don't know it happens because it's silent corruption.  You won't know if you have a bad copy of a song or an album if your computer automatically backups your music library to a hard drive.

I am curious what your method is.


I don't think I've ever seen an audio file damaged by bit rot. You'd have to be fairly unlucky to have an audible difference, or have a hard drive that was rapidly failing.  
I remembered when I was still using Windows XP, there were really some audio and video files in my harddrive getting noticeably corrupted, but I suspect I could have ignored those error like some of the chkdsk reports or lost clusters due to power failure or unexpected reboots. Couldn't remember if the disk/partition was using FAT32 or NTFS though. Never happened again since I upgraded to Windows 7.

In case of CDR data corruption the drive will simply retry and spin like crazy and throw a CRC error.

A good thing about audio file format with internal checksum support is the checksum will not be affected by metadata so updating tags will not change the checksum. However it seems that checksum is not popular among lossy formats, but at least wavpack supports it.
Things go wrong with computers. When things go wrong with hard disks, it can cost us our data. If that data is primarilly music, well... it's music. That is not saying that there is anything  special about data files that happen to contain music.  We, the end users, are not necessarily expected to know and use the precisely correct technical terms.

FWIW, I've never been disappointed (so far) by using separate audio drives and/or partitions, thus keeping audio data apart from the system drive.

What, in case spreadsheet data starts leaking numbers into the audio? ;)

It might make sense to keep filesystems that regularly have data added to them, but relatively rarely have it deleted or moved around. I don't have any justification for that: it is intuitive, which means I could easily by BSing!

I have had one audio file that went bad, in that it was playable from end to end, but the sound was heavily distorted. I never found out how that happened, or if (quite probable) it was something that I did.

Fast-failing hard disks are all too common. What I suspect is that it starts as slow failure, nothing that the system cannot handle, and thus is not noticed. Then it escalates and data becomes unreadable. Any hard disk that starts to make odd noises or give errors should be replaced urgently. 
Title: Re: Protecting audio files from bit rot?
Post by: bennetng on 2016-05-28 16:34:31
I remembered when I was still using Windows XP, there were really some audio and video files in my harddrive getting noticeably corrupted, but I suspect I could have ignored those error like some of the chkdsk reports or lost clusters due to power failure or unexpected reboots. Couldn't remember if the disk/partition was using FAT32 or NTFS though. Never happened again since I upgraded to Windows 7.

In case of CDR data corruption the drive will simply retry and spin like crazy and throw a CRC error.

A good thing about audio file format with internal checksum support is the checksum will not be affected by metadata so updating tags will not change the checksum. However it seems that checksum is not popular among lossy formats, but at least wavpack supports it.
Things go wrong with computers. When things go wrong with hard disks, it can cost us our data. If that data is primarilly music, well... it's music. That is not saying that there is anything  special about data files that happen to contain music.  We, the end users, are not necessarily expected to know and use the precisely correct technical terms.


It seems you quoted something that I didn't write in my original post. Maybe they are actually part of your reply?
Title: Re: Protecting audio files from bit rot?
Post by: saratoga on 2016-05-28 17:35:18
The key thing to remember is that on mechanical hard drives (and likely flash memory) is that read and ECC errors are often followed by more errors and then soon after by hardware failure:

https://www.backblaze.com/blog/hard-drive-smart-stats/

If you look at that data, as soon as a hard drive has even one ECC error, it becomes likely to fail completely, and if it has more than one, it's likely to fail even sooner.  

For most people "bitrot" is only going to be noticed when the system stops booting or the drive refusing to spin up, at which point all data will be lost.  For this reason you really need a back up.  If you're worried about your backup's integrity, use a format with CRC or compute an MD5 hash of the backed up files.

Title: Re: Protecting audio files from bit rot?
Post by: yourlord on 2016-05-29 00:09:23
Greynol already told you the answer. ZFS (or btrfs). I've been bitten by bitrot before when I had a drive silently corrupt hundreds of pictures. I didn't catch it until I randomly looked through the pictures. Once found I went back through my backups and found all the backups except my very oldest (and due to be destroyed) copy had all been quietly copied with errors. I was saved only by the fact I was lazy about destroying my oldest backup that one time. I switched to a file system that offers automatic fault detection and correction. ZFS. Problem solved.
Title: Re: Protecting audio files from bit rot?
Post by: 4season on 2016-05-29 00:20:22
Lots of reasons a person might experience audio anomalies such as the OP describes, but I'd think if the files themselves contained such faulty data, it would be plainly visible using tools like Audacity.

Thinking there are a couple of likely possibilities:

1. Data was always faulty, but maybe previous playback systems were more fault-tolerant, else listener simply didn't notice before.

2. Something in the digital delivery and/or playback chain isn't right.

When I experience such things, it's generally when I'm streaming tunes over the internet, and the provider seems to be experiencing network problems. Sometimes switching to another station and back in order to clear out the buffer is all it needs, else I tune in later in the day.

Experimental filesystems: Anyone who knows what that means knows not to store important data there.

Malware: Get protected! This includes Mac OS X: I encountered my first OS X-specific malware last year, it was a pain to clean out, and this was only one of the more "benign" sorts of adware, not something really insidious, like ransomware.
Title: Re: Protecting audio files from bit rot?
Post by: sven_Bent on 2016-05-29 00:51:26
Sven, actually he has a point, if you don't check that your files are good before doing a backup, and write the new backup over the old one, as many people do when backing up to a hard drive or even the cloud.  If a file has become corrupted without your knowledge, and you subsequently do a backup and overwrite an old, good backup, you have destroyed the last good copy of the file.  All too common because of the way most automatic backup software is set up.

We must be reading different stuff. Deathcore says backups does not protect against it
I am the one bring up the fact that it only if you are doing backups the wrong way.
"If you blindly copy over you good backup with a new version. without knowing if it good or bad, then you backup method is wrong, "
right above your post. So I don't know why you want to point my own point out to me...
if you keep destroying your backup with at new version you are logically only able to restore from your last backup. basically the entire backup method id done bad, if this is how you do it.

Most storage media on the hardware level invisible from the user has ECC built into it and will report once these goes into extreme usage and backups might be needed. AHEAD of time for bits being lost/flipped/wrong. it is also on you internet data transfer protocol
And if you make sure you computer is working properly this is not a problem for bit rot.

Things go wrong with computers. When things go wrong with hard disks, it can cost us our data. If that data is primarilly music, well... it's music.

If you loose important data because you drive goes bad. your backup methods is wrong.
If you loose you important data because you house goes on fire. you backup method is wrong.
almost anything less than total annihilation of earth, you can do a proper backup against

My ordinary data is on a raid5. one drive can go wrong and i can recover it.  really important data that i can never retrieve again i also have on a removable  USB HDD, in case the house goes on fire and i need to grab it quickly with me, or if 2 drives should fail i have a backup to recover it
I have it on a  optical media roughly 7400 KM/s away away that i update roughly once a  year.
If my house get nuked and the entire state of Texas is a burning inferno.  my important data is still intact.

There is nothing new to do this, it has always been that way that something my go bad. and common sense of proper backup will save you.
To bring up such an old, solved and unchanged topic  and in a way where it only audio data is brought up. yes it looks like a huge pile of audiofail. When DeathCore goes into deep details about one bit in a frame bla bla it seems like he thinks it only about audio data. its hard to see he is talking about general data when he only talk about specific data. and again. Its nothing new....

The chances for this Bit rot t is very small (outside of actual disc rot).  for it to happen if you are during proper backups is almost impossible.
I never said there was a difference between audio files or other data like pictures.  I mention audio files because this is a forum dedicated to music...  In my music library I have found audio files that have been damaged in different ways by bit rot, but have replaced most.  You don't know it happens because it's silent corruption.  You won't know if you have a bad copy of a song or an album if your computer automatically backups your music library to a hard drive.

I am curious what your method is.

When you bring up a topic but only bring it up in a very specific sub context ,its really hard to see that you mean the general and not the specifics are you pinpoint out. when you say backups does not work on audio data since you are only talking about audio data. that is a different towards the normal data cause in general  backups works,
So assuming the general very old knowledge that backups works against data corruption and you trying to say it does not work against audio data. Can you kinda see how its easy seeing it as  if you think audio data need special attention?
The alternative was that I would think you don't even know backups works against data corruption. But i didn't think you would know so little to backups.

if you regularly have bitroot. you need to find the source of it. it is NOT something that happens in a ordinary system. its a fault.
mostly because most people accept unstable computer as its a ordinary system.
if this is a recurring effect you need to test your hardware
- memory test (memtest86 or press f7 during windows load and change it to extende test)
- CPU test ( Prime95 or linpack. anything lesse is not gonna max stresstest you CPU
- HDD test (HDD factory has test software to use

To proff you against data corruption. you need to apply a proper backup/restore method
checksum/hash files will help you identify  good/bad data.
If you blindly copy data that has not changed you are doing it wrong and inviting mistake of overwritten good data with bad data
Jave more than one line of backups. i you are really  saa nervous about a single bit flip , which is soo rare to happen. you need multiple backupsfrom multipel diffrent times 1 yyeards old. 1 month old  1 week old c
Par files will help you restore corrupted data incase something goes wrong with you backups as well. use it for anything you want to longtime storage.

Still it a very rare thing to happen if you are using a working properly computer
Title: Re: Protecting audio files from bit rot?
Post by: Porcus on 2016-05-29 22:09:08
Ever heard an old MP3 with a very tiny blip of static? Guess what, that's 1 frame broken screwing up a few milliseconds worth of audio data.

I was always under the impression that such blips were not really silent corruption, rather someone missing the last segment of a p2p download, possibly due to copying before as the p2p client was about to finish, but had not yet written to file the last few bits.
Also I suspect that artifacts at the start are often non-audio written by buggy software, and interpreted as audio by some player.

Anyway, this is not really on-topic to the big question.

So I am curious if anyone here has taken preventive measures to protect their audio files?  Are you storing your music library on newer, experimental file system on a different computer?  Maybe you zip all your albums and store them on external media so you can checksum them if the album has problems later on.

No reason to use .zip for checksum if the audio is lossless - FLAC and WavPack are checksummed formats. For lossy files, one is stuck with the codec. (Is there BTW any simple utility that calculates md5 from audio and writes to tag, and can verify?)

I am not so sure if "experimental" is an appropriate term for the the file systems in question (zfs/btrfs) ... unless you were thinking of something else?
Title: Re: Protecting audio files from bit rot?
Post by: Porcus on 2016-05-29 22:10:30
Lots of reasons a person might experience audio anomalies such as the OP describes, but I'd think if the files themselves contained such faulty data, it would be plainly visible using tools like Audacity.

I absolutely do not want to rely on inspecting every second of every file manually.
Title: Re: Protecting audio files from bit rot?
Post by: Thad E Ginathom on 2016-05-30 17:48:26
I remembered when I was still using Windows XP, there were really some audio and video files in my harddrive getting noticeably corrupted, but I suspect I could have ignored those error like some of the chkdsk reports or lost clusters due to power failure or unexpected reboots. Couldn't remember if the disk/partition was using FAT32 or NTFS though. Never happened again since I upgraded to Windows 7.

In case of CDR data corruption the drive will simply retry and spin like crazy and throw a CRC error.

A good thing about audio file format with internal checksum support is the checksum will not be affected by metadata so updating tags will not change the checksum. However it seems that checksum is not popular among lossy formats, but at least wavpack supports it.
Things go wrong with computers. When things go wrong with hard disks, it can cost us our data. If that data is primarilly music, well... it's music. That is not saying that there is anything  special about data files that happen to contain music.  We, the end users, are not necessarily expected to know and use the precisely correct technical terms.


It seems you quoted something that I didn't write in my original post. Maybe they are actually part of your reply?


"Things go wrong with computers. When things go wrong with hard disks, it can cost us our data. If that data is primarilly music, well... it's music. That is not saying that there is anything  special about data files that happen to contain music.  We, the end users, are not necessarily expected to know and use the precisely correct technical terms." --- mine

Yes sorry, looks like I screwed up there. In which case I probably made other mistakes too. Apologies to anybody else if I accidentally put words in their mouths or attributed their quotes to another perosn.
Title: Re: Protecting audio files from bit rot?
Post by: Thad E Ginathom on 2016-05-30 18:17:48
Things go wrong with computers. When things go wrong with hard disks, it can cost us our data. If that data is primarilly music, well... it's music.

If you loose important data because you drive goes bad. your backup methods is wrong.
If you loose you important data because you house goes on fire. you backup method is wrong.
almost anything less than total annihilation of earth, you can do a proper backup against
Completely agree.
My ordinary data is on a raid5. one drive can go wrong and i can recover it.  really important data that i can never retrieve again i also have on a removable  USB HDD, in case the house goes on fire and i need to grab it quickly with me, or if 2 drives should fail i have a backup to recover it
I have it on a  optical media roughly 7400 KM/s away away that i update roughly once a  year.
If my house get nuked and the entire state of Texas is a burning inferno.  my important data is still intact.
if I remember rightly, what the books say is RAID is about availability and not about security. There are probably a number of reasons for that which I have forgotten, as such things have not been my trade for over 13 years, but you can loose an entire RAID setup to an electrical glitch or  even a system fault writing bad data over your good files.  If your HDD is connected when that lightening strikes the next block, your RAID, internal disks and external disk are probably now as good as dust. Possibly, even if the USB drive is not connected, everything electronic in your house might be dead.

I sometimes reflect that if I had the pass the system audit (and it wasn't exactly tough or bullshit-proof) that I did have to pass every year, with the backup system I have at home, it would fail miserably. I'm afraid yours is worse.

I have two (three, most times) off-machine backup copies of my operating system and all my data. Once a week, an off-site disk comes home and the onsite disk goes away.

--- the once-a-week is subject to human failure, and is sometimes every three weeks. Back in working days, not getting a backup tape off-site every day was a big deal.

--- I know I am not protected against the creeping corruption that is unlikely but not impossible.

My system would not pass muster professionally, even with me! I rate it good enough for my data, and my exposure to data loss of one to three weeks is actually acceptable. Even in the face of my laziness, a substantial acquisition of new data, such as photos from a holiday, will be backed up and send off site much sooner rather than later.

There is another thing about backups. No backup is actually known to be good until it tested --- and I have never, not even in work, been able to have a duplicate system just to test backups with. Life as an IT manager: long periods of boredom punctuated by moments of intense fear ;)
Title: Re: Protecting audio files from bit rot?
Post by: 2tec on 2016-05-30 18:28:32
For most people "bitrot" is only going to be noticed when the system stops booting or the drive refusing to spin up, at which point all data will be lost. 

It wouldn't hurt if folks actually looked at the SMART data once in awhile.  ;D
Title: Re: Protecting audio files from bit rot?
Post by: audiophool on 2016-05-30 20:22:56
I don't think I've ever seen an audio file damaged by bit rot. You'd have to be fairly unlucky to have an audible difference, or have a hard drive that was rapidly failing.  
To me, this is the key issue. Sure, there's the possibility of bit rot mainly due to DRAM memory errors (happens more frequently with non-ECC-memory). A single flipped bit may be a huge issue with program code. But, with audio files, it's pretty unlikely a single flipped bit would be audible.

EDIT: One should also put the frequency of bit rot into perspective. I believe a number that's sometimes tossed around is around one flipped bit per TB written if the RAM is non-ECC. If, for example, you're doing backups of your music collection from your desktop rig to your NAS, it's fairly unlikely that bit rot will ever become an issue. Bit rot is a more serious problem in data centers, cloud storage and such where data may be moved around somewhat frequently.
Title: Re: Protecting audio files from bit rot?
Post by: Thad E Ginathom on 2016-05-30 22:27:41
I have it on a  optical media roughly 7400 KM/s away away that i update roughly once a  year

Thank you! You have made me think (the wheels grind slow) about my procedures, and realise that I lack, and need, that longer-term-historical aspect. I'll take that on board.

My off-site storage is only about 10km away. I didn't plan it to be safe from the major flood in my city last December, because nobody saw the event coming in advance. As it happens, all of my original electronics escaped the water anyway, but the property where the backup is kept never even got wet.  Physical separation counts. The next-door neighbour is a bad choice for reasons other than lightening. In fact... there is a hell of a lot to think about 
Title: Re: Protecting audio files from bit rot?
Post by: apastuszak on 2016-05-30 22:48:04
There are filesystems specifically designed to protect against bitrot.  I have all my data on a Linux "server" in my basement that has a 4 TB mirrored btrfs drive.  I run a btrfs scrub about once every 2 weeks.  If it detects a checksum mismatch, it will copy the file from the mirror.  There other cool features such as snapshotting that can also help.

On the BSD side, ZFS is a similar filesystem.  The FreeNAS OS will let you take a PC and turn it into a NAS using ZFS and help protect against bitrot.  I think some high end NASes also offer bitrot protection. 
Title: Re: Protecting audio files from bit rot?
Post by: ChronoSphere on 2016-05-31 17:58:58
Btrfs is still considered experimental, but the simple schemes (read: not RAID5) are stable. I'm using it in RAID1 mode at the moment. Even though I had a problem with the filesystem itself once, the data is safe, mainly because a broken FS refuses to mount. For those that want a non-experimental system, ZFS is an alternative.

Non-checksummed RAID1 setups died for me after I read this article (http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/). Even without RAID1, a checksummed FS is useful because you immediately notice the file is corrupted, and maybe have the chance to restore an older copy before, say, deleting the older copy to clean up some space.
Title: Re: Protecting audio files from bit rot?
Post by: yourlord on 2016-05-31 23:17:09
If ZFS (or btrfs) is doing it's job you won't notice a file is corrupted. It will silently correct the problem and write the correct data to another location. You would never get the corrupted version to notice it. You will only notice an error counter when you check the pool status.
Title: Re: Protecting audio files from bit rot?
Post by: Porcus on 2016-06-01 12:41:42
If this story can warn anyone ... I went fifteen years without a single drive failure, but when it rains, it pours; this one was one of the more annoying ones:

I had a Windows XP that enabled write caching on NTFS drives. (Likely, there were no external other-than-FAT back when XP was fresh.) My mobo and HDD were not the best of friends, and USB started to drop and reconnect - I think.
With write caching, that meant I got a bubble message saying that data had been lost. No way (for me) to tell which data. Unfortunately, it had happened while the file table was written to, and as a result, the drive and Windows messed up which file segments belonged to what files. A song would change midway just because the file segment was overwritten with something else.

Back when I decided on FLAC, I did not realize how important a checksummed file format was, but I quickly learned to appreciate it. The lack of a checksum in mp3 was kinda "just another way this file format sucks" (and that mp3 decoders are often not accurate more down to a roundoff error anyway) - but then came ALAC in MP4. Lossless audio not even worth an 128-bit hash ... WTF?
Title: Re: Protecting audio files from bit rot?
Post by: greynol on 2016-06-01 15:13:05
Lame stores a checksum of the encoded audio data.
Title: Re: Protecting audio files from bit rot?
Post by: StephenPG on 2016-06-01 15:20:15
If this story can warn anyone ... I went fifteen years without a single drive failure, but when it rains, it pours; this one was one of the more annoying ones:
 

16 years, I acquired an old Amstrad PC1640 back in 2000, with a 20MB 5 1/2" HDD!

That was replaced by an AMD K6/2 450 MHz running 98se, still working.

Then I got into games again, enter an Athlon 1.4GHz again on 98se, still working.

Then into XP with an Intel Celeron, I squeezed an additional 1TB HDD into this with all my CDs ripped on it, and I'm still using this drive in a i3 W7 box used just to run my Duet.

My gaming rig is an i5 running W10 on a 500GB ssd and a 1TB usb HDD.

Some more info on bit rot from youtube.

https://www.youtube.com/watch?v=Ie9qomn3_3U

Title: Re: Protecting audio files from bit rot?
Post by: ChronoSphere on 2016-06-01 21:03:05
If ZFS (or btrfs) is doing it's job you won't notice a file is corrupted. It will silently correct the problem and write the correct data to another location. You would never get the corrupted version to notice it. You will only notice an error counter when you check the pool status.
That's only assuming you are using one of the redundant schemes. Of course, only those make sense for actual archiving.

However, there are still strange bugs creeping up in btrfs sometimes, which lead to actual filesystem corruption. The first rule in that situation is: don't panic, get help. Either on the mailing list or the irc channel. Most horror stories I heard about btrfs failing resulting in data loss were cases where people just tried to fix the system themselves and made a mistake. Or ignored the fact the feature they used was explicitly marked as experimental.

I've read about ZFS a little, and it seems while the filesystem is stable, there are ways to destroy your data when manipulating pools incorrectly. It's also less flexible than btrfs (no dynamic switching of redundancy scheme, e. g. from single -> raid1 -> raid 0 -> raid5 -> single).

Can ZFS use a single storage device as a "semi-raid1" (aka -d=dup in btrfs)? While useless against hardware failures, you can still use this scheme to protect yourself from bitrot. Just keep in mind it halves the writing speed as well as the capacity.
Title: Re: Protecting audio files from bit rot?
Post by: Porcus on 2016-06-01 22:02:08
Lame stores a checksum of the encoded audio data.

Yes it does, but who reads it?
Title: Re: Protecting audio files from bit rot?
Post by: yourlord on 2016-06-01 23:03:34
I've read about ZFS a little, and it seems while the filesystem is stable, there are ways to destroy your data when manipulating pools incorrectly.

This is true of any filesystem. If you foolishly aim a gun (ZFS, btrfs, XFS, ext or otherwise) at your face and pull the trigger, bad things are likely to happen.

Can ZFS use a single storage device as a "semi-raid1" (aka -d=dup in btrfs)? While useless against hardware failures, you can still use this scheme to protect yourself from bitrot. Just keep in mind it halves the writing speed as well as the capacity.

ZFS has a copies parameter which will instruct the filesystem to create as many copies of the data as you like. If on a single device, it stores them in different locations on that device (I use this feature with the SSD my base OS is on).
Title: Re: Protecting audio files from bit rot?
Post by: mjb2006 on 2016-06-02 02:26:06
Sure, any filesystem can have its internal database or other internal dependencies become corrupt and suddenly you can't access your files. It's just that when it happens on ZFS, you will lose access to everything, and recovery is difficult if not impossible (http://mbruning.blogspot.com/2009/12/zfs-data-recovery.html). Happened to me within 48 hours of trying ZFS for the first time about 6 months ago! I didn't do anything unusual, just used my OS like normal, and all of the sudden it wouldn't boot due to some kind of internal corruption. I looked and look for help, only to find others who got the same error messages as me all just gave up and reformatted. So I will revisit ZFS only when easy-to-use recovery tools become available or it is beefed up so that its internal failures are self-correcting or at least made less catastrophic.
Title: Re: Protecting audio files from bit rot?
Post by: greynol on 2016-06-02 02:51:50
Lame stores a checksum of the encoded audio data.

Yes it does, but who reads it?
Anyone with enough motivation.  How does the question change the fact that Lame offers a means to verify the encoded audio (not that I didn't already address the general issue in an earlier post)?
Title: Re: Protecting audio files from bit rot?
Post by: Porcus on 2016-06-02 08:54:37
Lame stores a checksum of the encoded audio data.

Yes it does, but who reads it?
Anyone with enough motivation.  How does the question change the fact that Lame offers a means to verify the encoded audio (not that I didn't already address the general issue in an earlier post)?

So how do I get lame to report errors? Using the --verbose option is apparently not enough.
Neither do foobar2000 nor does VUPlayer's audiotester.exe check this CRC. Nor the mp3val I just downloaded for the hell of it.
Title: Re: Protecting audio files from bit rot?
Post by: bennetng on 2016-06-02 10:58:18
So how do I get lame to report errors? Using the --verbose option is apparently not enough.
Neither do foobar2000 nor does VUPlayer's audiotester.exe check this CRC. Nor the mp3val I just downloaded for the hell of it.
This? (I haven't tried)
https://hydrogenaud.io/index.php/topic,111995.msg922998.html#msg922998

P.S. The fhg encoder in Adobe Audition 1.5 also has an option to write CRC checksum, but even Audition itself cannot detect file corruption. I used a hex editor to deliberately change a byte and there is no warning or error when I reopen the file.
Title: Re: Protecting audio files from bit rot?
Post by: Porcus on 2016-06-02 14:21:46
This? (I haven't tried)
https://hydrogenaud.io/index.php/topic,111995.msg922998.html#msg922998

Whoa, this is what I get for seeing only the FLAC part :-)

Next on the wishlist: write checksum to already encoded mp3 files.
Title: Re: Protecting audio files from bit rot?
Post by: greynol on 2016-06-02 16:01:27
Encspot is another; though the existence of validation tool doesn't change the fact that Lame writes a checksum.
Title: Re: Protecting audio files from bit rot?
Post by: sven_Bent on 2016-06-02 16:34:57
Forgive me for snipping in the quote :D

Completely agree.
I agree with you agreeing :D
if I remember rightly, what the books say is RAID is about availability and not about security....
You are right RAID only protects against a drive failure.
Software corruptions or accidental deleting it will not because its  not a timed different backup.
Thats what my USB drive is for its a backup in case  i hit ctrl+a  and shift+del on the wrong time :D
Still lightning strike and house fire can take it all out but that is why i have my most important data ( in this case not audio but the pictures of my daughter) on optical disk at my parents in Denmark. I go back to Denmark roughly once a year to visit my family ( I'm living in Texas).  off cause two house fire or lightning strike could still erase all my data both in Texas and in Denmark but i can live with those odds.  My disk er usually contains the original data.  and a winrar copy with data recovery and the rest of space is filled with par data. might be crazy but might as well take full use of the space and since its a once in a year thing i can spend the extra time

I rate it good enough for my data, and my exposure to data loss of one to three weeks is actually acceptable.
And that's exactly the choice people need to make. is this safe enough for me. how valuable is my data and how much do i want to protect it . How bad are the risk of loosing it all

There is another thing about backups. No backup is actually known to be good until it tested
Well even if you test it now it could theoretical be broken later on when you need it its all a chance/risc games.
However you my advice on optical media backup is to run a simple C1/C2 scanner on CD and whatever its called on DVD's
I never keep any Important data on Disc's that shows C2 errors

I have had 2 data losses in my memory.
my last was when my laptop drive died. i knew it was bad so i had most of my important files away from the drive.
It got bad sectors and sometimes would refuse to read files but i keep on using until it finally died and i trashed that laptops ( it had many other defect at the time)
and Ive had optical disc rot once.  the reflective layer liquefied



Anyway i will generally advice people to check out PAR and/or checksum/hashfiles systems.
i have md5 and SHA1 verification on all my multimedia files ( video or audio) to verify if they are OK.  should one get corrupted it can be identified before i back it up and i can verify my old backup as well. and just reuses the one for my backup.
Its an very easy process once you get used to it
Title: Re: Protecting audio files from bit rot?
Post by: sven_Bent on 2016-06-02 16:48:03
Back when I decided on FLAC, I did not realize how important a checksummed file format was, but I quickly learned to appreciate it. The lack of a checksum in mp3 was kinda "just another way this file format sucks" (and that mp3 decoders are often not accurate more down to a roundoff error anyway) - but then came ALAC in MP4. Lossless audio not even worth an 128-bit hash ... WTF?

You know you can just... hash it yourself  there are tons of tools for hashing/checksumming it. to avoid a file format because it doesn have it seems weird to me when the option to use it on all kinds of files are avaible.
Now i will say it IS and added bonus to have it checked at every playback by the player.
Title: Re: Protecting audio files from bit rot?
Post by: greynol on 2016-06-02 17:13:45
I think it goes along the same quirky lines as people wanting to contain an entire album, cue, log, artwork and any other imaginable piece of supporting material in a single, playable file. I imagine the even more quirky will embed a list of hashes for the supporting material in a vorbis comment.  Maybe someone has already addressed the challenge of replacing the list with a single, all-encompassing hash that's still embedded.
Title: Re: Protecting audio files from bit rot?
Post by: ChronoSphere on 2016-06-02 19:03:49
I've read about ZFS a little, and it seems while the filesystem is stable, there are ways to destroy your data when manipulating pools incorrectly.

This is true of any filesystem. If you foolishly aim a gun (ZFS, btrfs, XFS, ext or otherwise) at your face and pull the trigger, bad things are likely to happen.
Maybe I should clarify: I mean ways that appear totally logical. Like adding a device to a pool, except ZFS doesn't rebalance all your data across devices and just uses the newly added device (because it has the most free space) to write most of the data, thus increasing the chance most of the new parity will be on the new device as well. At least that's what I get from the example in this article (http://arstechnica.com/information-technology/2014/02/ars-walkthrough-using-the-zfs-next-gen-filesystem-on-linux/). There also seems to be no way to fix this (or the author doesn't mention one) without taking your data off the pool, recreating it with new number of devices and moving your data back - that's cumbersome.

With btrfs, you add your device, run a rebalance and all the data + parity is spread evenly across all the new devices.
Title: Re: Protecting audio files from bit rot?
Post by: bennetng on 2016-06-02 19:13:17
This? (I haven't tried)
https://hydrogenaud.io/index.php/topic,111995.msg922998.html#msg922998

Whoa, this is what I get for seeing only the FLAC part :-)

Next on the wishlist: write checksum to already encoded mp3 files.
Just tried. It reported all files in one of my downloaded mp3 album have CRC error

Code: [Select]
- Warning: Indicated LAME audio size incorrect or unrecognized tag block
* Error: CRC-16 check failed on audio header.
* Error: CRC-16 check failed on audio data.

It looks like a compatibility issue rather than real rot.

There are also a few individual mp3 files have CRC error without that warning, I checked the files' MD5 and googled them and I can find the exact files so it is not my problem.

It will be good if the content providers also provide a checksum file alongside with the media files (in case they don't support checksum) so we can verify their integrity, just like some software downloading sites.
Title: Re: Protecting audio files from bit rot?
Post by: deathcoreRULES on 2016-06-03 21:23:22
Greynol already told you the answer. ZFS (or btrfs). I've been bitten by bitrot before when I had a drive silently corrupt hundreds of pictures. I didn't catch it until I randomly looked through the pictures. Once found I went back through my backups and found all the backups except my very oldest (and due to be destroyed) copy had all been quietly copied with errors. I was saved only by the fact I was lazy about destroying my oldest backup that one time. I switched to a file system that offers automatic fault detection and correction. ZFS. Problem solved.
I read on ZFS before posting this topic.  My computer is running Windows which does not support the ZFS file system.  Maybe in the future I could build a FreeNAS system dedicated to music, but I don't have the money for that right now.
Ever heard an old MP3 with a very tiny blip of static? Guess what, that's 1 frame broken screwing up a few milliseconds worth of audio data.

I was always under the impression that such blips were not really silent corruption, rather someone missing the last segment of a p2p download, possibly due to copying before as the p2p client was about to finish, but had not yet written to file the last few bits.
Also I suspect that artifacts at the start are often non-audio written by buggy software, and interpreted as audio by some player.

Anyway, this is not really on-topic to the big question.

So I am curious if anyone here has taken preventive measures to protect their audio files?  Are you storing your music library on newer, experimental file system on a different computer?  Maybe you zip all your albums and store them on external media so you can checksum them if the album has problems later on.

No reason to use .zip for checksum if the audio is lossless - FLAC and WavPack are checksummed formats. For lossy files, one is stuck with the codec. (Is there BTW any simple utility that calculates md5 from audio and writes to tag, and can verify?)

I am not so sure if "experimental" is an appropriate term for the the file systems in question (zfs/btrfs) ... unless you were thinking of something else?
I read about the chirping or static noises on an article.  I figure static noise or that fuzzy sound could also be from an old vinyl rip.  Also a damaged CD or poor software as you said could cause those noises.  But if one day you listen to one of your albums you have heard a dozen times and hear a CHIRP in the middle, that sounds like a case of bitrot.

I don't have any WavPack files, I primarly use FLAC, MP3, and have experimented with a few other codecs.  So with FLAC I can simply highlight every album in foobar and verify integrity to check if the data has been changed.  With MP3 there is no checksum embedded.  So why not zip your albums and put them on external hard drive?  Sure having 2 copies of every album will take up space, but then you have a backup that is verified.

Experimental was a poor choice of word.  I also seem to make topics late at night when I am tired.
Title: Re: Protecting audio files from bit rot?
Post by: saratoga on 2016-06-03 22:11:20
With MP3 there is no checksum embedded. 

FWIW, this thread says foobar can also check MP3 frame CRCs:

https://hydrogenaud.io/index.php/topic,68536.0.html

This is imperfect since you could still have truncations of the file (whole frame deleted) that might be missed, but it would notice minor errors or bit flips, at least assuming you have the CRC option enabled when encoding files.
Title: Re: Protecting audio files from bit rot?
Post by: sven_Bent on 2016-06-03 22:30:11
So why not zip your albums and put them on external hard drive?  Sure having 2 copies of every album will take up space, but then you have a backup that is verified.
It would be pleasant if you took some of the advice given to you. there is no need to store 2 backups to protect against a sudden data corruption in one copy. PAR file will give you a much better protection against data corruption than an extra copy.
You method is not protection/space efficient.

You asked for advice. Par is MADE for what you are asking for. Its an ECC System in file format.
Title: Re: Protecting audio files from bit rot?
Post by: greynol on 2016-06-04 02:11:27
Frame based CRC is different than what is in the lame tag...

https://hydrogenaud.io/index.php/topic,317.msg2782.html#msg2782
Title: Re: Protecting audio files from bit rot?
Post by: deathcoreRULES on 2016-06-05 02:19:57
With MP3 there is no checksum embedded. 

FWIW, this thread says foobar can also check MP3 frame CRCs:

https://hydrogenaud.io/index.php/topic,68536.0.html

This is imperfect since you could still have truncations of the file (whole frame deleted) that might be missed, but it would notice minor errors or bit flips, at least assuming you have the CRC option enabled when encoding files.

foobar2000 can show CRC32 checksum of MP3 by using "verify integrity" but I just checked one of my albums ripped with LAME 3.9 and there is no CRC32 information.
So why not zip your albums and put them on external hard drive?  Sure having 2 copies of every album will take up space, but then you have a backup that is verified.
It would be pleasant if you took some of the advice given to you. there is no need to store 2 backups to protect against a sudden data corruption in one copy. PAR file will give you a much better protection against data corruption than an extra copy.
You method is not protection/space efficient.

You asked for advice. Par is MADE for what you are asking for. Its an ECC System in file format.
Forgive me but I thought I replied to your post.  Having 2 copies of thousands of album is not space efficient - I agree - but ZIP still does checksum.

Do you suggest https://multipar.eu/ ?  It seems the popular "QuickPar" has not been updated in 12 years.
Title: Re: Protecting audio files from bit rot?
Post by: sven_Bent on 2016-06-05 03:21:36
DeathCore Im not sure what you are trying to do. You suggested as i quoted to store zip'ed version as a backup and/or to use it for checksums ability.

much better tools for that is par. you can use PAR to check your files for data corruption just like running the checksum in zip.
in case you get data corruption par file can also be used to fix those. the same as if you kept an extra copy BUT would take up far less space

Either I'm missing your point about what you are trying to obtain with zipping and extra copy or you are still not understanding what PAE can do for you.. Which par solutions you get is up to you, the par format should to the best of my knowledge be the same.
https://en.wikipedia.org/wiki/Parchive


I'ts just that you rsuggestion of storing an extra copy in a zip file is far inferior in any aspect i can think of .
- Checksum strenght is lower than just using a hash.
- It take up more space than par file or par file would be more robust in correcting errors than an extra copy.
PAR can even self heal aka if par data goes corrupt it can fix itself and fix  the data you are trying to protect as well.


at least if you would still gowith the zip things down use Rar instad.
Rar has a data recovery feature. so in case something get corrupted it can fix it. not just check if things are ok.
Also never use solid compression as it cna amke dat korruptio have a lot more sever effect on your data.
Title: Re: Protecting audio files from bit rot?
Post by: deathcoreRULES on 2016-06-05 05:59:57
DeathCore Im not sure what you are trying to do. You suggested as i quoted to store zip'ed version as a backup and/or to use it for checksums ability.

much better tools for that is par. you can use PAR to check your files for data corruption just like running the checksum in zip.
in case you get data corruption par file can also be used to fix those. the same as if you kept an extra copy BUT would take up far less space

Either I'm missing your point about what you are trying to obtain with zipping and extra copy or you are still not understanding what PAE can do for you.. Which par solutions you get is up to you, the par format should to the best of my knowledge be the same.
https://en.wikipedia.org/wiki/Parchive


I'ts just that you rsuggestion of storing an extra copy in a zip file is far inferior in any aspect i can think of .
- Checksum strenght is lower than just using a hash.
- It take up more space than par file or par file would be more robust in correcting errors than an extra copy.
PAR can even self heal aka if par data goes corrupt it can fix itself and fix  the data you are trying to protect as well.


at least if you would still gowith the zip things down use Rar instad.
Rar has a data recovery feature. so in case something get corrupted it can fix it. not just check if things are ok.
Also never use solid compression as it cna amke dat korruptio have a lot more sever effect on your data.

I am just trying to learn different ways to protect my music from possible corruption.  It sounds like PAR is the second best choice if you don't want to change your computer's whole file system.

Also before making this topic, I have never seen anyone discuss backing up music with PARchive.  I have however, read many posts in the past about zipping music for checksum then storing on external media.  I never said I did this.  Also I don't just mean zip - rar, 7z, lz, dmg, etc.
Title: Re: Protecting audio files from bit rot?
Post by: greynol on 2016-06-05 06:05:08
Maybe spoon feed him with sfv files. I'm afraid he's way in over his head at this point.

Like using zip (et al), looking into fb2k's integrity verifier in conjunction with frame CRCs that likely don't exist in any of his mp3s (not the same as the CRC in the Lame header that I was talking about, but I digress) is, well, let's be nice and just call it a wasted effort. 

For me, mp3s are entirely expendable. If they were going to be corrupted in some way I would just make new ones. If I am unable to detect a problem that may exist, then let ignorance be bliss. If I encounter what I suspect to be an audible problem, then my next course of action would be to consult the source.  This seems far more rational to me than getting sucked into a rabbit hole of paranoia. This is not to dismiss par files, rather it is about the fear that redundancy is being created for files that might already be damaged.
Title: Re: Protecting audio files from bit rot?
Post by: bennetng on 2016-06-05 11:40:39
foobar2000 can show CRC32 checksum of MP3 by using "verify integrity" but I just checked one of my albums ripped with LAME 3.9 and there is no CRC32 information.
We already discussed that several times in this thread, did you see them? Use EncSpot and Mediags. In case you are are unfamiliar with a commandline program like Mediags, it is easy to to use a .bat file and make a shell extension yourself.

[1]Download all exe and dll files in Mediags' website and put in the same folder.

[2]Paste the text below in notepad, save as a .bat file and put into the same folder.

Code: [Select]
@echo off
mediags %1 > log.txt
notepad log.txt

[3]Open Windows explorer and type "shell:sendto" (without the quotes) in the address bar and create a shortcut of the .bat file in the sendto folder.

[4]Right click any folder with your audio files >> Send to >> your .bat file, Mediags will analyze your files and open a log file after it finished.

Title: Re: Protecting audio files from bit rot?
Post by: Brand on 2016-06-06 22:48:44
There is nothing specific between you audio files and other data
Well, for me at least, the specific thing about music files is that I changed metadata (tags, adding ReplayGain info...) way more than with other kinds of files, which made relying on external checksums/parchives impossible in the long run.
It's good to have audio stream checksums then, as in FLAC and a few others.
That aside, I agree it's just backups as usual (with both extra copies and par files, or even par files with 100% redundancy instead of extra copy).
Title: Re: Protecting audio files from bit rot?
Post by: apastuszak on 2016-06-07 04:41:13
Btrfs is still considered experimental, but the simple schemes (read: not RAID5) are stable. I'm using it in RAID1 mode at the moment. Even though I had a problem with the filesystem itself once, the data is safe, mainly because a broken FS refuses to mount. For those that want a non-experimental system, ZFS is an alternative.

Non-checksummed RAID1 setups died for me after I read this article (http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/). Even without RAID1, a checksummed FS is useful because you immediately notice the file is corrupted, and maybe have the chance to restore an older copy before, say, deleting the older copy to clean up some space.

Last I checked, ZFS, though an awesome filesystem, requires a LOT of RAM.  For the size array I wanted, my motherboard did not support enough RAM.  I believe the rule of thumb is 1 GB of RAM for each TB of data in the array.  3 4 TB drives in a RAIDZ would have have required 12 GB of RAM just for the filesystem.  On a 16 GM max motherboard, that didn't leave a lot of RAM for the OS and all the services I had running on it.

ZFS is still more advanced and stable than BTRFS.  But I think once BTRFS gets "stable," it's lower memory footprint may make it more desirable than ZFS.

The other thing that concerns me is that ZFS is owned by Oracle now.  Oracle is the company that started development and backed BTRFS.  I'm hoping that ZFS has a bright future under Oracle, but I am a little worried about it.
Title: Re: Protecting audio files from bit rot?
Post by: ChronoSphere on 2016-06-07 19:23:42
I believe the rule of thumb is 1 GB of RAM for each TB of data in the array.
I thought that was for realtime de-duplication only? (the arc cache). I know some NAS offer the option to use ZRAID and they don't have so much RAM.
Title: Re: Protecting audio files from bit rot?
Post by: Nongorilla on 2016-06-07 20:54:54
Quote
Use EncSpot and Mediags.

Woohoo, my project's first recommendation on HA!

On FLAC, I haven't seen mentioned the CRC-32's you get if you rip with EAC.  These EAC log CRCs are against the same data as the FLAC's intrinsic CRC-16.  If you can verify these, there's no big need for a separate hash file.  Not sure if there are any tools that will check these though...  Oh wait, there is at least one

https://mediags.codeplex.com/wikipage?title=UberFLAC%20over%20WPF

*cough, cough*
Title: Re: Protecting audio files from bit rot?
Post by: SweetSpotListener on 2016-06-07 21:50:55
1. When copying your files, use a utility that checks the files after copy operations (I use TeraCopy).
2. For safe, long term storage, use MDISC. I use 100GB blu-ray XL discs.
Title: Re: Protecting audio files from bit rot?
Post by: AliceWonderMiscreations on 2016-06-07 22:20:47
This is simple. You create a hash of any file you need to be kept pristine and routinely validate the hash on both your primary and backup systems.
Title: Re: Protecting audio files from bit rot?
Post by: AliceWonderMiscreations on 2016-06-07 22:22:24
Btrfs is still considered experimental, but the simple schemes (read: not RAID5) are stable. I'm using it in RAID1 mode at the moment. Even though I had a problem with the filesystem itself once, the data is safe, mainly because a broken FS refuses to mount. For those that want a non-experimental system, ZFS is an alternative.

Non-checksummed RAID1 setups died for me after I read this article (http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/). Even without RAID1, a checksummed FS is useful because you immediately notice the file is corrupted, and maybe have the chance to restore an older copy before, say, deleting the older copy to clean up some space.

Last I checked, ZFS, though an awesome filesystem, requires a LOT of RAM.

Yes and it is best if it is ECC RAM, though I am of that opinion anyway regardless of file system.
Title: Re: Protecting audio files from bit rot?
Post by: bennetng on 2016-06-08 05:26:18
Quote
Use EncSpot and Mediags.

Woohoo, my project's first recommendation on HA!

On FLAC, I haven't seen mentioned the CRC-32's you get if you rip with EAC.  These EAC log CRCs are against the same data as the FLAC's intrinsic CRC-16.  If you can verify these, there's no big need for a separate hash file.  Not sure if there are any tools that will check these though...  Oh wait, there is at least one

https://mediags.codeplex.com/wikipage?title=UberFLAC%20over%20WPF

*cough, cough*

Thanks. Can you add some switchs to ignore some types errors or only check against a specific type of error? I have many files always show errors (and very horrible words like "fatal") despite the fact they are perfectly fine. For example:

xxx.flv
* Fatal: File truncated near packet header.

xxx.mkv
* Fatal: No element found with signature [7B][A9][A2]

xxx.wav
* Fatal: Missing 'data' section

For example, is it possible to show only crc errors?
Title: Re: Protecting audio files from bit rot?
Post by: bennetng on 2016-06-08 05:31:18
2. For safe, long term storage, use MDISC. I use 100GB blu-ray XL discs.
I also remembered Kodak said their CDRs have 100+ years of life, but in reality... :))
Title: Re: Protecting audio files from bit rot?
Post by: Arnold B. Krueger on 2016-06-09 19:36:22
As others have pointed out, this thread goes off the rails of reality in the first post.

Data storage formats such as optical media, hard drives, etc are protected from hardware errors surreptiously causing errorsbby adding parity, CRC, and other checksum type controls to the data as they store them.

The relevance of this problem can be estimated by looking at the number of times a stored program that has always worked well suddenly starts totally failing and crashing do ng the identical same things that used to work well, without an accompanying error message pointing out the media errors that caused it.

In general, computer programs are far, far, far more intolerant of random errors than audio signals.

The reliability of computer systems while processing files that lack common protections schemes can be estimated by looking at the reliability of the same system, processing files that do contain protection schemes.


Bottom line, widespread problems with bit rot are usually modern versions of stories about things that go bomp in the night that used to be told around camp fires, etc. 
Title: Re: Protecting audio files from bit rot?
Post by: ChronoSphere on 2016-06-09 20:55:57
While not widespread, those issues do occur. It happened to me a few times already with archived data, without me moving it - it just suddenly wasn't matching the checksums on several files anymore. That's on at least two HDDs of different age and brand. Maybe all those error correction measures inside the HDD work when you regularly read the files, but it doesn't seem to work as well when the HDD is off most of the time.

In any case, better be safe than sorry afterwards, so adding an additional layer of protection is neither wrong, nor does it "go off the rails of reality". But that's just me.
Title: Re: Protecting audio files from bit rot?
Post by: Arnold B. Krueger on 2016-06-10 00:23:36
While not widespread, those issues do occur. It happened to me a few times already with archived data, without me moving it - it just suddenly wasn't matching the checksums on several files anymore.

Which checksums?  How do you know for sure that these problems weren't due to human failure?

Quote
That's on at least two HDDs of different age and brand. Maybe all those error correction measures inside the HDD work when you regularly read the files, but it doesn't seem to work as well when the HDD is off most of the time.

The drive checksums are created when you record and play data off the disk. They are internal to the drive and generally not visible to people using a proper O/S such as Windows to read and write the files. When the drive is powered down and idle, they are not being calculated.


Quote
In any case, better be safe than sorry afterwards, so adding an additional layer of protection is neither wrong, nor does it "go off the rails of reality". But that's just me.

The odds of failure of the same data on two different disks going bad are fantastically high. They point to a failure of a common component which could be software.  I get the feeling that significant information about these failures is not being reported. If you are storing data with some odd operating system, or system software then of course you shouldn't be using it to store and retrieve important data. But, this is not how the vast majority of data is used.

Title: Re: Protecting audio files from bit rot?
Post by: Thad E Ginathom on 2016-06-10 15:25:44
The relevance of this problem can be estimated by looking at the number of times a stored program that has always worked well suddenly starts totally failing and crashing do ng the identical same things that used to work well, without an accompanying error message pointing out the media errors that caused it.

Back in the day when I worked with people who worked with PCs, that was all part of the Windows experience! Microsoft set the computer-using bar very low, and people got used to stuff like having to reboot,  re-install programs and even reload the operating system on a regular basis. All this while the Unix machines in the server room just went on and on ...and on ...and on.

NB... My experience of Windows ended at XP. If it has improved, in the several versions since then, well good. About time too.

Quote
The odds of failure of the same data on two different disks going bad are fantastically high.The odds of failure of the same data on two different disks going bad are fantastically high. They point to a failure of a common component which could be software.
Or a disk controller or... anything.

Technically, as users, all we need to know is that our systems are not perfect, and that hard disks are mortal, and that, if we do not have adequate backups, we run the risk of loosing our data.  The odds of that happening are not high: it is a lucky person that does not experience one or more hard disk failures in their computing life. Actually, never mind the hardware... it is a lucky person that never gets that ohmygodwhathaveIdone feeling after a delete command.

The rest is academic.

But rot in music files? Yes, probably mythical. in about 15 years of regularly listening to music from a computer, I have (as I think I mentioned before) had one file that "went bad." It played, but horribly distorted. The backup was fine, and it was far more likely to have been user error (me: but no idea how) than bit rot) because a data error on disk would surely be more likely to cause drop outs or an unplayable file, not one that had the same fault from beginning to end. I don't even have a count of the hard disks I've lost in that time.



Title: Re: Protecting audio files from bit rot?
Post by: sven_Bent on 2016-06-11 02:33:32
There is nothing specific between you audio files and other data
Well, for me at least, the specific thing about music files is that I changed metadata (tags, adding ReplayGain info...) way more than with other kinds of files, which made relying on external checksums/parchives impossible in the long run.
It's good to have audio stream checksums then, as in FLAC and a few others.
That aside, I agree it's just backups as usual (with both extra copies and par files, or even par files with 100% redundancy instead of extra copy).

You quoted me out of context i believe the quoted part was according to data corruption.
besides the facts the behavior of updating metadata is not unique to audio data. it might be the only data you personally  but that doesn't make it a valid general statement.

nevertheless if you do update you audio files regular you simply use it as any other new data version. aka verify old and  create new checksum/hash/parity file.
I do agree however that internal checksums/hashes would make it easier as the would be based solely on the main data stream. but in reality it does not make it impossible to keep a external checksum/ash/parity file at all.

its is as always a matter of effort vs wanted results tradeoff
Title: Re: Protecting audio files from bit rot?
Post by: ChronoSphere on 2016-06-11 10:36:47
While not widespread, those issues do occur. It happened to me a few times already with archived data, without me moving it - it just suddenly wasn't matching the checksums on several files anymore.

Which checksums?  How do you know for sure that these problems weren't due to human failure?
CRC32 in the file name of several video files, MD5 within several flac/tak files. Archived, verified checksums matched, no more writes done to the files since. Re-verified one day, some checksums mismatched.

Quote
Quote
That's on at least two HDDs of different age and brand. Maybe all those error correction measures inside the HDD work when you regularly read the files, but it doesn't seem to work as well when the HDD is off most of the time.

The drive checksums are created when you record and play data off the disk. They are internal to the drive and generally not visible to people using a proper O/S such as Windows to read and write the files. When the drive is powered down and idle, they are not being calculated.
That's what I am saying. If the failure happens while the drive is off, or upon power on, it's not much of a security.

Quote
Quote
In any case, better be safe than sorry afterwards, so adding an additional layer of protection is neither wrong, nor does it "go off the rails of reality". But that's just me.

The odds of failure of the same data on two different disks going bad are fantastically high. They point to a failure of a common component which could be software.  I get the feeling that significant information about these failures is not being reported. If you are storing data with some odd operating system, or system software then of course you shouldn't be using it to store and retrieve important data. But, this is not how the vast majority of data is used.
You're trying to find things in my statements that aren't there - the failures happened on a single disk (the backup one), with NTFS, on windows. If you think using self-healing file systems or other means of protecting your data like .par archives is useless, it's your free decision not to.

Like I said, I prefer my data to be safe in case something does happen. You can cry about the odds for a failure being astronomically high all you want afterwards, it won't get your data back.
Title: Re: Protecting audio files from bit rot?
Post by: Arnold B. Krueger on 2016-06-11 16:12:47
That's on at least two HDDs of different age and brand. Maybe all those error correction measures inside the HDD work when you regularly read the files, but it doesn't seem to work as well when the HDD is off most of the time.

That would be speculation on your part apparently based on just one or two occurrences, not verified knowledge obtained by monitoring the operation of a large number of drives.

It is practically impossible for data to change due to a failure of the media in the drive without triggering the drive's error detection and correction features. These features are in a way bypassed if an outside agency changes the drive, because the checksums will  remain correct because there has been no media or logic failure in the drive.

 The error detection and correction features are engaged when data is read or written. If or when a drive fails while powered off, the failure is moot until the next time the relevant data is read or written to the disk.

The drive checksums are created when you record and play data off the disk. They are internal to the drive and generally not visible to people using a proper O/S such as Windows to read and write the files. When the drive is powered down and idle, they are not being calculated. [/quote]That's what I am saying. If the failure happens while the drive is off, or upon power on, it's not much of a security.
[/quote]

The data is checked the next time the drive  is powered up and read or written, which suffices for all practical circumstances.

Quote
In any case, better be safe than sorry afterwards, so adding an additional layer of protection is neither wrong, nor does it "go off the rails of reality". But that's just me.

Quote
Like I said, I prefer my data to be safe in case something does happen. You can cry about the odds for a failure being astronomically high all you want afterwards, it won't get your data back.

You are talking about someone besides me, because I have multiple backups of critical information, some geographically dispersed.

The real problem is that if the drive fails and data is lost or corrupted, the only practical way to recover it in most cases is to go to your backups.

If there is bit rot, the principles of operation of the device and actual real-world experience shows that it must be the error detection and correction features of the drive that give the first warnings, unless the drive is so badly failed that not even they work.

Since my (pre retirement) day job used to involve maintaining and building computers, I've seen a ton of media and equipment failures involving data storage.. Bit rot definitely exists, but it is most often found in common optical media. I have had whole boxes of CD ROMs and DVDs fail on the shelf in a dry, dark area,, and the same for both optical drives, hard drives, and SSDs. 

If a storage device is in the process of succumbing to bit rot, there is  usually first a goodly number of errors detected by the drive itself.  Not all of them are explicitly reported, and sometimes they first manifest themselves as equipment slow-downs. 

That all said, I don't back up anything because I fear bit rot. There are so many far more common errors with the same fatal outcome.
Title: Re: Protecting audio files from bit rot?
Post by: ChronoSphere on 2016-06-11 16:48:47
That's on at least two HDDs of different age and brand. Maybe all those error correction measures inside the HDD work when you regularly read the files, but it doesn't seem to work as well when the HDD is off most of the time.

That would be speculation on your part apparently based on just one or two occurrences, not verified knowledge obtained by monitoring the operation of a large number of drives.

It is practically impossible for data to change due to a failure of the media in the drive without triggering the drive's error detection and correction features.
Yes. But can the error correction work in all cases? No. There is a limit to what it can recover, and sometimes it won't be able to recover the correct state.

Quote
The real problem is that if the drive fails and data is lost or corrupted, the only practical way to recover it in most cases is to go to your backups.
Or, you let your file system automatically handle that (notifying you your other drive is failing) and all other cases of weirdness that may or may not crop up.

Quote
If there is bit rot, the principles of operation of the device and actual real-world experience shows that it must be the error detection and correction features of the drive that give the first warnings, unless the drive is so badly failed that not even they work.
The thing is, as you said it yourself
Quote
Not all of them are explicitly reported, and sometimes they first manifest themselves as equipment slow-downs.
And actually, I only had one type of errors reported to me by windows - that is, that the drive has completely failed. So unless you run additional tools that monitor SMART values, those logged errors are rather useless. And even then, as I said above, some errors are not corrected, so those values are mostly informational in nature. They might warn you and save you from even bigger data loss, but they don't prevent it in all cases. Self healing file systems like btrfs or zfs on the other hand are an additional layer of protection, that only fails to protect you from a complete RAID failure (all drives die before the array can be rebuilt). But that's not something you can do without having another copy anyway.

Quote
That all said, I don't back up anything because I fear bit rot. There are so many far more common errors with the same fatal outcome.
Nor do I, but if I can prevent something from happening, I don't see why I shouldn't - and you haven't mentioned any reasons against it either, even though you sound like you are against it.
Title: Re: Protecting audio files from bit rot?
Post by: Nongorilla on 2016-06-13 03:03:52
Thanks. Can you add some switchs to ignore some types errors or only check against a specific type of error? I have many files always show errors (and very horrible words like "fatal") despite the fact they are perfectly fine. For example:

xxx.flv
* Fatal: File truncated near packet header.
...
xxx.wav
* Fatal: Missing 'data' section

For example, is it possible to show only crc errors?

Fatal errors indicate that the parser is too confused to do anything CRC tests and just gives up.  These formats don't have any CRCs anyway so I take your request to optionally look only at files containing intrinsic checks.  Will consider options.

xxx.flv
* Fatal: File truncated near packet header.

This is a bug in Mediags and thanks for reporting it.  Your file is probably fine.  The fix is easy, but replicating the bug is not since I have no media file that uses segment 7BA9.  Not even any of the official Matroska test files do.  MKV format is sooo difficult is so many ways.

The issue has been assumedly fixed in the next release - v0.9.6 (https://mediags.codeplex.com/releases/view/624374)
Title: Re: Protecting audio files from bit rot?
Post by: Arnold B. Krueger on 2016-06-13 12:38:47

Quote from: arny
That all said, I don't back up anything because I fear bit rot. There are so many far more common errors with the same fatal outcome.

Nor do I, but if I can prevent something from happening, I don't see why I shouldn't - and you haven't mentioned any reasons against it either, even though you sound like you are against it.

Separate programs that test data don't prevent anything - they just tell you when something bad has happened. The damage is done.

 The effective ways to prevent data corruption that are up to the user are primarily related to controlling the hardware and software environment.

I fear that paranoia about so-called bit rot distracts people from the larger problem which is that computer data storage is fallible in may ways. 

Techniques that back up user data on remote media have become far more accessible and easy to implement due to the growth and development of "The Cloud".
Title: Re: Protecting audio files from bit rot?
Post by: deathcoreRULES on 2016-06-13 19:56:14
Maybe spoon feed him with sfv files. I'm afraid he's way in over his head at this point.

Like using zip (et al), looking into fb2k's integrity verifier in conjunction with frame CRCs that likely don't exist in any of his mp3s (not the same as the CRC in the Lame header that I was talking about, but I digress) is, well, let's be nice and just call it a wasted effort. 

For me, mp3s are entirely expendable. If they were going to be corrupted in some way I would just make new ones. If I am unable to detect a problem that may exist, then let ignorance be bliss. If I encounter what I suspect to be an audible problem, then my next course of action would be to consult the source.  This seems far more rational to me than getting sucked into a rabbit hole of paranoia. This is not to dismiss par files, rather it is about the fear that redundancy is being created for files that might already be damaged.
Forgive me for asking questions all the time greynol.  I am no expert like you - not being sarcastic here.

I get your mentality on mp3 files, it makes sense.  For me though, when I started to rip my music collection decades ago, I never thought about codec quality or even ripping loseless.  It was like... insert CD into iTunes, hit OK - it rips with default AAC settings.  Insert a track in WMP, hit OK - oops, plenty of track information is missing, but oh well...  It's only in the past 5 years or so I started caring about this stuff as my library grew.  So while I only rip in FLAC now, the majority of my collection is still various MP3/AAC.  I could go into storage, bring out all my CD's and rip them again, but that would be a waste of time.  As you said in my other topic - "your eyes cannot hear".  Why re-rip my 128 AAC "Dark Side of the Moon" when it sounds perfect to my ears?

I think creating a par file to verify my music is not damaged is a better solution than going into storage and re-ripping every CD I own just so it can have that built in checksum.  I really like this multipar program that I downloaded.  It's intuitive and does the job I need it to do.  If I just want a checksum for the MP3 files in question, it takes up maybe 20KB of space - that is nothing.  It's unlikely an entire song or album would be damaged, but if I wanted, I could set the redundancy to maybe 10% so multipar could try and repair the damaged MP3 file.  I think for me though, the checksum is enough.  If a song got damaged by bit rot and had a bad CHIRP or something, then fine, I'll go get my CD and rip it again but in FLAC.
There is nothing specific between you audio files and other data
Well, for me at least, the specific thing about music files is that I changed metadata (tags, adding ReplayGain info...) way more than with other kinds of files, which made relying on external checksums/parchives impossible in the long run.
It's good to have audio stream checksums then, as in FLAC and a few others.
That aside, I agree it's just backups as usual (with both extra copies and par files, or even par files with 100% redundancy instead of extra copy).
I changed a lot of metadata this past year - fixing many tags and adding high quality album art.  Unless you are tweaking ReplayGain all the time, I wouldn't say relying on parchive is impossible.  It only takes a few seconds to make a par file checksum for a large album.

Am I correct that FLAC metadata does not change the checksum calculation?
1. When copying your files, use a utility that checks the files after copy operations (I use TeraCopy).
2. For safe, long term storage, use MDISC. I use 100GB blu-ray XL discs.

Never heard of MDISC until today.  In the past I used CD's as archival storage because they were so cheap compared to hard drives or flash storage.  I still have hundreds of burned discs that are over 15 years old and the ones I have tried in the past 2-3 years seem to work just fine.  There has always discussion though, that CD's and DVD's last between 15-30 years and the cheaper discs could physically rot faster.  Even though I don't use CD's and DVD's more, it's always important to have multiple backup solutions.
Title: Re: Protecting audio files from bit rot?
Post by: Porcus on 2016-06-13 22:00:56
Right ... comparing audio files with executables. If I could fix a corrupted audio library by just redownloading a couple of megabytes, I wouldn't worry so much about my backups.

As for mp3s, one is stuck with lossy streams unless one is willing to store decoded/transcoded versions.
Title: Re: Protecting audio files from bit rot?
Post by: greynol on 2016-06-14 00:24:19
Maybe spoon feed him with sfv files. I'm afraid he's way in over his head at this point.
Also before making this topic, I have never seen anyone discuss backing up music with PARchive.
It appears as though you did know about par files.  I apologize for underestimating you level of knowledge on the subject.

I am no expert like you - not being sarcastic here.
Thank you, but I do not have the same level of expertise as the others who are contributing to this topic.
Title: Re: Protecting audio files from bit rot?
Post by: Brand on 2016-06-14 09:41:20
Am I correct that FLAC metadata does not change the checksum calculation?
If you mean the "inner" MD5 audio checksum, that is correct. But the checksum of the whole file will of course change any time you change metadata.
A possible solution is separating the audio data and metadata into separate (cue?) files, but AFAIK that has some drawbacks and limitations.
Title: Re: Protecting audio files from bit rot?
Post by: ChronoSphere on 2016-06-14 18:31:34

Quote from: arny
That all said, I don't back up anything because I fear bit rot. There are so many far more common errors with the same fatal outcome.

Nor do I, but if I can prevent something from happening, I don't see why I shouldn't - and you haven't mentioned any reasons against it either, even though you sound like you are against it.

Separate programs that test data don't prevent anything - they just tell you when something bad has happened. The damage is done.
See, that's not what zfs and btrfs do. They actually can self heal by detecting the (data block, not file!) checksum does not match on one of the disks and automatically copy over the correct data from the backup. I've linked an article which explains it in detail in this thread before, and how it's different from a regular raid1/5.

Parity files like the ones created by multiPAR can also be used to detect and repair damage done to the files. They are a trade-off between the amount of damage they can restore (parity file smaller than a whole copy) and file size.

So yes, there are separate programs that not only let you detect but also repair the damage at the cost of additional space used.

Quote
I fear that paranoia about so-called bit rot distracts people from the larger problem which is that computer data storage is fallible in may ways. 
Hence why users should make backups. And self-healing backups are better than "regular" backups.

Quote
Techniques that back up user data on remote media have become far more accessible and easy to implement due to the growth and development of "The Cloud".
Sadly, only a few countries of the world have a good internet infrastructure to actually allow normal users to backup to the cloud (have fun uploading terabytes with an upstream of 40KiB/s) and even if you have enough upstream bandwidth, cloud offerings that offer enough space to back the data up will cost you way more than just buying another HDD and storing it at a friend's house several tens of kilometers away.

In other news, it seems Apple will be including a new filesystem (http://arstechnica.com/apple/2016/06/new-apfs-file-system-spotted-in-new-version-of-macos/) in the macOS sierra release, but from what it seems, it doesn't have any FS level checksumming like btrfs or zfs.
Title: Re: Protecting audio files from bit rot?
Post by: apastuszak on 2016-06-14 21:08:43

Quote from: arny
That all said, I don't back up anything because I fear bit rot. There are so many far more common errors with the same fatal outcome.

Nor do I, but if I can prevent something from happening, I don't see why I shouldn't - and you haven't mentioned any reasons against it either, even though you sound like you are against it.

Separate programs that test data don't prevent anything - they just tell you when something bad has happened. The damage is done.

 The effective ways to prevent data corruption that are up to the user are primarily related to controlling the hardware and software environment.

I fear that paranoia about so-called bit rot distracts people from the larger problem which is that computer data storage is fallible in may ways. 

Techniques that back up user data on remote media have become far more accessible and easy to implement due to the growth and development of "The Cloud".


This I have to agree with.  Before I implemented a btrfs mirror, if a file was corrupt, I simply hopped on Crashplan and restored the file over the old one and was up and running 5 minutes later.  Of course having a btrfs mirror cuts that time down to seconds, instead of minutes.

Don't let copy-on-write checksummed filesystems distract you from keeping good backups.
Title: Re: Protecting audio files from bit rot?
Post by: Nongorilla on 2016-06-21 22:41:39
This is a bug in Mediags and thanks for reporting it.  Your file is probably fine.  The fix is easy, but replicating the bug is not since I have no media file that uses segment 7BA9.  Not even any of the official Matroska test files do.  MKV format is sooo difficult is so many ways.

The issue has been assumedly fixed in the next release - v0.9.6 (https://mediags.codeplex.com/releases/view/624374)

Found an example that uses this signature.*  Current RC build handles it okay.  (First untested attempt did not.)

On a related note, I have yet to find an example of a MKV container that actually uses intrinsic CRCs effectively.  Still, the Mediags tool is useful to verify structure and file length.

*Earlier reply was a misquote.  Meant to refer to the xxx.mkv file.
Title: Re: Protecting audio files from bit rot?
Post by: Peter on 2016-08-23 09:26:07
FWIW, this thread says foobar can also check MP3 frame CRCs:

https://hydrogenaud.io/index.php/topic,68536.0.html

This is imperfect since you could still have truncations of the file (whole frame deleted) that might be missed, but it would notice minor errors or bit flips, at least assuming you have the CRC option enabled when encoding files.

I'd like to point out a few facts about MP3 CRC that seem to be largely unknown to the general public.

Source:
http://www.mp3-tech.org/programmer/docs/mp3_theory.pdf
Quote
This field will only exist if the protection bit in the header is set and makes it possible check the most sensitive data for transmission errors. Sensitive data is defined by the standard to be bit 16 to 31 in both the header and the side information. If these values are incorrect they will corrupt the whole frame whereas an error in the main data only distorts a part of the frame. A corrupted frame can either be muted or replaced by the previous frame.
tl;dr You can flip bytes in MP3 file with a hex editor and most of the time CRC checks will not detect audible defects; they safeguard only a specific small part of each MP3 frame.

The MP3 CRC field is meant for preventing massive audible distortion when streaming over an unreliable medium, not for detecting storage errors.
On top of that, working on frame basis, they do nothing about protecting the file as a whole, against truncation or insertion of unwanted data.
Title: Re: Protecting audio files from bit rot?
Post by: Nongorilla on 2016-08-23 20:35:28
Quote
most of the time CRC checks will not detect audible defects
This jives with my testing.  I've never heard any difference from flipping single bits at random.
Quote
The MP3 CRC field is meant for preventing massive audible distortion when streaming over an unreliable medium, not for detecting storage errors.
Makes sense - it's just 16 bits.  The possibility of a false negative is too high.
Quote
they do nothing about protecting the file as a whole, against truncation or insertion of unwanted data
The LAME CRC does - within the limit of a paltry 16 bits.  LAME's CRC is over the *entire* audio segment, not just a frame.  LAME also stores the length of the entire audio segment.  The quote you provided is about audio frames.  It does not address the LAME wrapper.
SimplePortal 1.0.0 RC1 © 2008-2019