Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Long-term backup of big music collection (Read 5190 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Long-term backup of big music collection

I'll be backing up some 500-odd CDs onto DVD+Rs in single FLAC + CUE file mode per CD and wanted to know this:

What should go along with the FLAC & CUE file for each CD? I've heard talk about MD5, PAR, etc. but, I don't have any idea what they are, how they work, or how I make them for my files. Can someone clue me in on what's the best solution, why, how to make it work, how to check to make sure that everything's OK with the data, etc.? Thanks very much. On a side note, the reason for all of this is because I plan to put all of my original CDs in a fireproof safe (anyone want to reccomend what a good safe for ~500 USD is?) locked up and leave home for a very long time... The discs will have three copies, each stored in a different location.

Edit: Addition: BTW, about ReplayGain, how can I benefit from it via single-file-per-CD FLAC files? Would I need to use track or album setting? Is it worth using this feature at all when losslessly backing up all music CDs? Is ReplayGain information stored in the FLAC file?

Long-term backup of big music collection

Reply #1
I can only tell you things about MD5 because i'm somehow familiar with that thx to my eMule activity.

MD5 files usually contain a MD5 hash. Thats a mathematically generated alphanumeric string that is (at least in theory) unique for every file.
You can use that to check big files (like full CD images etc) for corruption before wasting a CD-R on it if its corrupted. You just download the MD5 file and compare it to the MD5-Hash that is generated out of the file you want to check. If it's the same, the file should be OK. If not, the file has been corrupted/damaged/changed...

For your CD Backup, this method is not really of use for you, only if you want to be shure, that the file you get from your DVD is the same you burned onto it.

The only way I can imagine to be shure that you have full quality rips is to use EAC.

Long-term backup of big music collection

Reply #2
PAR is a method of checksum-based (like MD5) file archiving with the option of having an adjustable size of redundant data being spread over the par-images for possible future data recovery.

If you search around this community, you will find a lot of info on how to use par to be on the safe side of data archiving.
The name was Plex The Ripper, not Jack The Ripper

Long-term backup of big music collection

Reply #3
An MD5 hash of the unencoded data is stored within the FLAC file. If you use the --verify option during encoding it will make use of this to check the data integrity. When a FLAC file is decoded, the MD5 hash is used again to check the output and an error is reported if there was a problem (actually, --verify works by doing a decode). (If anyone has ever experienced such errors I'd like to know). The bottom line is that MD5-based error checking is built into FLAC so you shouldn't have to worry.

A definition of MD5:
http://userpages.umbc.edu/~mabzug1/cs/md5/md5.html

A GUI for generating MD5s in case you are still interested:
http://www.md5summer.org/

As for your replaygain question, the only thing I am sure about is that you can store replaygain information along with lots of other metadata (like CUE sheet data) in the FLAC file. I think you would just use the album setting... and yes, I think it would be worth it. I personally don't rip to single files, partly because I want to add individual song title tags.

Long-term backup of big music collection

Reply #4
Can we retitle this thread "Guide for the uberparanoid"?
"You can fight without ever winning, but never win without a fight."  Neil Peart  'Resist'

 

Long-term backup of big music collection

Reply #5
OK, since MD5s are automatically stored within the FLAC files (and I verify every encoding during encoding), I really shouldn't need a separate MD5 file. Now, if I'm sharing the FLAC file on a network, etc., would an MD5 of the FLAC be handy or no?

I won't be storing Cuesheet information in the FLAC because it can't hold CD-TEXT disc & track artist, songwriter, and names.

About Replaygain, since I'm only encoding a single FLAC per CD, would I need to use the "treat input files as one album" option?

Quote
PAR is a method of checksum-based (like MD5) file archiving with the option of having an adjustable size of redundant data being spread over the par-images for possible future data recovery.
- So, is PAR like a WinRAR, where files and directories are stored in a single compressed archive file? Or is this something else?

Thanks for any of the information...

Long-term backup of big music collection

Reply #6
Quote
Can we retitle this thread "Guide for the uberparanoid"?

Indeed! LOL! But, it's because I'm going abroad for a while and won't have access to my originals for a long time... 

Long-term backup of big music collection

Reply #7
Quote
OK, since MD5s are automatically stored within the FLAC files (and I verify every encoding during encoding), I really shouldn't need a separate MD5 file. Now, if I'm sharing the FLAC file on a network, etc., would an MD5 of the FLAC be handy or no?

It really depends what your goal/concern is. You could just run flac --test <filelist> to verify that the wavs haven't changed. But this doesn't tell you if someone maliciously tampered with the files (like swapped your files with different FLACs of the same name). I guess the integrity of the non-essential metadata wouldn’t be “MD5 guaranteed” either. So if you want to make MD5s of the FLACs, you’ve got some reasons… uberparanoid ones, but reasons none-the-less.
Quote
About Replaygain, since I'm only encoding a single FLAC per CD, would I need to use the "treat input files as one album" option?

I assume you are talking about running replaygain using Foobar2000. You would NOT use that setting. That setting is for selecting multiple files and treating those, all together, as one album. Read it as "treat multiple input files as one album".

Long-term backup of big music collection

Reply #8
For me, the problem is not the choice of the PAR or MD5 files, the problem is that you've got 90 % of chances, if you open your fireproof cases ten years after having closed them, that all three copies are dead.
Check the FAQ about CDR longevity.
What's the use of an error recovery file if the drive says "no disc" when you insert your CDR ? It is now the case with my oldest CDRs (5 years old).

IMHO, the only safe way to preserve data is a daily, or regular, backup of a hard drive on an external device (with DDS tapes, for example). I think I'm quite safe with backups on an external hard drive.
One of the safest ways I can imagine (still starting with the entire collection on a hard drive) is to share the music collection with a friend, each of you doing your own backup separately.

Long-term backup of big music collection

Reply #9
is there such thing as professional data storage place? even online?
i guess you can make backups along with that
man, i never had to think backups in 10 year period, i guess temp, moisture, and whatnot would be issues

Long-term backup of big music collection

Reply #10
Quote
For me, the problem is not the choice of the PAR or MD5 files, the problem is that you've got 90 % of chances, if you open your fireproof cases ten years after having closed them, that all three copies are dead.
Check the FAQ about CDR longevity.
What's the use of an error recovery file if the drive says "no disc" when you insert your CDR ? It is now the case with my oldest CDRs (5 years old).

IMHO, the only safe way to preserve data is a daily, or regular, backup of a hard drive on an external device (with DDS tapes, for example). I think I'm quite safe with backups on an external hard drive.
One of the safest ways I can imagine (still starting with the entire collection on a hard drive) is to share the music collection with a friend, each of you doing your own backup separately.

I'm not using 'generic' discs, which have a lifespan of about 10 years, even under the best conditions. I'm using either Taiyo Yuden or Mitsui, which have a lifespan of at least 100 & 250 (Taiyo Yuden & Mitsui, respectively) years for CD-Rs kept at room temperature (20 degrees Celsius) out of sunlight (which they'll be). Now, I want to knowe the lifespans for DVD+Rs. Are similar numbers for the same brands to be expected?

Long-term backup of big music collection

Reply #11
Quote
Quote
OK, since MD5s are automatically stored within the FLAC files (and I verify every encoding during encoding), I really shouldn't need a separate MD5 file. Now, if I'm sharing the FLAC file on a network, etc., would an MD5 of the FLAC be handy or no?

It really depends what your goal/concern is. You could just run flac --test <filelist> to verify that the wavs haven't changed. But this doesn't tell you if someone maliciously tampered with the files (like swapped your files with different FLACs of the same name). I guess the integrity of the non-essential metadata wouldn’t be “MD5 guaranteed” either. So if you want to make MD5s of the FLACs, you’ve got some reasons… uberparanoid ones, but reasons none-the-less.
Quote
About Replaygain, since I'm only encoding a single FLAC per CD, would I need to use the "treat input files as one album" option?

I assume you are talking about running replaygain using Foobar2000. You would NOT use that setting. That setting is for selecting multiple files and treating those, all together, as one album. Read it as "treat multiple input files as one album".

OK, so I'd use the regular 'single file' Replaygain, as if the album file was a single song... Thanks...

Long-term backup of big music collection

Reply #12
Quote
I'm not using 'generic' discs, which have a lifespan of about 10 years, even under the best conditions. I'm using either Taiyo Yuden or Mitsui, which have a lifespan of at least 100 & 250 (Taiyo Yuden & Mitsui, respectively) years for CD-Rs kept at room temperature (20 degrees Celsius) out of sunlight (which they'll be).

I indeed used Mitsui CDRs, and no one lasted longer than 3 years. Other people have got similar experiences with Mitsui. For me, the storage temperature went above 30 °C for one month a year. Check the FAQ about CDR longevity for many other user reports about other brands.

Long-term backup of big music collection

Reply #13
Quote
Quote
I'm not using 'generic' discs, which have a lifespan of about 10 years, even under the best conditions. I'm using either Taiyo Yuden or Mitsui, which have a lifespan of at least 100 & 250 (Taiyo Yuden & Mitsui, respectively) years for CD-Rs kept at room temperature (20 degrees Celsius) out of sunlight (which they'll be).

I indeed used Mitsui CDRs, and no one lasted longer than 3 years. Other people have got similar experiences with Mitsui. For me, the storage temperature went above 30 °C for one month a year. Check the FAQ about CDR longevity for many other user reports about other brands.

That's why. Those numbers are given only for real specific conditions. Probably around 18-20 degrees C.

Long-term backup of big music collection

Reply #14
I have this CD-R from Yamaha which is almost 6 years old and still works perfectly (not a single C2 error and only ~ 300 C1 errors).

Anyway, I also have some Tevion CD-Rs (bought from Aldi) which died after 6 months (the data surface changed to a very dark green).

As some people suggested, you should really take care which CD-R or DVD-/+R discs you buy. Having the MD5 on the disc and finding out that your files can't be read anymore won't really help you.

Edit: About your ReplayGain question... Using Track Gain will avoid clipping while Album Gain will make all tracks from an album have the same volume. It has nothing to do with the file being lossless or not, nor will it change the quality of the file. The ReplayGain information is stored inside APE or ID3v2 tags and can be removed at any time. Players not supporting the ReplayGain tag fields will simply ignore the information and play the file without any pre-amplification change.