HydrogenAudio

CD-R and Audio Hardware => CD Hardware/Software => Topic started by: trail on 2012-03-29 19:08:55

Title: Help ripping ~30,000 CDs
Post by: trail on 2012-03-29 19:08:55
Hey! I'm a newbie around these forums, but hopefully I'll be around quite a bit. I really like the community here, and hope I can contribute in the future. But enough introduction, here's the interesting stuff:

I work for a college radio station, and we've decided to undertake the rather ambitious project of digitizing the CDs we've acquired over the years. This is a pretty monumental undertaking, so I'm looking to make this as painless and quick as possible. We have a very rough approximation of about 30,000 CDs that we're looking to convert to digital files, and it's my job to work out many of the more technical aspects of the project.

The problem with being a college radio station is that we're on a pretty limited budget. We can't afford any sort of robot or anything like that to help the process along, nor can we afford any sort of service, so we're stuck doing it ourselves. Thankfully, we have a bunch of people willing to put the time and effort in. We also aren't terribly picky about getting every rip totally 100% perfect. But I've done a fair bit of research, and here's the kind of plan I had in mind:

Ideally, we have one pretty decent quad-core desktop that we're planning to outfit with four CD drives. We have software that allows us to rip multiple discs at once to V0 MP3s which are stored on a small RAID 1 array inside the computer. I've done some informal ripping tests, and have narrowed down the two pieces of software that seem to work best to fre:ac (http://www.freac.org/) and dBpoweramp (http://www.dbpoweramp.com/). I have also tried EAC and simply ripping with MediaMonkey, but freac and dBpoweramp seemed the most efficient and easy to use. Now, if I decide to use one of these pieces of software (if anyone has any suggestions, I'm 100% open to them!) how can I configure them to make them as painless as possible? Would using multiple drives be an option? I found very little information about software that provided ripping from multiple drives simultaneously, so I'm assuming this is not a common feature. If not, would using different computers be our best bet? If anyone has any other suggestions about ripping multiple discs at the same time or other ways to improve efficiency then that would probably make my life much easier.

thanks for your time!
Title: Help ripping ~30,000 CDs
Post by: frozenspeed on 2012-03-29 19:11:31
I would use cueripper & foobar2000 but that's just me...
Title: Help ripping ~30,000 CDs
Post by: trail on 2012-03-29 19:13:45
I would use cueripper & foobar2000 but that's just me...


would you cueripper for the actual digitization and foobar2000 for the library management? In terms of actually playing back the library, I kind of had my heart set on MediaMonkey, I feel like it's perfect for this sort of thing. But I'll definitely give both foobar and cueripper a try, thanks dood.
Title: Help ripping ~30,000 CDs
Post by: Dario on 2012-03-29 19:23:45
What about the dBpoweramp batch ripper (http://www.dbpoweramp.com/batch-ripper.htm)? Is there anything wrong with it?
Title: Help ripping ~30,000 CDs
Post by: garym on 2012-03-29 21:50:54
What about the dBpoweramp batch ripper (http://www.dbpoweramp.com/batch-ripper.htm)? Is there anything wrong with it?


I'm about 1/2 way through the process of ripping about 10,000 CDs to FLAC files with dbpoweramp. And batch ripper is just fine...  Highly recommended.
Title: Help ripping ~30,000 CDs
Post by: Roseval on 2012-03-29 22:06:37
You might try running multiple instances of dbPoweramp each connected to a different drive.
Consider ripping to a lossless format.
Title: Help ripping ~30,000 CDs
Post by: trail on 2012-03-29 22:24:27
You might try running multiple instances of dbPoweramp each connected to a different drive.
Consider ripping to a lossless format.


It was something I gave some serious consideration to, but the only issue with that is hard drive space. After some math I figured out that I should be able to fit all of the CDs on two 2TB hard drives. I feel like using FLAC or other lossless formats would increase the number of storage space required by quite a bit. Also there is the issue of compatibility with other computer systems in the station. Not all of the computers use software that play nice with FLAC, but everything will play MP3s.
Title: Help ripping ~30,000 CDs
Post by: trail on 2012-03-29 22:36:57
What about the dBpoweramp batch ripper (http://www.dbpoweramp.com/batch-ripper.htm)? Is there anything wrong with it?


I'm about 1/2 way through the process of ripping about 10,000 CDs to FLAC files with dbpoweramp. And batch ripper is just fine...  Highly recommended.


That is 100% exactly what I was looking for. I can't believe I didn't come across it even with all the searching I did. I'll try this out right now, thank you so much!
Title: Help ripping ~30,000 CDs
Post by: DVDdoug on 2012-03-29 23:33:48
Quote
Ideally, we have one pretty decent quad-core desktop that we're planning to outfit with four CD drives.
Not a bad idea.  But I would also consider allowing your volunteers to take a stack of CDs home.    As long as you standardize on software & settings, and as long as everybody saves their logs, that should work.  Maybe you can appeal to your listeners for more volunteers?  The task is manageable (perhaps in a semester or two) if you get enough people working on it in parallel.  You dont wan't too many people working on it, because you need to maintain standards and keep track of the CDs.

You'll need a plan/procedure for dealing with ripping errors.  Maybe try a different drive/computer, maybe look for another copy of the CD, maybe just have someone "authorized" to listen and approve the file if they can't hear anything wrong...

And, you'll need tagging standards because the online databases are not all standardized or correct.    With 4 drives ripping at once, I'd guess that checking/correcting tags and filenames will take just as much time as the ripping.

Quote
We have software that allows us to rip multiple discs at once to V0 MP3s...

After some math I figured out that I should be able to fit all of the CDs on two 2TB hard drives. I feel like using FLAC or other lossless formats would increase the number of storage space required by quite a bit...

Not all of the computers use software that play nice with FLAC, but everything will play MP3s.
Here are my thoughts...  My biggest concern is that a couple of years from now, or when you get half-way through the project, somebody is going to wish you'd used a lossless format.

FLAC is going to take...  maybe 3 times as much space as V0...  Maybe 4 times as much...    That might be manageable, and I would think about it.

You don't need a format that plays on all computers...  Just the radio station's computers, and you should be able to install a FLAC CODEC on all of the station's machines that are used for audio editing/playback.    If you've mostly got Macs, ALAC may be a better choice than FLAC.    And, any lossless format can be converted to any other lossless or lossy format if necessary or desired.
Title: Help ripping ~30,000 CDs
Post by: Destroid on 2012-03-30 00:08:46
OP: I would concur with others to consider lossless as well, until I remembered you saying you weren't picky about the ripping accuracy.*

In that case, I might suggest a conservative setting like -V2 or even -V3 to save more space (or even -V5 is supposed to be very good under normal circumstances). If I recall correctly there is one online streaming radio that uses 64kbps AAC which has even more space savings than all the previous MP3 settings I suggested.

The worst scenario is that the songs/albums that get played/requested the most can be re-ripped to lossless, which in all likelihood will be a tiny fraction of the overall total number of CD's in the library.

Good luck!

edit: * seems to me that lossless archival of audio CD's rips containing errors does not make much sense
Title: Help ripping ~30,000 CDs
Post by: trail on 2012-03-30 01:52:15
Quote
Ideally, we have one pretty decent quad-core desktop that we're planning to outfit with four CD drives.
Not a bad idea.  But I would also consider allowing your volunteers to take a stack of CDs home.    As long as you standardize on software & settings, and as long as everybody saves their logs, that should work.  Maybe you can appeal to your listeners for more volunteers?  The task is manageable (perhaps in a semester or two) if you get enough people working on it in parallel.  You dont wan't too many people working on it, because you need to maintain standards and keep track of the CDs.


Unfortunately that's not an option for us.  We've had problems with theft in the past and as a result many of us are reluctant to let large chunks of our library out of our sight. Having things go in parallel is a good idea though. We have multiple computers in the station that don't get much use, maybe we can commandeer them.


You'll need a plan/procedure for dealing with ripping errors.  Maybe try a different drive/computer, maybe look for another copy of the CD, maybe just have someone "authorized" to listen and approve the file if they can't hear anything wrong...

And, you'll need tagging standards because the online databases are not all standardized or correct.    With 4 drives ripping at once, I'd guess that checking/correcting tags and filenames will take just as much time as the ripping.


I had the idea of dealing with any CD that didn't have metadata in this way: Say if working on a stack of 50CDs, the 30th one didn't have metadata. The person doing the bulk ripping would simply place a post-it or some other marking label on the jewel case and the CD would be revisited later on another computer where we could find/type in the metadata manually. MediaMonkey has an awesome Discogs tagging add-on which I'm sure will be invaluable.

Unreadable CDs and other serious ripping would have another (more angry colored) post-it placed on them for further review in the future. Honestly our current goal is to digitize as much of our library as possible but still rather quickly, so glossing over a CD here or there won't be too much of an issue.


Quote
We have software that allows us to rip multiple discs at once to V0 MP3s...

After some math I figured out that I should be able to fit all of the CDs on two 2TB hard drives. I feel like using FLAC or other lossless formats would increase the number of storage space required by quite a bit...

Not all of the computers use software that play nice with FLAC, but everything will play MP3s.
Here are my thoughts...  My biggest concern is that a couple of years from now, or when you get half-way through the project, somebody is going to wish you'd used a lossless format.

FLAC is going to take...  maybe 3 times as much space as V0...  Maybe 4 times as much...    That might be manageable, and I would think about it.


Hm. I think you're very much right. With 3 and 4TB drives dropping in price so much, I feel like lossless is going to not be so much more of an expense to have in the near future. Some other thread (http://forums.slimdevices.com/showthread.php?t=81590) I found and some pretty basic math makes me feel like we're going to need roughly 10TB to store all of our stuff without ANY data redundancy. But the problem is that 10TB is 5 2TB drives or roughly $600. This is all dependent on budget, I guess, and budget info isn't something I have at the moment. Lossless files though are definitely the best idea for a serious archival project.
But the problem is that I don't know how serious this archival project needs to be because pristine sound quality isn't something that most people care much about here at the station. Some people actually play songs posted on YouTube on air.  (which I am fairly sure is a crime against nature )

I just can't figure out if the increase in sound quality when ripping lossless is worth huge the increase in space and expense. But then there's the issue of future-proofing.
Title: Help ripping ~30,000 CDs
Post by: dumdidum on 2012-03-30 08:56:10
i second the suggestion of ripping to a lossless format. first, storage is cheap. second, generation loss could be an issue. after all, many if not most radio stations broadcast their show in a lossy format (internet radio, digital audio broadcasting, etc.).
Title: Help ripping ~30,000 CDs
Post by: Porcus on 2012-03-30 10:07:35
I ripped about 7000 CDs to FLAC using dBpoweramp, a Sony XL1B2 200-disc mediachanger (well actually two, luckily since one wore out ... 2nd hand ones available for cheap at Amazon: http://www.amazon.com/gp/offer-listing/B00...;condition=used (http://www.amazon.com/gp/offer-listing/B000ENU79C/ref=dp_olp_used/175-4122058-5165842?ie=UTF8&condition=used) ).

You can probably use dBpoweramp's Batch Ripper. What I did -- this was at a time Batch Ripper was fresh and a bit immature -- was to hack together an AutoIT3 script that automated dBpoweramp. I did once post it at http://www.avsforum.com/avs-vb/showthread....86#post13939586 (http://www.avsforum.com/avs-vb/showthread.php?p=13939586#post13939586) , but don't hold it against me, it is fairly lame coding.  (And forget whatever I wrote there about HDCD. I regret using the HDCD DSP.)

I know that people have modified REACT to work with the mediachanger too.
Title: Help ripping ~30,000 CDs
Post by: LosMintos on 2012-03-30 14:28:14
Sorry, I didn't read all posts carefully, nevertheless, let me add/emphasize some points.If you can't effort a batch ripper, you probably can distribute the job to many volunteers. In advance you have to agree on a standard:When ripping manually I'll pre-sort a bunch of CDs: regular albums, sampler, soundtracks etc. This reflects my folder hierarchy and speeds up ripping in my case. You could consider things like this, when distributing to volunteers.

I see, you're not likely to give CDs away (I fully understand!). That way, a large room with many computers equipped with max. 2 drives each will help more than few computers with a lot of drives each. Just my humble opinion ;-). Such a set up will likely be outperformed by a real batch ripper. In addition a network storage might be interesting for you. And a scanner for missing cover art.

Just my thoughts, hopefully of some help for you :-)
Title: Help ripping ~30,000 CDs
Post by: Porcus on 2012-03-30 16:38:06
You will just open two instances of dbpoweramp or EAC and it's just fine.


Be careful. My experience with dBpoweramp is that it might from time to time switch to the most-recently-used drive. Probably not without telling me, but I have overlooked it (and gotten a few rips with absolutely wrong content). I don't think it is intended to have concurrent versions open.


pre-sort a bunch of CDs: regular albums, sampler, soundtracks etc.


Also:
- remasters, if you want to have them distinguished.  The metadata sources do not.
- promos. Some of them have beeb sounds and talking interfering with the music.
- I keep classical music away from the rest -- or rather: music sorted by composer, apart from music sorted by performer.
Title: Help ripping ~30,000 CDs
Post by: pdq on 2012-03-30 16:40:43
On the cost of storing lossless files - consider the tens if not hundreds of thousands of dollars that those 30,000 CDs cost originally. Storage space for FLAC runs about 5 cents per CD.

On rippers, absolutely use dBpoweramp. I find that it saves a lot of time on metadata. It is also one of the, if not the, fastest ripper around.

On work flow, I would recommend that if a CD does not have metadata, set it aside in a pile to be ripped later. This saves having to match the metadata to the rip at a leter time.
Title: Help ripping ~30,000 CDs
Post by: spoon on 2012-03-30 16:41:56
>My experience with dBpoweramp is that it might from time to time switch to the most-recently-used drive

The last R14.2 release should have eliminated this possibility, how ever for true multi drive ripping Batch Ripper was designed for that operation (and has been in use 24/7 for the last 4 years by the largest commercial ripping companies out there).
Title: Help ripping ~30,000 CDs
Post by: trail on 2012-03-30 17:44:04
First off, I want to say thanks so much to everyone replying to this thread so far. It's been incredibly helpful.



i second the suggestion of ripping to a lossless format. first, storage is cheap. second, generation loss could be an issue. after all, many if not most radio stations broadcast their show in a lossy format (internet radio, digital audio broadcasting, etc.).

This is a good point, we do broadcast online. I think that this is such a huge undertaking and going lossless will make it even huger, but I'm being convinced more and more that going lossless is worth the effort. Then it's just the issue of "how do we back up and make 10TB of data network accessible on the budget of a college radio station?"


I ripped about 7000 CDs to FLAC using dBpoweramp, a Sony XL1B2 200-disc mediachanger (well actually two, luckily since one wore out ... 2nd hand ones available for cheap at Amazon: http://www.amazon.com/gp/offer-listing/B00...;condition=used (http://www.amazon.com/gp/offer-listing/B000ENU79C/ref=dp_olp_used/175-4122058-5165842?ie=UTF8&condition=used) ).

You can probably use dBpoweramp's Batch Ripper. What I did -- this was at a time Batch Ripper was fresh and a bit immature -- was to hack together an AutoIT3 script that automated dBpoweramp. I did once post it at http://www.avsforum.com/avs-vb/showthread....86#post13939586 (http://www.avsforum.com/avs-vb/showthread.php?p=13939586#post13939586) , but don't hold it against me, it is fairly lame coding.  (And forget whatever I wrote there about HDCD. I regret using the HDCD DSP.)

I know that people have modified REACT to work with the mediachanger too.

I appreciate the links, and dBoink is a pretty awesome name. If at one point we do decide to get a dedicated ripper, the XL1B will be at the top of the list, thank you! Does Sony have any current version of this that they're selling? And how did you go about storing 7000 CDs of FLAC files?

Sorry, I didn't read all posts carefully, nevertheless, let me add/emphasize some points.
  • You should insist on accurate, secure and lossless rips! You'll only do it once, do it right!
  • I've no experience with real batch rippers, but ripped a lot of CDs on ordinary PCs equipped with two CD-Rom drives. You will just open two instances of dbpoweramp or EAC and it's just fine. However, this is limited in terms of keeping a clear view on open CD cases on the desk and open program instances on the screen. With four drives you'll not gain much increase in over all speed, IMHO. The computer will wait for you (rather than you waiting for the computer).

This is something I didn't consider. I did some more tests last night, and it looks like three drives is the sweet spot. I also think I'll stick with dBpoweramp's batch ripper though.

[/li][li] Metadata is a crucial issue. Databases are not correct anyway and every n-th CD will not be found. Then you'll have to enter the data by yourself, the most time consuming step.[/li][/list]If you can't effort a batch ripper, you probably can distribute the job to many volunteers. In advance you have to agree on a standard:
  • codec
  • folder hierarchy
  • tagging scheme incl. cover images
  • what to do with unknown or erroneous CDs
When ripping manually I'll pre-sort a bunch of CDs: regular albums, sampler, soundtracks etc. This reflects my folder hierarchy and speeds up ripping in my case. You could consider things like this, when distributing to volunteers.

I think I'll deal with bad or missing metadata by marking CDs that dB couldn't rip automatically and revisiting later to manually type the data in, but the folders for different types of music is an excellent idea. We get lots of promo material and compilation albums so it's probably a good idea to separate the music into broad categories.

Here's the example folder hierarchy I was thinking of:
\Library\Category\Artist\Album (ID number we add when we get it)\Track number. Artist - Title

or for a real-life example:
Library\Full Albums\Modeselektor\Monkeytown (30458)\04. Modeselektor - Evil Twin.flac

I see, you're not likely to give CDs away (I fully understand!). That way, a large room with many computers equipped with max. 2 drives each will help more than few computers with a lot of drives each. Just my humble opinion ;-). Such a set up will likely be outperformed by a real batch ripper. In addition a network storage might be interesting for you. And a scanner for missing cover art.
Just my thoughts, hopefully of some help for you :-)

That's probably a great point. We have some older computers that may allow us to rip while keeping everything standardized, that's something I'll look in to. Coming up with a way to store the files efficiently and as cheaply as possible is another concern, too. But thankfully, cover art isn't terribly important, so we probably won't spend too much time on that.

It would be nice to digitize at a rate of like 100CDs/hour, but that's likely unattainable without multiple people working at once.

You will just open two instances of dbpoweramp or EAC and it's just fine.


Be careful. My experience with dBpoweramp is that it might from time to time switch to the most-recently-used drive. Probably not without telling me, but I have overlooked it (and gotten a few rips with absolutely wrong content). I don't think it is intended to have concurrent versions open.

pre-sort a bunch of CDs: regular albums, sampler, soundtracks etc.


Also:
- remasters, if you want to have them distinguished.  The metadata sources do not.
- promos. Some of them have beeb sounds and talking interfering with the music.
- I keep classical music away from the rest -- or rather: music sorted by composer, apart from music sorted by performer.

It would be nice to be able to be incredibly specific about these things (classical music and promo material) I'm only here three more years!  I think that we might overlook some of the more specific subfolder ordering in the interest of time. We'll be putting the entire library into MediaMonkey, too, so it'll be organized in that way.

On the cost of storing lossless files - consider the tens if not hundreds of thousands of dollars that those 30,000 CDs cost originally. Storage space for FLAC runs about 5 cents per CD.


That certainly puts it into perspective, yeah. Hard drives are pretty cheap in the long run.

It just seems like the biggest hurdle now is to store and back up all these terabytes of data we're going to create by going lossless and still make them accessible to the other computers on our local network.
Title: Help ripping ~30,000 CDs
Post by: Porcus on 2012-03-30 20:23:00
If at one point we do decide to get a dedicated ripper, the XL1B will be at the top of the list, thank you! Does Sony have any current version of this that they're selling? And how did you go about storing 7000 CDs of FLAC files?


They discontinued the XL1B (and dumped the prices gradually down to $84 for the last ones -- compare that to an original price of $799, which was itself half the price of the Powerfile it was based on!), and I think they replaced it with a BluRay changer. Which does not interest me, so I haven't paid attention since.

7000 CDs in FLAC, that fits on a single 3TB hard drive. (Plus backup and offsite backup.) 30 000 CDs should then fit on five. There are even cheap consumer-grade motherboards with 6 SATA connections.
Title: Help ripping ~30,000 CDs
Post by: trail on 2012-04-02 01:42:42
Bad news, our budget means we probably won't be able to afford the equipment to go full FLAC.  We'll probably stick with V0.

But for our RAID array I picked out this enclosure (http://www.newegg.com/Product/Product.aspx?Item=N82E16816132029) with four of these (http://www.amazon.com/Western-Digital-Caviar-Desktop-WD20EARX/dp/B004VFJ9MK/ref=sr_1_1?ie=UTF8&qid=1333327237&sr=8-1) in RAID 5
Title: Help ripping ~30,000 CDs
Post by: .hx on 2012-04-02 03:06:35
Little side note - CD ripping is not digitizing.
Title: Help ripping ~30,000 CDs
Post by: spoon on 2012-04-02 09:49:58
Word of advice, if you want a stress free life, stay away from port multipliers...
Title: Help ripping ~30,000 CDs
Post by: phofman on 2012-04-02 11:05:50
Bad news, our budget means we probably won't be able to afford the equipment to go full FLAC.  We'll probably stick with V0.

But for our RAID array I picked out this enclosure (http://www.newegg.com/Product/Product.aspx?Item=N82E16816132029) with four of these (http://www.amazon.com/Western-Digital-Caviar-Desktop-WD20EARX/dp/B004VFJ9MK/ref=sr_1_1?ie=UTF8&qid=1333327237&sr=8-1) in RAID 5


Just my 2 cents, coming from low-cost world.

HP Proliant Microserver N40L (280 USD on amazon) + linux from usb flash drive + 5 x SATA 3GB drives in linux soft RAID5 (the 5th drive in the optical drive bay in SATA drive internal enclosure, SATA connector on board) = 12TB of reliable redundant file space. The drives are hot-swappable in linux (tested in internet forums). An extra eSATA connector available - possible to boot from another external SATA drive, or 6th drive for the array.

The only issue is price of harddrives which is going down only slowly now.

I would not rip to MP3s either, considering the amount of work the process will take.

And BTW if you need MP3s from FLACs, perhaps the linux FUSE mp3fs would come handy, it works very good http://khenriks.github.com/mp3fs/ (http://khenriks.github.com/mp3fs/)
Title: Help ripping ~30,000 CDs
Post by: phofman on 2012-04-02 11:22:46
My experience with those SiliconImage SATA controllers - if more than one SATA drives are hooked, the performance goes down (raw read/write stream 130MB/s down to 90MB/s in our case for 7.200 SATAII drives). Hooking 4 SATA drives to a single SATA line via built-in replicator to the SiliconImage card and running RAID5 on top - I would not do that.
Title: Help ripping ~30,000 CDs
Post by: Porcus on 2012-04-02 14:57:42
Since this is turning into a discussion of storage (on a budget), here's my uneducated two cents, subject to change upon anyone's better arguments:

- RAID is not backup. RAID is a way to reduce the number of times you need to resort to your backup. RAID does not protect against a thief, a lightening strike, or a 'holy s**t, what did I just do?'. RAID5/6/Z/2Z gives you a limited time to replace a broken drive, that's all. That's a big deal if you care about uptime, but on a budget, you don't. You would rather take the array offline until you are sure it is OK again.

- Striping -- i.e., spreading one file over multiple drives -- (basically all RAIDs except RAID1 ... and some nonstandard solutions) is a bit dangerous: even if you have a fault-tolerance of 1 faulty drive of, say, 4 then you need all the other 3 in order to read a single file. You also need the RAID setup. That is, you cannot take a single drive out of the array and get anything out of it -- and if you will take the 3 working drives out, then you need to mount them in a RAID array that can read it.

- There is a proprietary solution called UnRAID which eliminates the issues of striping: it simply dedicates a drive as parity, monitors the other drives, and whenever you write to a drive, it also updates the parity drive. That means, you can take drive #2 out of the array, mount it on a different computer, and every file on drive #2 is readable. If drive #2 AND the parity drive is ruined -- then retrieve merely drive #2 from your backup and clone it. There is a performance loss (writing takes twice the time), but if media files are basically write-once-read-many, that is no issue.


If you still want to do striping (like, RAID5):

- Enclosure RAID with port multiplier? The www is full of complaints about data loss, so I dare not even try. Yes port multipliers slow things down (everything has to go through the same channel), and that might be one reason for issues -- the OS might give up because it sees the drive as unresponsive.
(I'm using a port multiplier myself, but with 5 individual drives, no striping, and it is still a bit stressful: I thought it would be no issue as I only read the file I'm playing, no writing -- so I thought: but Windows writes to the NTFS journal all the time, or something like that.)

- Stay away from 'hardware RAID'. Mainly because you won't actually get hardware RAID on a budget, even though some weasels market it as such -- it is done in the drivers, and kind of gives you all issues of hardware RAID and all issues of software RAID. And if you actually go for a hardware RAID card, then you need two identicals, in order to have a backup if the card breaks, further violating the 'on a budget' purpose.

- Linux software RAID? Less issues. FreeNAS with ZFS' RAID-Z? Tried it once on a too old box, ZFS does require a bit of resources.
Title: Help ripping ~30,000 CDs
Post by: rick.hughes on 2012-04-02 15:40:05
- RAID is not backup. RAID is a way to reduce the number of times you need to resort to your backup.

- There is a proprietary solution called UnRAID...

I use unRAID but if your budget forces you to choose between a complete backup and some sort of RAID then you should go for a complete backup, possibly kept offsite.
Title: Help ripping ~30,000 CDs
Post by: LosMintos on 2012-04-02 15:48:05
The last R14.2 release should have eliminated this possibility, how ever for true multi drive ripping Batch Ripper was designed for that operation (and has been in use 24/7 for the last 4 years by the largest commercial ripping companies out there).
I never experienced problems with multiple instances (neither dBpoweramp nor EAC). But, just to point it out, would you recommend Batch Ripper even for ordinary PCs with two or three drives? Or is it exclusively bundled with extra hardware?

Then it's just the issue of "how do we back up and make 10TB of data network accessible on the budget of a college radio station?"
Yeah, it's a matter of cost. But, if you do not insist in quite and most energy-saving technology, you'll get 10 TB rather cheap (except hdd's itself, maybe). And, keep in mind, that you don't have to start with 10 TB. You can add HDDs later along with the ripping progress.

but the folders for different types of music is an excellent idea. We get lots of promo material and compilation albums so it's probably a good idea to separate the music into broad categories.

Here's the example folder hierarchy I was thinking of:
\Library\Category\Artist\Album (ID number we add when we get it)\Track number. Artist - Title

or for a real-life example:
Library\Full Albums\Modeselektor\Monkeytown (30458)\04. Modeselektor - Evil Twin.flac
My approach is similar. When it comes to artist's names or titles with exotic characters, you might want to rename/sanitize your folder names. Especially, when you work multiplatform (Linux file server, Windows Client). However, if you don't care so much, you could also use your ID number, probably structured: $left(%ID Number%, 2) / $left(%ID Number%, 3) / %ID Number% / %track% = %artist% = %title%.flac.

It just seems like the biggest hurdle now is to store and back up all these terabytes of data we're going to create by going lossless and still make them accessible to the other computers on our local network.
Again, you don't have to provide the full disk space _now_.
Title: Help ripping ~30,000 CDs
Post by: phofman on 2012-04-02 15:57:23
We have been using linux raid in our company for a few years, from 2 to 8 drives configurations. This is just my experience:

Raid is definitely not a backup, I absolutely agree. In fact I was going to ask about the planned backup procedure. However, backing up 10TB of data on low budget is not a trivial task.

I assume the server should run 24/7 since it is the source of music for the radio station. That is why I suggested redundant raid and server hardware (yet inexpensive one).

Linux raid has no problem being used and running synchronization at the same time. Synchronizing 12TB software raid will take many hours, easily a day or two on a mildly loaded server. I agree that raid5 with only one drive of redundancy is not very safe. Six 3TB drives would allow the much safer raid6 of 12TB.

As for the backup, either the rather expensive tape, or IMO a much more flexible solution is a simple desktop PC with large case and motherboard with 6 sata ports, 5 3TB drives in RAID5, one small drive for system with linux (to make life easier) and run rsync every night/week over gigabit ethernet. I do not assume many changes on the main server data array so the synchronization would take just a few seconds. Preferrably the machine should be located in a different building. This solution would have the advantage of being able to take over the file-serving role of the main server quickly in case of hardware failure. It can be booted and shut-down automatically, by bios/halt command, to minimize electricity costs and hard drives wear.
Title: Help ripping ~30,000 CDs
Post by: krabapple on 2012-04-02 16:30:17
Nothing to add except again:  go lossless if you can.  You can always make mp3 versions of your FLAC files for everyday use, and keep the lossless versions as an archive.
Title: Help ripping ~30,000 CDs
Post by: pdq on 2012-04-02 16:39:47
For backup I would suggest AudioSAFE (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=89802&view=findpost&p=763689). If you never need to use it then it is completely free. You only get charged if you need to access your backed-up data.

In fact, you might consider backing up the lossless files to AudioSAFE, before you convert them to mp3 and delete the original. If you never want the lossless version then it has cost you nothing to save them, but if your budget at a later time permits then you don't have to rerip.
Title: Help ripping ~30,000 CDs
Post by: rick.hughes on 2012-04-02 19:12:36
The original CDs could also be considered the backup. If they are not CD-R then they should have a good lifetime. Depending on how difficult some might be to replace this might be all that is really needed for a backup. Store them at another location in case of disaster.
Title: Help ripping ~30,000 CDs
Post by: phofman on 2012-04-02 19:25:43
The original CDs could also be considered the backup. If they are not CD-R then they should have a good lifetime. Depending on how difficult some might be to replace this might be all that is really needed for a backup. Store them at another location in case of disaster.


CDs are definitely a very good backup. It all goes down to the time it took to rip them. If that time spent has lower value than the backup solution, then yes. Considering the time it takes to rip a few thousand CDs which fit onto one harddrive with presumed lifetime of 2 - 3 years (loaded 24/7, regular consumer type HDD), I would tend to prefer building a simple low-cost backup PC. Quality of drives is not going stelar, in fact manufacturers are cutting the warranty period - http://www.tomshardware.com/news/seagate-w...land,14322.html (http://www.tomshardware.com/news/seagate-western-digital-HDD-warranty-Thailand,14322.html) .
Title: Help ripping ~30,000 CDs
Post by: garym on 2012-04-02 22:25:43
For backup I would suggest AudioSAFE (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=89802&view=findpost&p=763689). If you never need to use it then it is completely free. You only get charged if you need to access your backed-up data.

In fact, you might consider backing up the lossless files to AudioSAFE, before you convert them to mp3 and delete the original. If you never want the lossless version then it has cost you nothing to save them, but if your budget at a later time permits then you don't have to rerip.


excellent ideas!
Title: Help ripping ~30,000 CDs
Post by: Jan S. on 2012-04-03 10:42:55
For backup I would suggest AudioSAFE (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=89802&view=findpost&p=763689). If you never need to use it then it is completely free. You only get charged if you need to access your backed-up data.

In fact, you might consider backing up the lossless files to AudioSAFE, before you convert them to mp3 and delete the original. If you never want the lossless version then it has cost you nothing to save them, but if your budget at a later time permits then you don't have to rerip.


excellent ideas!

If the files are deleted locally they will be dropped from AudioSAFE after some time. So you have to keep the files locally you want to backup on audioSAFE.
Title: Help ripping ~30,000 CDs
Post by: Porcus on 2012-04-03 16:29:08
the rather expensive tape


The tape drives are expensive, but tapes are only slightly more than hard drives, so ... are there rental services? Or backup services where you would show up physically, have your drives copied, and if you are so unlucky you need restoration, show up with tapes and new drives? (I suspect that if such exist, they are at industry-grade pricing.)
Title: Help ripping ~30,000 CDs
Post by: Nessuno on 2012-04-03 17:52:00
If the files are deleted locally they will be dropped from AudioSAFE after some time. So you have to keep the files locally you want to backup on audioSAFE.


Actually? That's quite a strange behavior for a backup system. Sounds more like an... asyncronous and delocalized RAID 1!

Well, all in all it depends on how long is "some time"...
Title: Help ripping ~30,000 CDs
Post by: spoon on 2012-04-03 19:33:43
Your deletions are kept on audiosafe (the last change) indefinitely.

All we ask is that your account is active, that is a computer logs into audiosafe once every so often.
Title: Help ripping ~30,000 CDs
Post by: Destroid on 2012-04-04 11:54:21
Bad news, our budget means we probably won't be able to afford the equipment to go full FLAC.  We'll probably stick with V0.

I have to ask again, why insist on this setting? I don't want to derail this thread into the dozens (hundreds?) of other listening threads, but it seems storage costs are relevant. So, I refer to my previous post of -V3...-V5, or, AAC 64/96kbps.

The hardware considerations will add up, plus (as already mentioned) there are redundancy options abound.
Title: Help ripping ~30,000 CDs
Post by: trail on 2012-04-05 20:41:41
Bad news, our budget means we probably won't be able to afford the equipment to go full FLAC.  We'll probably stick with V0.

I have to ask again, why insist on this setting? I don't want to derail this thread into the dozens (hundreds?) of other listening threads, but it seems storage costs are relevant. So, I refer to my previous post of -V3...-V5, or, AAC 64/96kbps.

The hardware considerations will add up, plus (as already mentioned) there are redundancy options abound.


It's just that I've always found V0 to be the optimum balance between quality and file size for myself. I don't have an aversion to other formats, (except AAC is probably out because we don't own any Macs or use iTunes frequently) but I'm not terribly well-versed in the trade off between file size and audio quality when using different VBR settings. But I did just do an A-B test between a song that I have both FLAC and V5 versions of and couldn't discern any noticeable difference between them. Other formats are definitely a consideration, I just assumed V0 was the happy medium.
Title: Help ripping ~30,000 CDs
Post by: mixminus1 on 2012-04-05 21:06:03
@Destroid:  Yes, the OP's budget is limited, and no lossy codec can truly be an "archival" format.

However, since this is for a music archive at a radio station, we should assume that some form of additional processing and/or lossy transcoding may take place in the broadcasting/webcasting chain.

As such, to try and hit the "sweet spot" of minimizing storage space while providing maximum resistance to downstream artifacts, I think V0 makes the most sense, if AAC is really not an option.

While this test is almost seven years old now, I think it's still useful for the discussion at hand:

http://www.hydrogenaudio.org/forums/index....st&p=282909 (http://www.hydrogenaudio.org/forums/index.php?showtopic=32440&view=findpost&p=282909)

That was done with LAME 3.97 a8, so not current, but not exactly "ancient", either, and it fared the worst as a source codec when transcoding to MP3 ABR 128 - and that was with -V0!  Going to a lower bitrate certainly wouldn't improve things.
Title: Help ripping ~30,000 CDs
Post by: JJZolx on 2012-04-06 01:43:53
If the budget is so limited that the station can't even afford the hard drive space on which to store the music in a lossless format, then I hate to say it, but 'm afraid the project is doomed from the get go. You're likely to patch together a system where the next guy to come in says "Just look at what they left me" and will want to start all over again, perhaps with a more realistic budget.

There is much better software available than foobar or JRiver that is actually designed for radio station programming. You have to take into account things like queueing advertisements, news briefs, pre-recorded shows, not to mention downloading those shows 24x7 as they become available. Then there's  the whole side of things where you may want to digitize the outgoing on-air feed and make it available at one or more bitrates in an internet radio feed. Also, you may want to generate a web page displaying your playlist of songs as they're played. Managing all of that at the file level with something like foobar would be a nightmare, if it's even doable.
Title: Help ripping ~30,000 CDs
Post by: shadowking on 2012-04-06 15:50:00
Bad news, our budget means we probably won't be able to afford the equipment to go full FLAC.  We'll probably stick with V0.

I have to ask again, why insist on this setting? I don't want to derail this thread into the dozens (hundreds?) of other listening threads, but it seems storage costs are relevant. So, I refer to my previous post of -V3...-V5, or, AAC 64/96kbps.

The hardware considerations will add up, plus (as already mentioned) there are redundancy options abound.



I agree. I did 300+ albums with V4 years ago and don't regret it as they mostly sound great . That is 150k vs 250..300k(v0 / itunes) bloat of today. No need for RAID or anything fancy.
Title: Help ripping ~30,000 CDs
Post by: Porcus on 2012-04-07 02:58:17
OK, so OP cannot afford drives enough for lossless.

Let me assume that the CDs average at 40 minutes. (If they are singles -- much less.) 20 000 hours. 72 million seconds. Want to fit on a single hard drive of 3 TB = 24e+12 bits?  Divide out to get one third megabit per second.  That's the 320 rate. Might as well use V0.
Title: Help ripping ~30,000 CDs
Post by: trail on 2012-04-09 04:20:20
@Destroid:  Yes, the OP's budget is limited, and no lossy codec can truly be an "archival" format.

However, since this is for a music archive at a radio station, we should assume that some form of additional processing and/or lossy transcoding may take place in the broadcasting/webcasting chain.

As such, to try and hit the "sweet spot" of minimizing storage space while providing maximum resistance to downstream artifacts, I think V0 makes the most sense, if AAC is really not an option.

While this test is almost seven years old now, I think it's still useful for the discussion at hand:

http://www.hydrogenaudio.org/forums/index....st&p=282909 (http://www.hydrogenaudio.org/forums/index.php?showtopic=32440&view=findpost&p=282909)

That was done with LAME 3.97 a8, so not current, but not exactly "ancient", either, and it fared the worst as a source codec when transcoding to MP3 ABR 128 - and that was with -V0!  Going to a lower bitrate certainly wouldn't improve things.



Exactly what I was thinking. There is some processing and the like, but it's not any that really takes much away from the music. To put it in perspective, we're a college FM station, and it's safe to assume very few of our listeners are tuning in looking for pristine audio fidelity, you know? I obviously want it to sound as best as I can, but V0 sounds like the optimal trade-off. As a college kid I don't quite have taste, budget, or equipment of an audiophile  but V0 hasn't ever disappointed my ears.
Title: Help ripping ~30,000 CDs
Post by: gorob on 2014-03-11 02:16:59
I second dbpoweramp as well, it works great for ripping CDs to FLAC.
Some at their forum also suggested using dbpoweramp with the nimbie CD loader, the robot can feed 100 CDs automatically.
It surely beats feeding the disc one by one by hand…
Title: Help ripping ~30,000 CDs
Post by: JJZolx on 2014-03-11 03:01:42
I wonder how this project turned out. Or if it even got off the ground.
Title: Help ripping ~30,000 CDs
Post by: eahm on 2014-03-11 04:43:30
30,000 discs...I can't even imagine that many, amazing and insane if it's a common person. It must be a radio station or something right?
Title: Help ripping ~30,000 CDs
Post by: Porcus on 2014-03-11 07:44:19
It must be a radio station or something right?


Yep - read the original posting ;-)
Title: Help ripping ~30,000 CDs
Post by: kennedyb4 on 2014-03-11 12:10:52
Hi. I would like to add my advice to those suggesting you go lossless, particularly flac. I have ripped and re-ripped my cd's more times than I care to count. I used to use mp3 abr 192 then 320. Then I switched to aac at 128 and then opus for my rockbox.
Lots of replicated work for no reason except saving storage which is now crazy cheap.

You will save space in the endrun.  $.02
Title: Help ripping ~30,000 CDs
Post by: yourlord on 2014-03-11 17:23:13
This thread is 2 years old and the original poster appears to be long gone. Adding suggestions now is pointless.
Title: Help ripping ~30,000 CDs
Post by: eahm on 2014-03-11 22:46:50
Yep - read the original posting ;-)

Oops, sorry about that

particularly flac.

Why? I, for example, prefer ALAC and WavPack to FLAC.

This thread is 2 years old and the original poster appears to be long gone. Adding suggestions now is pointless.

Exactly.