Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Converting a HUGE collection (Read 20058 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Converting a HUGE collection

Reply #25
Quote
I haven't done it yet, but I'll prolly MP3Gain the whole collection when done. I'm hoping by the time I'm done SnelG will have included support for the more widely HW supported ID3V2 tag in MP3Gain vs the technically superior APE tag - but that's another topic  (No, I don't do silly things w/ ID3V2 like include album art - only the fields EAC fills in for me.)

You should consider using replaygain instead of mp3gain.

Converting a HUGE collection

Reply #26
Quote
You should consider using replaygain instead of mp3gain.

MP3Gain is the MP3 implementation of Replaygain. So basically, it is Replaygain.

Back on topic: I have no idea how you plan on ripping 10,000 CD's. First of all, if I bought 1 CD an hour for 8 hours a day, 5 days a week, it would take me over five years to reach a collection that size. Have you listened to all that music, or are you just a packrat?

If you rip the CD's one at a time, assuming you can get a CD done in 10 minutes, that's another 3/4ths of a year of constantly switching CD's during the 8-hour work day, and probably much longer, unless you plan to sit attentively in front of the drive the whole time. If you're serious about archiving this music, you'll need a better solution, such as a CD tower and a cluster of machines to do the encoding. That already costs $5000, probably.

And after investing all that time in encoding, you'll want a good way to back it up. Since time = money, (let's say you're worth $30 an hour, for argument's sake), that means you've spent $50,000 of your time if you copied the CD's one by one, or $6250 if you used an 8-CD ripping tower. Hell, even if you were worth minimum wage, it would cost you less to buy a ripping tower and encoding cluster than to do the job one-at-a-time. Anyway, the time investment justifies a real storage and backup solution.

If FLAC gives 350MB per album and MP3 gives 75MB, that means the initial storage costs for hard drives are either 3,500GB or 750GB. With a nice round number of $1/GB, you save about $3000 by going with MP3, but you lose the ability to transcode or do any real signal processing on the audio, you don't get guaranteed transparency, and you don't get gapless playback. However, you can run 750GB off a standard IDE card, whereas a 3.5TB array will need some dedicated hardware, probably a RAID 5 with multiple redundant hot-swappable drives, just because you'll have drives dying on a semi-regular basis. Still, with the amount of money you already spent on the task, it seems silly to save a few grand and give up so much sound quality and flexibility.

Then, you will also want a good backup system for this. RAID 1 is out of the question, since as DaveSimmons pointed out, you're still vulnerable to natural disasters, power surges, theft, and viruses. You'll need off-site backup. Since your data won't be changing that quickly, you don't need a light-speed backup solution, but you will probably need a whole lot of tapes on a tape drive that costs several thousand dollars. There have been some discussions on 2cpu.com about backup solutions for large arrays of data, but I'm sure there have been better discussions elsewhere as well. Anyway, you'd need a huge pile of tapes, a fast, reliable tape drive, an auto-loader, and a company that will store, test, and protect the tapes. Now, the cost of using FLAC is much greater than using MP3, but I'd still recommend it because locking yourself into a single lossy format limits you so much.

Archiving 10,000 CD's is a huge undertaking that probably isn't worth it for you, especially since you've mentioned your reluctance to spend money (which makes me wonder how you collected so many CD's in the first place). You can't sell the music online because you don't have a license to do so, and 10-to-1 odds says the RIAA will have you in court by the end of the week if you try to share all the albums on a file-trading network. If you're DJ'ing, you don't want MP3 because you can't do any karaoke or extreme equalization efffects to it without making it sound like crap. That leaves you with personal listening (and public performance, if use use FLAC) as your only practical uses for your music. In that case, you might as well save the money spent on ripping all the music, buy an excellent amp and speakers, and just put the CD in a CD player. With all the time you saved by not archiving all your CD's, you'll have plenty of opportunities to sit back, relax, and listen.

Converting a HUGE collection

Reply #27
Quote
I have no idea how you plan on ripping 10,000 CD's. First of all, if I bought 1 CD an hour for 8 hours a day, 5 days a week, it would take me over five years to reach a collection that size. Have you listened to all that music, or are you just a packrat?

I'm sure that instead of buying one CD at a time, he did a MUCH more extreme version of what I do (and what many people do).  I budget myself to about 10 CDs per month, and generally buy them all in one trip.  This guy probably buys 100 a month (or more, especially if he's a DJ or VERY serious music collector).

As for being a packrat, I agree that that's an *exorbinant* number of CDs, but then again, everybody has different priorities and different interests.  Maybe he has 10000 CDs for the same reason I have 8 hand-signed photos of Sarah Michelle Gellar and 5 of Xenia Seeburg.  If you love something, be it memorabilia or music, sometimes you just can't get enough.

Or then again...maybe he just typed one "0" too many in his original post...  B)

And the work estimates make a valid point...you'd never have the time to archive nearly that many CDs without a *lot* of help and a *lot* of equipment.  Hopefully, those are things he's already arranged for...

Converting a HUGE collection

Reply #28
I don't get why most people insist on sitting still and watching the progress meter while ripping each and every cd?  You stick it in - go away and do some work or whatever, come back and check the logs, and then stick in the next!

It cuts maybe 5 minutes out of every hour, if you have a redudant PC at work with you, which many techie jobbed people do, then you can rip 25 cd's a day or something..  it'd take a long time but won't cost $50'000 of your time
< w o g o n e . c o m / l o l >

Converting a HUGE collection

Reply #29
Quote
Why would Raid5 need UPS? I thought you could recover your data with 1 drive dead as with raid1, never saw anything about power issues?

I think I'm actually confusing centrally cached parity (RAID 7) with typical one-dimensional parity (RAID 5), so you might be right. If the ECC info. isn't cached in volatile memory during a power outage, it should be okay I think.
  RAID 5 should survive a single drive failure very well, but will be a bigger PIA to rebuild than a mirroring (1) array.
  Given that we're talking about RAID for data integrity and not performance, a RAID 1 (or 0+1) array should be the cheapest and easiest to implement, build, and fix if a drive dies.
Quote
However, you can run 750GB off a standard IDE card, whereas a 3.5TB array will need some dedicated hardware, probably a RAID 5 with multiple redundant hot-swappable drives, just because you'll have drives dying on a semi-regular basis. Still, with the amount of money you already spent on the task, it seems silly to save a few grand and give up so much sound quality and flexibility.

It seems strange to me that you'd propose a RAID 5 array with redundant backup swap-outs and then exclude such a possibility with RAID 1, where it would be faster and probably cheaper (RAID 1 controller = cheap) to do.

A 3.5TB array is very cost-prohibitive, and it's an open question whether the poster has the cash necessary to maintain such a system, let alone implement it. It's also questionable whether lossless is called for in this situation.
  It's certainly the ideal case, but the cost differential is enormous here, making the hardware cost rise dramatically (Maxtor 7200 rpm 250GB drives = $265). Possession of many thousands of CD's does not directly imply tremendous current financial means - I acquired my collection over about 18 years and currently drive a used car

Converting a HUGE collection

Reply #30
This thread has been helpful for me, as I too am about to embark on a similar project (although on a MUCH smaller scale).  In fact, I'm facing about 1/10 the effort that Audible! is going thru right now- I've got around 250 CDs I'm planning on ripping.  Of course, I expect my collection to expand over time, and therefore am cognizant of starting with a storage solution that gives me room to grow.  Which brings me to my question...

I've got a entry-level hi-fi system that I ultimately plan on playing my music on.  (Hale Design Group Rev3 speakers, Bryston B60 integrated amp).  For quality and obsolescence reasons, I've decided to rip and store all my music as WAV files (no desire to EVER rip my collection again, and plan to rip each new purchase as it is made).  But I also plan to encode a subset of my music as MP3s for use on my iPOD.  I noticed Maxtor and Western Digital both have 250 GB hard drives out, at both 5,400 rpm and 7,200 rpm I believe.  These seem to offer the lowest cost/MB.  From my preliminary reading, it seems that higher rpm is good for access times, but bad for noise.  Does anyone have any view on the trade-off here?  Also, I've read that anything higher than 60GB requires another plate and therefore increases noise.  How much should I be concerned about noise if I'm running my WAVs through an external DAC and then through my Bryston (off-topic: any advice on external DACs with either Firewire or USB connectivity for this purpose?).

Finally, any things I should watch out for before I start this?  I've used EAC and RazorLAME in the past, and have occasionally had trouble with tagging and batch encoding (but that was around a year ago).  Ideally, I'd like to systematically rip my CDs, then batch encode the ones I want to transfer to my iPOD.  Tips on the best way to do this (or pitfalls to avoid) would be much appreciated.

Btw- sorry if this should be the start of a new thread or if some of the q's belong in other categories.  I'm a newbie (first post) and am still trying to figure out proper etiquette, protocol, etc.  Please trust that I HAVE spent a considerable amount of time searching around this forum before posting these questions.  Anyway, feel free to suggest other places I should post this or threads I should check out to get answers.  Thx.

Converting a HUGE collection

Reply #31
Quote
I've got a entry-level hi-fi system that I ultimately plan on playing my music on. (Hale Design Group Rev3 speakers, Bryston B60 integrated amp). For quality and obsolescence reasons, I've decided to rip and store all my music as WAV files (no desire to EVER rip my collection again, and plan to rip each new purchase as it is made). But I also plan to encode a subset of my music as MP3s for use on my iPOD. I noticed Maxtor and Western Digital both have 250 GB hard drives out, at both 5,400 rpm and 7,200 rpm I believe. These seem to offer the lowest cost/MB. From my preliminary reading, it seems that higher rpm is good for access times, but bad for noise. Does anyone have any view on the trade-off here? Also, I've read that anything higher than 60GB requires another plate and therefore increases noise.

If you're storing your CDs losslessly on disc for quality purposes, you might as well use lossless compression like FLAC and save space. Encoding from a FLAC will result in identical files as encoding from the uncompressed Wav, and you can always turn the FLAC encode into the original Wave with a minimum of fuss.

If you have an iPod, AAC would be a good choice

All else being equal, a 5400rpm drive will be quieter than a 7200 rpm drive, but often the noise difference isn't all that substantial, and the price difference is tiny. Most 250GB models these days use three 80GB+ platters, and come in around 240GB when formatted. The main advantage of the 7200 rpm drives is that the warranty period of the drive is longer- the $255 250GB 5400 rpm Maxtor IDE drive has a 1 year warranty while the $265 250GB 7200 rpm version has a three year warranty.

Converting a HUGE collection

Reply #32
@rsp22 if you want a quiet hard drive (or a quiet anything computer-related), Silent PC Review is the place to do your research. Some of the quieting technology available for 7200RPM drives is making them quieter than their older 5400RPM predecesors. Some models to look for (quietest first, the rest roughly in order): Seagate Barracuda IV, Samsung Spinpoint 1204N/1614N (or some derivative of those model numbers), Barracuda V, Barracuda 7200.7, Hitachi 180GXP, Maxtor DM+9. That's from my second-hand experience, and it's also what the site admin recommends from his in-depth experience. The WD drives are screamers, I've heard, and so are the ball-bearing Maxtor drives, which I can claim from personal experience. I heard a fluid-dynamic-bearing Maxtor drive recently, and it was much, much quieter.

@Audible!: it sounds like you know more about this RAID stuff than I do, so I'll accept your judgement that RAID 1 would be more appropriate than RAID 5 for this situation. My reasoning was that, with 3.5TB (about 20 drives), it's cheaper to tack on a few extra drives in RAID 5 (it is possible to have, say, three redundant drives in a 23-drive RAID 5 array, correct?) than to buy 40 drives and use RAID 1.

@Mac: Okay, I'll re-do my calculations. Let's say 12 minutes per CD, because he won't be attentive at the ripping station, and 5/60ths of $30 an hour. It will take a year of 40-hour work-weeks to finish the job, and still be worth $5000 of one's time.

Edit: Audible!, I think your comment is a result of my poor use of terminology; I should have said "multiple parity drives" or something along those lines, rather than "redundant" which suggests duplicate drives ("RAID 5+1"?). Does my suggestion now sound reasonable?

Converting a HUGE collection

Reply #33
I thought of a good solution. Hire a team of high-school or college computer geeks to do the job for you. Tell them that they can keep copies of whatever they want, and they'll do it for free. Pay them a nominal sum to keep copies of everything, and you'll have a backup solution. Let's say they copy a CD every 15 minutes for four hours after school on Monday, Wednesday, and Friday. If you hire 20, you'll have the job done in 2 1/2 months.

Converting a HUGE collection

Reply #34
Quote
@Audible!: it sounds like you know more about this RAID stuff than I do, so I'll accept your judgement that RAID 1 would be more appropriate than RAID 5 for this situation. My reasoning was that, with 3.5TB (about 20 drives), it's cheaper to tack on a few extra drives in RAID 5 (it is possible to have, say, three redundant drives in a 23-drive RAID 5 array, correct?) than to buy 40 drives and use RAID 1.


I believe the ratio in RAID 5 is usually 1:2, meaning one allowable failure per three total drives (and ~2/3 capacity of total drives), but I'm unsure about very large arrays. Edit: Actually, I'm pretty sure this is wrong - the minimum amount of drives for a RAID 5 array is three, and this can withstand a single drive failure, but I'm unsure how this scales.
This doesnt make RAID 5 more resistant to surges or failure though, and an array will take longer to rebuild in the event of disk failure.
My concern was primarily with price - 20+ large hard drives just isn't cheap! This is why lossless seems untenable to me for someone with a very large collection. MPC can sound really excellent, and it's small enough to be usable here.
I'd use MPC primarily if my portable and car players supported it. What I was suggesting was a small mirroring array for lossy compression because a lossless array would be both huge and expensive in comparison.
  He could use a RAID 1 or 5 array with 5-6 250GB drives and it shouldnt cost more than ~$1900, controller included, and he wouldnt have to fear much assuming the host computer is on a surge protector. He could even pull the mirrored drives and rebuild, then store a set in a safety deposit box if he felt like it.
 
  For a lossless array, you're talking closer to $6500 on drives alone, and good luck finding a controller that supports more than 16 drives for even that much by itself!
  $1,900 vs. $10,000+ is a big difference, especially for a matter like this.

Converting a HUGE collection

Reply #35
Here's a cheap solution to the 10K CD dilemna:

Sort through them and sell what you don't listen to.  Or, just rip what you do listen to and when the mood strikes, the rip the next in line.  10,000 CD's is a number that's just not possible to listen to in any reasonable length of time.  I'm inclined to say you'd be wasting time and money in trying to rip/encode all of them.

I'll go hide now.

Converting a HUGE collection

Reply #36
Quote
  For a lossless array, you're talking closer to $6500 on drives alone, and good luck finding a controller that supports more than 16 drives for even that much by itself!
  $1,900 vs. $10,000+ is a big difference, especially for a matter like this.

Prices have come down a bit.  Here's a real world pricelist of what you'd need:

3WARE Escalade 7500-12             $519.00 x 2    = $1038.00
Hitachi Deskstar 180GXP 180GB   $155 x 24   = $3720

That's $4758 for two RAID5 arrays of 12 drives each.  Total capacity is 3960GB (3600GB if we use 2 drives as hot spares).  Add $1500 for a proper case/ps/mainboard/etc.

Grand total:  $6258

Not bad for roughly 4TB of storage.

Converting a HUGE collection

Reply #37
It will take years to rip listen to those 10,000 discs. Why Not upgrade harddisks over the next couple of years and rip some 400 at at time

Converting a HUGE collection

Reply #38
Quote
It will take years to rip listen to those 10,000 discs. Why Not upgrade harddisks over the next couple of years and rip some 400 at at time

I think he's more interested in archival than just listening.  I.e., to avoid losing part of his music investment if CDs get lost, scratched, etc.  10,000 CDs is a LOT to keep up with and take care of...I can't find CDs in my collection half the time, and I have less than 400!

But I agree with your point on pacing the project over time mainly because of  storage technology prices dropping in the months and years to come (per my post about ten or twelve back in this thread).

I'd have to have a pretty compelling reason to spend thousands on hard drives, controllers and supporting components all at once, and then in 18 months (or less)find that equivalent storage capacity costs a small fraction of what I paid.   

I'm looking forward to 50 GB Memory Sticks, myself...   

------------------------------------------------------------------------------------

2005:  10 GB Memory Sticks.  50 GB flashcards.  Hard drives at 500 GB per platter (after a kid in a garage discovers a hot new magnetic-media indexing algorithm).

2007:  Hard drives at 2 TB per platter (the kid's a billionaire by now).  500 GB flashcards.

2010:  10 TB on a standard home PC.  2 TB flashcards.  Garage-kid from five years ago invents a 100 GB experimental, upgradable cranial implant with integrated neural interface (but anyone who has one will be considered a freak).  Digital media compression still needed to store 20TB-worth of high-definition media on your 10TB PC.

2015:  500 TB on a standard home PC.  No more spinning disks: 100 TB per flashcard, with 5  to 10 flashcard slots on an affordable computer.  ALL PCs are portable now.  10 TB now available for the no-longer-experimental cranial implants (anyone who DOESN'T have one will be considered a freak).  Cranial media storage coupled with thought-controlled satellite-commlink implant device...cell phones are obsolete.  Digital media compression is a thing of the past...now the catch phrase is *throughput optimization*.  Brain surgeons become digital storage media gurus.

2020:  100 Exabytes (100,000,000 TB) on a chip hanging from your keychain (with a built-in headphone port, of course, for those times you don't want to use the 500 TB chip in your head to listen to music from).  "Storage media wars" have given way to "bandwidth wars" as the atmosphere is permeated with radio and microwave communications over every available frequency range, with people fighting over a little more bandwidth for themselves...

...and the march goes on...

Converting a HUGE collection

Reply #39
Is the 10000 CDs guy still reading this thread even?

Converting a HUGE collection

Reply #40
Sidebar:  Why am I getting the impression that people like playing with their computers more than listening to their music?   

Dex

Converting a HUGE collection

Reply #41
Quote
Sidebar:  Why am I getting the impression that people like playing with their computers more than listening to their music? 

Actually, right now I'm:

1.  Ripping a CD.
2.  Encoding four files at a time with LAME.
3.  Running MP3Gain on another album.
4.  Writing a document.
5.  Using the internet (obviously).
6.  "Playing with" some other things on my computer.
7.  AND listening to Milla Jovovich...

...all at the same time.  Ya gotta love multi-tasking... 

Converting a HUGE collection

Reply #42
Quote
No kidding, eh? As a 19-year old, I think I've done okay in having ~100 CDs. Then, to hear someone come on here and have my collection size *SQUARED*(!!) just makes me feel small.

lol. I'm your age, and my collection size is SQUARE-ROOT yours.

Converting a HUGE collection

Reply #43
re cost, If the guy has (had) between $100,000 and $200,000 to spend on the CD's in the first place, then several thousand for hard drives shouldn't break his bank.

OTOH, the people I know with large collections of original (vs pirate copies) music get a lot
at garage sales, estate sales, etc.  One guy I rented a room from had at the time around
7,000 hours worth of records (before CD's came out).  Imagine encoding all that with the
added labor of splitting out and tagging the tracks.

Converting a HUGE collection

Reply #44
Given the enormity of the task, the original poster (to the extent he/she is still following this) may want to consider outsourcing the job.  The AudioRequest people offer a Music Loading service, though it's unclear how much it would cost or if it's available if you don't purchase one of their (arguably overpriced) systems.  Also, they sell RipStations for people with similar needs.  Not sure if this is what SometimesWarrior was referring to when he talks about "ripping tower and encoding cluster".  Neither solution is cheap, but more than one post has already demonstrated that cheap (in absolute terms) isn't an option and instead the key is to find a solution that is RELATIVELY cost/time-efficient given the requirements.  Still not sure if AudioRequest qualifies, but if you wanted to learn more: http://www.request.com/us/

Regarding the future, I don't know if I agree with the ScorLibran's timeline for cranial implants, but the holographic storage that he/she speaks of IS just around the corner:  http://www.inphase-technologies.com.

Converting a HUGE collection

Reply #45
Audible!: Lossless would seem like the best answer for me (FLAC seems to be getting the most support- see [overpriced] AudioRequest device).  A quesiton on lossless: Is there any decoding time required in playing lossless files that is not an issue for WAV files?  If not, is there ANY advantage to WAV whatsoever?
Regarding AAC, I've got the "old" iPod for Windows- I bought a 20GB windows-compatible iPod last september straight from the first run of the assembly-line only to see it made obsolete with the 2003 iPod!!  It's my understanding that mine doesn't support AAC.  Even if a software upgrade could fix this, I'm worried about ubiquity of AAC vs. MP3 and also just don't feel like going through the hassle of educating myself on optimal AAC bitrates, etc. (I've spent a not-insignificant amount of time researching the optimal lame settings for MP3).  That said, I'd be interested in people's thoughts on this decision point.

Converting a HUGE collection

Reply #46
Well,if you got a 10000++ cds..Even if you bought a blanks and burned-copyed from friends??its a  considerable cost......So what is a hd space compared to??On the other side,back to your question i would go with a mpc(q5) for a majority of them,and for cds you really like go with q6-7 or losless for those--in case you need to convert them in mp3s..like for portable mp3 player and such.....

Converting a HUGE collection

Reply #47
Quote
A quesiton on lossless: Is there any decoding time required in playing lossless files that is not an issue for WAV files?  If not, is there ANY advantage to WAV whatsoever?

Of course there's more decoding time for encoded music vs wav.  That's kind of the point

e.g. Annie Lennox - Bare - The Hurting Time, timed using foobar2k's speed meter (foo_null.dll):

MP3, 224kbps: decoding took 8632 milliseconds, speed 52.38x
APE, high: decoding took 18246 milliseconds, speed 24.77x
FLAC level 5: decoding took 7030 milliseconds, speed 64.30x
WAV, 16 bit undithered: decoding took 872 milliseconds, speed 518.44x

Depends if 10x the CPU's worth ~60% the space to you.


Converting a HUGE collection

Reply #49
The deal with RAID-5 is that only one drive in an array can fail and the array still survive. And the minimum number of drives is three. Yes, you could have a huge 20 drive array (you'd have to use software RAID) and you would only lose one drive to parity. If you just had 7 3-drive arrays, you'd lose 7 drives to parity. Then again, this is a direct trade-off to the fact that it's much more likely that two or more drives will fail in a 20 drive array then it wil in a 3-drive array. Most people find 4-8 to be the optimal number, but this is also probably related to limitations with hardware RAID.

I currently have half a terabyte in my fileserver and in the next month or so will be upgrading to a full terabyte. I will be using Samsung 5400RPM 160GB drives managed by Linux software RAID. I don't care how slow it is since I'm mostly concerned with just the amount of storage I can get for my budget.
Everything I've learned about space, I've learned from psytrance.