Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: dBpowerAMPs New AccurateRip (Read 6366 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

dBpowerAMPs New AccurateRip

Starting with dBpowerAMP Music Conveter Release 10, AccurateRip will be included - this is a clever database of all CDs in existance. Hopefully the database will populate quickly after initial launch - the database is designed for about 1 Million CD discs.

Initially AccurateRip is 'Unconfigured', this means the program does not known the Offset value of the drive, as soon as a recognised CD is inserted (recognised means any CD ripped by any other user who has AccurateRip configured) AccurateRip will offer to AutoFind the offset. Once that is found AccurateRip is enabled, each time a disc is ripped it is compared with the results in the database - it can inform you wether it is accurate. Once a month the database can be automatically updated (the disc you have ripped are added for other peoples benefit).

So say 100 people have ripped Madonnas Erotica - all results are stored on my computer - the one of the highest match (true one) is used to form the database - so when you get around to Ripping this disc it will offer a Confidence of 100** (see below)

Say only one person has ripped the disc before, well the system works in your favour - if everything goes ok it will match the existing one and it will give a confidence of 1, but what if they do not match - well it means either their or your rip is wrong, to find out which see below.

Say you rip a disc that is not in the database? at the end it will tell you, now the beauty of this system is you can verify the rip using your own result (the last rip is in your database), preferabbly on a different CD drive (my research has pointed that different drives tend not to (I have not come across one case yet) return the same result, but the same drive ripping and re-ripping the same disc can, I have seen it!).

AccurateRip is going to a Closed beta just before Christmas, an open beta just after Christmas - to see how the theory and practice go together.


** Really if 100 people submitted results for 1 disc, I would expect a number of them to be wrong, these are filtered out (the nice fact that one scratch on someone elses CD will not be the same scratch on your CD), so from a 100 it might have 67 that match - the confidence would then be 67.

dBpowerAMPs New AccurateRip

Reply #1
Seems quite a complex system to mangage if not impliment, just out of curiosity, what cd extraction engine do you use. One you have created yourself, or somthing like cdparanoia?

Cheers,

Kristian

dBpowerAMPs New AccurateRip

Reply #2
One I have created myself.

I do not believe that any Secure Rip / Paranoia system will give 100% accurate results when a CD drive can return errors (ie a scratch on a CD) that is constant. This system works independantly of all the C2-drive_ caches_audio_data problems that will make those previous systems not function correctly.

What it cannot do is magically rip a badly scratched CD - I do not think anything can, but what it can do is give 100% certainty that a track is either ripped 100% right (if a track had a confidence of 2 or above I would be happy that it was right, even with a confidence of just 1 I would be 1 in 4 billion happy), or it is not (in which case, go and get that CD restorer liquid, or another CD).

dBpowerAMPs New AccurateRip

Reply #3
Very interesting idea!

I actually was thinking for quite some time how it is possible to check quality of the ripped track, without listening from start to end, and know that the rip is OK: without clicks mainly.
Though I have a few questions:
1. What kind of information will be collected from each disk?
2. What if I have Madonna Erotica cd, but there are a few releases: US, European and Japanese ones. Plus a few different sub releases by different manufacturers (like Columbia House here in Canada and US prints their own cds). So which CD "version" of the cd will be choosen as a correct one?
3. Will the algorithm and the "reference implementation" be open source?

I would love to participate in the test just for the sake of the testing the idea... If this algorithm is not very advanced one then it will be possible to run it on MP3/OGG/MPC files and verify that they are correct rips of the original CDs... (usually very useful when you copy "a number" of files from someone else).
My endian is bigger than yours.

dBpowerAMPs New AccurateRip

Reply #4
>1. What kind of information will be collected from each disk?

Very little - having a large database would be the downfall of such a system.

>2. What if I have Madonna Erotica cd, but there are a few releases: US, European and Japanese ones.

This ties in with the above - For a CD the following information is stored:

Track Count - 1 byte
Unique Identification - this is 3 identifiers - each 4 bytes long, the first one is all the track offsets added, 2nd one track offsets multiplied, 3rd one is the freedb identifier (I thought it would be good to tie in discs to names that exist, although it will only be used whilst the database if low on entries to you can quickly find a matching known CD).

Then for each track - there is two CRCs each 4 bytes long, the first is all the track CRC, so it can be determined if a rip is correct. The 2nd CRC is the auto-offset finding CRC. A CRC is calculated 5 seconds into each track for 1 frame only. Using all the tracks on the disc the offset can be quickly and reliably found.

>3. Will the algorithm and the "reference implementation" be open source?

I would be happy to make it such, after it has been prooved to work.

dBpowerAMPs New AccurateRip

Reply #5
Volunteers wanted

I am after volunteers with large CD collections (over 100 discs) to partake in AccurateRip, also to help populate the database (others cannot begin to use accurate rip until it recognises a known disc).

A special Test Diagnostics CD will be sent for AccurateRip to seed on.

Send your email to dbpoweramp@dbpoweramp.com

dBpowerAMPs New AccurateRip

Reply #6
How do you handle overread ? A drive without overreading will never give the same CRC if the start and end of the audio track are not silent, because it will be cut at different places.

dBpowerAMPs New AccurateRip

Reply #7
Also, how do you deal with drives that give inconsistant offset readings? My ASUS CD-S520 gives differing offsets when comparing against the EAC database, so this is a very real problem that has to be addressed.

dBpowerAMPs New AccurateRip

Reply #8
I really like this concept. Three suggestions:

A: The database is pretty much empty and the concept is uneested.  I have many cds, but I am unwilling to re-rip them all just to fill the database for testing purposes. How about providing a list of 20 popular cds and ask people to rip these. Getting 100 results for one cd should be a lot easier then.

B: Should this idea catch on, I'd like to suggest the following method of reading from and writing to the database: DNS and e-mail. At some point your database server will experience technical difficulties. Setting up mirrors is very simple with DNS. Accepting submissions via e-mail allows your database server to be down for days without losing submissions.

C: I suggest you send the following information along with the submission results: e-mail address or anonymous user-id, cd-rom drive used for ripping, settings used on cd-rom, misc relevant settings. This has two advantages: If a specific user has ripped 100 cds and they all have a confidence of 99%+, then new submissions from this particular user might have a greated weight. If a specific piece of hardware statistically provides really bad or really good confidence, then new submissions using this kind of hardware can have appropriate weight.

dBpowerAMPs New AccurateRip

Reply #9
Quote
How do you handle overread ?


This is not an issue initially as AccurateRip will not configure on a drive that does not support accurate stream (the accurate positioning of the drive head on an exact sector).

It is possible when the system has matured, that non accruate stream drives can be made to work (by over and under reading), but your drive must support that, so someone with a drive that is not accurate stream and is not overread capable has got no chance.

I think all new drives are Accurate Stream, even my 5 year old trusty Phillips burner was.

Quote
My ASUS CD-S520 gives differing offsets when comparing against the EAC database


Going from above as long as it is Accruate Stream it will be fine, even if the offset is not what other people with the same drive have - the system will find an offset not based on drive names, it will find the offset on 10 tracks in an album and if it not constant will warn that the drive is not accurate stream.

Quote
providing a list of 20 popular cds and ask people to rip these. Getting 100 results for one cd should be a lot easier then.


Good idea, ideally the beta test is just for ironing out the bugs. The main usage (and benefit) will not come until it makes a propper release and I can get those people who rip 200,000 discs per week (according to freedb stats) onto a new version.

Quote
If a specific user has ripped 100 cds and they all have a confidence of 99%+,


I am hoping (atleast from brainstorming) that this is not required and the system should be immune from bad results. Put it this way - the law of averages is purely on the 100% right side, let me explain (with a few really bad case senarios):

1 CD - 100 people rip it, of those 100 people 78 do not look after their discs and they have bad scratches (that mean that disc can never be ripped correctly), these scratches will be random so their CRCs will not match up, but the 22 that do match up will out weigh the others - the true result will be used by the system.

1 CD 100 people rip it, but everyone with a XYZ drive rip it wrong (I don't know why this should be the case, but it might happen), another possibility is that the disc unique identifier is not totally unique, so there would be two CRCs that feature prominately, I have put code in there to notify about such so the results can be looked at manually to find out what went wrong. With a bit of luck it will not have to be used.

I don't think email needs to be used, the system will store results on a local computer for many weeks before submitting to the server. Initially it works by a downloadable database, one it grows in size it will use a remote server (with the option of local) just as freedb does. In fact I have just taken to using a 2nd dedicated server that will primarily be used by AccurateRip.

I will open for an open beta shortly after Christmas, there should be about 500 discs in the database by then.

dBpowerAMPs New AccurateRip

Reply #10
I believe you misunderstand me. I tested the offset mode twice, once with one CD, once with another. I got differing results between the two! Thus, my drive has an inconsistant offset, and is unsuitable for exact copying, and thus nothing your program could do could possibly fix the problem.

dBpowerAMPs New AccurateRip

Reply #11
Quote
I believe you misunderstand me. I tested the offset mode twice, once with one CD, once with another. I got differing results between the two! Thus, my drive has an inconsistant offset, and is unsuitable for exact copying, and thus nothing your program could do could possibly fix the problem.

Inconsistent offset means that you get different offset with the same disc when you test again. Your test discs just happened to be from wrong pressing and inappropriate for offset detection.

dBpowerAMPs New AccurateRip

Reply #12
Quote
Quote
How do you handle overread ?


This is not an issue initially as AccurateRip will not configure on a drive that does not support accurate stream (the accurate positioning of the drive head on an exact sector).

Overreading and accurate stream are two separate things. Overreading means that drive can read audio data from lead-in / lead-out parts of the disc. There are several accurate stream drives that don't support this and will cut samples on certain CDs.

Quote
I think all new drives are Accurate Stream, even my 5 year old trusty Phillips burner was.

Perhaps, but your drive isn't a good example. I don't know of any burner that wouldn't support it, but generic no-name CD drives that so many cheap computer ships with usually don't support it.

dBpowerAMPs New AccurateRip

Reply #13
Good point, overread is needed also for any drive where the offset is not 0...

dBpowerAMPs New AccurateRip

Reply #14
Quote
1 CD 100 people rip it, but everyone with a XYZ drive rip it wrong (I don't know why this should be the case, but it might happen)

It happens with some Toshiba drives. Quote from EAC FAQ :

Some Toshiba drives have a firmware bug returning wrong data on special
  positions of every CD. As the error really occured, you should listen
  to these suspicious position allways and decide if the error is audible
  or not.


...and with some Samsung (if I remember correctly) drives, that mutes a short range of audio near the end of each track.

For the overread thing, you could support drives without overread under the condition of including the amount of null samples at both ends of any CD. It shouldn't cause a big problem since more than one CD out of two, I think, has digital silence at the beginning and the end.

dBpowerAMPs New AccurateRip

Reply #15
Quote
What if I have Madonna Erotica cd, but there are a few releases: US, European and Japanese ones. Plus a few different sub releases by different manufacturers (like Columbia House here in Canada and US prints their own cds). So which CD "version" of the cd will be choosen as a correct one?

Maybe I'm slow but  I still don't see how you're going to deal with the problem in the quote above.  I have over 500 CDs, but could only find 2 that matched the EAC reference database because of this problem (and their horrible taste in music  )

Wouldn't you need a fairly large sample (to develop your reference point - your 22/100 in the example above) of each pressing for this to work?  It seemed to me that the offsets varied by pressing - though if not, this problem goes away.

Can it actually identify the specific release of the disc with the identifier you've described?  And if so, won't I end up having to try a ton of discs to find a few that match (that is, for which there is a large enough sample for the specific pressing I have)?  It wasn't fun going through the written list, but at least I could quickly spot the candidates - it would be unbearable to load 100s or even dozens of discs, wait for it to be analyzed, etc.

I think one of the problems with the EAC reference list is that it was rather random and idiosyncratic.

dBpowerAMPs New AccurateRip

Reply #16
>Can it actually identify the specific release of the disc with the identifier you've described?

Yes

Early on with the system the limited discs are a pain, but as a comparison there are very few CDs these days that freedb does not list, and they have to be manually submitted - AccuarateRip runs in the background automatically.

dBpowerAMPs New AccurateRip

Reply #17
Quote
Early on with the system the limited discs are a pain, but as a comparison there are very few CDs these days that freedb does not list

Just to break apart the concepts:

FINDING DISCS FOR THE USER TO RUN THE ALGORITHM ON

Again, the problem is not the limited discs in the database; in fact, limits are good - better to have xx albums with xxx total pressings for which the answer is determined, than to have xxx albums with xxxx pressings, none of which has definitive data. 

Rather, the problem is the very large number of variations of discs in the world - and the user frustration that results from how many discs they will have to try to randomly match in order to determine their offset. 

However, perhaps there's a suggestion or two in here.  An intentional strategy to identify and target widely sold discs for data submission to use with offset determinations might help a lot.  Some ideas:
- Ask your volunteers to submit the most popular discs they think they own (in whatever genre). 
- THEN, once you have data, have the database produce a list of albums that you have good data for, so people can try those first.  Otherwise, users have to insert discs at random and hope for a match.
- If you are worried about the database filling up, it should be easy to come up with an algorithm to do any or all of (a) discard submissions except for the most frequently referenced xx albums in the freedb database (in each genre), (b) prune pressings that are never going to get a meaningful sample; or © once you've got a confidence level that the 22/100 are right, delete the 78.

Presumably freedb has an entirely different approach to finding a match than you do:
- Various pressings of the same CD will still match because  the # of tracks and track lengths can identify a specific album despite the many releases and pressings.  The level of accuracy and detail required for a successful unique match is much less - freedb's problem is quite error-tolerant. 
- AND - it presents the choices to a user who can usually discern the right one - though the bad duplicate records are an irritation.  Users won't be able to help confirm a match unless you collect the identifying #s for each CD.
I'm not sure the analogy is entirely relevant to the problem.

THE ALGORITHM

I'll take your word on how the data to identify a pressing can be precisely determined from data read by a large number of different drive models each with a different, unknown offset - without impractically large samples.  Presumably you've considered whether you have enough accurate data to compute the correct offset or to converge on the answer in a non-random fashion with a tolerable number of additional reads.

BTW, getting the offsets just right, at the very beginning, is hardly the most critical issue in the big picture. You are to be commended on adding secure ripping, and for most all the users it will be a big step up from what they use now (especially since so many people are still doing disc to disc on the fly copies!).  Secure ripping with an easy to use interface (EAC is a barrier to too many) is a real contribution.

dBpowerAMPs New AccurateRip

Reply #18
Thanks for your input,

Quote
Rather, the problem is the very large number of variations of discs in the world - and the user frustration that results from how many discs they will have to try to randomly match in order to determine their offset.


I had naively thought that people would without looking at lists, just stick the top ten popular discs of their collection and when the database is decently populated the offset will be found.

Quote
Ask your volunteers to submit the most popular discs they think they own (in whatever genre).


Really I want people not just to submit their most popular discs, but all of them. dMC has an option called 'Test Conversion (no write)', it takes a small effort just to whip all your CDs through that - infact I am doing a bunch now...

Quote
(B) prune pressings that are never going to get a meaningful sample; or © once you've got a confidence level that the 22/100 are right, delete the 78


Remeber the database as the end user sees it, if not all the submitted jumble, but a 'best worked out' database. There is code once a confidence reaches 200, that disc becomes closed and a super record is created of one entry.

[/QUOTE]BTW, getting the offsets just right, at the very beginning, is hardly the most critical issue in the big picture
Quote


I beg to differ on this one, as I am about to point out in the other Test #1 thread, now results are coming back one hiccup was found - that is one disc entered the database ripped with the wrong offset, I am still unsure why it happened, I suspect a CD drive threw a wobbler and was 1/4 of a CD sector out (it had ripped 100's correctly). Anyhow, such an entry could polute the database if others found their offsets of this disc. So the rules of the system have been changed to only recognise discs for offset detection with a Confidence of two or more. You live and learn, but it means the database will be slower to populate.