Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Exact Audio Copy database format? (Read 3105 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Exact Audio Copy database format?

Hi all!

I've got a project in development, in which I aim at providing a website that recommends cds to users, and I'm just hacking away at the EAC database-format right now. The file 'cddb.dat', that is.

I currently have the following regex'es, in slightly "Pedantically Eclectic Rubbish-Lister" format:

Quote
Hex  := \x(00-ff){2}
Delim   := \x00
Padding  := Delim{3,}
CDDB_ID  := Hex{4}
Header  := Hex{2} Delim{2} CDDB_ID Hex{3}
Artist  := Delim Text+
Album  := Delim Text+
Title  := Artist Album
Flag  := Delim{1,2} \x96 Delim{3}
More  := Delim Hex{3} Delim Hex{7} Delim (Unknown | Flag)
Track  := Text+ (Hex{2} Delim{2})|(Hex{3} Delim)
LastTrack   := Text+
Genre  := Delim{2} Hex Delim Text+

CD_info  := Header Title More Track+ LastTrack Genre


What I hope some might be able to tell me, is the format of the "Unknown" expression. It appears to me that this section can hold quite a lot of data, so just to make a shortcut, I'd like to hear from anyone who might have any ideas as to the format.
Also, I suspect the hex following the CDDB_ID is some kind of composite, in which case I expect I will have trouble figuring it out.

Any takers?

EDIT:
Oh well, I don't really need any more data than:
* cddb
* artist
* album

, so this suffices:
Code: [Select]
# basic grammar
my $delim = '\x00';
my $special_1 = '[\x00-\x02]';
my $hex = '[\x00-\xff]';
my $text = '\p{IsPrint}';

# grammar-constructed expressions
my %patterns = (
    'head_of_cd_record' => qr/$delim{2}($hex{4})$hex{3}/, #cddb_id in little endian
    'head_of_artist_album' => qr/$special_1/,
    'artist_album' => qr/($text+?)$delim($text+?)$delim/,
    'end_of_cd_record' => qr/$delim{4,}/,
 );


It extracts the data correctly for a cddb.dat-file having 307 cds, with only one error, which may or may not be due to EAC itself, so I'm satisfied with that.

The error, in case anyone wonders, is:
Code: [Select]
0000de0: 0000 0000 0000 0000 0000 0000 0000 f200  ................
0000df0: 0000 0a9c 0a82 4fda c800 b45e acc2 00a5  ......O....^....
0000e00: 7db3 be00 0a0b 0a00 ffff ffff 4d1c 0300  }...........M...
0000e10: 0096 0000 00a5 7db3 be00 0070 7e00 00b3  ......}....p~...
0000e20: 73a6 5ebb f5b3 a3a4 a3b5 b9a7 dab6 dc3f  s.^............?
0000e30: 0000 95c7 0000 b367 a4df 0000 4613 0100  .......g....F...
0000e40: b6fa a7aa 0000 3344 0100 a670 b9da aaec  ......3D...p....
0000e50: bff4 0000 2f8c 0100 b25c bff4 0000 22aa  ..../....\....".
0000e60: 0100 a677 bca2 0000 78fc 0100 a4d1 a8cf  ...w....x.......
0000e70: 0000 b548 0200 b750 c1c2 a741 a5ce a4df  ...H...P...A....
0000e80: b752 a7da 0000 f0e2 0200 bc5a b8a8 0000  .R.........Z....
0000e90: 0200 0000 0000 0000 0000 0000 0000 0000  ................


, which gives the output of (probably illegible):
Code: [Select]
´^¬Â    ¥}³¾