Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Comparing Two Music Collections Via Tcp/ip (Read 2223 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Comparing Two Music Collections Via Tcp/ip

Lets say we have two computers on the Internet. Both of them has a lot of music on them.  I want to compare the collections - get a list of identical music.

1. Assume that the files are called something different on each computer
2. Assume that the files have been tagged/replaygained differently on each computer

With 1 alone I could just do a md5sum on both computers and compare those values. It would be rather fast.

But 2 requires that I ignore the tags/replaygain info out. This could be done by:
- copying the file
- stripping the tags
- setting replaygain info to zero
- calculating the md5sum
- deleting the file

This would indeed be slow. So I would need a lookuptable with md5sum/file => md5sum/audio_data

Any ideas/comments?

Comparing Two Music Collections Via Tcp/ip

Reply #1
Quote
But 2 requires that I ignore the tags/replaygain info out. This could be done by:
- copying the file
- stripping the tags
- setting replaygain info to zero
- calculating the md5sum
- deleting the file

This would indeed be slow. So I would need a lookuptable with md5sum/file => md5sum/audio_data

Any ideas/comments?

I guess with a bit of knowledge about the different streamformats you could make an md5sum calculator that ignored everything but the actual audio data... Shouldn't be that difficult as those things mostly are well-documented. And it would be much quicker than first stripping the tags and then summing.

Comparing Two Music Collections Via Tcp/ip

Reply #2
but what if they were encoded from different rips??