Re: [linux-audio-dev] mp3 mess help

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-dev] mp3 mess help
From: Frank NEUMANN (frank.neumann_AT_st.com)
Date: Wed Oct 27 2004 - 12:36:23 EEST


Hi list,
Jens M Andreasen <jens.andreasen_AT_chello.se> wrote:

[..]
> > I have a full partition with them in it but it is obvious most of them
> > are there multiple times with different non discriptive names..ouch:(
> >
> >
> > Is there a way I can searh the mp3 to find which are the same/different
> > using the actual mp3 binary data??
> >
>
> Short version:
>
> 1) Sort them (by binary content.)
>
> 2) Delete duplicates.
>
> 3) ...
>
> 4) Profit! :)
>
>
> I would probably only keep the path to the mp3 in the sorted structure,
> and then open (and close) them for comparison as needed.
>
> 'man qsort' is your friend.

My suggestion would be like this (if we are really talking about byte-by-byte
identical files):

find <path_to_mp3_directory> -iname "*.mp3" -exec md5sum {} \; | sort >log.txt

This will give you a logfile, "log.txt" containing all files, sorted by md5
checksum, _including_ duplicates. Whenever you see two identical md5 sums
following each other, you have a duplicate.

There are perhaps more geeky ways to use awk etc now to actually print out
the names of the duplicates, but others will have to continue from here.

Greetings,
Frank


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Wed Oct 27 2004 - 12:42:28 EEST