Ripping CDs in Linux (Tested in Fedora 20)
I probably should start a blog or something. I'm just hoping someone finds this useful. Sorry it is so long.
This is long and pedantic, but the script makes it easy for me to be pedantic.
There is an equivalent of EAC for Linux, written in python I believe by one of the GStreamer developers.
I haven't used it in several years because it is fracking slow.
I use cdparanoia because it is faster and still usually gets bit for bit good copy.
The only CD I've ever had a problem with using cdparanoia was a Shakira CD where EAC on Windows didn't do any better, it had intentionally flawed bits as a copy protection measure, so I ended up pirating it (I owned the CD so...)
This script uses the following CLI utilities that should be readily available in any modern Linux distribution:
EXECUTABLE (PACKAGE VERSION)
cd-info (libcdio 0.90)
bc (bc 1.06.95)
cdparanoia (cdparanoia 10.2)
sox (sox 14.4.1)
flac (flac 1.3.0)
shnsplit (shntool 3.0.10)
It rips and flac encodes in /tmp
I do this because I use tmpfs for /tmp and thus it is all done in memory and no disk writing is needed.
But don't use tmpfs for /tmp if you don't have a decent amount of memory, or you'll run out of space. tmpfs usually is configured to only use up to half of available system memory, and there are other things using tmpfs as well.
If you have 4GB or less and are using tmpfs for /tmp then change the line that says
TMP=`mktemp -d --tmpdir=/tmp paranoid.XXXXX`
to use something on a hard drive (like maybe --tmpdir=/home/username)
The script initially rips to a single file. I then do something controversial and re-sample to 48kHz.
Some here have expressed that is somewhat silly and perhaps they are right but I'm trying to keep everything 48kHz. I won't argue my reasons, but I do flac encode the 41.1kHz as ripped for archival purposes.
The script splits the re-sampled WAV using the cue information from the CDROM.
Since cdparanoia starts ripping from the beginning of the first track which often is not the beginning of the disc, the cue sheet has to be changed to split it.
I subtract the value of the first entry from all the entries, and the match I use for that does it by frame.
Then since shnsplit won't split a 48000 kHz WAV using frames, even though 48000 divides evenly by 75, the frames are converted to milliseconds. I do this using bc, dividing the number of frames by 75. It always rounds down but we are talking 1 millisecond.
I hope shntool is patched to allow MM:SS:NN but until it is, there will usually be a fraction of a 44.1 kHz frame difference in where a track is split because we are converting from base75 to decimal but we are talking less than a thousandth of a second difference, I doubt it ever makes a difference that can be audibly noticed.
After it splits the tracks, it attempts to make stubs for tagging.
The script does not do any tagging other than the tags added by flac on encode (replay-gain)
One stub it creates is called metaMaster.txt
the tags it puts in that file are almost all empty:
ALBUMARTIST=
ALBUM=
GENRE=
DATE=
RECORDED=
REMASTER=
CDUNIVERSE=
I personally use original release date for DATE because when I play Caress of Steel, I want 1975 displayed, not the 1997 remaster date. REMASTER is where I put that date, I only need it in the event I want to check what version of an album a song is from.
The CDUNIVERSE tag is also a personal tag, I plan to use it with an HTML5 icecast client to make it easy for a listener who wants the album to have a link where they can get it.
It also adds a RIPPED tag including the date the media was ripped. Again I use this for personal information.
If the number of created tracks matches the number of ISRC codes, the script then creates stubs for individual files and adds a TOTALTRACKS tag filled in to the metaMaster.txt file. My understanding is that when ISRC info is not there, cd-info will give a value filled with 0s so in theory the script *should* always create the track specific stubs but I am not positive.
The stubs for individual files contain an empty ARTIST and TITLE tag, and the filled in ISRC and TRACKNUMBER tag.
Finally, the script creates a shell script that makes it easy to add tags once you complete the stubs and are ready to tag the flac files.
The script is not run automatically as initially the tag information isn't there.
It uses metaflac to load the track data first. That if, say, the GENRE for the album is Rock but one of the songs is Classical Guitar, you can put GENRE=Classical Guitar into the track specific file and that will show in media players because it is the first GENRE tag.
After the generated metadata script adds the track specific metadata, it adds the tags from the metaMaster.txt file that are common tags for all the tracks.
Here is an example of a track specific stub I filled in :
ARTIST=Rush
TITLE=The Fountain Of Lamneth
TRACKNUMBER=5
ISRC=USMR17500048
CHAPTER001=00:00:00.000
CHAPTER001NAME=I. In The Valley
CHAPTER002=00:04:16.500
CHAPTER002NAME=II. Didacts And Narpets
CHAPTER003=00:05:17.500
CHAPTER003NAME=III. No One At The Bridge
CHAPTER004=00:09:37.000
CHAPTER004NAME=IV. Panacea
CHAPTER005=00:12:53.000
CHAPTER005NAME=V. Plateau
CHAPTER006=00:16:08.000
CHAPTER006NAME=VI. The Fountain
I filled ARTIST and TITLE, ISRC and TRACKNUMBER were already filled in for me, and I added the CHAPTER* tags that are not currently supported in any player I use but hopefully will be at some point. The script added the TRACKNUMBER and ISRC tags.
When the ripping script is finished ripping creating the various files, it puts them all in a tarball in the directory the script was called from and removed the temporary directory it ripped in.
I put it in a tarball because when I'm ready to tag the files, if I frack things up too badly I still have the tarball to start over from.
I call the script cdrip.sh and put in ~/bin/ and make it executable.
To then use it - put CD in the tray and run
cdrip.sh "Rush - Caress of Steel"
While it can handle spaces (but you must then use quotes as above), I would suggest the argument be sane characters that don't have special meaning in shell. About things like $ and !. Stick to alpha-numeric, -, _ and space. You can put special characters in the tags after ripping, but the argument is dictates file name and you don't want those in file names anyway. Perhaps I should update it to change anything that is not a sane character to an underscore, but I wrote the script for me and I know to use sane characters.
The script will do its thing and in above example, create the result as Rush_-_Caress_of_Steel.tar
-=-=-=-=-
You will need to make some modifications.
If your CDROM drive is not /dev/sr0 then you need to change the line that says
DEVICE="/dev/sr0"
You also need to set the offset for your drive model.
DRIVE_OFFSET="667"
is where you set that.
You can find out your drive model using the cd-info command, e.g. cd-info /dev/sr0.
Then find your offset at http://www.accuraterip.com/driveoffsets.htm
You may want to make other modification.
If you prefer the generated tagging script not encode to opus or you want different opus options, obviously modify that part.
If you prefer not to re-sample to 48000 then obviously change the line that says
sox --norm cdda.wav -b 16 ${DISK}.wav rate 48000 dither -s
to
mv cdda.wav ${DISK}.wav
and remove the
flac --best --replay-gain cdda.wav -o ${DISK}-as_ripped.flac
command and finally change the line that reads
cat split.txt |shnsplit ${DISK}.wav
to
cat fixed.NN |shnsplit ${DISK}.wav
That will result in splitting based on the frame information, which can do since it is 44.1 kHz.
One final note - when you run the command, as it gets towards the end of the CDROM during the rip you may get a bunch of warnings about invalid SCSI requests. According to the cdparanoia man page, this is normal when using the -O switch.
-=-
Here is the script:
#!/bin/bash
#~/bin/cdrip.sh
#The offset is specific to model of drive
DEVICE="/dev/sr0"
DRIVE_OFFSET="667"
#Good chance the drive will bitch at you about SCSI errors
# this is mentioned in cdparanoia man page and is correct
DISK=`echo "$1" |sed -e s?" "?"_"?g`
CWD="`pwd`"
TMP=`mktemp -d --tmpdir=/tmp paranoid.XXXXX`
pushd ${TMP}
cd-info ${DEVICE} > cdinfo.txt
grep "ISRC:" cdinfo.txt > ISRC.txt
N=`grep -n "^Media Catalog" cdinfo.txt |head -1 |cut -d":" -f1`
N=`echo "${N} -1" |bc`
head -${N} cdinfo.txt > tmp.txt
M=`grep -n "^CD-ROM" tmp.txt |tail -1 |cut -d":" -f1`
T=`echo "${N} - ${M}" |bc`
tail -${T} tmp.txt \
|grep " audio " \
|sed -e s?"^[ \t]*"?""? \
|cut -d" " -f2 > cue.NN
rNN=`head -1 cue.NN |cut -d":" -f3`
rSS=`head -1 cue.NN |cut -d":" -f2`
rMM=`head -1 cue.NN |cut -d":" -f1`
#make them integers
rNN=`echo "${rNN} + 0" |bc`
rSS=`echo "${rSS} + 0" |bc`
rMM=`echo "${rMM} + 0" |bc`
cat cue.NN |while read line; do
NN=`echo "${line}" |cut -d":" -f3`
SS=`echo "${line}" |cut -d":" -f2`
MM=`echo "${line}" |cut -d":" -f1`
NN=`echo "${NN} + 0" |bc`
SS=`echo "${SS} + 0" |bc`
MM=`echo "${MM} + 0" |bc`
if [ ${rNN} -le ${NN} ]; then
NN=`echo "${NN} - ${rNN}" |bc`
else
NN=`echo "75 + ${NN}" |bc`
NN=`echo "${NN} - ${rNN}" |bc`
if [ ${SS} -gt 0 ]; then
SS=`echo "${SS} - 1" |bc`
else
SS=59
MM=`echo "${MM} - 1" |bc`
fi
fi
if [ ${rSS} -le ${SS} ]; then
SS=`echo "${SS} - ${rSS}" |bc`
else
SS=`echo "60 + ${SS}" |bc`
SS=`echo "${SS} - ${rSS}" |bc`
MM=`echo "${MM} - 1" |bc`
fi
MM=`echo "${MM} - ${rMM}" |bc`
#fix format
if [ ${NN} -lt 10 ]; then
if [ ${NN} -eq 0 ]; then
NN="00"
else
NN="0${NN}"
fi
fi
if [ ${SS} -lt 10 ]; then
if [ ${SS} -eq 0 ]; then
SS="00"
else
SS="0${SS}"
fi
fi
if [ ${MM} -lt 10 ]; then
if [ ${MM} -eq 0 ]; then
MM="00"
else
MM="0${MM}"
fi
fi
echo "${MM}:${SS}:${NN}" >> fixed.NN
done
SKIP=0
cat fixed.NN |while read line; do
MM=`echo "${line}" |cut -d":" -f1`
SS=`echo "${line}" |cut -d":" -f2`
NN=`echo "${line}" |cut -d":" -f3`
sss=`echo "scale=3;${NN} / 75" |bc`
if [ ${sss} == "0" ]; then
sss=".000"
fi
if [ ${SKIP} -gt 0 ]; then
echo "${MM}:${SS}${sss}" >> split.txt
else
SKIP=`echo "${SKIP} + 1" |bc`
fi
done
cdparanoia -d ${DEVICE} -O ${DRIVE_OFFSET} -w 1-
if [ $? != 0 ]; then
exit 1
fi
sox --norm cdda.wav -b 16 ${DISK}.wav rate 48000 dither -s
flac --best --replay-gain cdda.wav -o ${DISK}-as_ripped.flac
flac --best --replay-gain ${DISK}.wav
cat split.txt |shnsplit ${DISK}.wav
flac --best --replay-gain split-track*
rm -f *.wav
#start the meta-data tracks if possible
cat <<EOF > metaMaster.txt
ALBUMARTIST=
ALBUM=
GENRE=
DATE=
RECORDED=
REMASTER=
CDUNIVERSE=
EOF
RIPPED="`date +%Y-%m-%d`"
echo "RIPPED=${RIPPED}" >> metaMaster.txt
TRACKS=`ls split-track* |wc -l`
ISRCN=`cat ISRC.txt |wc -l`
if [ ${TRACKS} -eq ${ISRCN} ]; then
echo "TOTALTRACKS=${TRACKS}" >> metaMaster.txt
cat ISRC.txt |while read line; do
ISRC=`echo "${line}" |cut -d ":" -f2 |sed -e s?" "?""?`
NUM=`echo "${line}" |cut -d":" -f1 |sed -e s?" "?":"? |sed -e s?" "?":"?g |cut -d ":" -f2`
MNUM=${NUM}
if [ ${MNUM} -lt 10 ]; then
MNUM="0${NUM}"
fi
echo "ARTIST=" > meta${MNUM}.txt
echo "TITLE=" >> meta${MNUM}.txt
echo "TRACKNUMBER=${NUM}" >> meta${MNUM}.txt
echo "ISRC=${ISRC}" >> meta${MNUM}.txt
done
fi
cat <<EOF > loadMetadata.sh
#!/bin/bash
# CAUTION - Fill in the track stubs and metaMaster data first
N=\`ls ${DISK}-track* |wc -l\`
COUNT=0
while [ \${COUNT} -lt \${N} ]; do
NN=\`echo "\${COUNT} + 1" |bc\`
if [ \${NN} -lt 10 ]; then
NN="0\${NN}"
fi
metaflac --import-tags-from=meta\${NN}.txt ${DISK}-track\${NN}.flac
COUNT=\`echo "\${COUNT} + 1" |bc\`
done
ls *.flac |while read flac; do
metaflac --import-tags-from=metaMaster.txt \${flac}
done
#encode to opus
ls ${DISK}-track* |while read flac; do
opus=\`echo \${flac} |sed -e s?"\.flac$"?".opus"?\`
opusenc --bitrate 64 \${flac} \${opus}
done
EOF
ls split-track* |while read line; do
N="`echo ${line} |sed -e s?"^split-track"?""? |sed -e s?"\.flac$"?""?`"
mv split-track${N}.flac ${DISK}-track${N}.flac
done
mkdir ${DISK}
mv *.flac ${DISK}/
mv ISRC.txt ${DISK}/
mv split.txt ${DISK}/
mv cdinfo.txt ${DISK}/
mv meta* ${DISK}/
mv loadMetadata.sh ${DISK}/
tar -cf ${DISK}.tar ${DISK}
mv ${DISK}.tar ${CWD}/
popd
sync && sync
rm -rf ${TMP}