HydrogenAudio

Lossless Audio Compression => FLAC => Topic started by: leo-bogert on 2012-10-28 17:03:47

Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: leo-bogert on 2012-10-28 17:03:47
This is an encoder script for EAC "Test & copy image" mode.
It splits the WAV image to WAV singletracks according to the EAC CUE and encodes them to FLACs.
Its goal is to produce perfect FLAC files by verifying checksums in all steps and being totally paranoid.
For guaranteeing this, it does the following checks and aborts if any of them fails:
Additional features are:
Website: https://github.com/leo-bogert/perfect-flac-encode (https://github.com/leo-bogert/perfect-flac-encode)

Sample output:
Code: [Select]

perfect-flac-encode.sh Version BETA2 running ...
BETA VERSION - NOT for productive use!


Album: Paul Weller - Wild Wood
Checking EAC LOG for whether AccurateRip reports a perfect rip...
AccurateRip reports a perfect rip.
Splitting WAV image to singletrack WAVs...
shntool [split]: warning: discarding initial zero-valued split point
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/01 - Sunflower.wav] (4:06.62) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/02 - Can you heal us (holy man).wav] (3:41.68) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/03 - Wild wood.wav] (3:22.30) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/04 - Instrumental one (Part 1).wav] (1:37.05) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/05 - All the pictures on the wall.wav] (3:56.47) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/06 - Has my fire really gone out.wav] (3:51.38) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/07 - Country.wav] (3:38.60) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/08 - Instrumental two.wav] (0:49.52) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/09 - 5th season.wav] (4:54.00) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/10 - The weaver.wav] (3:43.08) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/11 - Instrumental one (Part 2).wav] (0:33.42) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/12 - Foot of the mountain.wav] (3:37.60) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/13 - Shadow of the sun.wav] (7:36.43) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/14 - Holy man (Reprise).wav] (1:50.72) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/15 - Moon on your pyjamas.wav] (4:02.00) : .......... OK
Splitting [Paul Weller - Wild Wood.wav] (54:04.55) --> [Stage1_WAV_Singletracks_From_WAV_Image/16 - Hung up.wav] (2:41.68) : .......... OK
Comparing AccurateRip checksums of split WAV singletracks to AccurateRip checksums from EAC LOG...
AccurateRip checksum of track 01: 12A17108, expected 12A17108. OK.
AccurateRip checksum of track 02: B8B12553, expected B8B12553. OK.
AccurateRip checksum of track 03: EE647F8C, expected EE647F8C. OK.
AccurateRip checksum of track 04: 198D1B89, expected 198D1B89. OK.
AccurateRip checksum of track 05: 443F683F, expected 443F683F. OK.
AccurateRip checksum of track 06: D4EA2156, expected D4EA2156. OK.
AccurateRip checksum of track 07: 23A1013A, expected 23A1013A. OK.
AccurateRip checksum of track 08: 59846ADB, expected 59846ADB. OK.
AccurateRip checksum of track 09: 562695D4, expected 562695D4. OK.
AccurateRip checksum of track 10: F8F465AC, expected F8F465AC. OK.
AccurateRip checksum of track 11: 779AA114, expected 779AA114. OK.
AccurateRip checksum of track 12: 826FDC3D, expected 826FDC3D. OK.
AccurateRip checksum of track 13: 08E26E72, expected 08E26E72. OK.
AccurateRip checksum of track 14: BB90680D, expected BB90680D. OK.
AccurateRip checksum of track 15: 37AAC77A, expected 37AAC77A. OK.
AccurateRip checksum of track 16: 1A3CD493, expected 1A3CD493. OK.
Joining singletrack WAV temporarily for comparing their checksum with the original image's checksum...
Joining [Stage1_WAV_Singletracks_From_WAV_Image/01 - Sunflower.wav] (4:06.62) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/02 - Can you heal us (holy man).wav] (3:41.68) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/03 - Wild wood.wav] (3:22.30) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/04 - Instrumental one (Part 1).wav] (1:37.05) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/05 - All the pictures on the wall.wav] (3:56.47) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/06 - Has my fire really gone out.wav] (3:51.38) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/07 - Country.wav] (3:38.60) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/08 - Instrumental two.wav] (0:49.52) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/09 - 5th season.wav] (4:54.00) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/10 - The weaver.wav] (3:43.08) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/11 - Instrumental one (Part 2).wav] (0:33.42) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/12 - Foot of the mountain.wav] (3:37.60) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/13 - Shadow of the sun.wav] (7:36.43) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/14 - Holy man (Reprise).wav] (1:50.72) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/15 - Moon on your pyjamas.wav] (4:02.00) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
Joining [Stage1_WAV_Singletracks_From_WAV_Image/16 - Hung up.wav] (2:41.68) --> [Stage2_WAV_Image_Joined_From_WAV_Singletracks/joined.wav] (54:04.55) : .......... OK
No padding needed.
Computing checksums...
Original checksum: e47804481da6959636de7fbad147e3f2b2131c682aa54f07a2405d2b27aa7bdf
Checksum of joined image: e47804481da6959636de7fbad147e3f2b2131c682aa54f07a2405d2b27aa7bdf
Checksum of joined image OK.
Encoding singletrack WAVs to FLAC ...
NOTE: --keep-foreign-metadata is a new feature; make sure to test the output file before deleting the original.
Running flac --test on singletrack FLACs...
Decoding singletrack FLACs to WAVs to validate checksums ...
NOTE: --keep-foreign-metadata is a new feature; make sure to test the output file before deleting the original.
Generating checksums of original WAV files...
Validating checksums of decoded WAV singletracks ...
01 - Sunflower.wav: OK
02 - Can you heal us (holy man).wav: OK
03 - Wild wood.wav: OK
04 - Instrumental one (Part 1).wav: OK
05 - All the pictures on the wall.wav: OK
06 - Has my fire really gone out.wav: OK
07 - Country.wav: OK
08 - Instrumental two.wav: OK
09 - 5th season.wav: OK
10 - The weaver.wav: OK
11 - Instrumental one (Part 2).wav: OK
12 - Foot of the mountain.wav: OK
13 - Shadow of the sun.wav: OK
14 - Holy man (Reprise).wav: OK
15 - Moon on your pyjamas.wav: OK
16 - Hung up.wav: OK
All checksums OK.
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: mjb2006 on 2012-10-28 18:32:19
How is HTOA handled?

Does "Go fuck yourself" really need to be one of the messages?
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: leo-bogert on 2012-10-28 18:41:03
How is HTOA handled?

perfect-flac-encode itself does not parse CUE-sheets. They are interpreted by shntool.
Therefore, the HTOA handling is up to shntool.
The current commandline of shntool which is used is the following:
Code: [Select]
shntool split -P dot -d "$outputdir_relative" -f "$1.cue" -n %02d -t "%n - %t" -- "$1.wav"

($1 = filename of cue/wav)
The commandline of shntool is still up for discussion. If you feel like different settings should be chosen for better HTOA handling, feel free to suggest those.

Does "Go fuck yourself" really need to be one of the messages?

Consider it as an easter-egg 
- The tool is aimed for being used in a setup to produce absolutely perfect rips, incorrect rips are not welcome 
I've changed the wording though, thanks for reporting this, I had forgotten about it.
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: leo-bogert on 2012-10-31 18:10:45
BETA3 available now.
Changelog:

- Fix possible bug in startup temp dir deletion when using directory names with spaces
- Make syntax match the syntax which is described in README.md: BETA2 used to assume that the CUE/WAV/LOG reside in a "<Artist> - <Title>" subdirectory of the directory which is the first parameter. We now assume that the first parameter is the directory where they reside in.
- Move output FLACs to a newly created "<Artist> - <Title>" subdirectory upon success (and include this directory in the "delete existing output?" check upon startup).
- Copy CUE/LOG to output directory ("<Artist> - <Title>") upon success
- Delete temporary output directories upon success
- Print an explicity SUCCESS message with the name of the output dir
- Add TODO which shall be resolved in BETA4: What about the checksum in the EAC-LOG? Is it a plain CRC32 or a magic checksum with some samples excluded? If it is no plain checksum then we should keep the sha256sum as a file in the output dir for the user so he can check the sum in case he ever re-joins the FLACs to a WAV image.
- Internal cleanup: Rename variables / functions to make their purpose more obvious
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: leo-bogert on 2012-11-01 02:58:15
BETA4 available now.
Changelog:
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: pablogm123 on 2012-11-01 16:02:21
The CRC outputted by EAC excludes the RIFF header, just hashes the raw PCM audio data. If you rip to wav and delete the first 44 bytes of wav file, CRC of that file will match the reported by EAC. So, you could change the routine so that ignores the first 44 bytes of outputted wav fie, and hashes just the audio data. It is a plain CRC32, nothing magic, like AccurateRip's checksum.
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: skamp on 2012-11-01 16:46:08
To compute an EAC-compatible CRC hash:

Code: [Select]
mkfifo 'fifo'
shncat -q -e 'file.wav' > 'fifo' &
crc="$( cksfv -b 'fifo' | fgrep -v ';' )"


There must be a more straightforward way, but I haven't investigated any further.
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: leo-bogert on 2012-11-04 16:48:02
To compute an EAC-compatible CRC hash:

Code: [Select]
mkfifo 'fifo'
shncat -q -e 'file.wav' > 'fifo' &
crc="$( cksfv -b 'fifo' | fgrep -v ';' )"


There must be a more straightforward way, but I haven't investigated any further.
Thank you. Yes, I found a more straightforward way which doesn't need a FIFO and uses a pipe instead.
Using that, I've published a tool to compute the EAC CRC on github:
https://github.com/leo-bogert/eac-crc (https://github.com/leo-bogert/eac-crc)

I hope you are OK with me adding a donation address to it. I decided for this because my script works quite different to yours.
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: leo-bogert on 2012-11-04 17:01:07
BETA6 available now.
The changelog of BETA5 should be considered, BETA6 was needed for a simple quickfix and contains no actual changes.

BETA5/6 Changelog:
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: leo-bogert on 2012-11-06 23:47:48
BETA7 available. Changelog follows. Notice that there are quite a few questions in in which I would request feedback for. Thank you!
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: leo-bogert on 2012-11-08 01:56:54
BETA8 available now. Changelog:
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: leo-bogert on 2012-11-09 04:46:56
BETA9/10 available.
10 is only a bugfix for 9, so here is the changelog of 9:

Sample REAMDE.txt:
Code: [Select]
About the quality of this CD copy:
----------------------------------
These audio files were produced with perfect-flac-encode version BETA10, using
flac 1.2.1.
They are stored in "FLAC" format, which is lossless. This means that they can
have the same quality as an audio CD.
This is much better than MP3 for example: MP3 leaves out some tones because
some people cannot hear them.

The used prefect-flac-encode is a program which converts good Exact Audio Copy
CD copies to FLAC audio files.
The goal of perfect-flac-encode is to produce CD copies with the best quality
possible.
This is NOT only about the quality of the audio: It also means that the files
can be used as a perfect backup of your CDs. The author of the software even
trusts them for digital preservation for ever.
For allowing this, the set of files which you received were designed to make it
possible to burn a disc which is absolutely identical to the original one in
case your original disc is destroyed.
Therefore, please do not delete any of the contained files!
For instructions on how to restore the original disc, please visit the website
of perfect-flac-decode. You can find the address below.


Explanation of the purpose of the files you got:
------------------------------------------------
"Paul Weller - 1994 - Wild Wood.cue"
This file contains the layout of the original disc. It will be the file which
you load with your CD burning software if you want to burn a backup of the
disc. Please remember: Before burning it you have to decompress the audio with
perfect-flac-decode. If you don't do this, burning the 'cue' will not work.

"Paul Weller - 1994 - Wild Wood.sha56"
This contains a so-called checksum of the original, uncompressed disc image. If
you want to burn the disc to a CD, you will have to decompress the FLAC files
to a WAV image with perfect-flac-decode. The checksum file allows
perfect-flac-decode to validate that the decompression did not produce any
errors.

"Paul Weller - 1994 - Wild Wood.log"
This file is a 'proof of quality'. It contains the output of Exact Audio Copy
and perfect-flac-encode as well as their versions. It allows you to see the
conditions of the copying. You can check it to see that there were no errors.
Further, if someone finds bugs in the future in certain versions of the
involved software you will be able to check whether your audio files are
affected.


Websites:
---------
perfect-flac-encode: http://leo.bogert.de/perfect-flac-encode
perfect-flac-decode: http://leo.bogert.de/perfect-flac-decode
FLAC: http://flac.sourceforge.net/
Exact Audio Copy: http://www.exactaudiocopy.de/
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: leo-bogert on 2012-11-10 04:39:25
BETA11 available. Changelog:
Title: Linux command-line tool for converting WAV-image to FLAC singletracks
Post by: leo-bogert on 2013-05-30 18:43:11
The beta phase is over and the first and second stable version have been released!
Changelogs:

Version 1:
Quote
This is the changelog of the first stable version of perfect-flac-encode.
It has been tested with over 100 audio CD images and was able to encode them all without any errors.
The unit tests have been run for 10 discs and all behaved as expected.
Since BETA11, there have been 209 commits which is quite a lot.
There has been a HUGE improvement in code quality and a lot of bugfixes. This was possible because I made a lot of effort to learn about common Bash pitfalls (special thanks goes to Greg's wiki for the pitfall list: http://mywiki.wooledge.org/BashPitfalls) (http://mywiki.wooledge.org/BashPitfalls)). In other words: BETA11 should not be used anymore, it has a lot of dangerous misuses of Bash.
What follows is the list of externally visible changes. Internal changes which are not interesting to non-developers have been excluded. Bugfixes also have been excluded, just try if it works now.

Changelog:
  • New feature: Copy a specification of the settings which were used with EAC to the output directory as "Exact Audio Copy Settings.txt". This requires a new input file "(release name).txt". You should obtain it from my settings-exact-audio-copy repository. Further instructions are in README.md.
  • Update README.md - this is the README which github displays on the main page of the repository. It gives a detailled overview of the input and output of the script and you should read it if you want to get an idea of what perfect-flac-encode produces in addition to FLAC files.
  • Make output directory selectable by the user instead of writing the output to the input directory. It now is a mandatory parameter. You don't need any write access to the input directory anymore. So if you don't trust PFE, make it write-protected.
  • Write all non-audio files except README.txt into a subdirectory "Proof of Quality" of the output directory.
  • Implement writing of a log file "Perfect-FLAC-Encode.log". I decided against appending to the EAC LOG. We shouldn't mess with well-known file formats.
  • Improve version output on the terminal and in ENCODEDBY tag: Add version numbers of all used tools which touch the audio data (currently accuraterip-checksum, cuetools, eac-crc, flac and shntool). If bugs are found in the tools, you can check whether your audio files are affected by examining the version numbers. Notice that this requires you to have the latest version of accuraterip-checksum/eac-crc because the previous versions did not support printing the version number. Also document the advantage of storing the version numbers in README.md and in the README.txt which perfect-flac-encode generates in the output directory. Remove the version numbers from README.txt. This file shall be easy to understand by laymen and we already store versions in ENCODEDBY and in the future perfect-flac-encode.log
  • Clean up usage of stdout, stderr and the log file: All critical error messages should go to stderr and the log file now. You can safely ignore stdout now and only watch stderr if you want to. The purpose of the log is also documented in the generated README.txt and in the github README.md
  • Do not put the track title into the filename of the singletracks anymore. EAC will use the local charset for track titles in the CUE instead of UTF-8 and it is difficult detect which charset it is. This would break bash processing of the filenames under certain conditions. Also, the metadata which EAC puts into the CUE is imprecise: It does not provide a proper way of selecting releases from MusicBrainz when using the FreeDB gateway. So the filenames would have to be changed when tagging with Picard anyway.
  • Use "Exact Audio Copy.cue", "Exact Audio Copy.log" and "Exact Audio Copy.sha256" as filenames instead of "(release name).cue/log/sha256". The name of the release is usually imprecise when the image is ripped because EAC does not allow proper selection of releases from MusicBrainz. Further, using "Exact Audio Copy" as basename will make the user less likely to be confused about what the files are for.
  • Do not abort if there is no CATALOG in the cuesheet: EAC sometimes produces cuesheets without a catalog. Probably there are discs which do not contain a CATALOG.
  • Update required accuraterip-checksum and eac-crc version
  • Resolve TODO of proof reading the FLAC encoding parameters. My soundman is ok with them as well.
  • Resolve TODO of asking my soundman to review the "shntool split" sytnax. He is ok with it.
  • Change expected location of accuraterip-checksum/eac-crc to be $PATH instead of the home directory. This clearly is the standard Unix way of doing it and I don't know why I didn't do this right from the beginning on.
  • Move all TODOs from the shell script to a file "TODO.txt". I don't want to use the Github bugtracker because it provides no official way of exporting issues and I don't want Github to be a single point of failure.
  • Check whether an unit test was enabled at the very end of the script and abort with "Unit test failure". The program should abort if any errors happen so reaching the end of it with unit tests enabled shows that they failed. Also document this.
  • I've run the script through ShellCheck and it didn't find any serious issues.
  • Resolve all TODOs with regards to Cuesheets: I mailed the EAC developer and obtained a specification of the EAC CUE format from him.
  • Print full output directory when asking the user to delete it.

Version 2:
Quote
  • Fix and improve hidden track one audio (HTOA) handling:
    * Version 0001 accidentally wrote "Hidden track one audio not found" into the log file even when there was HTOA. This did not have any negative side effects, it is merely cosmetic.
    * Unit test fix: If testing on data with HTOA, do not use track 0 as test track for checking whether the AccurateRip checksum validation works. Track 0 does not have an AccurateRip checksum.
    * New feature: Only keep hidden track one audio as a separate track if it is longer than a second. I have tested perfect-flac-encode with over 200 disc images. When tagging the resulting FLAC files, I noticed that VERY MANY discs (over 50) contain HTOA. It is usually shorter than a second. Therefore, it is more likely to be a mastering error than an actual hidden track. Typically, those sub-second tracks do not appear as a separate track in MusicBrainz and would make tagging very difficult. So as a workaround we merge them with track 1 now.
    * New feature: Alert the user of HTOA on stdout, not only in the log file.
    * New feature: Include the length of the HTOA in the message on stdout/in the log file.
  • Paranoia improvement: Generate checksums of original WAV image and split singletracks as early as possible. In the previous version, they were generated AFTER feeding the WAV files as input to other programs (shntool / flac), now we generate them before. This guards against the very unlikely case of shntool/flac damaging their input files.