HydrogenAudio

Digital Audio/Video => General A/V => Topic started by: Dynamic on 2013-06-04 18:36:46

Title: Synchronizing & replacing audio in MPEG-4 video
Post by: Dynamic on 2013-06-04 18:36:46
I occasionally have need to replace the audio stream in a video file (usually some variant of MPEG-4) with a higher quality audio source and retain sync - particularly lip sync.

I've come up with what seems like a good workflow to synchronise soundboard audio fairly easily with the original video's low quality audio, encode it to AAC and remultiplex into a properly synchronized MPEG-4 file (e.g. .mp4 or .m4v) that seems to play fully synchronised on any platform I've thrown it at with the high quality audio instead of the original. The longest file I've tried so far stays in sync for its full 59 minute duration.

I've made some notes, primarily for myself and my preferences, but I thought this might be useful for anyone searching the forums or internet for ways to do this in future.

It should also be workable with other devices like an old Nikon camera that produces MOV files with motion-JPEG video and 8-bit mono sound at something odd like a 7980 Hz sampling rate, simply by using Handbrake to convert both video and low-quality sound using x264 and FAAC, say, then replacing the synchronized LQ sound with better sound. I find that many platforms won't play these MOV-mJPEG videos anyway (Quicktime excepted ), so I usually end up using Handbrake to convert them to MPEG-4 Visual (h.264 via x264) plus AAC audio to make a compatible video file that looks as good but takes up a fraction of the space.

High quality doesn't always mean HiFi - like soundboard audio at 48kHz/24-bit. Sometimes it can be a medium sampling rate for speech, but recorded at close quarters rather than with room-reverberation or background noise and thus offer improved legibility. A simple smartphone with a PCM recording application like AndRecorder for Android (which works happily at 22050 Hz/16-bit mono on my low-end phone) might be used with a wired headset microphone clipped to a tie or a lapel to produce decent sound in a video interview, for example, while recording the video and low-quality sound on a separate digital camera or DV recorder. The more professional approach might be a dedicated audio field recorder like a Zoom H1. In either case, if the separate audio source is better than that captured by the video recording device, it can be synced using this method.

My aim is to preserve quality or quality per bitrate, especially in sound (video isn't all that great anyway), and to use free software, mostly cross-platform.

I also want to produce files that Just Work properly on any platform without taking up undue bitrate for the quality of video present. That's why I prefer main profile h264 video at Constant Framerate and LC-AAC audio (96k CVBR is the sweet-spot for QAAC encoder where LC-AAC is a little better than HE-AAC).

I aim to produce two files from each video


Given that my video stream is poor: 10.4 fps, 320x240 pixels... 3GP format
the former version runs at 737kbps video + 327 kbps audio = 1065 kbps total (~450MB/hr)
The low bitrate version Constant Quality RF=25.0 runs at only 108 kbps video + 84 kbps audio = 193 kbps total (~80 MB/hr)
(CQ RF=23.0 runs at 158kbps video + 84 kbps audio = 243 kbps total (~100 MB/hr))

I've noted commandline equivalents that would work outside of Windows-only tools, so this guide could be useful on Mac or Linux platforms also. Of course, I'm not recommending this as the best or only way of doing things, but as a workflow that provides me with what I need without too many arduous steps. I also list some of the problems that led me to this method (i.e. I tried to demux the raw video and audio streams but couldn't retain the sync, possibly due to inconsistent frame rate - keeping the video in its MP4/3GP container until switching the audio over seems to retain proper timing).

My brief summary of the workflow and tools that I've chosen to use is reproduced here as well as on page 2 of the linked files, shared on my MediaFire Audio folder (http://www.mediafire.com/folder/9gjc5nla6c2sm/Audio). Fuller details are provided elsewhere in the OpenDocument Presentation (http://www.mediafire.com/view/uhrcp7j852yda8l/Adding_High_Quality_audio_to_video_files_to_replace_low-quality_audio.odp) (81KiB, LibreOffice Impress or OpenOffice Impress will open it, and MS Powerpoint should open it also, though I can't test that). A PDF download of the same Presentation (http://www.mediafire.com/view/1pj3ryziqiu2k8x/Adding_High_Quality_audio_to_video_files_to_replace_low-quality_audio.pdf)(176KiB) is also provided.

One of the key points is how easily this method can provide a synchronised audio stream in Audacity with multiple point of visual confirmation.
The other key point is keeping the video stream in its original MP4 container to retain accurate timing until the moment the audio streams are switched.

Full workflow in brief


If you have any good suggestions (or suggested alternative tools for Mac/Linux) feel free to chime in.