Grabbing VHS video or capturing VHS tapes is essentially like digitising a TV video signal with means of a hardware grabber, some capture video card, or as in my case, with a USB video capture solution. And: it's perfectly doable on Linux. In my case, I was using a stk1160 based USB stick, a cheap $20 EasyCap / Mumbi labeled device manufactured in China based on Philips compatible chips coming from Syntek/Silan.
To start, just hook up your VCR with the USB stick via Cinch Audio and the yellow Composite or the S-Video In. On recent kernel versions, getting the stk1160 driver to work should be straightforward - just
- get the required release from github matching your kernel version:
- and install the package:
tar xvf 0.9.3_for_v3.2.tar.gz
make && make install
- Dont forget to run depmod -a after the install, and
- Unmute the audio-in as described here
- Make sure you figure out where the audio in registeres after connecting the USB, use this cat command: cat /proc/asound/cards
A good tool to see if everything is working fine is tvtime. You can also use it to cue your tapes to in-points. But make sure you exit it before you start capturing as tvtime blocks the video4linux in-device. But as soon as tvtime gives you picture and audio, you are good to go and you can finally start capturing that good old family home video on VHS, digitised for posterity.
Before you read on (Update): In Capturing VHS video with vlc via v4l2 I present an alternative (better) capture pipeline, with vlc instead of mencoder as the main front-end.
I recommend mencoder as a solid front end to start and stop the capturing on command line. Normally, I prefer ffmpeg / avconv and it's bare-bones libav backend for anything transcoding, but with VHS I found it unsuitable. Every time on tape one recording stops and another one starts, you get this burst of snow. As it seems, the stk1160 device sends these broken frames through to the v4l2 device and once they reach ffmpeg, it chokes and reports invalid frames. Although that is part of the "honest" quality of ffmpeg, I consider it a bug here as it exits the transcoding process and makes ffmpeg unsuitable for the task at hand. I found that someone filed a bug-report for that, so that future ffmpegs would just skip over these frames, but that is in distant future. Until then, we resort to mencoder, which can act as a front-end for ffmpeg's libav and already behaves like that. Also, I want to encode to AVI and mencoder was primarily written for that.
Getting the codec parameters right
VHS is analogue and exhibits quite a lot of noise - and noise is not so well handled by modern compressors/codecs like xvid, h264 etc. Of course, you can store the v4l output uncompressed, raw, but prepare yourself for roughly 60-100GB of data for an hour of video. Thus we need to figure out a way to compress the digitised video and keep original quality as good as possible. In real-time, capturing TV signals is quite a challenge in terms of CPU stress, so all capturing /codec settings are always a trade-off between compression-quality and CPU speed.
If you got a big harddrive or want to capture only a short segment of video, for best results and less problems on older systems, I would suggest a two step setup. First, capture uncompressed to a temp file, then run mencoder with a compression/codec combo that your machine normally couldn't handle in real-time. CLI commands for that, to get you started:
mencoder tv:// -tv channel=0:driver=v4l2:device=/dev/video0:normid=5:input=0:width=720:height=576:norm=PAL:fps=25:alsa:adevice=hw.1:forceaudio:brightness=0:contrast=0:hue=0:saturation=0:buffersize=128 -oac pcm -ovc copy -endpos 00:15:00 -o VHS1raw.avi
This is for the "first pass", a raw copy of the audio and video input. Note the "hw.1" part, it is the device id you get from cat /proc/asound/cards as the identifier of the stk1160's audio device. This may change in-between boots/USB connects, depending on other soundcards in your system. Next is the buffersize, normally mencoder should adjust the buffer automatically, but giving it a forced value here seems to be no harm. Instead of using -oac copy I use -oac pcm, which is more or less the same, I think, but I once got a strange error with "copy" about frame-sizes and never again saw that with "pcm" which I think muxes better than the 16bit little endian stuff the stk1160 outputs. -endpos obviously tells mencoder to stop after 15 minutes, as that is what we want to grab here. The command should give you a low one-digit percentage CPU usage value and quite a lot to do for your harddrive.
After that, it's time for a "second pass", although that is not a second pass as it is often referred to in video compression, it's more a second, the real "transcode pass":
mencoder -ovc xvid -xvidencopts fixed_quant=4 -oac mp3lame -lameopts cbr:br=128 -ofps 25 -o VHS1.avi VHS1raw.avi
So this seconds command is the real one. We encode to XVid here, with standard settings, with a "quality target" of 4 - which means a variable bitrate to reach a certain quality. Visual comparisons, for me, came out with 4 being a very good setting, with bitrates around 2500-3500 kbits for video. Later on, we'll see that 5 is a bit faster to compress and still ok. For the audio part the command uses mp3 with a constant bitrate of 128. We could add a bit of downsampling from 48000 Hz in the original raw audio to 44.1 KHz to shave off some more bits, but a variable bitrate might be more useful to achieve that.
On a low-range Intel Core2Duo, I get transcoding framerates between 12-14 fps, so transcoding these 15 minutes would take half an hour.
Note that the grabbed video will have square pixels while the original TV/VHS video has non-square pixels. So adjust your player to present the video with an aspect ratio of 4:3. Otherwise the video played back on a desktop computer will be slightly compressed horizontally. Sadly, embedding this info into the file so that players adjust this automatically doesn't work reliably.
Grabbing and encoding in real-time
Again, for real-time grabbing and encoding you would have to get yourself a fast machine (as of this writing, something a bit beefier as an 2013 office PC) or calculate some trade-off between your CPU speed and how compressed the final video will be.
sudo nice -n -19 mencoder tv:// -tv channel=0:driver=v4l2:device=/dev/video0:normid=5:input=0:width=720:height=576:norm=PAL:fps=25:alsa:adevice=hw.2:forceaudio:brightness=0:contrast=0:hue=0:saturation=0:buffersize=256 -ovc xvid -xvidencopts fixed_quant=5:threads=2:turbo:nochroma_me:vhq=0 -oac mp3lame -lameopts cbr:br=128 -endpos 04:10:00 -o VHS2.avi
(We prepend the mencoder command with a niceness adjustment - see below for why). Apart from that, we encode to Xvid in real-time here, thus we use the turbo keyword to tell mencoder to use settings from the turbo preset. This mplayer/mencoder manual page is your friend. We've also increased the buffersize a bit. You could make a buffer as big as 1024 but I found that I get serious asynchronous audio/video when doing so. Might be not related, but the problem of audio and video getting out of sync started with oversized buffers... Another root of that problem might be the threads=2 option, on the test-machine here, for real-time, I needed ~150% of CPU power to compress audio and video in near-real time, thus it required two threads with theses codec settings but it might also result in the mentioned audio/video sync problem, so try for yourself.
Using more than one threads hurts motion prediction quality, so only use it when you are force to do so by limited CPU horsepower. And again, in combination with a big buffer, for example, it seems to bring the overall system out of sync in general...
What you will see with slower machines is that mencoder reports "video buffer full - dropping frames" once CPU couldn't catch up with the incoming video over a time period that is longer than what your buffer can hold. And even when the buffer can hold the video the CPU couldn't captch up with, it has to catch up after that to empty the buffer again and mostly this won't happen - so "video buffer full" tells you to rethink your strategy, or lower the burden on CPU by using a faster codec or faster settings. Using x264 / h264 as the codec was no option on the test machine - too CPU intense.
Endpos is roughly over four hours here as most VHS tapes run longer than four hours - with this setting, you can leave the capturing process unattended.
Sometimes mencoder will tell you "skipping 1 duplicate frame" or similar. That is normal and doesn't mean the same as "dropping frames". Modern codecs just don't compress duplicate frames, and the resulting video should look exactly as without a "skipped frame" as the video stream, as I understand it, will be informed to present a frame longer to fill the gap, the frame is just not in the data stream.
Try to get to a framerate of ~25 fps (on PAL) to see that you are reaching real-time encoding speed. And work your way of to more CPU intensive presets. Mostly these presets will make the resulting file smaller/bigger without much change in visual quality. I found that constant bitrates settings are less desireable, as fixed datarates of ~2000 kbps are the visual tipping-point and the resulting file might even be bigger compared to a VBR video with higher average bitrates and better visual quality.
Deinterlacing: Just don't de-interlace on capture. It will throw away information. Modern codecs are perfectly capable to compress interlaced video in it's original form, and most desktop players can apply de-interlacing on playback time, giving you options anytime later on. Also, it introduces a filter step, for example with mencoders -vf switch, and that means additional stress on the CPU.
Only make sure , whenever you choose a codec, that your codec supports interlacing, and that you are telling it that you are feeding it with interlaced material. VHS is interlaced. Either de-interlace and store it progressive, or keep it interlaced and tell your codec to store it this way - just don't store interlaced frames progressive.
Two-pass encoding: The common mpeg4 codecs we use here all offer as two-pass settings. But with a streaming video source like our grabber that doesn't make much sense. Essentially, you would then be grabbing to a raw file which will grow gigantic and then you had to run the second pass. One setup I tested captured 2 hours of VHS video to a huge temp file and then the second pass with, admittedly, high quality settings ran for 9 hours - without much win in resulting file size or visual quality. Two-pass encoding just makes more sense when you transcode a fixed video source like a file or DVD and want to hit certain bitrate or filesize targets.
Nice-value: For time-critical tasks like transcoding in real time, it might make sense to lower the nice value of the menoder process. In case your machine has no other high priority tasks, this might help you to tune the transcoding performance of your setup. For example setting niceness to -19 with sudo nice -n -19 might help you through performance peaks without dropping frames.
Re-nicing processes is only one half of the story as most systems don't schedule IO time in relation to CPU time. ionice is your friend here. Tuning HD priority for your capturing process might help in your setup. Doing sudo ionice -c 2 -n 0 -p <processid> would set HD access priority to "best-effort" with high priority (low number) for your process.
Audio copy trick: In real-time setups with a slower machine where you lack some 10% of CPU speed, another trick is to simply copy the audio part. The raw audio datarate is not that high and simply copying it into the muxed stream would save you some CPU cycles. After that you would then have to do a second transcode pass for the audio alone, but normally, audio transcoding is faster than real-time and that would mean, you could capture video in realtime to some good video codec, 4 hours of VHS for example, and then, you add another 15 minutes after that for the audio transcoding: mencoder pseudo-syntax: mencoder -ovc xvid -oac copy vhs.avi && mencoder -ovc copy -oac mp3lame -o vhs-final.avi vhs.avi
Horizontal compression: VHS analogue video is like analogue audio: you can think of the video as one long line of information and your horizontal resulution setting will tell your 'capturing process' how many 'samples' it has to take along this horizontal 'line', just like the difference between having 44.1 and 48KHz audio. Wikipedia says:
The horizontal resolution is 170 lines per scanline, and the vertical resolution (the number of scanlines) is the same as the respective analog TV standard (576 for PAL or 486 for NTSC; usually, somewhat fewer lines are actually visible due to overscan). In modern-day digital terminology, NTSC VHS is roughly equivalent to 333×480 pixels luma and 40×480 chroma resolutions (333×480 pixels=159,840 pixels or 0.16MP (1/6 of a MegaPixel))., while PAL VHS offers the equivalent of about 335×576 pixels luma and 40×240 chroma (the vertical chroma resolution of PAL is limited by the PAL color delay line mechanism).
So, some guides found online will tell you that you can (or should) capture VHS at half the horizontal resolution the grabber actually outputs. Well, what's true is that you *can* do this. The idea is similar to what anamorphic film does and actually what PAL/NTSC analogue TV already does in a mild form... But I think, you shouldn't. Just try it and capture the VHS video at 360 pixels horizontal, 360x576, or 352x576 (with the sides cropped). People will tell you that this essentially weeds out the noise, but I simply found the resulting video looks soft.
I found that doing so doesn't save you so much in resulting file size - at least on my tests. (edit: File size actually does come down, seemed to be an error in my test back then. Here's a visual comparison.)
Thus, all mencoder examples above use a final video size of 720x575 pixels. Just apply what you now know about horizontal compression: when watched in a player, the pixels will have to be non-square (not a pixel-aspect-ratio/PAR of 1:1 but a PAR 1:133) - just as in the original video. So tell mplayer to use a 4:3 aspect ratio when you watch the captured videos. Sadly, telling mencoder to embed this into the video file doesn't work - although it should. Ffmpeg/avconv on the other hand, with the mp4 container, does - via the -aspect switch.
More about horizontal video compression can be found in a dedicated article about that including a visual side-by-side comparison example.
Use a time-base corrector, if possible: VHS usually suffers from line jitter, coming from the playback mechanism, it simply is inherent in the analogue techniques employed to record the image. Time base correction (TBC) now is a system to compensate that. This video here has an impressive example. Some more expensive VCRs have this built in, usually all S-VHS players. For example the Panasonic HS1000 or JVC HR-S9500E - just search eBay for VHS / VCR and TBC. And sometimes even capture cards have something similar. Usually you simply don't know (like me) but you can tell from the resulting video output from your VHS pipeline that there's probably nothing like this in there, or just a poor mechanism.
After more fiddling with old VHS tapes, I've streamlined the workflow and got a few new insights:
(1) First: XVid is old technology compared to libx264, so forget about it.
(2) Second: Flashes of white noise on the source tape shoot up the effective bitrate of the source videostream, making it harder for the compressor to compress. And in my case, either the CPU got a bottleneck then, or the hard drive was not fast enough writing the stuff we capture. Either way, the fine-tuned process above fell apart. One cure for that might be putting a video-filter (-vf switch) into the mix. But that is still on my "to test" list. Until then...
My strategy to counter that is: I always do two pass encoding now. First I capture mostly raw into a temp file, then compress it to libx264. Typical CLI commands are:
sudo nice -n -19 mencoder tv:// -tv channel=0:driver=v4l2:device=/dev/video0:normid=5:input=0:width=720:height=576:norm=PAL:fps=25:alsa:adevice=hw.1:forceaudio:brightness=0:contrast=0:hue=0:saturation=0:buffersize=1024 -ovc lavc -lavcopts vcodec=ffvhuff -oac copy -endpos 03:10:00 -o tape1raw.avi
This is the first pass. I lover process priority with nice to give it top priority. I encode into the lossless ffvhuff to reduce disk IO, the resulting file is roughly half or a quarter of the uncompressed raw file out. -endpos is 3 hours 10 for a E180 tape, so we can walk away without the capturing running endlessly. The buffer is quite big, to compensate for flashes of noise on the source material, as even ffvhuff with it's low CPU strain occasionally has a hard time catching up.
You might also notice that I use mencoder as the front-end while the video stream is actually encoded by the libav library via the lavc switch, passing options in, like the vcodec, via the lavcopts switch. The full list of available lavcopts switches mencoder understands is here.
avconv -i ~/tape1raw.avi -acodec mp3lame -lameopts preset=medium:aq=0 -vcodec libx264 -qscale 3 -aspect 4:3 tape1.mp4
On the second pass I now transcode the audio into mp3. The preset medium results in a target bitrate around ~160kbps. This transcode can also be done on the first step but it ended up here to free some CPU cylces on the initial capture. Also on second pass is the transcode into H.264 mp4, called libx264 in libav. As noted, I found out it's superior to Xvid in any regard, trust me. The -aspect switch finally works with x264, I never got it to stick with Xvid and the resulting files need to be set to 4:3 manually in the player. But the MPEG-4 container in combination with libx264 properly tells any player to squeeze the video into a 4:3 aspect, when this switch is given on file creation. (Update: note that the above command lacks a flag for interlaced video handling! Read this post on how to encode VHS as interlaced mp4.)
With this workflow, I come from a 60-120GB raw file and end up with a 4-7GB transcoded mp4 file.
(3) Another command for less important material would be (first pass):
$ mencoder tv:// -tv channel=0:driver=v4l2:device=/dev/video0:normid=5:input=0:width=720:height=576:norm=PAL:fps=25:alsa:adevice=hw.1:forceaudio:brightness=0:contrast=0:hue=0:saturation=0:buffersize=256 -ovc lavc -lavcopts vcodec=ffvhuff -vf scale=540:576 -oac mp3lame -lameopts preset=medium:aq=0 -af resample=44100:0:0 -endpos 04:15:00 -o VHS.avi
...employing a bit of horizontal compression with a video-filter (-vf scale) and modest down-sampling of the stk1160 48000Khz audio output. Sadly my stock mencoder does not have aac compiled in, thus I rely on mp3 here, average bitrate ~140Kbps. -endpos is 4hrs+ for an E240 tape.
I've found that capturing to mjpeg seems to be quite stable, more than to ffvhuff, and seems to be a good trade-off between file-size and quality. I'm still unsure if it properly handles interlaced material.
(4) In order to keep compressed files as optimized as possible, I've experimented quite a bit with masking the overscan area with black. This way I can keep invisible and useless noise down to a minimum without cropping or altering the aspect ratio (means: without having to do the math to get that right).
This here refers to MEncoder svn r34540 (Ubuntu), built with gcc-4.6. and avconv version 0.8.6-4:0.8.6-0ubuntu0.12.04.1
(5) If you got tired of mencoder, try vlc: In Capturing VHS video with vlc via v4l2 I present an alternative (better) capture pipeline, with vlc instead of mencoder as the main front-end