One thing I've overseen for quite a while is that VHS is interlaced and the mp4 x264 codec does not handle it correctly automagically. That means: you have to tell avconv or ffmpeg to pass a switch to the x264 lib, to turn interlaced mode on and store the video actually interlaced.
As it seems, interlaced storage in x264 arrived around 2001. Sadly, the avconv man page mentions interlaced support in mp4 only in relation to the -ilme flag, which is somewhat incomplete, as the libav docs elaborate more on this and tell you ilme stands for "interlaced motion estimation", which only makes sense in combination with ildct, "use interlaced DCT.". That's why most of the time when you look around the web for info regarding x264 and interlaced, you'll find commands with a -flags option set: "-flags +ilme+ildct" - it's how you set these switches with avconv, simply adding -interlaced, -ilme, or -ildct does not work.
Now, in my specific scenario, I was compressing captured VHS video to mp4 in a two-step process. First, I've digitized analog interlaced VHS with an EasyCap USB stick to an uncompressed mezzanine format and container - in my case AVI with huffyuv video and pcm16 wav audio streams. With vlc as the managing capture front-end (formerly I'd used mencoder).
With vlc switched into PAL video mode, and huffyuv as the codec, the material ended up as properly interlaced encoded video. vlc told me so by spitting out:
[huffyuv @ 0x7f18ec0eae40] using huffyuv 2.2.0 or newer interlacing flag
Now, the thing with interlaced video is that it's hard to determine if it is really interlaced or not. And I assumed my whole pipeline would simply work. But now I'm 90% convinced, it doesn't. vlc automagically knew it was munging interlaced video as it was sensing the PAL flag, but when I used avconv to compress the mezzanine file to mp4, avconv didn't know about the video being interlaced.
I don't know if it is possible to have interlaced video being stored as progressive frames, but from looking at resulting video from encodes without proper interlace flagging, it seems possible. The encode seems to combine two interlaced frames, with the typical "fork" patterns on moving image details, and then seems to store this combined-half-frames frame as one progressive frame.
If that's the case, it's not very good. With image compression, the "fork" patterns would blur into each other and we would loose proper field separation - not so noticeably, especially if you use a de-interlace filter upon playback - but quality is lost.
Now to the meat of this post: here's an avconv command that successfully passes the interlaced flag to libx264, complete with a "masking" filter to eliminate noise in VHS overscan areas and an "above average/high-quality" setting (-q:v 3) for video:
$ avconv -i VHS_raw.avi -vf "drawbox=0:0:5:576:black@1, drawbox=716:0:4:576:black@1, drawbox=0:571:720:5:black@1" -vcodec libx264 -q:v 3 -flags +ilme+ildct -acodec aac -b:a 140k -strict -2 -aspect 4:3 VHS.mp4
That's a mid 2015 Ubuntu 14.04 avconv here and underlying libs in these versions:
ffmpeg version 1.2.6-7:1.2.6-1~trusty1 Copyright (c) 2000-2014 the FFmpeg developers
built on Apr 26 2014 18:52:58 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1)
configuration: --arch=amd64 --disable-stripping --enable-avresample --enable-pthreads --enable-runtime-cpudetect --extra-version='7:1.2.6-1~trusty1' --libdir=/usr/lib/x86_64-linux-gnu --prefix=/usr --enable-bzlib --enable-libdc1394 --enable-libfreetype --enable-frei0r --enable-gnutls --enable-libgsm --enable-libmp3lame --enable-librtmp --enable-libopencv --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-vaapi --enable-vdpau --enable-libvorbis --enable-libvpx --enable-zlib --enable-gpl --enable-postproc --enable-libcdio --enable-x11grab --enable-libx264 --shlibdir=/usr/lib/x86_64-linux-gnu --enable-shared --disable-static
libavutil 52. 18.100 / 52. 18.100
libavcodec 54. 92.100 / 54. 92.100
libavformat 54. 63.104 / 54. 63.104
libavdevice 53. 5.103 / 53. 5.103
libavfilter 3. 42.103 / 3. 42.103
libswscale 2. 2.100 / 2. 2.100
libswresample 0. 17.102 / 0. 17.102
libpostproc 52. 2.100 / 52. 2.100
As you can see, I pass flags +ildct and +ilme here, and if that's working, it should properly store perfectly separated interlaced-frames (ildct) while also employing motion estimation (ilme). I've read somewhere that the interlaced mode in mp4 is less optimal (size-wise, "...inherently less efficient than progressive encoding.") in comparison to progressive, but I want to stay true to the video source material. And: I've tested if there's a difference, running a second encoding session, this time with the interlaced switch absent... results: 1. speed: Kb/s was roughly the same, didn't really clock it; 2. resulting file size was nearly identical - so I don't see where this "less optimal" notion comes from.
Also, I'm not sure about field order. This stackexchange post adds another switch to tell the encoder to work "bottom field first" by adding "-x264opts -bff=1" but I don't know right now if interlaced video is top field first by standard, or the other way round. And I don't know if I really have to specifically tell avconv something about it. I guess, in this case I simply rely on everything to work automagically again...
Anyway, when you fire up avconv with the -flags option, it usually spits out the current encoder settings, like this:
[libx264 @ 0xea14e0] interlace + weightp is not implemented
[libx264 @ 0xea14e0] using SAR=16/15
[libx264 @ 0xea14e0] using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64 SlowShuffle
[libx264 @ 0xea14e0] profile High 4:2:2, level 3.0, 4:2:2 8-bit
[libx264 @ 0xea14e0] 264 - core 142 r2389 956c8d8 - H.264/MPEG-4 AVC codec - Copyleft 2003-2014 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=3 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=tff bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=0 keyint=250 keyint_min=24 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Look for the "interlaced=tff" bit: it seems the default is "top field first, tff". If you wouldn't pass the +ildct+ilme switches, interlaced would be "interlaced=0" here.
I don't know if the note: "interlace + weightp is not implemented" should worry me.
The resulting video with above command, came down to 1,5GB for 30 minutes of video. The "lossless"/digitized mezzanine huffyuv+wav file was 20GB. The interlaced flag seems to have worked its magic as the x264 report noted "field mbs: intra: 85.4% inter:55.6% skip:43.5%",
and although I couldn't really spot any visual difference to my formerly "wrongly" encoded non-interlaced x264-mp4 videos, treating interlaced video correctly from now on gives some peace of mind.
Footnote: the gist from an interesting post: Re: dv => mp4: deinterlace or not, and how?:
"If your display device is either a CRT TV or you trust your display's built-in deinterlacer, and it's able to properly display a video flagged as being interlaced - encode interlaced..." - which is what we do -
"... if you do want to de-interlace, use avconv's/ffmpeg's yadif video filter, "-vf yadif" instead of the "-deinterlace" flag, as the latter would default to something inferior."