Online video streaming best-practices have evolved significantly since the introduction of the html5 <video> tag in 2008. To overcome the limitations of progressive download streaming, online video industry leaders created proprietary Adaptive Bitrate Streaming formats like Microsoft Smooth Streaming, Apple HLS and Adobe HDS. In recent years, a new standard has emerged to replace these legacy formats and unify the video delivery workflow: MPEG-DASH.
In this article, we’d like to talk about why Adaptive Bitrate Streaming technology is a must-have for any VOD or Live online publisher, and how to encode Multi-bitrate videos mp4 files with ffmpeg to be compatible with MPEG-DASH streaming. In a subsequent post, we will show you how to package videos and generate the MPEG-DASH .mpd manifests that will allow you to deliver high quality adaptive streaming to your users.
Advantages of adaptive streaming
Adaptive streaming technologies have become so popular because they greatly enhance the end-user experience: the video quality dynamically adjusts to the viewer’s network conditions, to deliver the best possible quality the user can receive at any given moment. It reduces buffering and optimizes delivery across a wide range of devices.
But Adaptive Bitrate Streaming is trickier on the back-end side: distributers need to re-encode videos into multiple qualities and create a Manifest file containing the information on the location of the different quality segments that the user’s player will then use to obtain the segments he or she needs. Good news, this guide will show you how to do it the right way for MPEG-DASH!
As we saw before, there are two steps to providing adaptive streaming. First, encode your file into different qualities, and second, divide those files into segments and create the manifest that will link to the files.
Today we’ll take a look at how to properly re-encode your files into different qualities. To do so, we’ll use the very well-known video tool FFmpeg. This library is the Swiss army knife of video encoding and packaging. Download FFmpeg
For the second step, you can use GPAC’s MP4Box, or do it on-the-fly with a Media Streaming Server such as Wowza, USP or Mist Server. We’ll provide an overview of the different solutions in the second post on this subject.
Video encoding: the importance of I-Frames
Even if the MPEG-DASH standard is codec agnostic, we will encode our videos with h264/AAC codecs with fMP4 packaging, which is the most commonly supported format in today’s browsers.
The biggest trick in encoding is to align the I-frames between all the qualities. In the encoding language, I-frames are the frames that can be reconstructed without having any reference to other frames of the video. As they incorporate all of the information about the pixels in each image, I-Frames take up a lot more space and are much less frequent than the other types of encoding frames. Plus, they are not necessarily at the same place across different qualities if we try to encode with the default parameters. The issue is that after the encoding, we will need to divide the videos into short segments, and each segment has to start with an I-frame. If the I-frames are not aligned between different qualities, the lengths of the segments will not match, rendering quality switching impossible.
To ensure that users will be able to switch between the different qualities without issues, we need to force regular I-Frames in our file while we encode the video into different qualities.
FFMPEG command line
The cleanest way to force I-frame positions using FFmpeg is to use the -x264opts ‘keyint=24:min-keyint=24:no-scenecut’ argument.
- -x264opts allow you to use additional options for the x264 encoding lib.
- keyint sets the maximum GOP (Group of Pictures) size, which is the group of frames contained between two I-Frames. More info on GOPs.
- min-keyint sets the minimum GOP size.
- no-scenecut removes key-frames on scenecuts.
Let’s look at an example of encoding a 720p file with FFmpeg, while forcing I-Frames.
ffmpeg -y -i inputfile -c:a libfdk_aac -ac 2 -ab 128k -c:v libx264 -x264opts 'keyint=24:min-keyint=24:no-scenecut' -b:v 1500k -maxrate 1500k -bufsize 1000k -vf "scale=-1:720" outputfile.mp4
As you can see, we use a framerate of 24 images per second and then a GOP size of 24 images, which means that our I-Frames are located every second. Using the same command with a different bitrate, we can create files of three different qualities with the same I-Frames position:
ffmpeg -y -i inputfile -c:a libfdk_aac -ac 2 -ab 128k -c:v libx264 -x264opts 'keyint=24:min-keyint=24:no-scenecut' -b:v 1500k -maxrate 1500k -bufsize 1000k -vf "scale=-1:720" outputfile720.mp4 ffmpeg -y -i inputfile -c:a libfdk_aac -ac 2 -ab 128k -c:v libx264 -x264opts 'keyint=24:min-keyint=24:no-scenecut' -b:v 800k -maxrate 800k -bufsize 500k -vf "scale=-1:540" outputfile540.mp4 ffmpeg -y -i inputfile -c:a libfdk_aac -ac 2 -ab 128k -c:v libx264 -x264opts 'keyint=24:min-keyint=24:no-scenecut' -b:v 400k -maxrate 400k -bufsize 400k -vf "scale=-1:360" outputfile360.mp4
To choose the value of the GOP size, you’ll need to take in account the length of the segments you want to generate: segment length * framerate has to be a multiple of the GOP size.
Example: if the framerate is 24 and you want 2-seconds segments, the GOP size needs to be either 48 or 24). Know that if your GOP size is too big, the seek might not work properly in some players: as for quality switching, the player has to seek out an I-frame to resume the streaming.
to learn more about h264 encoding with ffmpeg, check out their guide.
Congratulations! You’ve successfully encoded our video into different qualities with aligned I-frames. Now, simply fragment them into video segments and generate the MPEG-DASH Manifest file. We’ll show you how to do this in our next blog post, so stay tuned!