How to encode Multi-bitrate videos in MPEG-DASH for MSE based media players (1/2)

Introduction

Online video streaming best-practices have evolved significantly since the introduction of the html5 <video> tag in 2008. To overcome the limitations of progressive download streaming, online video industry leaders created proprietary Adaptive Bitrate Streaming formats like Microsoft Smooth Streaming, Apple HLS and Adobe HDS. In recent years, a new standard has emerged to replace these legacy formats and unify the video delivery workflow: MPEG-DASH.

In this article, we’d like to talk about why Adaptive Bitrate Streaming technology is a must-have for any VOD or Live online publisher, and how to encode Multi-bitrate videos mp4 files with ffmpeg to be compatible with MPEG-DASH streaming. In a subsequent post, we will show you how to package videos and generate the MPEG-DASH .mpd manifests that will allow you to deliver high quality adaptive streaming to your users.

Advantages of adaptive streaming

Adaptive streaming technologies have become so popular because they greatly enhance the end-user experience: the video quality dynamically adjusts to the viewer’s network conditions, to deliver the best possible quality the user can receive at any given moment. It reduces buffering and optimizes delivery across a wide range of devices.

Adaptive streaming schema
Adaptive streaming schema

But Adaptive Bitrate Streaming is trickier on the back-end side: distributers need to re-encode videos into multiple qualities and create a Manifest file containing the information on the location of the different quality segments that the user’s player will then use to obtain the segments he or she needs. Good news, this guide will show you how to do it the right way for MPEG-DASH!

Tools

As we saw before, there are two steps to providing adaptive streaming. First, encode your file into different qualities, and second, divide those files into segments and create the manifest that will link to the files.

Today we’ll take a look at how to  properly re-encode your files into different qualities. To do so, we’ll use the very well-known video tool FFmpeg. This library is the Swiss army knife of video encoding and packaging. Download FFmpeg

For the second step, you can use GPAC’s MP4Box, or do it on-the-fly with a Media Streaming Server such as Wowza, USP or Mist Server. We’ll provide an overview of the different solutions in the second post on this subject.

Video encoding: the importance of I-Frames

Even if the MPEG-DASH standard is codec agnostic, we will encode our videos with h264/AAC codecs with fMP4 packaging, which is the most commonly supported format in today’s browsers.

The biggest trick in encoding is to align the I-frames between all the qualities. In the encoding language, I-frames are the frames that can be reconstructed without having any reference to other frames of the video. As they incorporate all of the information about the pixels in each image, I-Frames take up a lot more space and are much less frequent than the other types of encoding frames. Plus, they are not necessarily at the same place across different qualities if we try to encode with the default parameters. The issue is that after the encoding, we will need to divide the videos into short segments, and each segment has to start with an I-frame. If the I-frames are not aligned between different qualities, the lengths of the segments will not match, rendering quality switching impossible.

To ensure that users will be able to switch between the different qualities without issues, we need to force regular I-Frames in our file while we encode the video into different qualities.

FFMPEG command line

The cleanest way to force I-frame positions using FFmpeg is to use the  -x264opts 'keyint=24:min-keyint=24:no-scenecut'  argument.

  • -x264opts allow you to use additional options for the x264 encoding lib.
  • keyint sets the maximum GOP (Group of Pictures) size, which is the group of frames contained between two I-Frames.More info on GOPs.
  • min-keyint sets the minimum GOP size.
  • no-scenecut removes key-frames on scenecuts.

Let’s look at an example of encoding a 720p file with FFmpeg, while forcing I-Frames.

As you can see, we use a framerate of 24 images per second and then a GOP size of 24 images, which means that our I-Frames are located every second. Using the same command with a different bitrate, we can create files of three different qualities with the same I-Frames position:

To choose the value of the GOP size, you’ll need to take in account the length of the segments you want to generate: segment length * framerate has to be a multiple of the GOP size.

Example: if the framerate is 24 and you want 2-seconds segments, the GOP size needs to be either 48 or 24). Know that if your GOP size is too big, the seek might not work properly in some players: as for quality switching, the player has to seek out an I-frame to resume the streaming.

to learn more about h264 encoding with ffmpeg, check out their guide.

Conclusion

Congratulations! You’ve successfully encoded our video into different qualities with aligned I-frames. Now, simply fragment them into video segments and generate the MPEG-DASH Manifest file. We’ll show you how to do this in our next blog post, so stay tuned!

19 thoughts on “How to encode Multi-bitrate videos in MPEG-DASH for MSE based media players (1/2)”

  1. hi i need to create a dash with demuxed video and audio file, do i have to encode it before making the demux?, thanks.

  2. In your command
    ffmpeg -y -i inputfile -c:a libfdk_aac -ac 2 -ab 128k -c:v libx264 -x264opts ‘keyint=24:min-keyint=24:no-scenecut’ -b:v 1500k -maxrate 1500k -bufsize 1000k -vf “scale=-1:720” outputfile.mp4
    which option is about “framerate of 24 images per second”?
    Thanks a lot.

  3. Very good article.
    But I have following question.
    In your command
    ffmpeg -y -i inputfile -c:a libfdk_aac -ac 2 -ab 128k -c:v libx264 -x264opts ‘keyint=24:min-keyint=24:no-scenecut’ -b:v 1500k -maxrate 1500k -bufsize 1000k -vf “scale=-1:720” outputfile.mp4
    which option is about “framerate of 24 images per second”?

  4. Pingback: Encoding Videos for MPEG-DASH – Bachelor Thesis

  5. I tried this with two versions of ffmpeg one downloaded which didn’t support libfdk_aac and one compiled which fell over with:

    No pixel format specified, yuv420p10le for H.264 encoding chosen.
    Use -pix_fmt yuv420p for compatibility with outdated media players.
    [libx264 @ 0000000000408460] bad option ”keyint’: ’24’

    You could do with adding a lot more about getting an ffmpeg that actually works.

    1. Hello,

      you need to remove the quotes under the x264 options :

      ffmpeg -y -i inputfile -c:a libfdk_aac -ac 2 -ab 128k -c:v libx264 -x264opts keyint=24:min-keyint=24:no-scenecut -b:v 1500k -maxrate 1500k -bufsize 1000k -vf “scale=-1:720” outputfile720.mp4

  6. Pingback: Check key-frame alignment with MP4Box | GPAC

  7. Hi Vineet,

    Thanks for your comment, there was indeed a confusion in the post. the segment lenght has to be a multiple of the GOP size, so for your example, you can have a GOP size of 30, 60, 90 or 180 frames. My suggestion would be be a GOP size of 30 frames (an I-frame every second).

  8. Hi,
    these two statements are contradictory

    1) the segment lenght * framerate has to be lower than the GOP size

    2) if the framerate is 24 and you want 2-seconds segment, the GOP size needs to be < 48

    which one is correct one?
    What I need is the segment length of 6 seconds with framerate of 30000/1001
    so my GOP should be greater than 150 (6*30) or less than 150 ?

    thanks,

  9. I was unable to use the above scripts to encode a valid Dash mp4 in terms of I-frames. The -g value is ether deprecated or does not enforce strict GOP length in all cases. and I ended up with the mp4box encoder reporting to me :
    [DASH]: Segment duration variation is higher than the +/- 50% allowed by DASH-IF (min 0.083, max 1.088) – please reconsider encoding

    by instead using the x264 options directly in the FFMPEG instruction I was able to get a predictible and uniform GOP length (otherwise known as I-frame distance)
    -c:v libx264 -x264opts keyint=24:min-keyint=24:scenecut=-1

    1. Hello,

      Thanks a lot for your comment! Indeed with some versions of FFMPEG, the -g option is not enough. We added the x264 options to the tutorial. Thanks again for the feedback

  10. when using wowza / DASH , To ensure that users will be able to switch between the different qualities do I need to use a .smil file ?

  11. Pingback: How to encode Multi-bitrate videos in MPEG-DASH for MSE based media players (2/2) Streamroot Blog

  12. Pingback: Ce que vous avez raté cette semaine – épisode 9 | Yidaki

Comments are closed.