By, uav-jp 04/11/2022

[4-part series / Part 1] Media services - Latency in live video streaming

While there is a growing need for video distribution such as sports broadcasts, games, news distribution, and TV programs, there may be some people who are having problems with distribution delays and choosing the best service. Common issues and solutions when considering media services are divided into four parts as follows. The first theme is "definition and measurement of latency (delivery delay)".

Part 1: Latency definition and measurement (this article) Part 2: Recommendation optimization for encoding, packaging, and CDN delivery Part 3: Recommendation optimization for video players Part 4: Reference architecture and test results

Part 1: Defining and Measuring Latency

Why is latency an issue with live video streaming? Whether it's TV content like sports, games, news, or pure OTT content like eSports or gambling, when your content delivery is time sensitive, you can't afford to be late. The longer you wait, the less interest you will have. Wait times make us second-class in the world of entertainment and real-time information. A good example is watching a soccer match. Suppose your neighbor watches a soccer game on TV in the traditional way and yells through the wall when his favorite team (often with you) scores a goal. For OTT services, you have to wait 25 or 30 seconds before seeing the same thing. This can be very frustrating, akin to having your favorite singing contest results spoiled by the Twitter or Facebook feeds you monitor along with your streaming channels. These social feeds are typically generated by users watching the show on their TV, so typical latency is reduced to 15-20 seconds, but still well behind live TV.

In addition to broadcast latency and social network contention, there are other reasons why content providers want to minimize live streaming latency. Old Flash-based applications with RTMP streaming performed well in terms of latency. However, in web browsers, Flash usage is declining. On the delivery side, CDNs no longer use RTMP, so content providers must switch to HTML5-enabled streaming technologies such as HLS and DASH, or more recently CMAF. Other content providers want to develop personal broadcasting services with interactivity, and a 30 second delay in video signals is unacceptable in this use case. Additionally, those wishing to develop synchronized second screen, social watching, or gambling applications will need fine-grained control over streaming latency.

When it comes to latency, three categories are usually defined, with upper and lower bounds. These do not exactly match broadcast latencies as they range from 3 to 12 seconds depending on the transport medium (cable/IPTV/DTT/satellite) and the specifics of each distribution network topology. Broadcast latency averages of 6 seconds are common in this field. In other words, the OTT sweet spot is somewhere in the low range of the "low latency" category or the high range of the "low latency" category. Approaching 5 seconds maximizes your chances of competing effectively against broadcast and social network feed conflicts. Additionally, depending on the position of the OTT encoder within the content preparation workflow, the goal of reducing latency is heightened when the encoder is located downstream in the chain.

VocabularyHigh (seconds)Low (seconds)
Latency Shortened186
Low Latency62< /td>
Ultra Low Latency20.2

In HTTP-based streaming, latency mainly depends on the length of the media segment. If the media segment is 6 seconds long, the player is already at least 6 seconds behind the actual absolute time when requesting the first segment. And many players download additional media segments into the buffer before actually starting playback. This automatically increases the time to the first decoded video frame. Of course, there are other factors that contribute to latency, such as video encoding pipeline duration, ingest and packaging operation duration, network propagation delays, and CDN buffers. But most of the time the player is the largest share of the total latency. In practice most players generally use a conservative heuristic to buffer 3 or more segments.

With Microsoft Smooth Streaming, a typical segment length is 2 seconds, with Silverlight players typically having a latency of around 10 seconds. In DASH, it's pretty much the same. Most players support 2 second segments with variable results in terms of latency. But the situation is quite different with HLS. Until mid-2016, Apple's recommendation was to use 10 second segments. This resulted in around 30 seconds of latency on most HLS players, including Apple's own player. In August 2016, Apple's Technical Note TN2224 stated: “We recommended a target time of 10 seconds. We don’t expect to suddenly re-segment all content. Less than 4 seconds per segment and 12 seconds latency to pop off screen. In most cases, content creators want to avoid danger when iOS players validate their iOS applications on the AppStore, so they followed Apple's recommendations, even though they could work with shorter segment lengths. But three evolutions recently changed the game with Safari Mobile on iOS 11. Live HLS stream auto-launch feature enabled, support for small segment durations greatly improved, and FairPlay DRM supported. This means content creators who never have to use apps compiled on iOS can deliver studio-approved DRM-protected streams while reducing live latency for short media segments.

One might argue that short media segments put a lot of strain on CDNs and players, but Microsoft Smooth Streaming makes use of 2-second segments, which has been the case for many years. I was. The next step to reduce the latency gap with the broadcast is to move to 1 second segments. This does not create any real big bottlenecks. Of course double the number of requests considering all the HTTP overhead in terms of number of headers and TCP connections, but a CDN (especially if your edge supports HTTP 2.0 and supports HTTP 1.1 like Amazon CloudFront) makes it much more manageable. And then there are the modern players who will benefit from higher bandwidth last-mile connectivity with fiber, 4G, LTE, DOCSIS 3.1 FD, and other recent connectivity advancements. Experimentation shows that short segments of 1 and 2 seconds are currently supported by many players, offering many new options to reduce latency. Finally, for both HLS and DASH, short segments are generally not an issue for encoders, packagers, and origin services across the chain.

Content creators who still enforce the 6 second segment duration except for AppStore requirements should expect broadcast latency equal to or faster than 1 or 2 second media segments across a variety of players on all platforms. you can try

At a high level, the main operations we perform to put our streaming solution into the "low latency" category are:

Let's see how you can combine AWS Elemental video solutions with any open source or commercial player available today.

How to measure latency The first step in the latency optimization process is knowing which components in the chain contribute what percentage of the total latency. This guides you towards your optimization priorities, whether in the encoding, packaging, or playback stages of your workflow. Let's start by measuring end-to-end latency.

The easiest way to measure end-to-end latency is to use a tablet running a clapperboard application, film it with a camera connected to an encoder, publish the stream to your origin, and then use a CDN to to deliver to the player. Place the player next to the clapperboard tablet, take a picture of the two screens, and draw the timecode for each screen to get the number. I need to run it a few times to make sure this is an accurate representation of my workflow latency.

Alternatively, use an AWS Elemental Live encoder with a loop file source, write the encoder time to the video as an overlay (for encoders that use NTP references), and view the written timecode in your browser window as a time such as time.is You can also compare services. You should typically add about 400 ms of capture latency.

[4-part series / Part 1] Media Services - Live Video Streaming Delays (Latency)

You can enable AWS Elemental Live timecode writing in the preprocessor section of the capture latency video encoding parameters. Must be enabled for each bitrate in the encoding ladder.

You need to make sure you have set your encoder to low latency mode. For AWS Elemental Live, this means selecting the "Low Latency Mode" checkbox in the "Additional Global Configuration" section of the input parameters.

Next, set up a UDP/TS encoded event with a 500 ms buffer in the TS output section, with the laptop IP as the destination.

On your laptop, open a network stream in VLC (rtp://192.168.10.62:5011 in this example) with the :network-caching=200 option to use a 200 ms network buffer . You can calculate the capture latency from a snapshot of the VLC window by comparing the written timecode with the clapperboard timecode.

Even if your tablet can't sync with NTP, some applications like iOS' Emerald Time can still tell you how much time drift your tablet has compared to NTP. In this example the drift is +0.023 seconds. So the clapperboard time is actually 25:00.86 instead of 25:00.88. The written timecode is 25:01:06 (the last two digits are the frame number), which (because it was encoded at 24fps) can be converted to 25:01.25 in hundredths of a second. So the capture latency is (25:01.25 – 25:00.86 ) = 0.39 seconds. The formula is: Capture latency = write timecode in seconds – (write timecode + NTP drift t).

Encode Latency You can also use this UDP/TS Encode event to calculate the latency introduced by the video encoding pipeline. In this example, we use the following encoding parameters to produce broadcast-compliant quality for demanding scenes while proposing an acceptable trade-off in terms of induced latency:

In this case the tablet time is 13:27:19.32 and the VLC time is 13:27:16.75.

Encoding pipeline latency is calculated by the following formula: (Tablet Time – VLC Time) – (Capture Latency + VLC Buffer + RTP Buffer) i.e. (19.32-16.75) – (0.39 + 0.20 + 0.50) = 1.48 seconds

Ingestion Latency Now that we know capture latency and encoding pipeline latency, let's consider ingestion latency. "Ingest Latency" includes the time required to package an ingest format and ingest it to an origin that does not apply packaging to the ingest stream. This is AWS Elemental Delta with passthrough filters, or AWS Elemental MediaStore. Here we use HLS with 1 second segments pushed to AWS Elemental MediaStore.

Use your shell to monitor changes to your origin's HLS child playlists.

$ while sleep 0.01; do curl https://container.mediastore.eu-west-1.amazonaws.com/livehls/index_730.m3u8 && date +"%Y-%m-% d %H:%M:%S,%3N"; done

Returned the first time the segment “index_73020180223T154954_02190.ts” is referenced in the shell output.

#EXTM3U[…]index_73020180223T154954_02190.ts2018-02-23 15:49:55,515

Then download the segment “index_73020180223T154954_02190.ts” and find out which time Check if it carries code: 16:49:53:37 (UTC+1). The difference between the current date and the segment timecode is 55.51 – 53.37 = 2.14 seconds. If you've removed encoding and capture latencies, package your HLS segments to isolate the time required to push them to your origin. The formula is Ingest Latency = (Current Date – Segment Timecode) – (Ingest Latency + Encode Latency). For AWS Elemental MediaStore, it is 0.27 seconds. For AWS Elemental Delta, the same calculation yields 0.55 seconds.

Repackaging Latency By applying the same approach to AWS Elemental Delta and AWS Elemental MediaPackage and adding the previously calculated ingest latency, we can calculate the time required to repackage the ingested stream. increase. The formula is: Repackage latency = (current date – segment timecode) – (ingest latency + encode latency + ingest latency) (assuming), when outputting HLS 1s segments from HLS 1s ingestion, the repackaging latency is (57.34 – 54.58) – (0.39 + 1.48 + 0.55) = 0.34s. For AWS Elemental Delta, (26.41 –23.86) – (0.39 + 1.48 + 0.55) = 0.41 seconds.

Delivery Latency The same approach applies to delivery. i.e. transport from origin to CDN edge. If your origin does repackaging, delivery latency = (current date – segment timecode) – (ingestion latency + encoding latency + ingestion latency + repackaging latency). If your origin does pass-through streaming, delivery latency = (current date – segment timecode) – (ingestion latency + encoding latency + ingestion latency). Delivery latency can be measured by adding an Amazon CloudFront distribution on top of your origin and using the same kind of command line as calculating ingestion latency. For AWS Elemental MediaStore it is (52.71 – 50.40) – (0.39 + 1.48 + 0.27) which is 0.17 seconds. This latency is the same for all origin types in the same region.

Client Latency There are two client-dependent latency factors in this category: last-mile latency (related to network bandwidth) and player latency (related to content buffers). Last mile latency ranges from milliseconds on fiber connections to up to seconds on the slowest mobile connections. The content download period directly impacts latency, since at the moment timecode T is available for client-side buffering and playback, the content download period is delayed by T+x seconds. If the delay is too large compared to the length of the segment, the player will not be able to build enough buffers, and until a good trade-off is found between bitrate and network conditions and the ability to build that content buffer, the encoding ladder switch to a lower bitrate. If even the lowest bitrate can't build enough buffer, it can't download the content quickly enough, so it constantly starts playing, stops, and rebuffers. As soon as the content download duration reaches 50% of the segment duration, the player moves into the danger zone in terms of the buffer. Ideally, it should stay below 25%. Player latency is a result of the player's buffering policy, i.e. the player buffering X segments or requiring a certain minimum duration of content, and the playhead positioning strategy.

The way we measure client latency is client latency = end-to-end latency – (capture latency + encode latency + ingest latency + repackaging latency + delivery latency). Player latency can be isolated by subtracting the media segment average transfer time (last mile latency) from the overall client latency. The last mile latency value should be calculated with at least 20 segment requests. Includes actual data transfer time and client-generated latency. For example, all allowed sockets for a particular subdomain are currently open when the segment request is made.

This is an example breakdown of HLS 1 second segments created with AWS Elemental Live and AWS Elemental MediaStore delivered to a standard hls.js 0.8.9 player with Amazon CloudFront.

< td> 0.39< /tr>
Latency TypeSecondsEffect
Capture 7.58%
Encoding1.4828.79%
Import0.275.25%
Repackage N/AN/A
Delivery0.173.33%
Last Mile0.285.44%
Player 2.5549.61%
End-to-end latency5.14100%

As you can see, the encoding and playback steps generate most of the latency. This is where most of the improvement margin is located. There are ways to optimize other steps, but the impact of optimization is minimal. Longer output media segment times typically result in longer player and last-mile latency, while other procedures remain stable.

Part 2 of the series discusses optimization options that can be applied to each step of the workflow.

Nicolas Weil AWS Elemental Senior Solutions Architect