Developing Complex Media Workflows in the Cloud with Amazon Web Services
To reduce costs, there is an increasing search for ways to move media workflows from on-premise installation to the cloud, as here the need for investment, installation and maintenance of classic IT infrastructure is eliminated and only on-demand costs are incurred. The Amazon Web Services Cloud offers ready-made media services for this purpose, among other things. However, these services are not trivial to configure, let alone combine into complex workflows. The PORTAL product offers solutions for the configuration and operation of complex media workflows in the AWS Cloud. Selected workflows are presented with their infrastructure, configuration and operation during operation.
1 Introduction LOGIC
LOGIC is a German company for system architecture in the field of professional moving image processing. Besides the sales of hardware for TV infrastructure and production, LOGIC is mainly active in the two business areas of migration from classic SDI infrastructures to IP-based infrastructures of public and private broadcasters, as well as the development of media workflows for media productions in the Amazon Web Services (AWS) Cloud. LOGIC is certified throughout Europe for AWS consulting and sales. In addition, LOGIC is certified by AWS for the public sector in Germany and therefore takes on public broadcasting customers, among others. Since 2018, LOGIC has been offering its own framework PORTAL. Among other things, it teaches methods for agile project management and offers a user interface for simple remote control of sometimes complex media workflows that are operated in the AWS Cloud.
2 The PORTAL User Interface for Remote Control of Media Workflows in the AWS Cloud
PORTAL is a Javascript-based user interface for web browsers that allows users to control complex media workflows with a few clearly arranged controls. This is done via the AWS Application Programming Interface (API) gateway, which communicates via RESTful or Websocket APIs with the APIs of the individual AWS services, commands to create, publish, maintain, monitor or secure functions of the corresponding services can be sent or information received. The integrated user management uses the central AWS Identity and Access Management (IAM) service and enables fine-grained access control and permissions. Users can be given or restricted the access they need for the application via different roles. With the help of AWS Cognito, these functions can be implemented natively in web applications. In addition, AWS Cognito enables the integration of Single Sign On (SSO) with OpenID Connect, so that users can log in with credentials from other providers such as Microsoft, Github or Facebook. The individual media workflows are integrated in PORTAL as modules and can be booked as required.
3 AWS Cloud-based Media Workflows
The AWS Cloud offers many services that can help you develop complex media workflows without having to develop all aspects from scratch - you can focus on developing the workflow. In the following, some, by far not all, services are presented that are essential or helpful in the development of media workflows. These services are offered redundantly and invisibly to the application across multiple AWS data centers. Data centers are referred to by AWS as Availability Zones (AZ), of which there are always at least two per AWS region.
3.1 Data storage
One of the most important resources in any kind of IT application is the storage of data. There are various requirements for this. A database is used, for example, when various data points are to be combined into a data set that is to enable processing by an indefinite number of instances, whether human or machine. In this context, a database is also considered a persistent storage, but databases are usually not intended for large amounts of data, as they occur, for example, with video files. For such large amounts of data, block data stores are used. These have greater data throughput, but are not particularly well suited to handling many parallel input and output operations. In conventional IT, hard disks that are operated in servers and hold the operating system and its configuration are regarded as persistent. However, this changes when cloud infrastructures are considered. Cloud infrastructures follow the throwaway concept, i.e. server systems should be planned in such a way that scheduling and provisioning of server systems does not result in data loss. This approach is necessary to achieve great scalability. AWS provides a serverless NoSQL key-value database with the DynamoDB service. The service offers high availability, built-in security, continuous backups and automatic replication across multiple regions, among other features. For bulk storage, AWS offers the Simple Storage Service (S3). This is not network storage in the traditional sense, based on file systems, but object storage. At first glance, the difference is not obvious, but it becomes apparent when you want to develop watchfolder services. This is because, unlike file systems, S3 does not recognize folder structures, but uses the concept of prefixes and suffixes, which requires a rethink during development. For the storage of operating systems and their configuration, drives are provided in the AWS Cloud Elastic Block Store (EBS), which correspond to partitions on SSD hard disks and are intended for use in virtual servers.
3.2 Network Services
In most cases, modern IT applications require a network infrastructure that enables communication between different subsystems. The Virtual Private Cloud (VPC) AWS service offers the possibility to create complete network environments that are able to communicate with each other across different AZs or AWS regions. Familiar classic network elements such as routers, NAT gateways or load balancers are provided. Services can thus be provided both in private subnets and on the public Internet. One can make such services reachable with Elastic IP addresses, which are public IP addresses from the AWS pool, and also deposit them with a DNS registrar using AWS Route 53. A registered DNS record is also a prerequisite to provide encrypted communication over SSL/TLS, which can be enabled with the AWS Certificate Manager (ACM) service. Delivery to many endpoints can be done using the AWS Cloud Front service. CloudFront is a content delivery network (CDN) service from AWS.
3.3 Virtual Servers with AWS Elastic Compute Cloud
AWS Elastic Compute Cloud (EC2) offers the possibility to provide virtual servers. The type of server application is irrelevant here. EC2 instances have at least one EBS drive and are housed in VPCs. ARM, AMD and Intel processors with different numbers of CPU cores and matching memory sizes are offered, which can run Linux, but also Windows operating systems. Even desktop operating systems with graphics card support can be used. A network connection of up to 400 GBit/s is possible. These server instances are booked either on-demand or for continuous operation.
3.4 Serverless applications
Another approach to increasing the scalability of web services is the development of serverless infrastructures. That is, server instances are not explicitly created; instead, services are executed as containers and a logic within the cloud then selects the appropriate hardware resources as needed. Containers, such as Docker, are always in an initial state at startup. Persistent data and configurations must be obtained from storage locations outside the container. Among many other advantages, containers eliminate the need for an operating system tuned to the hardware and can run on any server, as long as the CPU architecture matches. Web services offered in this way can also be run as Infrastructure as Code (IaC). This means that complete infrastructures can be set up with ready-made recipes. In principle, this is also possible with server applications, but it is more complex due to the required hardware tuning. AWS offers a solution for short-lived executions with its Lambda service, which can be used to process a wide variety of tasks. Lambda executions can be started by certain triggers from other services, which then execute a file transfer, for example. This service is only provisioned on-demand, with severely limited resources, and is scheduled again after completion. With AWS CloudFormation, one can generate as code within the AWS cloud infrastructure and provision, execute, catch errors during generation and also terminate infrastructures again as required.
3.5 Helpful Tools
In addition to the actual services, tools are also needed to help connect or monitor services within the AWS Cloud. For machine-to-machine communication, AWS offers the Simple Queue Service (SQS) and for push notifications to users, the Simple Notification Service (SNS). For monitoring services, there is the CloudTrail service on the one hand and the CloudWatch service on the other.
3.6 Media Services
The main Media Services briefly discussed below include:
AWS Elemental MediaConnect - live stream receiver and signal distributor
AWS Elemental MediaLive - real-time encoder for live streams
AWS Elemental MediaPackage - Origin Server and just-in-time packager for live streams and VOD
AWS Elemental MediaConvert - File-to-File Transcoder
The first three services can be used for 24x7 live streaming as well as for live streams of events only temporarily.
3.6.1 AWS Elemental MediaConnect
AWS Elemental MediaConnect (EMX) is a type of signal distributor that understands various live streaming protocols and can translate them into other protocols. Live streams can be transmitted compressed and uncompressed with low latency. Sources and sinks can be located both inside and outside the AWS Cloud. For endpoints outside the AWS Cloud, the available bandwidth depends on the connection to AWS - either via the public Internet or via dedicated connections from managed networks. EMX can provide secure connections across the AWS backbone worldwide.
3.6.2 AWS Elemental MediaLive
AWS Elemental MediaLive (EML) can generate many different output signals from the supplied input signal in the desired resolutions, frame rates, bit rates and with the desired codec. This can include outputting adaptive bit rate (ABR) sets intended for delivery to viewers. ABR sets contain video and audio in different resolutions and bit rates. This makes it possible for (mobile) devices to receive and play back video without buffering and judder, depending on their reception position. MediaLive can, among other things, feed CDNs (PUSH) or use other services as so-called origin servers. The input signal can be sent directly to EML from a contribution encoder using various protocols or translated beforehand by EMX into a supported protocol.
3.6.3 AWS Elemental MediaPackage
AWS Elemental MediaPackage (EMP) is an Origin Server and just-in-time packager for live streams and VOD with additional features. EMP can receive an ABR set from EML or other encoders as an HLS ABR set and optionally filter it, delay it, DRM it as an origin server, etc., in HLS or MPEG-DASH format for CDNs to retrieve. Only the ingested format is stored and other formats are generated just-in-time on demand. It is also possible to export only excerpts from a live stream and use them for VOD.
3.6.4 AWS Elemental MediaConvert
AWS Elemental MediaConvert (EMC) is intended for file-based transcoding and can encode and decode a variety of codecs with a wide range of parameterizations. The service works in jobs that process specific input files.
4 Description of the workflows
In the following sections, we present two workflows that can be used to implement redundant delivery and generate highlight clips from live streams.
4.1 Easystream
The block diagram in Figure 1 shows the setup of the cloud infrastructure for a fully redundant live streaming workflow without single point of failure. Two encoders on-premise are needed, whose input signals should be synchronous. These send to one EMX flow each in different AZs. The SRT protocol is one of several possible protocols. The EML channel consists of two redundant pipelines running in different AZs, each fed by an EMX flow. Thus, processing in the cloud occurs in different data centers. The outputs of the two EML pipelines feed an EMP channel, which serves as an origin server and holds the content ready for retrieval. The content can be delivered as long as at least one of the inputs is feeding. This covers the following outages:
Failure of an on-premise encoder or its Internet connection
Failure of an EMX flow
Failure of an EML pipeline
If encoder A fails, the content of encoder B is delivered at the end of the chain. Problems with encoder A can be repaired in the meantime. If encoder A is repaired and delivers content again, content from encoder B continues to be delivered; if encoder B fails first, of course, the reverse is true.
Figure 2 shows a media pipeline in which the resources in the AWS Cloud are not redundant and thus incur lower costs. However, redundancy is provided for the on-premise encoders. Here in the example, the RIST (Reliable Internet Streaming) protocol is used. With this protocol, it is possible in the EMX flow to specify a second source as a backup of the first. Additional input redundancy options for EML using an input failover pair are described in the User Guide (https://docs.aws.amazon.com/medialive/latest/ug/automatic-input-failover.html). The media pipeline is made even simpler by using on-premise AWS Elemental LINK HD or UHD. Running Elemental LINK on-premise requires only power, an Internet connection with a data rate that does not overly limit the quality of the video transmission, and an SDI- or HDMI- signal source. For redundancy purposes, it is also possible to logically combine two Elemental LINK devices into one input signal for a standard EML channel. PORTAL's UI is deliberately kept simple to make complex workflows easy to handle. PORTAL.easystream gives the user a simple UI to start and stop the media pipeline for live streaming. When starting or stopping the pipeline from the UI, both EMX flows and the EML channel are started or stopped. In EMP, the retrieval of the content is enabled or disabled. Starting and stopping is very important because the bulk of the cost is incurred while the media pipeline is running. Once the media pipeline is running, you can see the output signal. In the current version, PORTAL.easystream offers the possibility to build the pipeline and generate the output using an Elemental-LINK or an SRT encoder as source. The handling is slightly different, since the SRT encoder still has to be given the entry point and, if necessary, the video and audio PIDs of the MPEG transport stream. To ensure that only desired encoders connect to EMX, it is advisable to whitelist the IP address range of the encoders in the UI. The Elemental-LINK, on the other hand, is directly linked to the EML channel and does not require any further settings. Once the settings are made, any operator can start and stop the media pipeline from the UI.
Once the pipeline is running, the signal can be retrieved in PORTAL and checked for sound and picture quality.
4.2 Clip and Multiclip
The block diagram in Figure 4 shows the structure of the cloud infrastructure for a workflow that can create highlight clips from multiple live streams and make them available to NLE editing systems or VOD platforms, for example.
In the UI in Figure 5, a user marks IN and OUT of a clip, names and exports it and any other desired clips, among other things, using the time display. These assets are then automatically made available to Adobe Premiere workstations in the AWS Cloud where editors are already working on a summary.
For the PORTAL.clip workflow shown, an AWS Elemental-LINK or an SRT encoder is used to stream an SDI signal encoded to the AWS Cloud. When using an SRT encoder, an EMX flow as SRT listener receives the signal from the encoder, which is accordingly in SRT caller mode, in the AWS Cloud and forwards it to the EML channel. The EMX flow is necessarily configured as an SRT listener and the encoder as an SRT caller. The flow is then used to add the signal as input to an EML channel. An elemental LINK directly provides the input signal to an EML channel. Unlike the SRT encoder, the elemental LINK encoder can be used directly as the input signal for an EML channel. EML then encodes with the set parameters. The workflow presented here uses an EMP output to generate video-on-demand clips from the segmented HLS stream as an MPEG transport stream with associated m3u8 manifest file. To do this, the operator within the UI only has to jump to the corresponding location within the stream and can specify the length of the clip there using the mark-in and mark-out buttons or a UTC timecode input. In a so-called harvest job, EMP exports the corresponding segment files from the stream and stores them with the manifest files in an S3 bucket. A lambda function merges the individual segments into a clip and stores it in another S3 bucket. The NLE editing systems are installed on an EC2 instance and the user accesses them remotely from their local machine with low latency via NiceDCV. The 3rd party Tiger Bridge software copies the previously stitched clips from the S3 bucket to the EBS storage on the EC2 editing workstations, and the editors can then use the clips as assets in their NLE editing system's project.
When multiple camera angles are available, such as is common at sporting events, there is the PORTAL.multiclip workflow shown in Figure 6. With this, EMP exports the clips from Mark IN to Mark OUT from up to six signals.
5 Conclusion
The range of services offered by AWS, especially the helper services, makes it possible to set up very versatile workflows. In doing so, the degree of complexity and thus the development effort increase considerably, but also bring the advantage of being able to roll out the workflows as IaC. The use of AWS services incurs costs and, especially in 24x7 operation, these can be considerable. The PORTAL user interface simplifies starting and stopping LOGIC-developed workflows so resources don't have to run redundantly. Previous users do not see automatic timing as necessary, especially the end of the respective event is usually variable - it’s simply live.
Comments