Nothing is simple. In the digital age, that’s more or less a hard-and-fast rule—everything we interact with in our day-to-day, from cars to phones to the toaster on your kitchen counter, has layers of abstraction and interconnectivity in order to interface with other digital devices. What’s more, since the digital revolution wasn’t done in one fell swoop, there are layers and layers of history behind the design and development of things we take for granted.
Take for example serving ads on network television. At first glance, this would seem like a simple prospect—just cut the feed from the main broadcast and turn on a recording of a commercial. But the more you think about it, the more questions arise. How do you “just cut the feed” reliably and switch to commercials without introducing a chance for operator error? How do you account for remote broadcasts where the studio providing the main feed doesn’t have every single commercial on file? What about serving local ads when a program is broadcast to multiple networks—how do you do that? Nothing is simple…
Fortunately, there is a system in place to handle all of these issues seamlessly: the SCTE-35 standard, which injects “SCTE markers” into a broadcast feed. Simply put, these cue ad insertion opportunities and other events at specific points in time on a broadcast, adding flexibility and functionality to an otherwise “bare” camera feed. But how did this come about? What can it do to make a broadcast better? To answer that, let’s go over the genesis of SCTE-35 and its predecessors—it’s a more interesting story than you might imagine. Let’s go back to the beginning…
How Do You Dial a Phone?
Yes, this is where the story starts. Think about it—before the advent of Internet protocols with packets that could contain any information you need, how did the act of dialing a number cause your line to be connected properly? It’s not as though there was a separate line to transfer that data to a switching center—a telephone connection is just a single pair of wires.
The answer is in-line signaling. Through an encoder built into every landline phone, dialing (either literally, in the case of rotary phones, or by pressing buttons on a keypad) a number causes a series of tones to be sent along the phone line in the same way voice is transmitted. This signal is sent to a telephone exchange, where it is automatically processed into commands to establish a connection from the sender to the receiver. In some cases, this even involved physical connections being established by motorized shuttles, though most systems involve simple electrical or electromechanical relays.
At first, the signal standard used for this system was pulse dialing, which generated a simple series of on-off pulses by interrupting the connection to the telephone exchange directly. (This is how rotary phones work, and the specificity of the pulse rate is why all rotary phone dials turn at about the same speed.) However, this system was significantly limited in the speed at which you could dial, and the requirement to physically “cut” the connection intermittently meant the distance a phone could be from the switching station was limited.
To resolve this, Bell developed an alternative in the early 1960s called dual-tone multi-frequency signaling, or DTMF. Instead of interrupting the connection to the telephone switching system directly, DTMF sent a pair of tones (hence the name) whenever a button was pressed on a phone’s keypad. This system, which is more well-known as Touch-Tone, introduced push-button phones to the general public and allowed for increased performance of telephone systems. Signals from a phone could be sent faster, to an exchange any distance away, with a much lower chance for distortion—and the dual-tone system allowed for more characters to be encoded on the same system than just numbers. (This is when the star (*) and pound (#) sign were introduced!) This system is still the groundwork for how conventional phones operate today.
DTMF and Electronic Control
What does this have to do with ad insertion? We’re getting there. Recall that telephone exchanges process the audio signal a phone sends them and can convert that into an action like flipping a relay or starting a motor. (You may already see where this is going.) In the late 20th century, it became clear that this system could be utilized elsewhere to embed commands into an audio signal that was designed to support voice frequencies. To send credit card information, pay phones embedded strings of DTMF tones into their signal. VHS tapes sometimes use DTMF to encode information about the program duration and type. And, in 2001, the Society of Cable Telecommunications Engineers released the first version of the SCTE-35 standard, which embedded DTMF tones into television audio channels for signaling.
This initial implementation was fairly simple. Essentially, an audible DTMF tone was assigned to a small number of events, namely the start and end of approved ad blocks. A studio would play these tones directly into the audio feed of their broadcast “on its way out,” as it were, to be processed by networks downstream. These DTMF signals would start and stop videotape players built into the broadcast architecture to insert advertisements—either wholesale to supplant the source studio’s advertisement injection capability, or partially, to include local ads in a larger commercial block. Being based on a proven standard and utilizing very little new hardware (in most cases, a DTMF decoder and interface was all that was needed, as studios would already have equipment to play taped commercials) meant that adoption was fairly effortless and soon became widespread.
Of course, the injection of tones into the audio channel being fed to consumers wasn’t a perfect solution, though it added a great deal of flexibility. If the timing of the broadcast systems wasn’t completely correct, the signaling tones would play in full before the tape of the injected advertisement began to play—as you can imagine, this wasn’t an ideal experience for the end user. (You may recall this happening when watching TV in the early 2000s, often accompanied with a black or flashing screen before a commercial began.) Additionally, though the base 16 DTMF tones were more than enough for controlling a telephone exchange, it limited the potential for SCTE-35 to expand its capabilities with more information and signaling. A change was needed to improve performance and pave the way for the future…
SCTE Markers in Digital Television
The very first broadcast TV station started service in 1928, setting a global standard for analog television systems. Analog TV is, to put it lightly, dead simple. It consists of two separate signals (usually) pushed over radio, one each for video and audio. As the standards for analog TV in the United States were established in 1941, you can imagine they’re fairly limited in capability. For decades, TV audiences had to make do with 480i resolution for the majority of broadcasts—and as cameras, computers, and home video media grew more capable of handling higher resolutions, this 480i standard became more and more restrictive. In order to support higher (and multiple) resolutions, markets worldwide transitioned to digital TV starting in the early 2000s, with the digital “flip” in the United States taking place in mid-2009.
A digital TV signal can carry dramatically more information than its analog predecessor as a result of improved bandwidth, and the SCTE-35 standards were updated to take advantage of this. Though the standard retained support for DTMF signals for compatibility with older hardware (and indeed still does today), the main functionality of the standard was overhauled with an all-digital system. Now, advertisements are indicated in a video stream using “SCTE-35 markers,” which are brief code segments added to a video data stream as metadata. No longer are commands relayed down the line using audible signals—SCTE markers are “silent” additions to a broadcast, only “noticed” and interpreted by equipment designed to process the relevant data. Moreover, the new SCTE-35 standard is compatible with the modern Internet streaming protocols HLS and MPEG-DASH (see an earlier blog post for a breakdown on those!) which adds compatibility with online streaming and broadcasting services, greatly improving reach.
Despite these improvements, however, the basic operation is quite similar from a broad perspective. Broadcasting hardware injects SCTE markers into a transport stream to indicate the start and end of advertising opportunities—though instead of playing a tone at the proper times, each marker is simply a timestamp to denote when a downstream ad provider can insert their own commercials. Depending on the stream format, this can occur as a pair of commands reading essentially “start at this timestamp and end at this timestamp” or “start at this timestamp and play for (value) seconds and frames”. Having support for both opens up compatibility for new hardware and software and improves accessibility for more studios.
As the system is fairly open-ended, modern SCTE-35 markers can also be used for more purposes, including denoting chapter or segment breaks in broadcast shows and movies to allow for greater content control all the way down to local broadcasters. For example, if a particular segment in a broadcast movie was considered objectionable in one region, local broadcasters could use SCTE-35 markers to automatically black out that segment and replace it with local programming. In fact, SCTE markers could be used for signaling nearly anything, limited only to what equipment downstream can interpret. Studios large and small can utilize SCTE-35 to control broadcasts with terrific precision and ease—including our studio here in Boulder.
Utilizing SCTE-35 in a Production Pipeline
BCC Live is a relatively small company, but we leverage our mastery of broadcast technology to put on world-class shows to a global audience. A major part of that is our usage of SCTE markers to interface with our downstream partners. For most shows, we have a single staff member running the broadcast—coordinating cameras, switching signals, communicating with our onsite hosts, and setting up commercial breaks. Though this sounds daunting, it’s perfectly manageable with the right hardware!
When it comes time to insert an ad break, our team member signals the hosts with a countdown to cut. While the hosts wrap up their comments, we prepare for the cut—though it doesn’t take much! All it takes is cutting to black and pressing a key on our control deck to insert a SCTE marker. In our case, this is actually a SCTE-104 marker, which requests the insertion of a SCTE-35 marker in the final broadcast stream. From there, the video feed with embedded markers is pushed to the cloud services of Amagi Media Labs, which converts the SCTE-104 markers to SCTE-35 and pushes our stream out to broadcast TV and online platforms, depending on what our clients require. The rest is self-evident based on what’s been relayed above—advertisements can be inserted with ease wherever they’re needed, by local TV networks using digital recordings or even old analog tape decks, by streaming services pushing content over the Internet, or indeed anywhere they’re needed!
While it seems almost anticlimactic to describe our production pipeline so concisely after an extensive history, it’s the bulk power of that history which makes our workflow so easy. The genesis of remote hardware control in telephone exchanges has caused the basic operation to be, quite literally, as easy as dialing a phone number! The whole system is straightforward to grasp from both a theoretical and operational perspective, thanks in no small part to the way it’s connected to other systems that we easily understand. SCTE-35 markers are a massive multiplier to a studio’s capability to operate effectively in the digital age, and we’re lucky to be able to utilize them in every broadcast we run.
And yes, thanks to the simplicity and backward-compatibility of the system, you could insert ad breaks by dialing an old Touch-Tone phone connected to your broadcast hardware. There’s no reason to, when a normal control deck works so well… but how much fun would that be?