Earley AI Podcast - Episode 4: AI and ML in Media and Entertainment with Adam Sutherland

From Dark Archives to Personalized Streaming - How AI Is Transforming Content Discovery and Monetization

Guest: Adam Sutherland, WW Business Development Lead - Data Science and Analytics for Media and Entertainment at Amazon Web Services

Hosts: Seth Earley, CEO at Earley Information Science

Chris Featherstone, Sr. Director of AI/Data Product/Program Management at Salesforce

Published on: November 23, 2021

In this episode, Seth Earley and Chris Featherstone speak with Adam Sutherland, Worldwide Business Development Lead for Data Science and Analytics for Media and Entertainment at Amazon Web Services. Adam's path to AI leadership is anything but conventional - he studied Japanese literature in college, lived in Japan for five years, ran a school there, worked at Encyclopedia Britannica during its CD-ROM and first-website era, served as Head of Discovery Online, held the Chief Strategy Officer role at National Geographic for four and a half years, and then made a deliberate pivot to a small speech recognition and machine translation startup before joining AWS. In this conversation he maps the real AI use cases transforming the media industry today - from unlocking the value buried in untagged content archives, to confidence-scored content moderation, to the emerging frontier of AI-generated scripts, upscaled archival footage, voice cloning for dubbing, and dynamic in-game product placement - and explains exactly why the data layer, not the algorithm, is where competitive advantage in media is actually won or lost.

Key Takeaways:

Media companies have historically maintained meticulous metadata only where mistakes would cost or lose money - music rights, stock footage licensing - while leaving the actual content of their archives nearly invisible, and AI is now the primary tool for closing that gap through transcription, image recognition, video recognition, and NLP.
The first and most foundational step for any media AI project is simply tagging - even if the organization is not ready to act on search and discovery or content moderation today, running assets through managed AI services to generate tags creates the foundation that every downstream use case depends on.
The build-versus-buy decision for AI models comes down to two questions: how specific is the use case, and is this outcome a strategic differentiator for the business? If a generalized model gives the same results to every competitor, investing in a bespoke model is the only path to competitive advantage.
Personalization in media has two distinct problems that require different data and different models - content for people (recommending what to watch based on past behavior) and people for content (finding the right audience and marketing message for new releases) - and the best organizations run multiple recommendation engines simultaneously, optimizing the mix continuously.
Marrying content metadata with behavioral engagement data - where viewers drop off, what scenes precede abandonment by which demographics - enables editorial decisions about future content creation, not just recommendations, allowing media companies to dial specific elements up or down for specific audience segments.
AI is not yet perfect and for workflows like captioning and content moderation a human in the loop remains essential, but AI changes the nature of that human role dramatically - from trained specialists who do the work from scratch to editors who review and correct AI output, expanding the available labor pool and reducing cost.
The new battleground in media is data ownership and customer experience, not content volume - good content at scale is now table stakes, and the companies winning are those with the best data governance, the most consistent data collection, and the ability to use that data to deliver a superior personalized experience across the full value chain.

Insightful Quotes:

"What we see a lot is: before, the conversation was 'I want AI.' Now the conversation is 'I want to use your services to help me monetize my content.' And you very quickly figure out that the people you're talking to don't control the value chain. You either have conversations with people who are way down in the weeds of the technology, or you have conversations with people who are trying to solve business outcomes - and either one, you realize they're not talking to each other." - Adam Sutherland

"If you're a translation house, a generalized translation model is not going to give you any advantage over the translation house down the street if you're both using the same generalized model. So if it's a strategic differentiator for you, then it's worth the investment in the data scientists. That's the other conversation we have all the time." - Adam Sutherland

"Good content and a lot of content - that's table stakes. Everybody expects that if they open up a service they're going to have good content. Where everybody is competing now is on customer experience. And the real driver for a good customer experience is data." - Adam Sutherland

Tune in to hear Adam Sutherland explain why National Geographic would sometimes spend more money buying third-party footage than it would cost to find a clip it already owned, how Amazon's "this is a gift" checkbox was one of the best things the company ever did for personalization, why media companies deliberately inject random recommendations to avoid creating filter bubbles, how AI is being used to superimpose updated brand placements into archival footage years after original broadcast, and why the voice-cloning technology that can dub an Arnold Schwarzenegger film in his actual voice is raising urgent unresolved questions about rights and compensation.

Contact Adam:
https://www.linkedin.com/in/adamrsutherland/

Thanks to our sponsors:
Earley Information Science
CMSWire
Marketing AI Institute

Podcast Transcript: AI and ML in Media and Entertainment - Unlocking Archives, Personalizing Experiences, and the Emerging Frontier

Transcript introduction

This transcript captures a conversation between Seth Earley, Chris Featherstone, and Adam Sutherland about how AI and machine learning are transforming media and entertainment - from solving the dark archive problem that costs studios millions in unnecessary stock footage purchases, to the confidence-score decisions that determine when AI output needs a human editor, to blue-sky predictions about AI-generated scripts, voice-cloned dubbing, and dynamic in-content brand placement.

Transcript

Seth Earley: Welcome to today's podcast. I'm Seth Earley.

Chris Featherstone: And Chris Featherstone.

Seth Earley: Excited about our guest today. He's a multidimensional Renaissance man with a broad array of interests. He's currently leveraging his extensive background in media and publishing as the Worldwide Business Development Lead for Data Science and Analytics for Media and Entertainment at Amazon Web Services. Please welcome Adam Sutherland.

Adam Sutherland: Hello. Thank you. Nice to be here, and thank you for that very glowing introduction.

Chris Featherstone: Adam is one of my favorite humans - not only at Amazon but just in general. It's rare when you meet a genuinely great person who is good across all these aspects. Adam, our history in terms of working together was one where I learned an amazing amount of stuff from you, and I think you have an unbelievable perspective that the world needs to know.

Adam Sutherland: I really appreciate it. I've been very lucky to work with some really great people over the years, including yourself Chris.

Seth Earley: So a little background - I understand you have an undergraduate degree in Asian studies?

Adam Sutherland: Yeah. So my undergrad was actually in Japanese literature. I did have a minor in business, which was the only way my parents agreed to pay for college. With the typical hubris of an 18-year-old, I wanted to study something that would never get boring. I thought another culture, right - because there's a lot to another culture. And eventually landed on literature as a window into that culture. Now, with the benefit of not being an 18-year-old, I realize that you can study anything and if you study it well enough it'll never get boring. But at that time.

Seth Earley: Did you ever do any martial arts, and did you spend much time in Japan?

Adam Sutherland: Yeah. I actually lived in Japan for five years total. I did martial arts - Muay Thai boxing, starting when I lived in Chicago. And then it's funny - you study Japanese in college, start martial arts in the US, move to Japan and try to speak Japanese with actual Japanese people and also join a dojo with a few world champions. Humble pie across the board.

Seth Earley: That is brutal - elbows, knees, anything goes. I did Shotoken and then aikido because that's much easier for old people.

Adam Sutherland: Aikido is awesome. But personality-wise it just didn't fit for me. If somebody was going to cause some grief, I felt that maybe I should be the first one to make sure we stopped the trouble. Whereas I had a friend who was very much on the other end of the spectrum and took to aikido like a fish to water and got his dan and then was teaching in Chicago.

Chris Featherstone: Adam, I'd love to get a take - for those listening, we're trying to do something that's not just another AI podcast but more along the lines of the human experience of how one goes from where they are in their career to embracing and practicing this new innovation. How did your career path lead you to what you're working on today?

Adam Sutherland: The career has been a combination of tech and content pretty much from the beginning. After my first stint in Japan - where I was teaching English and then ended up starting a school - I then worked, believe it or not, for Encyclopedia Britannica. We learned an important lesson at Britannica, which is very much like the Betamax versus VHS conversation. Britannica actually had the first CD-ROM, the first website - we were very cutting edge from a technology perspective. The business model did not allow us to take advantage of it. We had a salesforce making money on big commissions from expensive physical sets, and we just couldn't get rid of that. So the technology evolved and the business model didn't.

From there I moved to Discovery Channel as Head of Discovery Online. Long story short, I moved around in that tech-content space, moved into M&A where I would work with Discovery and Travel Channel and others to figure out - we have content, we have a brand, now how do we take advantage of that in new platforms - build, buy, or borrow. I was the build-buy-borrow person to figure out who we partnered with, what platforms we might acquire, what content we would acquire.

Before I got really steeped in AI and ML, I was at National Geographic as Chief Strategy Officer for four and a half years. Then I did a complete pivot to a small startup that did automatic speech recognition and machine translation. Small operation, mostly a government customer set, and they brought me on as CEO to help figure out what we could do in media and in telco as well.

The long-winded answer to your question is: I got into AI from the content and industry side, not from the technology side. It was always about how does the technology solve a problem. When it came to AI and ML specifically it was: we've got this cool technology, there are early indications that media could use it for captioning and asset metadata and call center - how do I turn that technology into a business?

What I tell high school kids when we talk to them about getting into high tech: you don't necessarily have to be an engineer. There's room for engineers and there's room for people who work backwards from a business perspective. But at the end of the day, it's just remaining curious - not ever saying I'm afraid to learn something new, or I already know exactly how to do that so that's got to be the way. That move from National Geographic to the small startup was fantastic because I brought the industry experience and I knew nothing about AI and ML - zero. Next thing I know I've got a team of 40 scientists sitting at universities around the world, and for the first six to nine months we just learned from each other.

Seth Earley: What does a day in the life for Adam look like? What kinds of problems are your customers trying to solve?

Adam Sutherland: I look at everything as a portfolio - a certain amount of consumption and a certain amount of production. On the production side, I get pulled into conversations with customers to help with initial triage. When I first started in this role, the number of times I would get into a conversation with somebody who said "I need AI" - that conversation is thankfully becoming less and less frequent. Now we have conversations about: I have this specific use case - is there a managed service that can do it, or do I need my own data scientists?

In media specifically, the biggest use case we deal with most frequently is understanding X - and the X is generally content or customer. When it comes to content, you have companies with huge libraries and archives. Because of the way the industry has historically worked, the metadata on most of those assets is very buttoned-up only where you could get sued or miss out on money - music rights, stock footage licensing. What is less buttoned-up is what's actually in the content itself.

I'll give you an example. At National Geographic I was Chief Strategy Officer for four and a half years and we spent a lot of money every year buying third-party footage that we knew we had somewhere in the archive. But finding it - the difference between knowing there's an elephant in a clip and actually finding the clip that would work for the show - that effort would cost us more in manpower than just buying a clip from a third party. And that's not infrequent. So one of the big use cases is using AI to transcribe, use image recognition, video recognition, maybe some NLP on top of that, to really get a better idea of what's inside your content.

There's a really interesting use case with a particular customer where they've spent a lot of time tagging movie clips for emotions. The first pass gets you car, house, animal, face - the basic object tags. The next pass is more interesting: if I see a cafe and the Arc de Triomphe, we're in Paris. That next layer of meta-assumption on top of the tags you've already got is where it gets really rich. A company we're working with tagged a lot of clips themselves with concepts like "triumphant" and "teamwork" - really amorphous concepts. Now that they've got all that tag data, we're building a model where you can do that tagging automatically off the metadata created by managed services. When you see smiling and more than three people, you can assume that's teamwork. There's a lot of cool stuff you can build on top of basic tagging once you have it.

Seth Earley: How come I can't find anything on my streaming apps? The search seems like the Wild West.

Adam Sutherland: There's definitely streaming fatigue. This goes back to knowing your customer. There used to be a clear delineation: I make content, I hand it to somebody who distributes it, they work with an advertising agency to monetize it. Those swim lanes are anything but clear now. Companies born in the cloud like Netflix had to own that whole value chain from the beginning and really understand what customers were doing. So personalization becomes critical - recommending content based on what somebody has watched before.

Lots of complications there: if people share accounts it doesn't always work well. And most folks I talk to are very cognizant about not going too far down the personalization route because you create an echo chamber. So there's this idea of manufactured serendipity - 80% of recommendations are algorithm-driven and 20% are deliberately random, just to make sure something unexpected gets interjected.

One of the best things Amazon.com ever did, in my opinion, was the little checkbox that says "this is a gift." The power of Amazon was collaborative filtering - I'd go in and see stuff that really gets me, things I actually want. But when I started using it to buy presents for my kids it messed everything up. Once they introduced that "this is a gift" checkbox and I could say don't put this in my profile, it went back to being really efficient and targeted. That simple signal completely changed the quality of what I was getting recommended for myself.

Chris Featherstone: What's the biggest barrier to entry for media companies trying to implement AI solutions?

Adam Sutherland: What we see a lot is people who don't control the value chain. You either have conversations with people who are way down in the weeds of technology, or you have conversations with people trying to solve business outcomes - and either one, you realize they're not talking to each other. The real value would be if they integrated AI through their whole stack. What we try to do is tell folks: even if you're not ready for all the use cases - search and discovery, captioning, content moderation - do the tagging. Because once you've got that foundation, you can do all the other stuff. The biggest hurdle is finding a customer who connects the dots across the different inputs to get where you want to be.

Seth Earley: Someone once said to me: just tag everything with everything. What level of granularity should organizations be thinking about, and when does something need to be bespoke?

Adam Sutherland: Most services allow you to set a confidence interval or confidence score, and the trade-off is: would you rather have false positives or would you rather miss a few things? It depends on the use case. A customer comfortable with 60% confidence for transcription used in captioning is fine because there's going to be a human in the loop downstream. On the other end of the spectrum, someone wanting 80 to 90% assurance that something needs to be moderated is a different threshold entirely.

When it comes to bespoke models: generalized models allow some customization - custom vocabulary, face libraries, things like that. But the more specific the use case, the more you'll want a bespoke model. I recently spoke with someone who wanted to identify not only which sport an athlete was doing, but whether they were doing it well. That's exactly the kind of scenario where off-the-shelf generalized models won't work for you.

The other dimension is: is this a strategic differentiator for your business? If you're a translation house and you're both using the same generalized translation model as your competitor down the street, you have no advantage. If it's a strategic differentiator, it's worth the investment in data scientists.

Seth Earley: How do recommendation engines differ across media organizations, and how important are they becoming?

Adam Sutherland: There are two big categories: content for people, and people for content. Content for people is the core - you have a lot of content, someone logs in, you show them something similar to what they've watched before that you know will make them happy. People for content is: how do you grow your audience, how do you have the right marketing message for the right cohort? The algorithms can do both but the underlying training data is different.

From my perspective, the algorithms and methods are out there - there's not a secret algorithm that somebody built in isolation. It really comes down to data governance: what are you collecting, how consistently do you manage that data? Some customers we work with run three or four different personalization engines simultaneously. They constantly optimize which engine's results appear where - they might always keep the human-curated editorial rail at the top, but everything else is up for grabs based on what you interact with.

Where knowing your customer and knowing your content really intersect is personalization. And what I find really interesting is the data lake angle. You collect behavioral data - where customers fall off in a video, how long they consume it, whether they watch the whole thing. Once you start marrying that content metadata with that engagement data, you can say by demographic: my audience drops off after a really violent scene. You can then start making editorial content decisions - not just recommendations, but what you create in the future. You can dial down the violence in action films across the board, or produce a specific version for a specific cohort. That's a really powerful level of personalization that's starting to be possible.

Chris Featherstone: What emerging trends should people be keeping an eye on?

Adam Sutherland: One is upscaling - you take an old film at 32 frames per second or fewer, and the algorithm actually creates seven or eight interpolated frames between each existing frame to get you to a full HD file. You're essentially creating new visual information that wasn't there, and the results are impressive.

There's also a burgeoning area of AI-generated content creation. There's a short film where the entire script was AI-generated - they fed a bunch of sci-fi scripts into a model, the model produced a script, they hired real actors and made a five-minute movie. It's not good right now. But one day it will be better. And some networks are starting to build models where they feed in screenplays and the model tells them directionally whether this is going to be a hit - not a binary yes or no, but directional signal, similar to how AI is being used for resume review in hiring.

On the commercial side: if you're in a stadium you see those rotating interactive banners on the pitch, and those banners switch based on how much screen time each brand has had on the telecast. Right now there's a human watching with a stopwatch - you can automate that. And the ultimate expression of that is product placement tracking and replacement. Maybe you made a show five years ago where the lead sponsor was Car Company A. Now you want to repurpose that archive content. If the rights allow, you can use AI to superimpose Car Company B over every appearance of Car Company A in the original footage. Very cutting edge, not at scale yet, but it's happening.

Seth Earley: Any blue-sky predictions? Synthesized actors?

Adam Sutherland: What's going to be really interesting is the rights questions around derivatives. If you're building a synthetic person from scratch, that's work for hire - you own it. Where it gets interesting is taking an existing actor's voice, training a model to sound exactly like them, and then using that for voiceovers and dubbing. When I lived in Japan, I watched a Schwarzenegger film that was dubbed and it didn't sound anything like him. Now you can actually train a voice model that can speak another language in that actor's voice. But the question becomes: what are the rights around that, who gets paid? There's going to be a lot of these derivative situations that raise unresolved questions.

The new battleground overall is data. Not only who owns the data, but who owns the interface with the customer that takes advantage of that data. Good content at scale is table stakes - everybody expects that. Where everybody is competing now is on customer experience, and the real driver for a good customer experience is data.

Seth Earley: Thank you so much, Adam. And before we close, I also want to thank our sponsors - the Marketing AI Institute and CMSWire, wonderful resources for digging deeply into many of these topics. Thank you to Sharon, our producer, for everything behind the scenes.

Adam Sutherland: My pleasure. Thank you guys.

Chris Featherstone: Adam, it's always a pleasure. Love hearing your insights. Thanks for the time.

Earley AI Podcast - Episode 4: AI and ML in Media and Entertainment with Adam Sutherland

From Dark Archives to Personalized Streaming - How AI Is Transforming Content Discovery and Monetization

Meet the Author

Let's Connect