The Viewer Takes the Switchboard
From One-to-Many Broadcasts to AI-Composed, Prompt-Driven Live Experiences
What if you could direct your own live Super Bowl broadcast, switching cameras, angles, stats, and commentary in real-time, tailored just for you? Since the early days of a few TV channels captivating millions, viewer attention has fragmented across thousands of shows and now billions of videos. Recommender algorithms in Netflix and YouTube offered personalized curation, sifting through the “title noise” to offer a tailored experience.
Zoom into a piece of content, toss high-performance hardware + advanced AI models into the mix, and new types of tailored viewer experience (VX) become possible.
Cameras are everywhere. There are far more live streams than people can watch. Production and composition used to be a one-and-done endeavor. Now, it becomes unique per viewer, like a hologram. Every observer sees a similar but different version, based on perspective and traits.
What was once a spectator experience, will become an interactive experience driven by the viewer’s choice and powered by the abundance of streams, meta-data, and AI vision models pulling the levers.
Composition metamorphosizes from an editorial artifact to a runtime process.
Natural Language Everything
When we study the progression of Human Computer Interaction, there are key milestones. Namely, the teletypewriter aka terminal/TTY (like computer screens in the Matrix), the window manager (keyboard + mouse), the web (window manager + networked content), mobile (all that but on the go). Today we have emerging agentic systems (voice/text driven).
Blade Runner [1982] depicts AI interaction quite succinctly. Deckard (Harrison Ford) sits on the couch, whiskey and bottle in hand. He inserts a photo into a Polaroid-type scanner and starts talking to his TV to drive image analysis. As Drunken Deckard interacts with the computer with his voice, he directs it to scroll, pan, zoom, and print a copy – all while fumbling his words, making mistakes and telling it “no, no, no … go back..”.
This was prescient. At the time, desktops and window managers were just conceptual prototypes in tech labs. This hits on a number of levels. It leapfrogs Windows + Mouse-based interaction. It demonstrates a future where an inebriated user can interact with intelligent systems in order to assist in their real-world tasks. The system adjusts to the state-of-mind of the user.
Each change occurs in generational shifts; but this is a magnitude-level improvement. From TTY, Desktop, Web, Mobile, to AI + Natural Language ← you are here
Operators Become Builders
Sidequesting into another realm, we can learn from Agentic Engineering. In just a few short years, the efficacy of frontier models for writing code have so drastically shaped engineering that any engineer who does not embrace it will be obsolete soon. Couple that with top reasoning models like Opus 4.6 and GPT-5.3 Codex. These exhibit longer, unassisted run-times on the order of hours. The machines are getting smarter.
The game changer is Planning. You submit a specification. It interacts with you, asking questions. It seeks disambiguation. By asking the questions upfront, it allows the system to run unassisted. You come back many hours later, and it’s wrapping up what would normally have been weeks worth of work.
Despite the fact that the system is rapidly becoming more tactically adept than the Engineer, it often lacks high-level strategy. It takes two to tango. The value of the user is experience and “taste”... leading to better a strategy of what to create and why. Domain expertise is becoming as important as software expertise.
The impact is not another bejeweled buggywhip. Operators are moving upstream in their decision authority, defining the system. It’s a reallocation of their leverage within the organization.
So what becomes possible with these systems and tailoring VX?
Natural Language Composition
Natural language is the new remote control for viewer-directed live composition.
We can compare consumer live video VXs to commercial solutions. Examining the semantics of commercial live video “Systems of Action” (SOAs), they have much in common with consumer-based use cases for live content. They ingest large quantities of live streams, employ real-time AI inference to extract meta-data, couple it with telemetry data from other systems and sensors, and deliver a vastly reduced set of audio/video/event streams.
The consumer side has additional parameters such as monetization, measurement, and rights management. SOAs lack these requirements, and often operate on private networks, and therefore encounter less friction to build and deploy. It’s why they are further along in terms of adoption.
It brings to mind the recent 2026 Winter Olympics, where the intermixing of telemetry and meta-data with a live feed from high-speed drones produces a novel VX. The drone is following downhill skiers at 120mph, while real-time overlaying the location, speed, and milestones along the race course. Now imagine if there are many targets followed by many drones and cameras, and you want to follow the Snowboard cross racer you are rooting for? That’s a prompt.
Furthering the examples to other live events: like the Super Bowl, PGA Masters, or Coachella. Each viewer chooses their own adventure. The VX is fluid, changing with the observer’s profile, preferences, and instructions:
Focus on this team. Follow these athletes. Show these stats every 3 minutes, or when a goal happens, or when a course is completed.
Make a playlist of these bands. Focus on the rhythm section, sometimes showing the top view of the drummer. Occasionally use b-roll from the side stage and the audience. Show the song that’s playing, and if it’s a cover who originally wrote it, and other popular bands that also did….
The prompts are only limited by the meta-data that’s present and the viewer’s imagination. With personal AI systems emerging and gaining long-tail memory, they too will write and dispatch the prompt on behalf of their User.
Owning the Fan mentions an entirely new category of sports fan-specific meta-data that comes to the party. Coupled with meta-data extracted in real time from each stream. These event streams become the backbone for which user-choice can be overlaid.
Similar to SOAs, live content contains a sea of possible streams. The observer only cares about certain things. The rest is thrown away. The salient pieces are composed with intent. Attention is captured and retained.
Shifting Control Inwards, Socializing Attention Outwards
Live television becomes a rendering engine, not a finished product. Whether rendering at the Content Production Network (CPN) or the device-level, dynamic and interactive video composition is becoming a tenable VX. When done at the CPN level, a given Viewer’s mix or collection of prompts that produce a stream can be shared socially, to watch X Viewers’ mixes of Y events.
For this content series, show me the “Mystery Science Theatre 3000” commentary from anyone in my FB friend group.
On screen overlays of comments, and/or audio intersect and augment the video streaming experience. User-specific meta-data + content.
If done at the device level, there may be more flexibility because the experience is rendered locally rather than generating a new distributable stream, but rights and policy constraints still apply.
Is that a real Owl?
There is an emerging universe of AI-generated video and user-driven content generation unfolding. However, in the midst of this sea of unbounded fictional content, actual events representing “real live people and experiences” will be a genre which is considered special because of the authenticity of human experience.
In another Bladerunner scene, Rachael asks Tyrell if an owl perched in the room is real. The scene is depicted at a point so far in the future that an actual animal is genuinely rare. By this point all living creatures are synthetically generated. Video content is heading there rapidly. Authentic, real, live events will become valued in this world of questionable reality.
Generated AI video has lately been catching lots of attention. Another big wave is coming where real events are AI-mixed.
What Breaks?
With traditional, one-and-done artifacts, the Producer has always controlled the eyeballs. The choose-your-own-adventure style of content yields control. It gives agency to the Viewer. Without robust, accurate and real-time meta-data, none of this works. It must be delivered in tandem and synchronized with the stream.
Taking a page from Governance and Standards for CPNs we see emerging meta-data interchange standards which target static content, and with the right modifications would unlock and encourage content Originators to share and monetize their live streams with the downstream Producers. Without this data in tow, the live streams are not as valuable.
But who gets to see what?
In a world where the meta-data flows, there needs to be governance. The Viewer combines and composite live streams in containers operated by an Intermediary entity. These require contractual rights to the streams + associated meta-data.
In addition to the syndication of the meta-data, digital rights must be enumerated and enforced. The treatment of stream composition is NOT a binary choice. The rules of governance dictate how and when a stream is used. Whether it requires the full frame. What adjacent content may precede or follow. Required or excluded adjacent content as sidebars, titling, closed captions, and/or overlays. Prompt templates are delivered and act quasi-contractually and guide the Producer to remain in-bounds.
These concepts need to be backed contractually between the Originator and Producer. This is based on using profile data to determine the Household and/or Viewer, and where they are geolocated. All of this augments the decisioning framework.
And… who saw what?
Just as beacons abound for measuring video ad-units, a robust measurement standard linked to the above-mentioned meta-data needs to close the loop. No longer is content a singleton, with predictable elements within a predictable time series. The measurement of eyeballs needs to convey what was seen and when.
Consider VX’s like the “sports bar” or Vegas “sports book” where a single screen is simultaneously watching many sporting events, with just the audio feed from the center-stage game. How do you factor concurrently viewed streams in multi-views and deal with mixed attention, and the absence of audio for some of them? This is all technically feasible right now, but measuring reach and effect requires an agreeable framework, and mixed-modeling attention in a fluid, vs. binary method.
Surely we can observe beyond the perimeter of commercial cameras. If we expand the valence of inputs, what is required to ensure brand safety and validating inputs of some of the source video that comes from devices like cell phones?
When mixing at the device level, what is constituted as fair use often can become blurry.
Wrapping
With the right guardrails and metadata syndication as a service, large-scale/wholesale stream sharing can be done safely and monetized with lesser overhead than traditional “one-and-done” production.
Industry leaders convened at the 2026 Montevideo Summer Camp Ventures Innovation Track spent the weekend exploring these concepts centered around meta-data, ad-monetization, and digital rights. Montevideo Tech Ventures produced a thought provoking manifesto proposing solutions to some of these concepts. It takes activation energy to elevate to a higher state. The presence of solutions will bring stakeholders together.
Putting the Viewer in control, factoring meta-data that is explicit from prompts, and implicit from adjacent Viewer data means a better curated experience and longer attention cycles. Content that is not just done for you. It’s done by you. Maximally engaging. Sticky “eyeballs”. Therefore ideal monetization.
There are crosswinds and headwinds. Tailwinds emerge with standards adoption. Bundling meta-data + rights + measurement in tandem with delivering live streams will unleash new viewer experiences that enhance and augment new forms of viewer attention.




