Introducing VCS WebFrame to Daily’s recording and compositing framework

We’re very happy to share WebFrame, a new component now available as part of Daily's Video Component System (VCS) beta, our powerful compositing framework. WebFrame enables developers to embed live web content into a custom Daily-built application that uses cloud recording or live streaming.

Recording/streaming web content typically requires using screen share functionality to capture a browser window. (In a live call a participant screen shares so the content gets recorded or streamed.) Due to the way screen media is processed, this often results in a low quality capture, which is also subject to a user's manual actions. There’s a constant risk of wrong content being displayed or impacted by other user errors or local technical problems. Wouldn’t it be better to leave this job to a server that renders exactly what’s needed, right when it’s needed?

Our new WebFrame release offers this advantage. It gives developers the tools to create polished, compelling content, optimized for the recording use case. WebFrame supports:

  • Server-side rendering with programmatic API control over the embedded browser. Your application — not the user’s manual screen share — decides what web content is rendered and when.
  • Extensive control over how the WebFrame overlay is positioned in the layout.
  • Higher visual fidelity than a screen share because WebFrame is rendered directly into the final encoding.
  • The ability to specify the embedded browser’s exact size and scaling to ensure text content will look just right in your video stream.

In this post I'll introduce you to WebFrame, its advantages over other forms of web page recording, and its many use cases. We'll wrap up with a hands-on tutorial of using the component.

💡
WebFrame is built for custom applications developed using Daily's client SDKs. For Daily Prebuilt, our full-featured video embed, check out our recently announced Prebuilt Integrations API.

What is web page recording and WebFrame ?

Web page recording is a common use case in which developers seek to capture the contents of a browser window, such as a whiteboard. For a practical example, consider a dynamic whiteboard shared by the participants on a video call. The whiteboard content should be displayed to all participants as part of the UI, but also included in the recording of the call.

Other use cases include:

  • Collaboration tools in addition to whiteboards
  • Shared cloud documents (e.g., Google Docs)
  • Info widgets (weather, news sidebars, stock tickers, etc.)

WebFrame solves this problem by rendering the whiteboard on Daily’s cloud directly into the recording. WebFrame works on the server, so you don’t have to modify your client application other than sending the right properties to enable the rendering. It lets you embed dynamic web content into the streams or recordings that you’re producing on Daily’s cloud.

How it compares to headless browser instances

Historically, this has been achieved using headless browser instances that capture the UI. However, headless browsers are an expensive and difficult-to-scale solution because of the graphical horsepower required to render arbitrary web content. Headless browsers are also very difficult to debug compared to a local web app.

In contrast, WebFrame uses a built-for-purpose video compositing pipeline, making it far less expensive and more robust and scalable than browser-only cloud compute instances. What’s more, WebFrame doesn’t require additional pre-configuration or concurrent session management — it’s seamlessly available as a feature of VCS.

WebFrame and VCS

The new WebFrame component expands the capabilities of VCS with support for web-based content. Accessing it can be as simple as toggling the showWebFrameOverlay property when starting or updating a recording or live stream.

💡
Video Component System (VCS) is Daily’s cloud-native toolkit for implementing custom layouts and dynamic elements in interactive video applications.VCS is a core part of our recording and streaming infrastructure. If you want to learn how it all works together, our CEO Kwindla Kramer recently wrote this blog post about VCS and media pipelines – highly recommended!

WebFrame is also interactive! You can send simulated key press events to WebFrame’s server-side browser. This enables remote control of most existing web apps. For example, to display a Google Slides presentation, you can simply send arrow keys to advance between slides.

Here’s a 24-second video (no sound) showcasing some of these features in a Daily room:

0:00
/0:23

In the tutorial part of this post you’ll find code examples for how to load a web page into WebFrame, position the component, and send key press events. But first, let’s take a deeper look at how WebFrame fits into the VCS framework and Daily’s platform.

How WebFrame works

Until now you had two supported content types in our VCS compositing framework:

  • Video layers. These are the participant video streams available within the Daily room. VCS offers many options that let you control the layout of the video frames and choose which participants to include.
  • Graphics. Built-in VCS components including Image, Text, and Box can be combined to create a variety of motion graphics. These VCS components are designed for real-time rendering, so you can animate the graphics elements with confidence that it won’t impact the performance of your overall video composition on the server.

WebFrame is a third, brand new content type. It lets you handle dynamic content with the same ease as overlay images. When you load a web page URL into the WebFrame component, it’s actually rendered by a Chrome-compatible web browser instance immediately available on Daily’s media server.

WebFrame offers the highest visual fidelity. It renders web content in the same pass as VCS overlay graphics, just before the video stream is encoded to your destination endpoints (e.g. MP4 recording, RTMP streaming, HLS video distribution, etc.) This guarantees that text remains legible, whiteboard graphics don’t suffer from blurry encoding artifacts, and so on. That’s a substantial quality advantage over a regular screen share which goes through multiple encoding stages and is rendered at the original display resolution of the participant sharing their screen (usually not a good match for the final video stream’s resolution).

When WebFrame is a good fit

WebFrame is particularly great for shared online documents. It updates in real time, so if your participants are working together on a whiteboard or collaborative timesheet or some other heart-rending masterpiece, the recording captured by WebFrame will exactly reflect what the participants saw on their own devices viewing that same document URL.

WebFrame also works very well for small widget-style info displays embedded in a corner or on the edge of your video. It doesn’t go through any extra encoding steps, so text and graphics will be as crisp as possible in the output video.

However, WebFrame is not meant for media content. It’s not a video playback or animated graphics. The embedded browser isn’t guaranteed to update more than a few times per second, which is fine for documents but not enough for high-motion content. Neither does WebFrame play audio.

💡
For the media use case, Daily offers the Remote Media Player API, currently in closed beta. You can combine WebFrame and Remote Media Player in the same app to leverage both. (For more information on Remote Media Player, contact our support team.)

A hands-on WebFrame tutorial

Let’s embed a Google Slides presentation into a recording and send it simulated arrow keys to change slides.

I’ve prepared a demo using one of Google’s templates and lots of “lorem ipsum” placeholder text. You can use that link or your own. (If using your own URL, remember that you should use “Publish to web” to make the presentation accessible so that Daily’s media server will be able to render your link.)

I’m assuming you already have a web app that connects to a Daily room using the daily-js front-end library. If not, you can use our recording example as a starting point.

As usual, you’d start a recording using the startRecording() call object instance method.

Then, we’ll update the recording to send WebFrame parameters:

call.updateRecording({
  layout: {
    preset: "custom",
    composition_params: {
      mode: 'grid',
      showWebFrameOverlay: true,
      "webFrame.url": "https://docs.google.com/presentation/d/e/2PACX-1vTLKgKalkVn2Ol-vmMr1CWo3f5uFEKmS4vH_zKXVLeHmMPgWbij542H3UbxBqNhcxcF7spzQxcwvGx-/pub?start=false&loop=false&delayms=60000&slide=id.p",
      "webFrame.viewportWidth_px": 1080,
      "webFrame.viewportHeight_px": 1080,
      "webFrame.position": "top-right",
      "webFrame.margin_gu": 0,
      "webFrame.height_gu": 36,
      "videoSettings.margin.right_gu": 36,
    },
  },
});

I’m specifying the embedded browser’s viewport size as 1080*1080 pixels. The aspect ratio here is rectangular so we can place it side-by-side with the video grid:

Video call participant on the left and a WebFrame embed on the right.

In this screenshot, I’m the only person on the call. If more people joined, the grid would be rendered on the left-hand side in the available space.

To accomplish this side-by-side layout, I’m first setting the WebFrame’s position to ”top-right” and then adjusting two sets of position parameters. On the WebFrame overlay, we’re setting its margin to zero and its height to 36 grid units (full height of a 16:9 viewport). We’re also setting the right margin on our video content to 36 grid units. This will place the videos on the left side, and they won’t overlap the WebFrame content.

💡
You don’t have to work out these layout settings manually as code. I didn’t! I used the VCS Simulator. It’s a graphical interface that lets you tweak every available parameter. You can use the “Record” functionality to create a JSON object ready to drop into your code.

Simulated key presses

The previous updateRecording() call loaded our Google Slides presentation into WebFrame. Now let’s send another update to advance to the next slide:

let kpkey = 0;

call.updateRecording({
  layout: {
    preset: "custom",
    composition_params: {
      "webFrame.keyPress.keyName": "ArrowRight",
      "webFrame.keyPress.modifiers": "",
      "webFrame.keyPress.key": ++kpkey
    }
  }
});

What’s the kpkey variable doing here? We need some way to tell WebFrame that this is a new key press. If we just send the key name and modifier, VCS has no way of knowing that this is an update: after all, we might send several instances of the same simulated key one after another, so there needs to be a way to distinguish them. We must send a new value for webFrame.keyPress.key to trigger the action. An integer that gets incremented for every key press works great as the key.

(As a side note, I realize that it’s potentially confusing that this value is called key when we’re talking about key presses… But VCS is based on React, and it’s standard React convention to call this type of unique identifier a key, so we don’t want to deviate from the standard.)

To move to the next slide now, you could just send an update with:

 "webFrame.keyPress.key": ++kpkey

Incrementing kpkey will make this register as a new key press with the same keyName as before.

Or, to go backwards, you could change the keyName as part of the call:

"webFrame.keyPress.keyName": "ArrowRight",
"webFrame.keyPress.key": ++kpkey

It’s also possible to send modifier keys like Ctrl and Alt, which can be useful to access a web app’s shortcuts.

You can find the list of supported key names and modifiers in the VCS WebFrame parameter docs.

Changing the layout

While WebFrame is on the screen, you’re free to use the various parameters available in the VCS baseline composition to modify the layout.

For example, here’s a split layout (showing myself talking to myself):

Two video call participant tiles on the left and WebFrame on the right

You can make this layout by specifying the split mode and selecting the horizontal split direction, as follows:

call.updateRecording({
  layout: {
    preset: "custom",
    composition_params: {
      mode: 'split',
      "videoSettings.split.direction": "horizontal",
    },
  },
});

The video layout will fit into the available space on the left because we previously specified a margin that leaves room for the WebFrame display. (This was done by passing "videoSettings.margin.right_gu": 36 in our first update call above.)

Fully custom layouts

There’s a number of parameters available in the VCS baseline composition to control the layout of the WebFrame component. Previously in this tutorial we used webFrame.position, webFrame.height_gu and webFrame.margin_gu​​ to place it on screen.

Yet for some applications you might need further control over the WebFrame component’s placement and rendering. For example, you might want to render a background border around the component so its content has a bit of padding (like a framed picture).

This kind of custom rendering is possible using the CustomOverlay component, which you can read more about in our earlier tutorial on adding a dynamic moving watermark to any live stream or recording. Long story short, a custom overlay is effectively a React component that runs on Daily’s rendering server, and which you can upload through the session assets API. You can use the WebFrame VCS component directly and combine it with other components like Box and Text to create all kinds of composite rendering.

Conclusion

In this post, we’ve covered our new VCS WebFrame component. There are many options, but for the simple case it’s quite straightforward to get started:

  • Set the showWebFrameOverlay property when starting or updating a Daily recording or live stream in custom layout mode;
  • Pass in a URL to be rendered.

We went through some possible use cases for the component, its benefits and limitations, and walked through a WebFrame usage example. If you have any questions about WebFrame or VCS, reach out to our support team or get in touch through our WebRTC developer community.

Never miss a story

Get the latest direct to your inbox.