Sending data to video call participants: “app-message” or dedicated WebSocket?

This is part three of our social gaming series, in which we walk through key features of our Daily-powered social game: Code of Daily: Modern Wordfare. This is a standalone post. You do not need to read the rest of the series to follow along. But if you’re curious about the greater context of our application, check out part one or the rest of the series.

Introduction

It's common for video applications to incorporate features which involve sending data to users from a remote source. This data can range from chat messages to presence information, central application state, and more.

In this post, we'll go through a couple of ways to manage application state and messages to (or between) participants in your Daily-powered video call application. Specifically, we will focus on Daily's "app-message" events and WebSockets.

We will cover examples along with pros and cons of each approach. We'll also discuss why we chose WebSockets for our social game — Code of Daily: Modern Wordfare — and how we implemented this approach.

Following along

If you'd like to see our usage of WebSockets in action, check out the instructions to run the game locally in our introduction to Code of Daily (CoD). You can also follow along in the code repository.

Keeping it simple: Daily's `"app-message"` events

Daily facilitates the sending and receiving of messages between video call participants via "app-message" events up to 4096 bytes in size. Users can send app messages by using the sendAppMessage() call object instance method. Users can receive others' messages by setting up the call object to listen for incoming "app-message" events.

When in Peer-to-Peer mode, sendAppMessage() uses an RTCDataChannel. When in SFU mode, it uses Daily's own WebSocket.

For many applications, this is more than sufficient. Users should take care not to spam more messages than they need to and to keep data sizes as small as is reasonable, but that's really the case with a server as well: you'd still need to be mindful of what you're sending and how often.

One example of a fairly "beefy" use of Daily's "app-message" events is our spatialization demo. In this example, "app-message" events are being used to send position and zone presence data between participants in the world space.

When would you want a server?

The decision to incorporate a server component in your video call application should be a holistic one, taking into consideration all parts of your application in addition to the video implementation itself.

For example, for Code of Daily: Modern Wordfare, we ended up choosing a server approach based on the following considerations:

Knowing we'd need to synchronize central game state, not just participant state.
Potential performance impact once we add features like cursor sharing, which would involve even more data being sent to participants.
The need to safely create rooms and issue meeting tokens with Daily's REST API without exposing the API key to clients.
The desire to do some server-side validations.
The desire to have a "source of truth" for game data that was not reliant on the integrity of any one client.

All of the above individually could likely be solved with a client-only approach, but all together they just made a server approach make sense. So we opted for a server with our own WebSocket.

WebSockets: what are they even?

WebSocket is a communication protocol. It facilitates bidirectional communication between clients and a remote host which has chosen to accept them: in this case, our game server. This negates the need to make HTTP requests on demand to request data.

Let's go through an example of a CoD: Modern Wordfare feature and talk through how we can implement it using WebSockets, Daily's "app-message", and good old HTTP.

An example: taking a turn in Code of Daily: Modern Wordfare

In CoD: Modern Wordfare, we're using Socket.IO as our WebSockets implementation. A user makes a turn by selecting a word from the board.

Our word-selection flow with WebSockets looks something like this:

User clicks on a word
Client emits word selection event to the host
Remote host processes event and sends turn result event to everyone in the game
Clients handle turn result event

Let's go through the flow in a little more detail.

User clicks on a word

When it's their turn, the user is able to click on a word to select it:

Modern Workfare game screen, showing user clicking on a word — Modern Wordfare word selection

Client emits word selection event to the host

A "word-selected" event is emitted to the server via the connected socket:

game.ts client-side

  // onClickWord() is invoked when a player in an active team
  // clicks on a word on the board.
  private onClickWord(wordVal: string) {
    const data = <SelectedWordData>{
      gameID: this.data.gameID,
      wordValue: wordVal,
      playerID: this.localPlayerID,
    };
    this.socket.emit(wordSelectedEventName, data);
  }

As we can see above, the event contains data about the selected word.

Remote host (game server) processes the event

Our game server was previously set up to listen for the "word-selected" socket event from any connected clients:

index.ts server-side

    // Handle player trying to select a word
    socket.on(wordSelectedEventName, async (data: SelectedWordData) => {
      orchestrator
        .selectGameWord(data.gameID, data.wordValue, data.playerID)
        .then((turnRes) => {
          // Send turn result back to everyone in the game
          io.to(data.gameID).emit(turnResultEventName, turnRes);
        })
        .catch((e) => {
          io.to(socket.id).emit(errorEventName, e);
        });
    });
  });

Above, the server asks its instance of the game "Orchestrator" (which manages all running games) to select a word based on the game ID and word data provided in the event. This does a few things:

Validates whether it's even the player's turn to select a word — are they allowed to do this?
Checks which team the word belongs to
Updates the relevant team's score based on the selected word
Toggles the turn to the next team if needed

Once that's done, the result of the selection is sent to all other participants in the game via the io.to() call. Other players don't need to poll the server for data. Instead, data is pushed to them as needed.

Note that we specify the game ID (data.gameID) as the recipient of the event. This is because all players in the given game are subscribed to a channel based on the ID of the game they're in (in Socket.IO these are called "rooms"). This subscription happens via a socket.join() call when a user joins a game (which is also a WebSocket event!)

index.ts server-side

// Handle socket asking to join a game
socket.on(joinGameEventName, (data: JoinGameData) => {
    socket.join(data.gameID);
    // The rest of the function...
}

The recipient specified in io.to() can also be an individual socket ID. Note that in the word selection handling above, if there is an error it is sent back to the requesting socket only, not the entire game: io.to(socket.id).emit(errorEventName, e);

Clients handle turn result event

Once the socket server emits a turn result event which is generated in response to a player clicking on a word, all clients which are set up to handle that event can do so:

game.ts client-side

socket.on(turnResultEventName, (data: TurnResultData) => {
     const winningTeam = this.board.processTurnResult(
       data.team,
       data.lastRevealedWord
     );
     // If there is a winning team, display
     // the game over UI
     if (winningTeam !== Team.None) {
       this.showGameOver(winningTeam);
     }
   });

How might this work with `"app-message"`?

This same flow could be implemented with Daily's "app-message". It might look something like this:

User clicks on a word
Client itself processes the event and calculates the turn result based on the data the client currently has
Client emits turn result to all other Daily room participants via sendAppMessage()
All other participant clients handle the turn result event

In this case, no server is involved. The responsibility falls on the client to not just communicate central game data to all other clients, but also:

Validate the word selection
Decide what each team's new scores are
Confirm that all other participants received the event as expected

Aside from this requiring more logic and processing on the client, there's another consideration: client data can easily be tampered with, and risks of client state falling out of sync are non-trivial. But if you're aware of the potential “gotchas”, it is possible to implement this entirely on the client side with Daily.

For example, all the clients receiving data from the user which selected a word could perform their own validation of the result ("does this turn result make sense in the context of the data I currently have?") If some significant number of clients decide the turn result is suspicious, the turn could be rejected (which could involve some more "app-message"s to build consensus).

How might this work with HTTP?

There's another alternative to WebSockets and "app-message": HTTP.

With a traditional HTTP/1.1 implementation, clients would request data from the server. For example, a client might be set up to request updated game data every n seconds by making a GET request to an endpoint associated with their particular game.

There are some HTTP features that can serve as closer alternatives to WebSockets' bidirectionality depending on the use case (long polling, streaming, or HTTP/2 server push), each with their own benefits and tradeoffs. Going into these in detail is out of scope of this post, but it might be worth checking them out if you're choosing to go with a server approach.

In our case, WebSockets covered what we needed to do and was the most intuitive and efficient solution.

Conclusion

In this post, we've covered a few different alternatives for sharing data or central state across participants in your Daily-powered video call application. We've discussed considerations when choosing between using a server with a dedicated WebSocket and utilizing a client-based approach with Daily's "app-message" events. We also covered an example of how both might be implemented for one of our social game features.

At Daily, we realize choosing between all the options might be a bit daunting. We're happy to help! Get in touch if you'd like any help deciding whether Daily's "app-message" is the right tool for your application.

Categories

Topics