How to build a billion dollar audio app in a weekend
Updated (22/10/2021): Since the time of writing this, Apple has addressed the issue that required the iOS bridge workaround. The Party Line source code has been updated to reflect these changes.

At Daily, we're seeing a lot of new interest in audio chat and collaboration apps of all kinds. To support people building these apps, we whipped up sample code for a Clubhouse-like (or, Clubhouse-lite) audio room experience on four platforms: web (React), iOS, Android, and React Native. Feel free to try out the live demo on the web, or browse the sample code, then come back and read this post ...

If you clicked on this article because of the title, you’re (probably) one of a few possible types of person. You’re a thirsty entrepreneur, ready to see how you can unlock the future. Maybe you’re the curious type and the short timeframe caught your attention. Or in the most likely case you, dear reader, are like your humble author. You’re a cynical developer who loves SEO and being hyperbolic. Hopefully, you’re all three of these people! 🦄

Unless you’ve been living under a rock (or perhaps an Android phone), you have undoubtedly heard about some of these audio communities in the news lately. If you haven’t, may I suggest you enter the Clubhouse, grab a Cappuccino, and make a Quilt? Or you might prefer to take a Roadtrip to the Rodeo. To put it bluntly, audio is hot right now, Musk vs. Putin hot.

If you’ll humor me, I’d like to start by talking about how we approached building Party Line to begin with. So many "weekend" projects go astray because they don’t fully define goals and expectations. So that’s where we’ll begin.

Then we’ll talk about the overall flow of our applications. Creating a consistent experience across four platforms was important to us, so we’ve built 4 different demos with a small backend (two serverless functions). Since they all share similar logic, we’ll walk through that before diving into platform-specifics. In each section we’ll link out to the platform specific code if you want to dig deeper.

Party on!

Screenshot of audio app with speaker, moderator, and listener squares
Be excellent to each other.

Party Line allows you to create audio chat rooms and invite your friends. They can join you as a speaker or just listen. The best part is that it works in most browsers without installing anything. And if you want to ship mobile apps, we’ve got you covered with iOS, Android, and React Native.

Since we wanted to build something in a "weekend", let’s define our goals and call out assumptions to keep the scope manageable.

Our application should have the following features:

  • A starting screen where you can either create or join a room
  • A prompt to enter your first and last name
  • Three user types in a room: moderator, speaker, listener
  • The room creator is the moderator
  • Moderators can promote listeners to speakers
  • Moderators can make other users moderators
  • Moderators can demote speakers to listeners
  • Listeners can raise (or lower) their hands to speak
  • Speakers and moderators can mute/unmute themselves, but only mute others.
  • Moderators can end the call for everyone
  • All attendees can leave a call and return to the starting screen

Let’s also consider the following constraints:

  • No external account management or authentication
  • No database
  • No backend aside from serverless functions which call the Daily REST API
  • Implicit user roles based on meeting tokens
  • No list of rooms to join

Now that we have a sense for our MVP scope, let’s look at the overall architecture, from a Daily API perspective.

Architecture: methods and events

Since Daily is doing the heavy lifting for us, we can focus on how we’re interacting with Daily and build our interfaces (more on that later).

There are three different ways we’ll interact with Daily:

1) Calling methods from either of the js libraries

2) Responding to events sent by the libraries

3) Creating rooms and tokens using REST endpoints

It’s worth noting that for iOS and Android, we’ll be interacting with the methods and events via a WebView "bridge". In React and React Native, we can call daily-js and react-native-daily-js directly.

Audio for the masses, the more platforms the merrier!

We’ve put each of the clients in one convenient repo for you. You’ll notice four folders, one for each of the clients. You can find the folder for our serverless functions inside /react/server/functions. Each folder contains its own README with platform specific setup instructions. These should be enough to get everything up and running, but let us know if you’d find detailed, platform-specific posts helpful. Open an Issue, create a PR, or just ping us any time we can help.

Screenshot of project directory with folders that read android, ios, react-native-react, plus gitignore, license, readme, and netlify.toml
daily-demos/party-line repository structure

In general, React and React Native are the most similar, and the most "idiomatic" in terms of how to build with Daily. Android and iOS are included to show what’s possible via "bridging" data to an invisible WebView. The fact that we’re only playing audio tracks makes this a bit more feasible. We’re actively working on native SDKs for mobile, so we welcome any feedback in this area.

Follow along on the platform of your choice, or explore each of them simultaneously to see how to approach a cross platform application.

For the rest of this overview, we’ll use the following legend.

🕸 (React web app)
(React Native mobile app(s))
🍏 (iOS mobile app - Swift)
🤖 (Android mobile app - Kotlin & Java; we'll link to the Kotlin examples)
🥅 (Netlify function - Node)

Whenever you see one of these emoji, it will link to the relevant file or line in the repo. Even the ones above. Try it!

First off, follow the README instructions, and start your dev server. (🕸 🍏 🤖)

If you’re one of those read-the-last-page-first types, you can see a working demo here and follow along that way.

The join/create page for Party Line
The join/create page for Party Line

The first page you are greeted with is a join/create a page. This is the user’s entry point into your application. Here we made a couple assumptions to keep the scope down. First, there’s no authentication or accounts per se. We take privacy seriously and you should too. So before you take our scrappy demo and deploy it as a production app, please consider how you’ll handle auth and security in general.

Second, if you start a room, you’re the moderator. We accomplish this by creating a meeting-token which is then used when you join the call. Moderators have the ability to “promote” other user types, which is why we need a token to identify them, but more on that later.

See more on how create (🕸 🍏 🤖) and join (🕸 🍏 🤖) works in the clients

On the server(less) side of things, we’ve created /room (🥅) and /token (🥅)  endpoints which call the /rooms and /meeting-tokens endpoints respectively. You’ll see those endpoints used in each of the clients. We’re enforcing the 10 minute demo limit here by using the exp property for rooms and tokens. Feel free to change this in your own application unless you prefer to keep your meetings short and sweet!

Life of the party

Audio only app with different squares representing listeners, speakers, and moderators
View once you've joined and created a call

Once you create a call, you are presented with the call view (🕸 🍏 🤖). Here, as a moderator, you have the ability to "promote" listeners to speakers, or moderators. We’re using the owner property from the participants object to identify moderators. This gets set when the meeting token you join with has is_owner set to true. To simplify how user roles in general work for the purposes of this demo, we’re storing them by appending to the username (🕸 🍏 🤖). In a production environment, you’ll want a more robust way to enforce the roles, so keep that in mind.

Look who's talking now

Moderator controls to "promote" listeners to speakers
Moderator controls to "promote" listeners to speakers

When you use your moderator privileges to promote a speaker, under the hood this is accomplished by calling sendAppMessage() to let the speaker know they have been promoted. For the promoted participant, who will be a speaker, we call setUserName(), so that other members will be able to pick up their new role via participants() and update the UI accordingly in each of the clients.


Don't hesitate, moderate!

It's also possible to remove participants from calls

Sometimes conversations can get a little spirited or maybe someone’s dog has decided they want to participate in the world’s latest social craze. When this happens it’s good as a moderator to be able to mute someone. We accomplish this by calling updateParticipant() with setAudio:false (🕸 🍏 🤖). For privacy reasons, moderators can only mute remote participants. They will have to unmute themselves when things quiet down.

In rare circumstances, muting may not be enough, and you will need to remove someone from the room. We handle this by calling sendAppMessage and telling the client that needs to leave to exit the room by calling leave() (🕸 🍏 🤖).

Gotchas and One More Thing™️

User controls for "promoting" a participant to a moderator
More on making a moderator

Because our trust mechanism to identify moderators is a token, when someone gets "promoted" to a moderator they need to rejoin (🕸 🍏 🤖) with their token so we can give them elevated control of the call. This pattern means they’ll drop out of the call for a second or two before rejoining as a moderator. In a production app you might prefer a smoother transition, which could be accomplished with a different moderator authorization method.

Another thing you may have noticed is that we ask for device permissions when you join, even if you’re a listener. This allows us to immediately turn on your mic when the time comes, but if you prefer a less invasive approach, you can always rework things so permissions are requested only when you’ve been promoted.

The not-so-secret sauce of the the iOS and Android clients is that they’re loading daily-js in a "headless" WebView and then interacting with it via a "bridge" (🍏 🤖). This is made possible by additions to platform specific WebView implementations which allow the audio tracks to be played off-screen. This works for this audio-only use case because there’s no video layout logic to deal with. We wanted to demonstrate the power of a lightweight integration like this, but rest assured that we’re actively working on fully featured mobile SDKs which we’ll release later this year™.

Just one more thing meme

So you’ve now got a flourishing multiplatform audio-only application. But you’re finding that your speakers can get a little carried away and sometimes you want to inject some audience participation. For this reason we’ve added the ability for listeners to raise their hands so they can be promoted to speakers. Since we’ve already "creatively" stored user roles in usernames, why not publicly show when someone raises their hand there as well? To do this, we call setUserName() and prepend ✋ (🕸 🍏 🤖). A moderator can then decide to promote, in which case the ✋ will magically disappear. Or a listener can lower their hand if it was "more of a comment than a question".

So there you have it, a suite of awesome demos that we built in a "weekend" (emphasis on week), but…

What's next?

Here are just a few ideas on what to do next with Party Line:

  • Add proper authentication and user roles
  • Make rooms persistent and add room metadata
  • Add the ability to schedule a "Party"
  • Send notifications when someone joins (via webhook or push)
  • Add user profiles (avatars, custom fields, etc)
  • A Party list / overview page
  • Livestream and scale up!

Now that you’ve got a billion dollar app, don’t forget to create a deck and go after that hard-won VC money. 🤑

In all seriousness, communities are only as good as their members. We want to take the guesswork out of the technical communications part of yours so you can focus on building.

Please reach out if we can help you!



Never miss a story

Get the latest direct to your inbox.