During a major cricket tournament, Disney+ Hotstar captured 5 billion emojis to distill the mood of the 50 million-strong streaming audience in real-time.
1. Hotstar mobile clients send emojis using a lightning-fast HTTP API written in Golang.
2. The API server doesn't waste time processing emojis.
It batch sends the high volume of incoming emojis to Apache Kafka every 500ms for downstream async processing. Apache Kafka was selected for its high throughput and low latency.
3. Apache Spark is the streaming processor on the other end of Kafka. It aggregates emojis every 2 seconds. This interval is tunable.
4. Apache Spark writes the aggregated emoji data into another Apache Kafka.
5. The consumers on the other end pull the aggregated emoji data from Apache Kafka and push it into the PubSub infra.
6. The PubSub infra broadcasts the aggregated emoji data to all clients in real-time. The PubSub infra is built on MQTT to handle 50M concurrent connections.
Key Design Principles
The system should be horizontally scalable to be able to support the increasing traffic. We achieved horizontal scalability with the help of load balancers and configured auto-scaling to scale the resources up or down.
The system needs to be decomposed into smaller components each being able to carry out the assigned task independent of each other. This also provides us with the ability to scale each component as needed.
Asynchronous processing enables execution without blocking resources and thus supports higher concurrency. We will talk more about this later.
How are client requests handled?
Clients send user’s submitted emojis via HTTP API. To prevent hogging the client connection, heavy processing on the API needs to be done offline. We need to write the data somewhere so processing applications can consume it. Message Queue is a commonly used mechanism for asynchronous communication between applications.
There are a lot of message queues available out there. For Emojis, they need a technology that offered high throughput, availability, low latency and supports consumer groups. Kafka seemed like the best option but managing Kafka on our own takes significant effort. Hotstar has an amazing data platform called Knol built on top of Kafka which is flexible for all our use cases.
How do we write messages to the queue?
Synchronous: Wait for the acknowledgment that the message is written before sending a success response to clients. In case of a failure, they have retries configured at both server and client. If your data is transactional or cannot suffer any loss, this approach is preferable.
Asynchronous: Write the message to a local buffer and respond with success to clients. Messages from the buffer could be written asynchronously to the queue. The downside is that if not handled properly, this could result in data loss.
For Emojis, very low latency and data loss in rare scenarios is not a big concern. So they chose the Asynchronous approach. Golang has great support for concurrency. It ships with an implementation called Goroutines. Goroutines are lightweight threads that can execute functions asynchronously.Its just a command say go do_something(). They use Goroutines and Channels in Golang to write messages to Kafka. Messages to be produced are written to a Channel. A Producer runs in the background as a Goroutine and flushes the data periodically to Kafka. Using client libraries like Confluent or Sarama, they can provide the flush configuration to achieve optimal performance. We configured our flush interval to 500ms and maximum messages sent to Kafka broker in a single request to 20000.
How does the processing happen?
Goal: Consume a stream of data from Kafka and compute aggregates of the data over an interval. Time interval should be small enough to provide a real-time experience to users. After considering different streaming frameworks like Flink, Spark, Storm, Kafka Streams they decided to go with Spark. Spark has support for micro batching and aggregations which are essential for our use case and better community support compared to competitors like Flink.
What about data delivery?
They use PubSub as our delivery mechanism. PubSub is a real-time messaging infrastructure built at Hotstar to deliver messages to users in our Social Feed.
They perform normalization over the data and send top emojis to PubSub. Clients receive messages from PubSub and show Emojis animation to users.
Voting and More
Hotstar is currently the sole voting platform for a few big Indian reality shows like Dance Plusand Bigg Boss (Telugu, Tamil, Malayalam). This infrastructure equipped to power Emojis and Voting for Hotstar.
Next blog will be on how how data alone can do service delivery without a web/mobile medium.