Scaling Social Discovery: Engineering Friend Bubbles for Billions of Reel Users

Overview

At first glance, the Friend Bubbles feature on Facebook Reels seems deceptively simple: it shows you which Reels your friends have watched and reacted to. But beneath that straightforward UI lies a complex engineering challenge—building a real-time social discovery system that scales seamlessly to billions of users. This tutorial walks you through the architectural decisions, machine learning model evolution, platform-specific quirks, and the key insight that made the whole feature click, as discussed by Meta engineers Subasree and Joseph on the Meta Tech Podcast.

Scaling Social Discovery: Engineering Friend Bubbles for Billions of Reel Users — Source: engineering.fb.com

Whether you're building a similar social feed feature or just want to understand how massive-scale recommendation systems handle personalization, you'll find actionable design patterns and lessons learned.

Prerequisites

To get the most out of this guide, you should be comfortable with:

Machine learning basics: understanding of collaborative filtering, engagement signals, and model training pipelines
Distributed systems concepts: data sharding, caching, and eventual consistency
Mobile client development: differences between iOS and Android architectures (e.g., background execution, push notification handling)
Familiarity with Reels: knowing what Reels are (short-form videos) and how they are surfaced in a feed

No direct access to Meta's internal tools is required; we'll use pseudocode and pattern examples that you can adapt to your own stack.

Step-by-Step Guide to Building Friend Bubbles at Scale

1. Define the Core Signal: What Makes a Friend Activity Worth Showing?

The first step was identifying the right social signals. Friend Bubbles highlight Reels that friends have watched and reacted to. But “watched” is ambiguous—does it mean viewing >50%? And “reacted” includes likes, comments, shares, or saves. The team settled on an activity score that combines recency, type of reaction, and relationship strength (e.g., close friends vs. acquaintances).

// Pseudocode for calculating friend activity score
function computeFriendActivityScore(friend, reel) {
  let score = 0;
  score += friend.reactions.like ? 2 : 0;
  score += friend.reactions.comment ? 5 : 0;
  score += friend.reactions.share ? 10 : 0;
  score *= decayFactor(reel.timestamp); // recency weight
  score *= intimacyFactor(friend, currentUser); // tie strength
  return score;
}

Select only top-N activities per user per session to limit data volume.

2. Build the Machine Learning Model for Reel Selection

The ML model evolved through three phases. Initially, a simple collaborative filtering algorithm ranked Reels based on global friend activity counts. This failed to personalize—everyone saw the same popular Reels. Phase two introduced a graph neural network that encoded user-friend relationships and engagement patterns. Finally, a multi-task learning approach predicted both click-through rate and session duration, weighted by friend proximity. Subasree noted that the biggest breakthrough came when they added temporal attention to the model, allowing it to prioritize recent friend actions over stale ones.

3. Manage iOS vs. Android Behavior Differences

Joseph highlighted a critical platform disparity: iOS restricts background app refresh and network calls aggressively, while Android allows more flexibility. This affected how Friend Bubbles data was fetched and updated:

iOS	Android
Data pre-fetched via push notifications; UI shows cached state	Background sync service pulls updates every 15 minutes
Bubbles may lag by up to 30 minutes	Bubbles update within 2 minutes of friend action

To maintain consistency, the team built a polling + push hybrid system: iOS uses silent push to wake the app and fetch new data, while Android uses a periodic job scheduler. Both feed into a unified in-memory cache on the client.

4. Optimize Storage and Bandwidth

Friend Bubbles generated massive traffic—every friend action on every Reel could produce an event. The solution: aggregation layers that collapse duplicate activities (e.g., three likes in five minutes become one event). Use probabilistic data structures like Bloom filters to quickly check if a user has already seen a bubble for a given Reel. On the server side, implement a time-windowed counter that stores only the latest hour of friend activities, with a fallback to historical summaries.

5. Handle the Surprising Discovery: Recency Trumps All

The team discovered that the feature only gained user adoption when they reduced the model's reliance on long-term friend activity and instead only surfaced bubbles from the last 60 minutes. Earlier versions that included older friend reactions (even if highly relevant) were ignored by users. This insight changed the entire data pipeline: instead of storing all friend-reel interactions, the team switched to a sliding window cache. Implement a TTL of 3600 seconds on friend activity entries, and evict old ones aggressively.

Common Mistakes

Over-relying on historical data: Including friend actions older than an hour reduces engagement. Always A/B-test time windows—what works for recommendation may not work for social discovery.
Ignoring platform timing differences: iOS users missed bubbles because data wasn't refreshed. Ensure you have a retry mechanism and clear UI indicators for stale data.
Underestimating model complexity: The simple collaborative filter was easy to build but failed to personalize. Invest in a graph-based or attention-based model from the start if you have the data.
Not decoupling read and write paths: Friend actions are write-heavy; serving bubbles is read-heavy. Use a CQRS pattern to avoid contention. Fail gracefully under high load with cached defaults.
Forgetting cold-start users: New users with no friends or activity see empty bubbles. Provide a fallback of trending Reels or suggested friends to kickstart the experience.

Summary

Building Friend Bubbles taught Meta’s Reels team that social discovery at scale requires a careful balance of ML personalization, platform-aware client infrastructure, and a relentless focus on recency. The three key takeaways: (1) prioritize real-time signals over historical accuracy, (2) treat iOS and Android as independent systems with synchronized caching, and (3) never assume a “simple” feature has a simple implementation. For your own projects, prototype with a sliding window of friend activities, use a graph model for personalization, and validate each decision with live A/B tests. The full journey is detailed in the Meta Tech Podcast episode referenced below.

Listen to the Original Episode:
Meta Tech Podcast: Reel Friends: Building Social Discovery that Scales to Billions. Available on Spotify, Apple Podcasts, Pocket Casts.

Tags: