Twitter System Design

Twitter has a massive amount of people that are using it, 6.1% population, and it has huge amounts of daily traffic, 500 millions daily tweets posted.

But how is such a system engineered?

And what components does it have?

These are also popular interview questions, and we will tackle them in this article.

Plan of Action

1. Requirements and Goals of the System
	Scope the problem
	Functional Requirements
	Non-Functional Requirements
2. High-level Design
	Flow of Posting a Tweet
	Building the Newsfeed
	APIs
3. Detailed Design
	Delivering Tweets - Push vs Pull
	Fan-out Service Deep Dive
	Building the Newsfeed Deep Dive
	Search Service
	Trending Topics
	Data Partitioning
	Ranking Tweets at Scale
4. Conclusions

Functional Requirements

If we want to build a system like Twitter, we would need to provide at least the following Core features:

Post a new tweet. This can be simple text, links, images or videos.
Users should have the possibility to: comment, retweet, like, or share on a tweet.
Follow, or unfollow an account. The resulted relations we’ll be used later, to build the newsfeed page. Here, relations will be in one direction. If I follow Elon Musk, he doesn’t have to follow me back. Although, it would be nice.
Users should be able to access the home page, they should receive tweets from the people they are following, ranking in a relevancy order. This is also known as newsfeed.
A user should be able to search tweets and also see trending topics in his region.
There might be other features like "Creating an Account", "Getting Notifications" but these are common to many applications nad not something specific to Twitter.

The interviewer might pick some specific ones, but don’t worry, we’ll cover most of them in this video.

Non-Functional Requirements

High availability. Although, always-on availability is taken for granted, it takes a lot of effort to achieve it. It would make the news if the whole Twitter app would be down for a couple of minutes.

Scalability. Twitter should be able to scale horizontally and vertically to accommodate an increase in users and message volume.

Fast Responses. To keep the users happy we need fast responses from the application. Posting a tweet should be fast but that wouldn’t be so difficult to achieve.

However, an interesting problem is when an account with millions of followers makes a post.

How should we propagate this tweet to all followers? And how does users immediately receive the most relevant tweets on their newsfeed page? We’re going to find the answer in this article.

Capacity Estimation

Traffic: Twitter has over 330 million monthly active users, and its peak traffic volume can be very high during major events or breaking news. Assuming an average of 1,000 requests per second per million users, Twitter may need to handle 330,000 requests per second during peak periods.
Data volume: The amount of data generated by Twitter users can be substantial, including messages, pictures, videos, and other content. Assuming an average message size of 200 bytes, and 10 messages per user per day, Twitter may need to store around 66 TB of data per day for its 330 million users. But we should further consider a whole year worth of data.

Read heavy

However, we might observe that Twitter is a read-heavy application, as opposed to write heavy. Since read to write ratio for Twitter is very high, we should look for optimizing this kind of pattern.

Posting a Tweet Flow - High Level View

What happens in the backend, when someone adds a tweet?

The tweet would likely be saved to a database;

But probably there would be more than just that.

There should be a process that distributes this tweet to the timelines of all followers.

We’ll call this process Fan-out service.

Then, it would also make sense to have a notification service that informs users that someone they like have published new content.

Since we want to make sure, that all three services are involved when posting a tweet, we’ll need to have a manager service.

As this is related to writing tweets, we’ll call it Write API.

Building the Newsfeed

The newsfeed service job is to first collect the user ids of the accounts a person is following.

After that, it will need to fetch the tweets made by those userIDs.

Then, merge the tweets, and finally sort them, in order to retrieve the most relevant tweets first.

Usually, tweets are ranked based on parameters such as relevance, timestamp, and user’s engagement.

However, executing all of these steps for all users can become expensive.

The queries involved here will be heavy on the database; Especially when millions of users are involved.

And, we can’t let the user wait.

A cache would definitely help here, but it probably won’t be enough.

So, it would be great if the newsfeed cache would be already pre-computed.

We would have a separate process to periodically updates the cache.

We’ll see later in the Newsfeed deep-dive how to build such a push-pull mechanism.

APIs

In order to store or retrieve data, we need to communicate with the backend services.

We do this via the exposed APIs endpoints.

Create a new tweet:

Headers: authentication token (JWT)
Parameters: tweet data (text, image, video, etc.)
Response: tweet object with unique ID, text, author ID, and other data

Get a list of tweets:

URL: GET /api/feed
Headers: authentication token (JWT)
Parameters: query parameters (e.g., author, hashtags, time range)
Response: list of tweet objects that match the query

Follow another user:

URL: POST /api/{userId}/follow
Headers: authentication token (JWT)
Parameters: user ID of the user to follow
Response: confirmation message or error message

Delivering Tweets - Fan-out

The goal of this service is to deliver each particular tweet to the users.

There are 2 common ways to achieve this.

We can push the tweets to the recipients, or we can prepare the tweets nicely form and let them be pulled by users.

Push Model

In the Push Model, we can deliver the tweet immediately after publication to all home pages of all the followers.

This model is also called Fan-out on Write.

Pros:

The advantage here is that a post is pushed immediately to followers, and therefore the news feed is populated in real-time.

Cons:

However, if a user has many followers, fetching the followers list and populating the news feeds page for all of them will be slow and resource intensive.
As you image this model involves a large number of writes on the database.
Moreover, for inactive users this approach would be a waste of resources.

Pull Model

On the other hand, we can let the users pull the most recent tweets when they load their home page.

This is why this model is also called Fan-out on Load.

When using the pull-model the most recent feed is retrieved only when the user makes the request.

Pros:

Therefore, this approach involves only a fraction of write operations on the database. (small number of write operations on the database)
Also, in this approach we won’t update the news feed for inactive users. (resource not wasted on inactive users)

Cons:

However, the obvious downside of this approach is that users won’t be able to view recent feeds, unless they "pull" the data from the server. And, fetching the news feed on demand could be slow. (no real-time news feed)

Push vs Pull

So, what shall we choose?

In general, we want to push the feed as fast as possible.

However, the push model won’t be achievable for all users.

For instance, for famous users that have huge number of followers, and that also tweet a lot it will be a storm of writes to the database for any tweet they post.

So, in this case, we let the followers pull new content on-demand to avoid an overloading the system.

Therefore, we use a kind of a hybrid between the two models, depending on the number of followers a user has.

This way, we combine the benefits of the two models.

Search Service

How can we provide a search functionally while considering the amount tweets that need to be indexed, and the amount of search requests that are being made by all users?

A traditional relational database will probably not be performance enough considering the size of data, and the fact that searching would be made on text rather than keys.

So, a good candidate for such a use case would be a search engine like ElasticSearch or Apache Solr.

These search engines could index, search, and analyze huge amounts of data quickly, possibly in real-time, and give results back within milliseconds.

But what is the secret sauce behind these databases?

They both are using an inverted full-text index.

In simple terms, this means that whenever a tweet is posted the tweet will be passed through a chain of tokenizers and analyzers.

After the text processing is done we’ll add these words in a big distributed index.

This index will basically be a mapping table where each word will have references to the tweets that contain that particular word.

Since the data will be laid-out in the exact form necessary for searching, it will be extremely fast.

Suppose a user searches for word ‘twitter‘, then the engine will go through the table it will immediately find the word ‘twitter‘, then it will get the references of all the tweets in the system.

Additionally, the results that are returned can be ranked based on relevance engagement and freshness.

Furthermore, this index table will be distributed across multiple nodes.

This means that it would be able to scale out as the data continues to expand.

Next, we’ll see how the functionality of Trending Topics can be built on top of the searching functionality.

A topic becomes a trend, when people start searching for it.

Therefore, using the search service we could gather statistics about the most frequently searched queries, hashtags, and topics, in the last day for example.

Furthermore, we can update them every hour using some sort of scheduling mechanism.

Additionally, we can customize the trending results based on location and user’s interests.

Data Partitioning

Because there are so many users and tweets, it’s impossible to keep all the data on a single server.

First, it would not fit.

Second, it would be slow considering for retrieving data.

So, to scale out our databases we will need to split the data into partitions, and distribute them on multiple nodes.

This kind of partitioning is also known as sharding.

The partitioning can be done using different classic strategies such as the following:

Hash-Based Partitioning
List-Based Partitioning
Range Based Partitioning

These strategies, they all offer even data distribution depending on use-case.

However, these methods are disappointing when it comes to adding or removing a node from the cluster or when one of the node breaks down.

In those situations, a full re-distribution of data would be necessary and that would be very expensive.

A better approach would be to use Consistent hashing because it limits the re-sharding area, only to fraction of the whole data.

Conclusions

Every company has its unique requirements and constraints, and we should design a system to fit those requirements.

As always, there is no standard way to design a system.

This is also the case in the interview.

What’s important is to understand the problem, to analyze the issues step by step while discussing the benefits and tradeoffs of each design.