System design interview
March 11, 2023
System design interviews are a critical component of technical interviews, particularly for software engineering roles. In these interviews, candidates are asked to design a system that meets a particular set of requirements and constraints. The goal is to evaluate the candidate's ability to think systematically and design solutions that are scalable, fault-tolerant, and performant. In this article, we'll explore the different aspects of a system design interview
What qualities companies are looking for in candidates
Problem Exploration
This refers to how well the candidate can analyze the problem and identify the key requirements and constraints. Companies are looking for candidates who can ask the right questions, clarify assumptions, and propose creative solutions.
For example, if the problem is to design a social media platform, the candidate should explore the different features that the platform needs to support, such as user profiles, friend connections, news feeds, and messaging. They should also consider the scalability and performance requirements of the platform, as well as any security or privacy constraints.
Proxy Evaluation:
- Functional requirement
- Non-functional requirement
- Assumptions
Handling Data
The ability to handle data at scale efficiently. This includes skills such as data modeling, data storage, and data retrieval.
For example, if the problem is to design a search engine, the candidate should consider how to store and retrieve a large index of web pages efficiently. They should also consider how to rank the search results based on relevance, which requires sophisticated data modeling and algorithms.
Proxy Evaluation:
- Data API
- Data Storage
Component Responsibilities
This refers to how well the candidate can identify and define the responsibilities of each component of the system. Companies are looking for candidates who can think systematically and have a strong understanding of the underlying architecture.
For example, if the problem is to design an e-commerce platform, the candidate should consider the different components of the system, such as the user interface, the database, the payment gateway, and the shipping logistics. They should also consider how these components interact with each other and how they can be scaled and optimized for performance.
Proxy Evaluation:
- Service separation of concern
- Database separation of concern
- API Gateway/Load Balancer
Completeness of Solution
Provide a complete solution that satisfies all the requirements and constraints of the problem. This includes considerations such as scalability, fault tolerance, and performance.
For example, if the problem is to design a streaming video platform, the candidate should consider how to handle the storage and delivery of video content at scale. They should also consider how to ensure high availability and fault tolerance in case of network or server failures.
Proxy Evaluation:
- Address all functional requirement
- Address all Non-functional requirement
- API and system Completeness
Tradeoffs
Identify and explain the tradeoffs of different design decisions. This includes tradeoffs such as complexity vs. simplicity, consistency vs. availability, and cost vs. performance.
For example, if the problem is to design a chat application, the candidate should consider the tradeoff between message delivery latency and message consistency. They should also consider the tradeoff between using a centralized or decentralized architecture for the application.
Proxy Evaluation:
- Coming up with at least 2 solutions
- SQL vs NoSQL, depends on throughput data
- ReST vs gRPC vs GraphQL
- Pull vs Push
Quantitative Analysis
How well the candidate can analyze and quantify the performance of the system. Companies are looking for candidates who can use metrics and benchmarks to evaluate the effectiveness of the solution.
For example, if the problem is to design a recommendation engine, the candidate should consider how to measure the accuracy and relevance of the recommendations. They should also consider how to optimize the recommendation algorithms for performance and scalability.
Proxy Evaluation:
- Read per second, Write per second, Read Write ratio
- Storage Consumption
- Bandwidth consumption
Deep dive
Demonstrate a deep understanding of a particular aspect of the system. This includes being able to explain the underlying technology or algorithms, or being able to optimize a specific component for performance or scalability.
For example, if the problem is to design a content delivery network (CDN), the candidate should be able to explain the different caching and routing strategies that can be used to optimize content delivery. They should also be able to explain how to measure the effectiveness of the CDN and identify areas for optimization.
Proxy Evaluation:
- Database sharding and partitioning
- System scalability
- Authentication
- Security
Hands On
- Functional requirement
- Non-functional requirement
- Quantitative Analysis
- High level design and data flow
- Data API (ReST, GraphQL)
- Data schema and Data Store
- optimization
1. Functional requirements
- Understand the problem domain and user needs to identify the key functional requirements
- Prioritize requirements based on their importance to users and the system's goals
- Be able to explain trade-offs between different requirements and how they impact the system's design and implementation
- Be careful with assumptions, because it can lead to different problem
Example:
- Use cases- Stakeholders- “As a user I can post comment to a post”
2. Non-functional requirements
- Be familiar with common non-functional requirements such as performance, scalability, security, and usability
- Understand how non-functional requirements affect the system's architecture and design
- Be able to propose solutions for meeting non-functional requirements and explain trade-offs between different options
Example:
- Reliability: 99.99% system availability- Scalability: can handle up and down traffic- Security: only one public endpoint. Code is executed safely- Durability: store data for 10 years- Latency: p95 200ms- High Availability vs Strong Consistency
3. Quantitative analysis
- Be comfortable with basic statistics and data analysis techniques
- Understand how to collect and analyze data to inform system design and optimization
- Be able to explain how quantitative analysis can help identify bottlenecks and areas for improvement in the system
Example:
- Number estimation (how many users, how many time use cases, read heavy vs write heavy, read to write ratio)- Read per sec and Write per sec. Read to Write ratio- Storage consumption- Bandwidth (not always important)
Example with number:
Assumptions:- Active users: 400mio/day - Only 20% of users write comment - Read to write ration -> 100 : 1- Usecase: - post comment - read commentNumbers:- 1 day = 86400 sec ~ 100K sec - write per second: 500mio/day * 20%/day 5 * 10^8 * 0.2 / 10^5 1 * 10^3 write/sec - Read per sec 100 * write per second 10^2 * 10^3 /sec 10^5 read/sec
4. High-level design and data flow
- Be able to create a high-level architecture diagram that shows the key components and their interactions
- Understand how to break down the system into smaller components that can be developed and tested independently
- Be able to explain how data flows through the system and identify potential bottlenecks or areas for optimization
Example:
- Draw: User → Load Balancer → API Gateway → Service → Database- Draw the arrows- If data flows through multiple system, write the order of the data flow
5. Data API
- Understand the differences between REST and GraphQL and when to use each one
- Be able to design a data API that meets the system's requirements and is easy to use for other developers
- Understand how to handle errors and edge cases in the API and ensure it is secure and scalable
Example:
- REST vs GraphQL- POST vs GET- Both request and response.REST:POST /:post_id/comment request = { auth_token: string, post_id: UUID, comment: string }response = { comment_id: UUID }
6. Data schema and data store
- Understand the trade-offs between different types of data stores such as relational databases, NoSQL databases, and file systems
- Be able to design a data schema that meets the system's requirements and is easy to maintain and extend
- Understand how to ensure data consistency and integrity in the data store
Example:
- SQL vs NoSQL- Data type- Object storage- In-memory storage, caching, message queue- Database partitioning/sharding
7. Optimization
- Understand common optimization techniques such as caching, indexing, and load balancing
- Be able to identify bottlenecks in the system and propose solutions for improving performance and scalability
- Understand how to measure the impact of optimizations and balance the cost and benefit of each one
Example:
Authentication and AuthorizationMonitoringMobile specific knowledge- Battery- Offline Tradeoffs- SQL vs NoSQL- Read heavy vs write heavy- REST vs GraphQL vs gRPC vs Protobuf- Pull vs Push
8. Core puzzle
- Be able to identify the key features or components of the system that are critical to its success
- Understand how to prioritize development efforts based on the importance of each core puzzle component
- Be able to explain how the core puzzle fits into the overall system design and how it contributes to meeting user needs.
Learning Material
- Parking: https://www.youtube.com/watch?v=NtMvNh0WFVM (opens in a new tab)
- Facebook News Feed: https://www.youtube.com/watch?v=5vyKhm2NTfw (opens in a new tab)
- Spotify top K song: https://www.youtube.com/watch?v=CA-ei3mOCf4 (opens in a new tab)
- Rate limiting: https://www.youtube.com/watch?v=mhUQe4BKZXs (opens in a new tab)
- Design Online Judge: https://www.youtube.com/watch?v=eg0nlYcbLpo (opens in a new tab)
- System Design Interview - Insider Guide by Alex Xu
- [Book] Alex Xu System Design Youtube channel: https://www.youtube.com/c/ByteByteGo (opens in a new tab)
- [Book] Grokking The System Design Interview
- [Book] Designing Data Intensive Application