Skip to main content

Command Palette

Search for a command to run...

Scaling in System Design

Updated
1 min read

Scaling refers to increasing the capacity of a system so that it can handle more users, traffic, or functionality with acceptable performance as load increases. In system design, scaling aims to improve responsiveness and throughput while maintaining reliability. It is mainly classified into vertical scaling and horizontal scaling, each with its own trade-offs.

Vertical Scaling

Vertical scaling increases the capacity of a single system by upgrading CPU, RAM, storage, etc. to handle higher demands. However, vertical scaling has physical and cost limitations, so it can only scale up to a point. One advantage is that the modules in the system can communicate efficiently, since everything is within the same machine and no network calls are required. But relying solely on vertical scaling can become expensive and less practical for large-scale distributed systems.

Horizontal Scaling

Horizontal scaling increases capacity by adding new systems alongside the existing ones, allowing them to share the workload. These systems often communicate using APIs or RPC(Remote Procedure Call), and load balancing is required to efficiently distribute traffic and reduce response times. Horizontal scaling improves availability and fault tolerance since requests can be redirected if one system fails. It is more practical for large-scale environments because we can keep adding systems to meet increasing demand, although it introduces more operational and architectural complexity.