Sharding in DBMS
Shard means a small part of the whole, i.e., sharding is a process of dividing a database into smaller data partitions called shards. These shards are not only smaller, but they are also easier to manage, store, and query. Each shard typically holds a subset of the total data.
Need for Sharding
Consider a university database that stores present and past students, about 10,000 in number. If we want to find only the current students, the system may need to scan a large portion of those entries, which can be heavy on the server and can reduce performance.
If we shard or divide the data across multiple data nodes based on something like admission year, it becomes much more efficient to query student data by year, because only the relevant shard needs to be queried instead of the entire database.
How it Works
The data is distributed across different nodes or database partitions based on some predetermined criteria like userId, region, or year. This allows the system to scale horizontally by adding more shards when needed, instead of relying only on vertical scaling (upgrading hardware), which is usually more expensive.
To query data from a sharded database, the system needs to know which shard contains the required data. This is done using a shard key, which is a field used to decide where the data will be stored. When a query is received, the system uses the shard key to determine the correct shard and sends the query to the appropriate node.

Features of Sharding
Sharding makes each database partition smaller
Sharding improves query performance when queries are shard-key based
Sharding makes data easier to manage at scale
Sharding helps distribute load across multiple servers
Each shard handles reads and writes for its own data
Failure of one shard does not directly stop other shards from working (if replication is used)
What if a Shard Fails
Databases often use replication (commonly called primary-replica or master-slave architecture). If a primary shard goes down, one of the replica shards can be promoted as the new primary until the original one recovers.
The primary shard usually handles write operations, while read operations can be handled by multiple replicas using load balancing. These replicas continuously sync data from the primary to reduce chances of stale data.
