Tech Unpacked – Research & Fundamentals with Nitin Sharma: Database Sharding

Tuesday, May 10, 2022

Database Sharding - Key Concepts

Sharding a database is a common scalability strategy used when designing server side systems. The server side system architecture uses concepts like sharding to make systems more scalable, reliable and performant.

Sharding is horizontal partitioning of data according to a shard key. This shard key determines which database the entry to be persisted is sent to. Some common strategies for this are reverse proxies.

Let's take a common example of pizza and break it into slices and call your friends over. Each of your friend is going to get one slice of pizza. What you have done effectively is partitioned the pizza according to each friend's share. Just like that we can have serves which are going to be taking the load of the requests.

How we can convert our tech requirement to the pizza model.

Basically each server going to handle requests based on partition say 1-100 userIds on 1st server and 101-200 on 2nd server and so on.
This kind of partitioning which uses some sort of a key to break the data into pieces and allocate that to different servers is called horizontal partitioning.
Servers which we are talking about here are database servers.

While we are discussing about database tuning, we always have to make sure we are maintaining key attributes of the database.

Consistency
Availability

What should be shard your data on?

We are using userId in our case but in applications like tinder which use location, you could shard on the location and if a person says find me all the users in city X and X may fall in one specific sharda and all you need to do just read through this shard.

Problems doing Sharding

Joins between shards
Fixed number of shards

With hierarchichal sharding approach we can overcome this problem, we can break one shard into sub-shards and there can one master sort of thing which can decide which mini shard request needs to route.

Best Practices

Create index on shards

This index can be on completely different attribute compared to userId.

One of the good example is like find me all the people on NewYork having age greater than 50.

Use Master Slave architecture on the shards to avoid SPOF.

Read requests can go to slaves.
Write requests always goes to master.
In case master fails, slaves choose one master among themselves.

Conclusion

Conceptually it's easy but in terms of doing practically it's quite tricky because consistency is really tough to do.
If starting with a new system, take other mechanisms into consideration like indexing, noSQL databases which internally uses these these kind of concepts.
Indexing and ready made solutions would be way to go before to think of implementing sharding by our own.

Tech Unpacked – Research & Fundamentals with Nitin Sharma

Popular Posts

Search This Blog

Tuesday, May 10, 2022

Database Sharding - Key Concepts

No comments:

My Profile

Featured Post

🚀 Introducing the Universal API Testing Tool — Built to Catch What Manual Testing Misses

!! IMPORTANT LINKS !!

!! INTERESTING TALKS !!

Contact Form

Labels

Total Pageviews