Partitioning can improve throughput in serialized systems

Last Updated on Mar 04, 2021

To guarantee Serializable Isolation, the database should ensure that the result of a transaction, whether in serial or concurrent conditions, is the same. While applications can use different serialization implementations with certain constraints, they can also opt to execute transactions serially.

One issue with executing transactions serially would be that the throughput will be limited to whatever can be processed on a single CPU. It is possible to use bigger CPUs with more cores, scaling up vertically can be costly.

Instead, applications can choose to partition data across different nodes so that transactions only need to read and write data into a single partition. This would allow the database to use multiple CPUs across machines, with each CPU being responsible for its own partition.

But the ability to partition data has to be considered while developing the application's data model. Certain types of databases, like simple key-value stores, can be partitioned easily. But those with secondary indexes, like in the case of relational databases, are not good candidates for partitioning because they will require a lot of cross-partition coordination.

© 2022 Ambitious Systems. All Rights Reserved.