Document data models have excellent data locality


Last Updated on Jan 07, 2021

Document data models are similar to hierarchical databases in that they store nested records. So they are ideal for handling tree-like data structures that need colocation of nested records within the parent.

There is a performance advantage to this storage locality. If the data is split across multiple tables like in a relational model, the system may require several lookups to retrieve all data, needing more disk seeks and time.

Application data, especially in the OOP world, tends to contain a nested structure, like XML or JSON. So there is a natural advantage for such applications in using a document database.

Document Size

This advantage only applies if the application needs large parts of the document at the same time. The database loads the record entirely, so using only a part of the document is often wasteful.

Also, subsequent updates rewrite a document. So it is preferable to architect the data model such that document sizes remain small. Applications should actively avoid writes that would increase the size of the document.

These performance limitations significantly reduce the possible circumstances in which document databases can be useful.

Relationships

Document models are great for storing one-to-many relationships, as folded data within the parent document. But they are not efficient at representing many-to-one and many-to-many relationships.

Like relational models, document models represent these relationships by referencing the related document by a unique identifier (called the Document Reference).

Applications gathering data from multiple documents via references will have the same issue of several calls with document databases.

Systems can overcome this problem by creating denormalized forms of data ready to be retrieved and shipped over the wire, though it adds to the application complexity.


© 2022 Ambitious Systems. All Rights Reserved.