Long-running queries need a consistent snapshot of the database


Last Updated on Feb 16, 2021

Long-running processes like analytics queries or full database backups take a long time to complete. During this period, a consistent view of the database as of start time has to be available. The process will need to continue to see data that has long been overwritten or deleted.

If data changes happening in the background were visible, the processes may see and consume partial transaction data, corrupting the data sanctity. So such long processes must see data as if it is frozen in time.

The object visibility rules are pretty straightforward: all object values that were committed and not deleted, as of process initiation time, are visible.

A database can provide a consistent snapshot while incurring only a small overhead by never updating values in place and maintaining multiple versions of objects whenever they change. Each long-running process gets a snapshot of its own, thus bringing forth the importance of maintaining multiple versions.

At some point in time, when the database determines that an old object value is no longer referenced by any process or transaction, the version is safely discarded.


© 2022 Ambitious Systems. All Rights Reserved.