IDs are meaningless to humans, and that is their precise utility


Last Updated on Jan 22, 2021

Data that is meaningful to humans tend to change over time. If you duplicate something with meaning, you will need to update all redundant copies of the data.

IDs have no meaning to humans - they only help uniquely identify information. IDs only make sense in a context - a table record, an API request, a computing machine. They point to information that can change in the future.

person = Persons()
person.ID = 153479919 # A typical incrementally generated ID
person.ID = "X80Qu2sBMcYBL_rB5lGJ" # A String ID
person.ID = "132d0f9d-01be-408a-b089-fc276d08ed31" # A Universally Unique ID

Identifiers remain the same, while what they point can change over time. A person identified by a unique ID can change everything about themselves, including their name, email address, and social security number, but they would remain the same individual. IDs help us preserve this link forever.

IDs group meaningful data under a single umbrella, and everybody else refers to the data group using the ID. Conversely, IDs provide a mechanism to gather the information stored in separate locations when the data structure is not simple enough to be held in one location.

In the world of relational databases, IDs that uniquely identify a record in a table are called Primary Keys.

CREATE TABLE Person (
ID int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Age int,
PRIMARY KEY (ID)
);

These keys can be sequential or randomly generated. A unique attribute that is guaranteed not to repeat can serve as a primary key, like an Email Address or Social Security Number. But it is preferable to go with randomly generated identifiers.

Even when a unique attribute is used as the primary key, systems can use a surrogate key (an arbitrary system-generated key unrelated to application data) to identify the record uniquely. If we consider User records as an example, each user may have a unique email address, which can serve as a primary key. But if we want to allow users to change their email address in the future, we have to treat it as a piece of data and not use it as a primary key.

We can represent constants as a special case of identifiers called Enumerated Types. The enumerator names are constants - a combination of uniqueness and identity - and not expected to change in the future. You can compare an enumerator's values against each other, and the system (typically the programming language) decides how to store them in memory for optimization purposes. But since they are identifiers, it is possible to discover their usage across the codebase.

from enum import Enum
class Color(Enum):
RED = 1
GREEN = 2
BLUE = 3

A Boolean value is a good example of a enumerated type, with TRUE and FALSE represented as constants.

class Truthy(Enum):
TRUE = 1
FALSE = 0
>>> is_active = Truthy.TRUE
>>> is_active
<Truthy.TRUE: 1>

© 2022 Ambitious Systems. All Rights Reserved.