The Key is “Just Be Natural”

A questioner asked how to select a natural key in a business data model. In this case, the business is modeling client organizations across international boundaries. He wrote:

In a business data model I was wondering about something like National Registration Number (varies by country), Country, and National Registration Number Qualifier (if there could be multiple registering organizations – which raises other considerations).

A natural key (as opposed to, say, a surrogate key) has the two-fold goal of being (1) unique, and (2) “naturally” memorable. A memorable attribute of a person or an organization is their name, but — alas — that is almost never guaranteed to be unique. In a business setting, where the business has control of the organization’s name, the name is a much better candidate. However, it’s not clear from the question whether the business has control of the name.

It is important to know that the “National Registration Number” is a surrogate, and not a natural key. At least this is so in the United States, where this is the “Federal Employer Identification Number” (EIN) or the “Taxpayer Identification Number” (TIN) — a number assigned by the Federal government but not indicative of the business itself. Like any surrogate key, the number is arbitrarily assigned and usually forgotten by the business itself. (Technically, the TIN is not guaranteed to be unique either, though in practicality it is.)

My point, I guess, is that the purpose of having a natural key is to simplify look-up — and typically you are simplifying the look-up of a surrogate key. Within the data model, all relationships are by the surrogate key, a system-defined identifier that is guaranteed to be unique (though not memorable, meaningful or derivable).

So a suitable natural key could be the name of the organization or an organizational identifier, coupled with some geographic qualifier (in the US, the country is not specific enough; many business names are only unique within a state).

But it’s important, I think, to keep in mind the reason why a natural key is desired: it is memorable (naturally) and it is unique. That only matters where the user enters into the data model — it doesn’t matter, and shouldn’t be used, to navigate relationships within the model.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: