When we create a one-to-many relationship between two entities, we copy the primary key from the entity on the one side (the parent entity) over as a foreign key to the entity on the many side (the child entity). We traditionally copy over all of the metadata associated with the primary key such as name, format, and definition. The one exception where a foreign key can have a different name than its primary key is when there is more than one relationship from the same entity, and to avoid having two or more data elements with the same name in the same entity, we “role name”, meaning giving the foreign key a different name than its primary key.
So for example in the data model below, in Employee we have a foreign key back to Employee Type which has the same name as its primary key (Employee Type Code), and we have two foreign keys in Customer that point back to Employee. These two foreign keys are role-named to avoid having Employee ID twice in the Customer entity and to provide additional meaning as to what the foreign key represents. Primary Contact Employee ID points back to the Employee who is the primary contact for this Customer, and Initial Contact Employee ID points back to the Employee who initially made contact with this Customer.
In this data model, the definition for Employee Type Code is:
Employee Type Code is a numeric value assigned to each organization-wide understood category for an Employee. These codes have business significance and are for Human Resources internal use only. Examples:
01 = Full time
02 = Part time
03 = Retired
And the definition for Employee ID is:
Employee ID is the unique, mandatory, and stable business key for each Employee. It is assigned by human resources and used throughout the organization. Example:
Bob Jones is assigned the Employee ID 123-AB-872123
Definitions copy over too from primary key to foreign key, so the Employee Type Code foreign key in Employee has the same definition as Employee Type Code in Employee Type, and Primary Contact Employee ID and Initial Contact Employee ID have the same definition as Employee ID in Employee.
Should we at times modify the foreign key definitions so they are more relevant to the relationship they represent? After all, if I just see the Employee ID definition in Primary Contact Employee ID it is not very descriptive.
What guidelines would you apply in deciding whether a foreign key should have a different definition than its primary key?
Whenever we role name a foreign key, whether it is because of having two or more relationships from the same entity or we role name for another reason such as to better describe the foreign key, we need to revisit the foreign key’s definition. Sometimes we replace the default primary key definition with a completely new definition, and sometimes it requires just adding to the existing primary key definition to make it more precise and context-dependent.
So we should modify the foreign key definition when we modify the foreign key name. In addition, there are times when we should modify the foreign key definition even when the name does not change. As Walt Cook, DW/BI Section Chief, says, we should modify the foreign key definition “…when the definition migrated with the foreign attribute does not clearly describe what the attribute values mean for the child entity type to which the foreign attribute belongs.”
Here are some comments on redefining foreign keys that have been role-named:
- Liz Fanning, Senior Data Architect: The foreign key definition should always be contextually accurate, so we often change the automatically migrated definition so its meaning is defined from the perspective of the table in which it resides.
- Brad Lindsey, Business Intelligence Engineer: …the decision goes back to the same decision to use more than one role for the primary key in the child table. If the business rules call for using roles in the child table it would seem to be consistent to also give a specific definition to the roles.
- Gary Whitney, Information Architect: The two foreign keys from Employee to Customer should each have their definitions in the Customer entity modified to describe the Role that they play in the relationship.
- Rich Kier, Data Architect: …in my opinion, the definition in the child should differ more often than not or any time there is the potential for multiple relationships.
- Craig Mangum, Data Architect: I always change the definition if the foreign key has been role named.
- Kevin Heinsey, Data Architect: At a minimum, it is useful to have the same definition cascade from the parent to the child. But this is not ideal. There is almost always benefit in describing the role that the foreign key plays within the new record. But when there is more than one foreign key to the same parent, it is absolutely necessary to change the definition. Relying on the unique column name is simply not good enough.
- Fred Cohen, Data Governance: If you “need” to change the name in the foreign key to clarify meaning, then the definition should be different from that of the primary key to clarify the definition.
- Mike Harris, retired data modeler: The definition should be different if the interpretation of the elements meaning is different. The “initial contact employee id” is NOT simply an employee ID but that ID that belong[ s | ed ] to the employee who initially made contact with the customer.
- Joyce Norris-Montanari, Consultant: I would change the definitions when I changed the rolename.
- Vicky Li, Data Modeler: When a role name is used, the definition definitely needs to be updated. For example, Primary Contact Employee ID and Initial Contact Employee ID should have different definitions. But in the case of Employee Type Code, I think it’s OK to keep the same definition as in its parent table.
A number of practitioners agree when updating the foreign key definition, you can build it upon the existing primary key definition:
- Ray Doggendorf, Database Architect: Whenever role naming a column it is usually best to do two things. 1. Retain the original column definition wherever it currently exists. 2. Prefix the original column definition with additional information that explains the role wherever that new role-named column will exist.
- Michael Brackett, Consulting Data Architect: The data definition solution is to inherit the existing definitions and further qualify them, but not repeat the existing definitions.
- Jane Chatterley, Senior Consultant, Arch CoE: My guideline is that context should be *added to* foreign key definitions – leave the inherited definition in place and then append “In this context……”
- Kenneth Hansen, Data Architect: The Inherited Definition should always be elaborated to identify the context.
A number of practitioners always modify the foreign key definition, even if role naming is not performed. Let’s hear from some of these folks:
- Monte Montesano, Data Modeler: Very timely challenge. We are just examining this topic for our standards. My personal inclination is to always provide an enhanced definition for foreign key attributes. Rationale: the original definition applies to the attribute as a primary key, describing the attributes use as an identifier. In the child table, the use is always different, identifying a characteristic that applies to the child row. For example, I would enhance even the Employee Type Code definition within the Employee table to be something like “Employee Type Code identifies the organization-wide category for this Employee.”
- Graham Witt, Associate Director: I never use exactly the same definition for a foreign key as for the corresponding primary key as the meanings are quite different when context is taken into account.
- Thijs van der Feltz, Information Architect: Even if there is a single relationship between the same two entities, I believe that a foreign key should always have a definition that describes its relationship to the parent entity e.g. “The ID of the EMPLOYEE that is the primary contact of a CUSTOMER”.
- Dave Hay, Guru: Since the foreign key is, in fact, the implementation of a relationship, the definition should mostly be about the relationship. Thus, the “Primary Contact Employee” should be something like “the unique identifier of the Employee who is the most important person for the customer to call or visit if there is a problem.”
- Madhu Sumkarpalli, Business Intelligence Consultant: One should modify the foreign key definition so that the foreign key has meaningful information within the context of the child table. As one of the biggest consumers of data models, I build logical models in MicroStrategy (BI tool) based on the information provided in the data model (both LDM and PDM) and business requirements, in addition to, many other details available. Many times, we fall into a trap of not fully understanding the meaning of a foreign key if it has the same definition as that of the primary key in the parent table.
- Mary Komorowski, Enterprise Data Modeler: Yes, we should modify foreign key definitions so that they are more relevant to the relationship they represent. In fact, I believe all definitions for attributes within an entity should have some context within them (except perhaps those that are obvious, such as last modified date).
As part of your corporate standards policy, have a best practice on when to change a foreign key definition. At a minimum, update the foreign key’s definition when the foreign key’s name changes (role-naming). Optionally, you may find it valuable to update every foreign key definition, especially when the foreign key means something different as a foreign key than its originating primary key. Until the next challenge!