Create a logical data model?

Create a logical data model?

You are a data modeler creating the design for a non-relational database. “Non-relational” means the underlying database does not follow relational database theory – for instance the database could be hierarchical or object oriented. It is even possible that the underlying structure will be a set of files such as XML files.  You are deciding whether you should create a normalized logical data model or just jump straight into a physical design? What would you do and why? Is there value to you and your organization in creating a logical normalized design even though it will be implemented as a very different set of structures?

Feel free in your response to include any types of assumptions you feel are important around architecture and design. Thanks to Sue Hannigan for providing the idea behind this scenario.


Should we always first create a normalized logical data model, regardless of the planned database platform? The answer to this question from all our respondents was “Yes!” Below are the major reasons why followed by the two main motivating factors against creating the normalized logical data model.

Creating a normalized logical data model regardless of the planned database platform provides us with:

1. Understanding of the business

Anytime we normalize it “forces” us as modelers to understand how the data elements relate to each other. For example, how does Customer Identifier relate to Account Code? Can a Customer have more than one Account? This gives us more knowledge of the business hence we will make better design decisions for this project and future projects. It helps us make sure we’ve completely and correctly captured the data elements and business rules.

Ben Ettlinger contributed:

Even if the shop only uses the non relational database, the logical model would play an important role in finding out and then representing business rules. Aside from the business rules that are inherent to the model, relationships, super types, subtypes etc., during the building of a logical model many important business rules are captured in the description, comments and notes.

Frank Palmeri contributed:

Yes I would certainly create a normalized logical data model, no matter what the physical database will eventually be. That’s the whole point of the analysis; you will be able to make sure by creating the logical model that you have all the necessary and required entities, at the very least.

Bill Lewis contributed:

Logical data models represent business data requirements, and these requirements are independent of the implementation technology (or technologies) that may be chosen at any point in time.  The process of logically modeling business data requirements removes ambiguity and greatly decreases the risk of the implementation not meeting the business requirements.

2. Consistency and validation

Comparing two or more normalized models makes it easy to see the areas that overlap. This can help in creating consistent naming standards and reusing data element names where appropriate. It can also help us spot gaps or errors on our model. For example, if normalizing our design leads us to a Customer Type entity, we might notice a similar entity in another model with an additional data element we might need. Or we might notice the description field in this similar entity is 30 characters while ours is only 20. Normalization creates more consistency in our corporations as well as provides a validation mechanism.

Ben Ettlinger contributed:

…the logical model makes sense in order to determine whether or not the meta data in this non relational database is the same as existing meta data in the shop. Entities could be match to entities, attribute to attribute, and that would lead to consistency of data types.

3. Flexibility

One of our primary goals as data modelers is to build flexibility into our designs. This also includes flexibility in physical database platforms. For example, what if Phase I of a design is in a non-relational database package while a future phase is built on a relational database?

Ben Ettlinger contributed:

…if the de-normalization will be documented in some fashion it will be possible to work backwards from the non relational to the relational model. The logical model will then also play an important role later on, if there is a need to convert the non relational data base into a relational data base. Along those lines, a number of years ago we created a logical model of an existing billing system written in IDMS (A network data base). The logical model has become an important factor in a new effort to convert the 15 year old main frame system to a more up- to – date platform. Even if we would not have the upcoming conversion the model has been more than worth the effort.

4. Easier to support and maintain

If a picture exists (even if it is only a logical and not a physical model), if becomes easier to educate new people on a team and help bring them up to speed faster. A project I am working on now is fairly complicated, and I believe what would help clear up many of our questions and decrease support activities would be a logical data model showing the components of this complex system and how they relate to each other. A model was never originally created for this project because the original database was not relational. One of my first activities on this project is to “reverse engineer” the design. That is, take the physical database and create a logical model from it so that it becomes clear what content and business rules exist in the application.

5. Time and Money – strong motivations for not creating a fully normalized data model

Even with strong reasons for normalizing the design, we must face the facts that normalizing takes more effort and therefore more time and money. A project team with tight deliverables may not be most willing audience to dedicate extra time for benefits that may not be immediate. For example, flexibility may not be something identified as a high priority in meeting next week’s programming deadline. Many of the most passionate discussions I have had are focused on the value of the normalized design. It makes it a slightly more difficult “sell” if the underlying database for at least the first phase of the project is going to be non-relational.

Bill Lewis contributed:

In this scenario, there is danger in using the target implementation technology as an excuse to not do a good job of data design.  And even if a good logical model is created, there may likely be a tendency to effectively throw out the logical design at implementation time in favor of gratuitous denormalization, again using the target technology as an excuse, e.g., “only relational designs need to be normalized”.  Beware!”


I’d like to conclude this design challenge with a short homework assignment. What are some of the techniques or tactics we can use to convince or educate folks that the benefits of normalization outweigh the near term additional Time and Money investment? Wait, don’t answer just yet! Stayed tuned for Design Challenge #3!