Modeling for Intelligence: Conceptual, Logical, Physical

It’s a fact: intelligence in big business, big data and big analytics begins with data modelers. And understand this: we data modelers are a special breed: we LIKE translating business requirements into conceptual, logical, and physical data models. We enjoy analytical work with tools like Entity Relationship Diagrams (ERD), which are valuable formats that we use to describe data in a business domain. Really? Yes, an ERD is cool; it uses “entities” to define business terminology (or objects) and business relationships, as well as all of the data points as attributes. And there are many ways a data modeler can work with the various data types to create models for actionable intelligence. Let’s talk about ERD’s.

First, there is the quintessential Conceptual Data Model; it shows the business terminology that business leaders use every day. This model is usually displayed on a single page and can be understood and appreciated very quickly. Because of this, it usually shows only the highest-level terms, omitting lower-level detail. This model does not have any attributes associated with the entities (or nouns) but it does show the relationship between those entities. Such objects as “Sales Order,” “Customer,” and “Payment” may be seen here. This model can be used to outline the business quickly and to vet the overall correctness of the model. This is also a good model to show to people who are not technically oriented but know the business well. Changes in the business here should be reflected in the Logical Data Model. Key Point: This model does not reveal any business rules, rather, just connected high-level business concepts.

Once the conceptual model is validated, the Logical Data Model is the next step. A logical data model is an explosion of the conceptual data model into all of the entities (many, many more than in the conceptual model), attributes, and relationships that completely define a business. This really gets into the detail of the business where the “Sales Order” entity may be attributed with “Sales Order Identifier,” “Customer Identifier,” “Total Sales Amount,” (and many more). This model is the base or “master” model from which all other models are derived. It is important to use this model in a continuous mode–as the business changes, the logical model should be made to reflect those changes. These changes are then reflected, as needed, into the physical data model. Key point: This model shows all the detailed business rules via relationships and is one reason that this model is so valuable to maintain.

Physical Data Model has been made ready for use on a relational database management system (RDBMS) by making alterations that may improve access speed, data update times, user access, data storage efficiency, and many other possibilities. This stage requires a very good knowledge of the intended use by the business users across the entire enterprise and also requires an extensive knowledge of the RDBMS and its capabilities. The physical data model should also change when the logical model has to be changed (when the business terminology changes, or when other important changes need to be made in the business). Key point: This model often loses many business rules that must be then transferred to external processes surrounding the maintenance of the RDBMS. As the maintenance of the RDBMS changes, the Logical Data Model must be consulted to review and validate that the business rules are being properly applied to the maintenance and utilization of the RDBMS.

One of the advantages to using these models is that, because they are validated in sequence, changes are fairly rare and specific to the business needs. This means the models are relatively stable over time.

Stable models are a good thing.

The really good ones could be called supermodels. The most attractive models solve the most complex problems with stability and elegance. I won’t digress on the differences between a supermodel on a runway at Sak’s versus a supermodel in a Sak’s Teradata Data Warehouse.

The good ones begin, however, with this three-tiered approach, which provides the “best practice” pathway for creating and managing data models.

00-mark crosby blog bio march 2015Mark Crosby is a Senior Product Manager with Architecture and Modeling Solutions and is responsible for development of the Teradata Transportation and Logistics Data Model. Mark has been with Teradata since 1989 and has over 45 years’ experience in software development, data center management, database administration, and data modeling, having also worked for the U.S. Department of Defense, The American Banker’s Association, and consulting firms.

Leave a Reply

Your email address will not be published. Required fields are marked *