A data modeler joined a panel….

A data modeler joined a panel….

A data modeler joined a panel…sounds like the beginning of a good joke. However, a couple of weeks ago I participated in a data management panel at Columbia University.

I arrived early and therefore sat in on the Artificial Intelligence (AI) panel that was scheduled right before mine. Many of the conversations hovered around topics such as deep learning, data mining, and machine learning. Topics such as robotics and predictive analytics were also discussed. Many of those in attendance were in the Advanced Analytics program at Columbia and currently (or will be soon) in the job search process.

Towards the end of the AI panel, the moderator asked for questions from the audience, and as can be expected, most of these questions focused on skills needed to be hired in the field of AI. There was a lot of talk in this area, but the theme kept coming back to one main skill: Communication. Experts on the panel emphasized that the easy part is learning tools such as TensorFlow. The hard part is knowing how to communicate with users, executives, and coworkers to get requirements and design a solution.

(As an aside, it’s interesting that “communication” was also the most important skill that came out of this past year’s Data Modeling Hackathon at DMZ.)

After the AI panel, I walked upstairs for my panel. I was the only data modeler on the panel :L). Most of the topics focused on data quality or data governance. It was amazing though that I was able to address questions on data quality and governance purely using examples from my last few modeling assignments. For example, the moderator asked the panelists what factors lead to data disparity, and I was able to talk about the four factors I believe lead to multiple definitions for the same term: Scope, Time, Motive, and State.

I volunteered to participate in this panel because another theme from this past year’s Data Modeling Hackathon was educating university students on data modeling. .I wasn’t sure how many students knew data modeling in this audience. However, I was shocked when the first question posed to the panelists was directed at me! A student asked, “How relevant is data modeling in the world of NoSQL?” I smiled and gave my answer.

How would you answer this question? (Remember a majority of the audience are students that will be entering the workforce shortly.)

1 Comment

  1. Gordon Everest 4 years ago

    To the question “How relevant is data modeling in the world of NoSQL?” I give the following answer.
    The main purpose of data modeling is to understand the business, some application domain, some users world. The model becomes a representation of that world — the “things” in it, the relationships among those things and any constraints on those things or relationships. A secondary purpose is to build a database to contain information which pertains to and describes that domain. Hence, the business data model should not be concerned with issues of physical stored representation, or the transformations/manipulations/constraints which are imposed to facilitate implementation in some data (storage) management system. That could be a relational DBMS, or a NoSQL tool. Generally we speak of the model coming first, then the implementation, and finally, the data gets collected and stored according to the model. However, increasingly the data already exists in some form. Which leaves us with the task of figuring out what it means, what it represents — that is, understanding the data as it represents some user domain. NoSQL tools are often designed to deal with existing data and to process it more efficiently (that may be an oversimplification!). Either way, you must understand the business in order to make sense of the data.
    BOTTOM LINE: it is vitally important to understand the principles, methods, and approaches of data modeling which can help you get to an understanding of the data and the business domain it represents.

Leave a reply

Your email address will not be published.