Bimotics views on Cloud Data Warehouse, Big Data, Business Intelligence, Analytics, Google Cloud Platform and Google BigQuery.
Monday, March 7, 2016
So What is This Smart Data?
The last few months, I have noticed the term “smart data” popping up in sessions and blogs mostly related to big data and data science. The content is full of the value of using smart data to answer questions, but what exactly is it and how does it differ from the regular data we collect everyday? I did a little research and here are some things that I have learned. It turns out the concept of data marts and analytics marts that Bimoticsdiscussed earlier this year are very similar.
According to Cambridge Sematics, a company specializing data and analysis, describes smart data as a set of data that has been collected in such a way that it can be optimized discovered, integrated, visualized, and analyzed. There are technology vendors that have described smart data as big data analytics or analytics on top of big data.
Smart data usually is collected from many different sources and applications. The data is then aggregated into a single location. While bringing it in, the different data elements need to be organized and standardized so that they relate to one another in a way that represents reality. For example, an order has many product line items and each product has a price. While this data modeling reflects how all data for business intelligence starts out, there are nuances that eventually makes this data “smart”.
To start, smart data is not a just a collection of all the data related to a particular subject. Instead, the information and data elements are evaluated for significance and the ones that are highly related to the subject are kept in the set. This means that the smart data model needs to be flexible so that data can be included and excluded as needed. I have seen this concept of smart data in articles related to customer insights and predictive analytics. Identifying and keeping only the most meaningful data points is at the heart of these kinds of analyses.
The relationships between elements are defined and form a common set of terminologies, so that these sets becomes more understandable. Maintaining the meaning of the data and making sure that it has not changed from the original is key and often complex. The effort pays off when the number of false positives decline as well as when the data begins to be more purposeful and can be used in many different ways to answer many different sets of questions.
While, I am just beginning to understand the concept of smart data I am eager to learn more as this concept emerges into something as popular and common as big data.