The 8 States of Big Data
September 14 2012 by Troy Blackman
What is the state of your data? Does it bring value?
The buzz in the industry is “Big Data”. What do you do with it? How do you manage it? What is its value? Whether you plan to use the data in a batch environment or real-time or both, you need to decide that ahead of time to make the most of your data management process. To really understand where the value is in data and where you stand in deriving the value you can use the grid below to assess you situation and what you should do next.
8 States of Data
| State of Data | Value
($ – $$$$$) |
Definition | What to do next? |
| Raw |
Virtually unusable, too much to look at in too many places. This data is unprocessed, unfiltered data and it is the condition of data for most companies. |
Prepare the data in logical format to allow matching and analysis. This includes standardization, hygiene and verification of name, address, phone, email. Data fields are also reviewed for validity, population rates and logical divisions. |
|
| Cleansed | $ |
Standardization, hygiene, verification and validation have been performed so data has logical divisions and meaningful matching elements. |
The data is likely from several systems/sources at this point and key matching elements need to be determined to bring the data together into a single view. |
| Consolidated | $$ |
Merging of your internal data from silos to one view in a database. |
Your consolidated data gives you a great view of what your customer’s interaction is with you now you need to add additional “outside” or third party data to gain perspective. |
| Matched | $$$ |
Third party data is matched to the file to bring additional meaning and insight to the file. |
Develop Meaningful buckets and categories to put data into understandable groups. |
| Summarized | $$$ |
The data is prepared for reporting. |
Now you need to develop and disseminate the reports. |
| Distributed | $$$$ |
Dissemination of performance metrics to stakeholders. |
Time for analysis and prediction of future behavior. |
| Analyzed | $$$$$ |
Meaningful insights into what has happened within your file. |
Make marketing decisions and plans! |
| Prediction | $$$$$ |
The data is used to predict the outcome of events such as response, attrition or upsell or prediction results such as profit or revenue. |
Go do better marketing! |
The cost of getting to the states of ‘Distributed’, ‘Analyzed’, and ‘Prediction’ are relative to the scope, complexity and control of the data with control being the key piece in the cost equation. If you outsource or use existing structures your cost of control goes down while if you want to own the process and data the cost goes up with the hardware, software and resources necessary to pull it off. Generally all but the largest rely on external support.


