Using Dark Data

Using Dark Data

Data Center

What is Dark Data?

Nowadays, we have a new approach to the business environment related to information technology. Data-centric solutions and predictive analytics are a couple of significant trends in business process implementation and decision-making tools.

The critical element of the data-centric or data-driven solutions is data. The data that we have in the data lakes through data collections in many processes implemented in the business. And, the existing data which are not tapped by the business due to many reasons. We call the untapped data, the Dark Data.

Gartner has defined dark data in its IT Glossary as follows:

dark data is the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships, and direct monetizing). Similar to dark matter in physics, dark data often comprises most organizations’ universe of information assets. Thus, organizations often retain dark data for compliance purposes only. Storing and securing data, typically, incurs more expense (and sometimes greater risk) than value.” [Gartner no date]

 

There are many other definitions circulating about dark data, but I prefer to define it based on the features of the data to give a more precise sense of it:

1 – Dark data exists in the business operation but is not necessarily collected, and they are not in the format that the business can use them easily.

2 – The business might collect the dark data, but it is not classified or structured in a way for the business to run a query. They are just sitting in the data lakes and overgrowing.

3 – Dark data can be a large collection of smaller datasets that are collected, but businesses cannot exploit it due to the lack of established relations between the smaller datasets.

4 – Dark data is the pile of text, tables, images, and videos, and it is difficult to set the worth of it, and usually is ill protected in operation.

If I get technical, I can list more features of dark data, but I think you got the picture and the definition. Based on the above features, the data we have in our ERP or any legacy business operation software is not considered as dark data, unless we encounter old archived data when we migrate from one system to another.

 

 

The usage of Dark Data

Currently, a business might pay the cost of collecting and maintaining dark data, but they are not using them to their benefit. And for sure, there is no monetization applied to them.

What business needs to get from the data is the predictive analytic reports and use the outcome in the decision process. For a business to achieve analytics goals, it needs to form the data in a way that analytical tools can use them and run the necessary queries.

Extracting, transferring, and loading the dark data is not an easy task. If we engage with a distributed system with large data collections in each business node, it becomes more difficult. There is technological consideration to validate the data, store them, and make sure they comply with the data model we need for the analytical tasks. And once the data become trustworthy, we can use them and monetize them in operation.

Many use cases can be named to exploit the dark data in operation based on generated and gathered information from the processes.

Logistics Photo

In hospitality, the information gathered on the web through digital marketing activities can be used for predictive analytics. In a hotel premise, the data collected through WiFi can exploit for guest behavior and facility usage and help the operation to improve the guest experience in the hotel

In logistics and warehousing, optimizing the equipment usage and improving resource allocation in operation is one use case of dark data analytics. Improving load rate and optimizing equipment maintenance is a direct benefit of Dark data Monetization

In the industry, collecting information from different processes and machinery can help to create a better picture of the whole production process. Even in the case of industry, many practices can benefit from the dark data and a data overhaul can improve the entire business.

In the retail operation, we have similar processes like hospitality that deals with marketing and customer experience, as well as logistics to optimize the supply chain and distribution. These two processes are closely related, and improving one will help to enhance the other.

The cost of Dark Data

Technological expenses have been a barrier so far for businesses to benefit from dark data. Due to the growth of cloud computing, the Internet of Things (IoT), and rapid growth in telecommunication, especially 5G, the cost of dealing with dark data has dropped.

There are solutions like Datumize that focused on dark data exploitation and focused on specific business verticals and provide data analytics based on their business model requirements.

Besides, one crucial aspect of dealing with dark data is that the business usually does not need to invest in new hardware. As per the dark data definition, the information is already generated and exists, so the dark data solution job is to extract, transfer, and load the data into the analytic solution and provide analytical reports and visualization to the business operator.

 

The goal of dark data analytics is to provide more information to the operators to have better analytics and make batter data-driven decisions.

Roozbeh Salehi

Experienced manager with a demonstrated history of working in the information technology and services industry. Skilled in IT solutions for Retail and Hospitality businesses, with 30 years of professional experience in different aspects of the IT industry. A strong business development professional with a Bachelor of Engineering (BEng) major in Computer Engineering - Hardware from Shiraz University. Enthusiastic in AI and Machine Learning with a vast knowledge of Data Science.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: