Dark data

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Dark data is that which is acquired through various operational sources sometimes nefarious, but not used in any manner to derive insights or for decision making.[1][2] In an industrial context, dark data can include information gathered by sensors and telematics.[3] The ability of an organisation to collect data can exceed the rate at which it can analyse it, and in some cases the organisation may not even be aware that the data is being collected.[4] According to IBM, "about 90 percent of data generated by most sensors and A-to-D conversions on the market never gets utilised".[5]

The first use and defining of the term "dark data" appears to be by the consulting company Gartner.[6]

Reasons for retaining

Organizations retain dark data for a multitude of reasons. Often it is stored for regulatory compliance and record keeping.[1] Some organizations believe that dark data could be useful to them in the future, once they have acquired better analytic and BI technology to process the information.[4][6][7] Because storage is inexpensive, storing data is easy. However, storing and securing the data usually entails greater expenses (or even risk) than the potential return profit.[1] Data-informed attributes the retention of data to being "culturally accustomed to saving everything, in the belief that all data has some value".[7]

Analysis

The IBM Watson, which could be used for future dark data analysis.

A lot of dark data is unstructured, which means that the information is in formats that may be difficult to categorise, be read by the computer and thus analysed. Often the reason that business do not analyse their dark data is because of the amount of resources it would take and the difficulty of having that data analysed. According to ComputerWeekly.com, 60% of organisations believe that their "BI (business intelligence) reporting capability" is "inadequate" and 65% say that they have "somewhat disorganised content management approaches".[8]

Many companies in the IT sector are looking at creating "cognitive computer systems" that are able to analyse unstructured dark data. The IBM Watson is considered to be a future system that would be able to analyse this unstructured data and be able to produce meaningful results that will utilise a lot of dark data that it is either practically impossible or very difficult to process at present.[9] In terms of current systems, IBM have advertised the IBM Spark as a system that "can extract insight from that information almost immediately. This enables businesses to build data rich products and services that use that information to transform the customer experience." Furthermore, they also give an even broader definition of dark data, one that also includes data that is not currently processed by computing systems but could be, for example in law.[10]

Relevance

Useful data may become dark data after it becomes irrelevant, as it is not processed fast enough. This is called "perishable insights" in "live flowing data".[7] For example, if the geolocation of a customer is known to a business, the business can make offer based on the location, however if this data is not processed immediately, it may be irrelevant in the future.[7] According to IBM, about 60 percent of data loses its value immediately.[5] Not analysing data immediately and letting it go 'dark' can lead to significant losses for an organisation in terms of not identifying fraud, for example, fast enough and then only addressing the issue when it is too late.[11]

Storage

According to the New York Times, 90% of energy used by data centres is wasted.[12] If data was not stored, energy costs could be saved. Furthermore, there are costs associated with the underutilisation of information and thus missed opportunities. Another cost is associated with keeping the data secure and other related IT costs.[7] According to Datamation, "the storage environments of EMEA organizations consist of 54 percent dark data, 32 percent Redundant, Obsolete and Trivial data and 14 percent business-critical data. By 2020, this can add up to $891 billion in storage and management costs that can otherwise be avoided."[13]

The continuous storage of dark data can put an organisation at risk, especially if this data is sensitive. In the case of a breach, this can result in serious repercussions. These can be financial, legal and can seriously hurt an organisation's reputation. For example, a breach of private records of customers could result in the stealing of sensitive information, which could result in identity theft. Another example could be the breach of the company's own sensitive information, for example relating to research and development. These risks can be mitigated by assessing and auditing whether this data is useful to the organisation, employing strong encryption and security and finally, if it is determined to be discarded, then it should be discarded in a way that it becomes unretrievable.[14]

Future

It is generally considered that as more advanced computing systems for analysis of data are built, the higher the value of dark data will be. It has been noted that "data and analytics will be the foundation of the modern industrial revolution".[3] Of course, this includes data that is currently considered "dark data" since there are not enough resources to process it. All this data that is being collected can be used in the future to bring maximum productivity and an ability for organisations to meet consumers' demand. Furthermore, many organisations do not realise the value of dark data right now, for example in healthcare and education organisations deal with large amounts of data that could create a significant "potential to service students and patients in the manner in which the consumer and financial services pursue their target population".[15]

References

  1. 1.0 1.1 1.2 Lua error in package.lua at line 80: module 'strict' not found.
  2. Lua error in package.lua at line 80: module 'strict' not found.
  3. 3.0 3.1 Lua error in package.lua at line 80: module 'strict' not found.
  4. 4.0 4.1 Lua error in package.lua at line 80: module 'strict' not found.
  5. 5.0 5.1 Lua error in package.lua at line 80: module 'strict' not found.
  6. 6.0 6.1 Lua error in package.lua at line 80: module 'strict' not found.
  7. 7.0 7.1 7.2 7.3 7.4 Lua error in package.lua at line 80: module 'strict' not found.
  8. Lua error in package.lua at line 80: module 'strict' not found.
  9. Lua error in package.lua at line 80: module 'strict' not found.
  10. Lua error in package.lua at line 80: module 'strict' not found.
  11. Lua error in package.lua at line 80: module 'strict' not found.
  12. Lua error in package.lua at line 80: module 'strict' not found.
  13. Lua error in package.lua at line 80: module 'strict' not found.
  14. Lua error in package.lua at line 80: module 'strict' not found.
  15. Lua error in package.lua at line 80: module 'strict' not found.