Hollywood loves explosions. Few scenes offer more bang for the buck, literally, than a battle scene where a soldier screams “Incoming!” just as bombs start falling from every direction.
CIOs must feel like that soldier when they look at not just the volume of data, but the type and source of data raining down on their organizations. They are under siege, and most of them lack the technology to fight back.
We all know the amount of information enterprises are collecting. It’s huge. In one of the more recentstudies published on big data in large organizations, the Aberdeen Group found the median amount of active business data companies used was 150 terabytes with 17% of firms working with more than a petabyte. The research showed that the average data growth was 42% year-over-year, but one-fifth of organizations reported growth rates of more than 75%.
This study revealed something even more interesting: the number of distinct data sources. According to Aberdeen, companies average 28 sources of incoming data: 14 from internal operations, nine from partners and five from outside their business ecosystem.
All those disparate sources confound IT’s ability to help the company make use of – or make sense of – the information pouring in. That’s why the report revealed that 45% of companies are still dealing with data formats that make data analysis a major hurdle and that 39% lament that data remains “siloed” and inaccessible for analysis.
Ultimately, the varying data sources, formats and silos lead to the most astonishing number of all: 23%. That’s the amount of data enterprises control that is available for analysis.
Think about it. Less than one-fourth of the information inside a company is available for analytical scrutiny. Undoubtedly there are business reasons unique to each organization for this analysis gap. However, the report does unintentionally reveal one reason: IT is slow to adopt technology that addresses the problem. That’s true for even well-established technologies such as columnar or in-memory databases.
According to Aberdeen’s research, only 29% of companies have installed columnar databases and only 14% have adopted in-memory databases, such as SAP HANA, that can handle large volumes of data and process it more rapidly than traditional database systems. And a mere 11% have begun working with MapReduce/Hadoop in their analytics infrastructure, which would help companies analyze their unstructured data sources.
It’s not surprising, then, to learn that enterprises are capable of analyzing only a fraction of their data today. When organizations rely on legacy systems, they’re not likely to solve age-old compatibility and access problems. And they are not likely to achieve the insights that big data analytics deliver.