Although the term “Big Data” has been in existence since the early 90s, its recent popularity can be attributed to the open source community, which has released tools and frameworks to store, process, and analyze data that either does not fit or is unsuitable for traditional transactional databases.
In meetings with customers, I often hear how Big Data sources such as the Internet of Things and social media have challenged their entire approach to data management. The architecture of centralized and traditionally on-premise data systems no longer meets their needs when it comes to handling large volumes of distributed data arriving at such a high velocity.
New technologies, new challenges
As the amount of data grows, so does the complexity of organizations’ IT landscapes as they try to manage the influx. Dealing with structured and unstructured data from multiple applications, files, databases, data warehouses, and data lakes can seem to hinder rather than enable. Managing the data with existing tools and technologies is complex and time-consuming, and gaining the insights that Big Data promises can be challenging.
At an organizational level, too, the Big Data challenge is often not dealt with holistically. Numerous roles—developers, data scientists, business warehouse administrators, and business analysts to name just a few—all have their own requirements, approaches, and tools. This results in silos, both at the level of the technology and the company.
The power and potential of Big Data
Despite the engineering complexity, it’s important not to lose sight of the fact that Big Data represents a great opportunity. Analyzed effectively, patterns and trends—that would otherwise go unnoticed—can be identified. These insights can lead to more informed decision making, new or improved business processes and models, and, of course, contribute to the ultimate goal of greater customer satisfaction. And beyond efficiency gains and increased profits, Big Data also has a major impact on society as a whole. From enabling smarter cities to saving lives by anticipating natural disasters and driving medical research, the advantages of Big Data ingestion, storage, analysis, and sharing are both numerous and far reaching.
But while there are countless open source Big Data tools, they frequently lack the essential life-cycle management, governance, and security capabilities that are required in the enterprise world. Gaining insights from multiple data formats and silos to get the complete picture remains a huge challenge. Going one step further, the ability to connect an organization’s own master and transactional data with Big Data would provide enormous and immediate business value. Until now, complex and expensive workarounds to integrate the many different technologies and systems have been the only way to do this—but adding yet another layer of complexity to an already complex IT landscape is obviously not the answer.
From “information” to “managed enterprise asset”
So how to bridge the gap between the Big Data and enterprise data worlds? Whenever I talk to our customers about their Big Data challenges, they are looking for a unified and open approach to help them accelerate and expand the flow of data across their data landscapes for all users—from data scientists to business analysts. So this is exactly what we came up with.
With extensive data integration, data orchestration, and data governance capabilities, we want to provide customers with the same high standard of data access, quality, and enrichment when processing their Big Data as they do with their enterprise data. Our focus is on enabling these customers to create agile, data-driven applications and processes that can respond to changes and anomalies in the data in real time. Best-in-class container schedulers such as Kubernetes help us orchestrate fleets of operations and processes into scalable pipelines, making it possible to effortlessly scale thousands of processing nodes to cope with high volumes of data. It doesn’t matter whether it is in the cloud, on premise, in a data lake, a data warehouse, or application—business and IT users alike must be able to get the information they need wherever the data lives.
Data as the foundation
As I explained earlier, the digital age requires companies to find the right balance between stability and agility—in other words, implementing new use cases while at the same time ensuring existing processes are not disturbed. It is about running your processes with the greatest efficiency and winning with new technologies and business models. Data is the foundation for both these approaches: A common, harmonized data source across your entire digital landscape is the key to unlocking enormous business value. New technologies allow us to push the boundaries of what our data can do, so let’s embrace these possibilities!