Come Together: Putting Live And Historical Data In Memory

Tom Traubitz

Hybrid transaction/analytical processing (HTAP) is a term Gartner coined to describe in-memory database systems that are capable of holding live transactional data and historical data together in memory for real-time analytics. Forrester and others use the term “translytical.” Whatever you call it, it’s a must-have for many of the use cases of the digital economy.

Internet of Things (IoT)?

Without an HTAP or translytical approach, predictive maintenance (one of the most common uses cases for IoT) would be next to impossible. The idea itself is predicated on the ability to analyze streaming data from IoT sensors on deployed assets and past data at the same time to detect patterns that help you predict maintenance needs.

Many real-time monitoring use cases, in fact, depend on this approach. From counterterrorism and patient health to fraud detection and weather prediction, the ability to understand new incoming data in light of historical data is key to running an intelligent enterprise.

Why keep it separated?

One obvious question is: Why are OLAP (online analytical processing) and OLTP (online transaction processing) separate to begin with? Aren’t the advantages of having them together – such as delivering insight and responding faster – obvious?

The reason for the split was performance. Traditionally, organizations have sectioned off analytics to a separate business warehouse environment. In this environment, experts can integrate transaction data with historical data to perform analysis and generate reports without interfering with the high performance needed for collecting new transactions. This approach has the advantage of preserving the live production landscape for real-time business transactions where the highest level of performance is a must-have. The price is that the analytic and reporting data is no longer real time but delayed by hours or days.

The problem is that in the digital economy, business performance is no longer measured simply in transactions processed. Increasingly, it’s measured according to speed-of-insight and the ability to respond fast and effectively. From this perspective, the practice of separating OLAP from OLTP is a strategic disadvantage.

The advantage of keeping transactions and analytics together

An in-memory database approach that keeps live and historical data together delivers at least three solid advantages. First on the list is less work for IT pros. Without the need to constantly move operational data to data warehouses – typically with midnight batch loads that sometimes go wrong and need correction – you can free up your IT people to focus on higher-level activities.

Second is speed. Transactional data is always available on demand, instantaneously. And it’s available for analytics just as quickly. Latency between when data is created and when it can be analyzed is almost nonexistent. This can accelerate decision-making dramatically.

Third on the list is simplicity. From a data perspective, one of the most significant sources of complexity is data duplication. And every time you copy data over to the data warehouse, you’re duplicating that data, thus creating further complexity. With everything held in memory, you can create a single version of the truth. This simplifies your data landscape tremendously.

An example using data marts

Let’s say you handle product distribution for a consumer products manufacturer. To minimize empty shelves and reduce overstocks, traditionally you’ve analyzed historical data using your enterprise data warehouse.

But today, things are different. Today, there’s new data that you want to tap – social media data. Live and constantly streaming, this data holds tremendous value when it comes to reading customer trends, managing demand, and building positive customer experiences.

Your idea is to mix streaming social data with data on inventory levels to better understand demand and put the right products on the right shelves at the right time. Based on this insight, you can then generate relevant real-time delivery manifests for drivers to ensure the kind of hassle-free and efficient delivery that delivers positive customer outcomes.

With traditional approaches, this kind of project would be cost-prohibitive. But with an in-memory data infrastructure in place, your idea is only one of many potential projects that can help your company run more competitively, efficiently, and profitably.

The way it’s done is with a data mart. Data marts, which focus on a specific use case such as sales analysis, have been used for years with data warehouses. In the context of relational databases stored on hard disk, the goal has always been to provide users with the most relevant data in the shortest amount of time using small slices of historical data.

With an in-memory approach, data processing is exponentially faster. Instead of small slices, you get the whole pizza. No aggregates needed. Data marts on hard disk require traditional extract, transform, load (ETL) processes – and invariably, they exacerbate the data duplication problem. Data marts in the in-memory context, on the other hand, don’t suffer a performance penalty for data access. Essentially, they simply point to where the live data lives – and they use this data for analysis and decision-making in real time.

The data mart still retains the sense of a special-purpose tool. In our example here, the purpose is responsive inventory management. But gone are many of the associated data management duties and performance considerations. This means that making new data marts for other ways of differentiating your business can be faster and easier.

For more information on HTAP and translytical approaches to database management, read the Forrester report here.

Tom Traubitz

About Tom Traubitz

Tom Traubitz is a senior director of Product Strategy with the Product and Innovations Group at SAP. He specializes in enterprise-class data warehousing and analytics. Tom has spent the past 25 years designing, engineering, testing, and marketing large scale, networked information management systems for a wealth of clients throughout the United States and the world.