The Vicissitudes Of Knowledge Discovery: From Prediction To Explanation And Back Again

Doug Freud

As we bid adieu to 2016, it is clearer than ever that the “Templosion” caused by the rapid changes in information technology have exceeded our capacity to comprehend the impending disruption. Disruption is in fact the new normal, and as a society, organization, or as individuals, it is time to recognize that what worked in the past is unlikely to work in the future. In order to evolve we need next-generation approaches for innovation and knowledge discovery. Part of this evolution requires the self-awareness to understand when we do not need to understand.

Prediction vs. causation

11-17-dogWhile there is still significant debate on the mechanisms behind human learning and knowledge discovery, it’s clear it is fueled by experience and an incredible number of predictions and feedback loops we aren’t even aware are happening. As children, we learn words, numbers, and shapes, and eventually recognize them instantly. In this learning process, it is rare to ask or to be able to answer why we know this is a picture of a dog and not a cat.

Heuristic-based predictions are hard-wired into our everyday existence—it’s only when they fail us too often that we seek out causal-based explanations.

As humans evolved, we eventually developed first-principal scientific laws (like gravity), which are not only useful for prediction but also explain how the world works. One of the reasons we have survived as a species is the inherent need to understand and explain. The need for causation permeates our interactions and decision-making even though most of our everyday mundane behavior is driven by predictions that we aren’t aware of or understand. We view the world through a lens that focuses on causation, but we rely on heuristic-based prediction because it usually works.

Next-generation knowledge discovery

As we enter this new age—where everything is “connected”—there is a need for the adoption of a new framework for knowledge discovery, because traditional approaches don’t scale.


As the industrial and consumer Internet connect everything in real time, the velocity of data will increase exponentially. Our ability to analyze it and explain it will be quickly overwhelmed. Organizations that can evolve toward a holistic balance between explanation and prediction in a knowledge discovery framework will be more likely to cause and/or survive the impending disruption.

In our current economic environment, we will see healthcare readily adopt machine learning as the analytical-powered approach for diagnosis, but encounter significant resistance around it for analytics that surround treatment. As the cost for healthcare continues to escalate, my expectation is that time to market will require new approaches and that machine learning will even power drug discovery around treatment.

It is inevitable that we will start accepting predictions and recommendations powered by machine learning even if we can’t understand how or why they work because it will provide a strategic advantage. Ignoring the data is an option, but we do so at our own risk. Instead, the more plausible option is to first learn via automated approaches to analytics. Second, to understand that the consumer of the analytics are our applications and operational systems, not individuals. Predictions and correlations will have to be good enough.

We must accept that many systems are simply too complex— or are not important enough—to warrant the effort for understanding causality.

With the additional data sources and velocity in a highly connected world, it’s likely that even the next generation of scientific discovery will be a function of smart humans using machine learning to test combinations of factors we would never have thought of in a time frame we can’t compete with.

In May of 2016, a group of physicists from University of Adelaide demonstrated that the machine-learning optimization method required fewer experiments than competing optimization methods. The physicists were surprised by the clever methods the system came up with, like changing one laser’s power up and down, and compensating with another laser. Paul Wigley stated:

I didn’t expect the machine could learn to do the experiment itself, from scratch, in under an hour. A simple computer program would have taken longer than the age of the universe to run through all the combinations and work this out.

What do you think about the next stage of knowledge discovery?

Join us here on the SAP BusinessObjectsAnalytics blog every Thursday for new posts about all things predictive (and read the previous series posts ).

Doug Freud

About Doug Freud

Doug Freud is a global Vice President of Data Science for the SAP Customer Innovation and Engagement Platform team. His academic background is Industrial Organizational psychology, and he has worked in both GTM and professional service roles. He is a proven leader with ability to manage cross-functional teams that implement innovative solutions. His passion is using data and machine learning to change business processes and create new systems of innovation.