Part 2 in a 2-part series about model risk management in the areas of artificial intelligence (AI) and decision intelligence (DI) for enterprises. Read Part 1.
We are moving artificial intelligence (AI) systems from the lab, where we can control for a limited set of variables, to potentially massive-scale implementations, where variables will propagate and multiply. We must have an AI engineering discipline that can help predict and adjust for those variables.
From single- to multilink AI/DI decisions
Early enterprise AI implementations are still quite limited and tend to be focused on single-link predictions such as:
- “What’s the chance that this customer will churn?”
- “What is the predicted lifetime revenue from this customer?”
- “Which clause in this regulation has been changed since we last reviewed it?”
- “Where are the logos appearing in this video?”
In a typical enterprise use case, these single-link AI models provide insight used by human decision-makers to determine the best next step based on this information. The decision space – the inputs and the outcomes that result from them – are well-defined and tightly limited.
For instance, an AI system may display information such as “churn risk,” or “high lifetime value” – based on a small set of criteria – on a customer service representative’s terminal when they are talking with a customer.
Or a natural language processing (NLP) AI system might be trained to identify and distribute updates to a specific set of regulations. A human analyst can then adjust contract terms or modify a limited universe of product or service features to comply with those changed legal requirements, with significant business damage possible if the NLP system is poorly designed and trained.
The risks of these single-link systems fall into several categories, as we discussed in Part 1 of this series:
- Bias in data selection
- Training on data sets that are not representative of future conditions (the Black Swan problem)
- Societal or external bias
However, these single-link use cases represent only a tiny fraction of potential AI/DI “injection and inflection” points in a typical enterprise or other large organization.
Future AI systems will involve multiple models in a cause-and-effect cascade, as the systems are used for both decisions (decision intelligence) and process improvement.
For instance, a system used for capital allocation might include one model to predict future customer growth, another to predict the impact of a marketing allocation spent on customer service on likelihood-to-recommend (L2R), and a third model to represent the likely conversation rate from a sales campaign. A decision regarding the best course of action to increase revenues might use a system that relies on all of the models working together. This is qualitatively more complex – and therefore involves a qualitatively different level of risk – than single-link AI systems.
As this example shows, AI-driven decision processes will increasingly determine goals and incentives and make or contribute to multiple decisions that underlie all of an organization’s strategic and tactical choices. Finding a way to manage the associated risks will become critical to the success or failure of enterprise initiatives or even the survival of the entire enterprise.
Complexity naturally increases with each new level of AI maturity and as multiple AI systems are woven into a firm’s processes. As AI-driven enterprise systems become more complex and deeply embedded, new and emerging risks also become much harder to identify and address.
Unintended and intended decision externalities
As described above, decisions automated by AI/DI within an enterprise are separated by layers of context from both source data and external risk points. That makes it more difficult to identify biases in the data and to trace any issues back to their root cause.
But a deeper risk can arise in a circumstance that that doesn’t even reflect bias from the enterprise’s point of view.
One of us (Kamesh) worked with a law firm as part of a strategic planning initiative. The project found that to maximize profits, the company should focus on high-net-worth clients. Logically, this means that fewer resources and lesser attention would be focused on cases coming from, for example, the U.S. Consumer Financial Protection Bureau.
But, while those cases were fundamentally less lucrative for the firm, they arguably provided an equally or more desirable outcome from the point of view of social justice.
This kind of optimization is, of course, pervasive within enterprises. Businesses are in most cases beholden to deliver profitability or other results that benefit their immediate stakeholders and often treat external societal impacts as cost-free externalities.
This is nothing new.
What is new is when this kind of selection is supercharged first by the availability of masses of data and, as we move forward, by AI models that do an increasingly better job of excluding all but the most lucrative of clients from any business.
Telecom companies might, for the first time, understand that there is a class of customers who are more expensive to service through the call center than the revenues they receive. On the flip side, they will provide “marquis” service to VIP customers with wide social influence, such as politicians and media stars.
In a multilink context, this is how net-neutrality decisions made based on optimizing network-management practices might trickle down to help determine different service outcomes for different groups of customers.
The inferior services available to less-wealthy clients are not the result of design, but they are the result of myriad decisions driven by unconscious and self-amplifying biases.
The same pattern plays out every time an enterprise decides which suppliers to do business with or where to set up shop based on regulatory structures or any of the millions of decisions that underlie business relationships and structures.
And most of this is happening out of sight, deep inside enterprise systems and in databases that have grown like coral over decades and through billions of transactions. The impact of any one of the billions of decisions that underlie each transaction is so remote from the final output or outcome as to be invisible.
There truly are ghosts in the machines already.
In the context of AI, this means that the data we intend to use to train hugely powerful AI systems is opaque. We can’t see what biases are incorporated – whether by design or default – in the data and the architectures of our databases and repositories. But biases are there, and these tools are guaranteed to amplify them.
Indeed, the use of AI creates a new smoke-screen layer for companies that manifest intended bias: a knowledgeable negative on external societal factors.
Traditional IT risk models do not catch the external societal implications that can flow from ignoring non-economic internal risks and externalities. In many cases, those non-economic risks arise when decisions are unknowingly made based on flawed or biased data, and they are amplified when AI/DI is introduced into enterprise business-support systems.
For these reasons, we must develop a new discipline to manage enterprise AI risk.
The sheer scale of enterprise data that is planned to be used for AI is considerable. It is potentially all the data that humans ever have or will create and store electronically. With massive data, computing power, and ever-more sophisticated algorithms, the outcome space – the possible results of a chain of decisions made by AI/DI – is equally large.
The risk of unintended outcomes may very well grow nonlinearly with the complexity of the application for which it is used. We need to do everything we can to stay ahead of that risk.
We are experiencing the results of delaying the hard work on climate change. We have learned an immense amount, but it is only a fraction of what we need to begin healing our planet. We don’t really understand how to put that knowledge to work as we literally push against the tide. The resulting damages are already massive, and some are likely irreversible in our lifetimes – or even those of our children’s children.
We can’t make that same error when it comes to AI. If we wait for the hidden biases and other flaws hiding in our data to emerge out of AI-based decision systems, it will be too late. Whether we like it or not, these powerful tools are placed on a knife’s edge: Get this right and we could make a tremendous save; get it wrong, and we are doomed to live with the unintended consequences.
Today is the time to begin. We need the brilliant minds designing these systems and imagining how they might be used to also turn their attention to understanding what is really going on inside the black boxes of our data repositories and computing processes. We need to be able to see where the properties inherent in those systems reside and how they might interact.
These risks will emerge from the depths of data, processes, and decisions. They will not be obvious. And they will propagate across the massively interconnected webs of commerce. These risks can be addressed only by associating them correctly with the AI/DI-driven decision points (across the layers of context). This requires that we invest now and address those risks in the core setup of AI/DI systems.
We need to look into the future to identify where and how AI-related risks will emerge. And we need to share that knowledge with everyone who has a stake in this project, which is everyone on this planet. This is the only way we can hope to reach a consensus on the level of risk we can accept and the rules we need to put in place to control them.
Time for a new risk-management discipline
The good news is this: When the new engineering disciplines that built airplanes and skyscrapers emerged, robust quality assurance and risk management practices grew alongside them. Those disciplines learned to detect, mitigate, and eliminate unintended consequences from these powerful new technologies.
By taking AI and DI as seriously as engineering disciplines, which means recognizing that AI and DI models are artifacts that must be rigorously managed, we have every hope of obtaining the very best from these powerful new tools while limiting any negative consequences.
In conclusion, each one of us plays a part in contributing to this new discipline.
- Developers should go beyond simply creating good training data to consider the larger context in which that data was developed to understand any selection or other biases it may contain. As developers build models, we should go beyond evaluating their accuracy to modeling the decision-making context in which our models will be used and mapping their effects within and beyond the organization that will use them. We should also add human inspection and control points as much as is feasible, especially as our systems become more and more complex.
- Business decision-makers should support developers as they strive, as above, to answer broader questions regarding the impact of their AI and DI models.
- Policy-makers should sponsor programs that create the new discipline of AI/DI risk management and advance technologies that lead to transparency and accountability. Policy-makers should insist on rigorous answers as to the potential unintended consequences of this powerful technology.
- Risk managers should expand IT risk initiatives to cover both model and decision risk (decisions usually involve multiple models, as above) and create collaborative structures that allow multiple stakeholders to participate in the identification, review, and mitigation of the risks that emerge from the automation of predictions and decisions that were previously done only by humans.
For more on this topic, see “The Human Side Of Machine Learning.”