Machine learning is on every CXO’s mind at this time. We’ve all heard and tested many use cases for machine learning, spanning various domains. I would like to focus on one point in this blog: the emergence of two distinct categories of machine learning solutions, depending on the type of problem that needs to be solved.
This week and next, I will describe in detail these approaches, which I call “macro modeling/predictive models as a service” and “micro modeling/training as a service.” In addition, I’ll highlight some of the challenges that technology executives need to be aware of when making investment decisions.
Predictive models as a service—macro modeling
When talking about machine learning, some obvious use cases that come to mind are autonomous vehicles and machine learning-powered translation systems. These could be described as general-purpose systems powered by predictive or machine learning techniques. Let’s see how these are generated and consumed.
First, we’re talking about one system that has been trained on a very large corpus of data in order to get the proper results in many different situations. For example, Toyota says it needs 8.8 billion miles to create a safe autonomous car. The same is true for image recognition, where public image data sets contain more than 100 million images (and the true internal data sets used by Yahoo and others are much bigger than that).
To build a general-purpose predictive model thus requires a gigantic amount of data that you have the right to use for this purpose.
How many intelligent general-purpose systems do we need? Today, we have many teams focused on autonomous vehicles, image classification, or even translation systems. But how many of these systems do we need on the planet? If a system to drive autonomous vehicles is efficient enough to beat the competition, you can expect that there will be 10 systems or so to equip all the cars on the planet. We expect these systems to work well in cities, in the countryside, during day and night, and so on. Producers will compete on the price and reliability of the sensors.
The same is true for translation systems or even image recognition systems. This is a winner-take-all market. If we push it to the limit, how many true artificial intelligence systems do we need on the earth?
As always with predictive and machine learning, it’s almost never a “fire-and-forget” activity. Your systems need to be continuously updated as new data comes in or specific rare situations occur, which means that you need to connect them to continuous feeds of data collection for continuous updates and monitoring.
This continuous improvement feature has cost impacts, of course, that will push the need for continuous learning or incremental learning. This in turn will also be used in order to start from general-purpose predictive models to specific models for specific contexts.
Of course, the fact that these systems can be transported is important. It’s nice to have such systems available as REST APIs on the cloud. This means that they will be available only within connected environments, which will solve many, but not all, of the use cases. Typically, an autonomous car must be able to run even if there is no connection.
On the shared models, through services, speed, and concurrency is very important, as well as exchanged data security and privacy. These are technical challenges that have been solved in the SAP Leonardo machine-learning foundation based on SAP Cloud Platform with Cloud Foundry.
Finally, we’re talking about very large data volumes and very large computing power, which impact on direct operation costs.> Consider, too, the fact that we can expect more improvement on this financial equation in the future (such as the introduction of ASICs) and even on pure electrical consumption, not to mention the amount of data traffic.
Next week, I’ll discuss training as a service – micro-modeling.