Predictive Analytics And Machine Learning For Developers And Data Scientists

Mary Carol Madigan

Part 3 in the 4-part Predictive Analytics and Machine Learning series

In Part 2 of this series, my colleague John Schitka discussed the need to bring data science and predictive analytics and machine learning (PAML) technologies to the business user. This is one of the central points Forrester made in a new report: “Powering The Intelligent Enterprise With AI, Machine Learning, And Predictive Analytics.”

The current problem, as the report says, is that data science teams cannot keep pace with the business’ demand for PAML-based solutions. John talked about attacking this problem at its source. The idea is to alleviate demand by empowering business analysts to build and use PAML solutions themselves, without expert assistance. Let’s call this self-service PAML for everyday businesspeople.

Tools for developers and data scientists

Self-service features for businesspeople are important for meeting increased PAML demand – but it’s only part of the overall solution. What about developers and data scientists? These roles are key for creating and implementing PAML solutions with the deepest insights or most potential for real business impact. They have more technical expertise than the average line-of-business person, and typically they’re pursuing more ambitious PAML projects, ones that would not be possible with self-service tools alone.

This means that in addition to self-service options, your PAML platform should also include a set of machine learning tools focused on the needs of developers and data scientists.


According to the Forrester paper: “Companies flail when it comes to infusing machine learning models in production applications. Roughly half of companies struggle with the complexity of deploying and managing models used in production applications.”

A robust set of application programming interfaces (APIs) can help address this challenge by allowing developers to embed machine learning algorithms in the applications they develop. APIs mean that you don’t have to build the technology you’re trying connect to. Instead, the API gives you protocols and data standards to access preexisting technology.

The idea is that the data science team can develop algorithms and publish them along with the APIs needed for developers to make use of them. As the Forrester paper points out: “Data-science professionals more commonly felt like their company had a broad application of PAML across various use cases compared to business end users who indicated the number of PAML use cases was still relatively small.” Data scientists, in other words, often pump out the algorithms – but then these algorithms sit on the shelf unused. APIs are a way to encourage wider machine learning adoption.

Ready-to-use machine learning models

A good PAML platform should also provide preconfigured and ready-to-use models that act as foundational components for larger and more complex business scenarios. Re-trainable versions of these components would also lend themselves well to more specific needs across a variety of business functions.

But it’s important that these models focus on the high-level scenarios that machine learning can address. Three important areas are:

  • Images: What objects or text are shown? Are any identical?
  • Speech: How can you convert speech to text or the other way around?
  • Text: How can you classify text content and identify keywords?

From starting points like these, developers and data scientists can move forward faster on a number of fronts. Need item recognition for a set of images? Start with an image-based machine learning model and train it for your specific purposes. Want to do invoice matching? Move forward with a text-based model that uses optical character recognition to read through volumes of invoices and identify patterns. The possibilities are intriguing.

Productivity tools for data scientists

It’s also important that your PAML platform addresses the often-tedious tasks that data scientists encounter when developing their models. Data scientists want tools that support experimentation and collaboration – tools like Jupyter notebooks. They also want to use the tools they’re most comfortable with, which requires flexible support for TensorFlow and other technologies.

Critically, data scientists need an underlying data platform that is up to the PAML task. This means readily available data pipelines to orchestrate machine learning services – with all the pre- and post-processing required. And because PAML solutions exist in live corporate landscapes, data scientists (and all technical people) need robust lifecycle management tools that support solutions from across all phases – from design and deployment to operations and retirement.

Moving forward

As a technology and solution space, PAML is increasingly moving out of the laboratory and into live, productive use. The platform choices your organization makes can mean the difference between persistent obstacles that impede success and rationalized approaches to PAML development and deployment that allow you to deploy and run PAML solutions quickly and at scale.

For an in-depth look into the intelligent possibilities for your business, review the August 2018 Forrester Consulting study, “Powering The Intelligent Enterprise With AI, Machine Learning, And Predictive Analytics,” commissioned by SAP.

Mary Carol Madigan

About Mary Carol Madigan

Mary Carol Madigan leads strategy for machine learning at SAP, where she is in charge of defining how SAP leverages artificial intelligence to build the intelligent enterprise. Mary Carol also focuses on implementing AI ethics at SAP, as well as other strategic topics key to the success of innovation at SAP.