How To Approach A Cognitive AI Project For Unstructured Data Processing

Bharti Maan

Businesses have entered an era ruled by data. Business teams need the right tools and insights to handle intricate and multidimensional decision-making. CIOs work behind the curtains orchestrating the tools these teams need. 

Both business leaders and CIOs know that most of the data that can provide insights needed for genuine business decisions is already with them and that there’s lots of it, thanks to e-forms, digitized documents, computerized workplaces, and systems of record. However, it is highly unstructured, scattered, and in a mishmash, and the formats and channels where it lies are completely disparate. It might not even have a precise context or design and can comprise anything from a customer complaint to a user query response, internal memo, social media post, transaction record, scanned form, product brochure, presentation, or policy document.

Taming the wild horse

But just because this source of insights is in an unruly state doesn’t mean effort shouldn’t be made to harness it. Processing unstructured data may require more effort, but once accomplished, it can take organizational capabilities to a whole new level.

Cognitive AI-powered systems can “understand” the information and the context of the unstructured data. Through self-learning capabilities, these systems can take analytics to the next level by automating data extraction, document classification, and clustering of documents without prior classification.

For example, cognitive AI can help hospitals better manage patient records and help organizations sort invoices, prepare better RFP responses, enhance security, and protect customer data by detecting sensitive information. It can even flag the data for special handling to prevent unauthorized viewing or alteration. A cognitive AI-powered system could automate invoice processing by recognizing details such as invoice numbers, line items, and so on, even when these details appear in different locations and varied fonts and sizes for different companies.

Is it worth the leap?

The term “cognitive computing,” coined by IBM, is a conundrum as computers aren’t cognitive. Regardless, today the term embodies the application of artificial intelligence and machine learning in context to unstructured data analytics, including natural language processing. Cognitive computing comes with a twist of purpose, adaptiveness, self-learning, contextuality, and human interaction.

While the advent of cognitive AI offers unique possibilities, it’s important to better understand the many considerations, including tangible business outcomes, feasibility, cost, timelines, and project complexity, on which to base the “go/no-go” decision. Here are a few elements that come into play.

Accuracy. Cognitive technology is not 100% accurate. In 2011, IBM’s Watson Supercomputer won Jeopardy, a popular game show, against two of its most successful human contestants. But Watson did not get every question right. Many systems claim 99% accuracy, but that is rarely true. Even ~70% accuracy is tough to achieve. So be sure to define the expected business outcomes in line with the limitations. 

Input conduits: formats, integrated applications, and sources. No systems on earth can possibly be as diverse as cognitive AI systems. They get inputs in all kinds of formats, from handwritten to databases, from multiple internal and external ancillary systems, like email, document repositories, or websites. They consult a gamut of lookup systems for processing.  

The complexity of the system, use of resources, and time to develop it will multiply based on the formats, input sources, and lookup systems with which it will communicate. Keep in mind that the go-live time will stretch exponentially with each new document format and application for want of collaboration with different teams who own these systems and may have different priorities. It is thus most pertinent to identify and calibrate expectations around the coverage, scale, and accuracy at the outset. An approach covering each variety of organizational data and problem is a sure red flag. 

System capabilities. Key cost components of the system encompass architecture, availability, and response times. And with cognitive computing, the costs are a major factor for sure. The implementation team must consider the speed with which the cognitive automation system is expected to respond. In some cases, the response times could be highly demanding and unfeasible. In other cases, architectural or scope tweaks can do wonders and help manage the user experience, accuracy, deployment architecture, application integration, and so forth. Spend your time getting this right. 

An always-on and real-time application, like a chatbot or Alexa, can consume gigantic resources if it is validating and collecting data from multiple resources. The variety and number of formats, input sources, and lookup systems with which it will communicate also impact the cost. 

Commitment, data, and efforts to train the systems. The systems will only be as accurate as trained. Unless the beneficiary teams train the system using the right historical data, the system will not be efficient. Their commitment to training the system is essential, as is the availability and accuracy of training data.

Human assistive mode. An IT application needs to run on its own, especially for AI systems. But can it? In the real world, we do need human intervention and help for training the systems, collaborations, onboarding, troubleshooting, upgrades, etc. You can push the system to its true potential only when you control it or else it will be a wasted opportunity.

Think big, build small, one step at a time

Cognitive AI system deployments are full of questions ranging from cost and feasibility to ethical considerations. To ensure that the project is financially and technically viable, design it as a series of small precise steps. 

Learn how scenario integration is transformed into an automated workflow by registering for the webinar on January 23rd.


Bharti Maan

About Bharti Maan

Bharti is the director of Innovation and Digital Transformation Advisory at SAP, architecting and mentoring digital transformation projects for key and strategic accounts of SAP. She is a thought leader and technology champion for disruptive technologies of cloud, data analytics, digital twins, AI, IoT, cognitive intelligence, and RPA. Her team ensures alignment to business goals by defining appropriate strategies and ensuring that “the rubber meets the road.”