Data governance is hot once again. It is “strategic.” A core competency. A foundational requirement. An investment focus area.
If this sounds familiar, perhaps you recall the days when companies first moved their operations into enterprise applications, or built their first data warehouses, or Sarbanes-Oxley went into effect. At those brief transition points, information governance was critical for the organization’s long-term health, and businesses made major investments in people, processes, and technologies supporting those capabilities. But since the last wave of interest in the topic, information governance has often been deprioritized in favor of efficiency, innovation, and growth.
Now the pendulum has swung again, and data governance is back. Every day now, we get requests to “help frame a data governance strategy.” In fact, so many of those calls have come in that that we just opened up a new Center of Excellence (CoE) focused exclusively on helping our customers with data governance and process excellence, of which I’m a part. Although many of the fundamental governance challenges persist from years past, there are several important differences this time around:
- Much, much more data: The rise of cloud platforms, IoT solutions, data lakes, and numerous applications mean that there is an unprecedented volume, variety, and velocity of data in enterprises today. Given how easy and cheap it is to manage data now, much of this data is being overseen by lines of business (LOBs) outside the formal control of IT.
- Growing regulatory and reputational risk: While these freewheeling data environments are fabulous places to nurture innovation and entrepreneurship within large companies, they also pose increasing risks. New GDPR regulations in Europe threaten massive fines for companies that don’t strictly manage personally identifiable information (PII), and data breaches are creating large civil liabilities as well as lasting reputational damage.
- Digital business transformation: Many organizations are also on a long-term path toward digital transformation. Whether that means digitizing and automating a core process or completely reinventing a business model to leverage digital technologies, a critical requirement for these initiatives is high quality, trusted data.
So how are businesses and other organizations approaching data governance? Well, details are of course complicated and specific to each organization, but as a high-level parallel, data governance has a lot in common with airport security. The basic idea in airport security is to manage two zones – a less-secure “public zone” where people come and go with minimal interference, and a “secure zone” behind a screening perimeter. In the less-secure public zone, airport police are focused on informing visitors of basic protocol (don’t park here! don’t leave bags unattended!) and unobtrusively monitoring for blatant dangers (using cameras, bomb-sniffing dogs, observation, etc.). But moving into the secure zone requires passing through a perimeter with much more rigorous screening – document and ID checks, baggage x-rays, metal detectors, etc.
Applying this analogy to data governance, a typical secure zone would include trusted enterprise data, business applications, data warehouses, and analytic data marts secured by a data-quality perimeter. There is also an uncharted universe of data lakes, cloud services, business applications, and ad hoc databases with an unknown number of users, largely unconstrained by IT or business policy.
If that sounds vaguely like your enterprise data landscape, a few insights from the parallel world of airports that might be helpful:
1. Unobtrusively police the public zone (e.g., invest in reactive governance)
By all means, let business users experiment and data scientists innovate. However, ensure they know the data governance policies in your organization and use automation to continually monitor for risks. Every data scientist should get regular reminders that it is not OK to include PII in their data sets, that they are responsible for the security of their data, and that there are repercussions if a breach occurs. And use technology (especially data profiling tools) to automatically scan your enterprise landscape for PII or other sensitive data, and develop a process to quickly escalate any discovered issues. Bring in the data police when needed!
2. Expand and harden your trusted data perimeter (e.g., extend and deepen active governance)
High-quality enterprise data is a key factor for increased efficiency, better analytics, and deeper insight into your operations. Many companies take only half-hearted steps towards data quality and reap limited benefits. With the confluence of three business drivers, this is a great time to “do data governance right.” Expand your perimeter to include all enterprise applications and analytic tools (not just a few process areas). Then develop a comprehensive strategy for data governance, including appointing stewards, enabling key stakeholders, and developing processes as needed. And, of course, use technology to automate and improve data governance as much as possible.
Some of the capabilities most in demand right now are:
- Data profiling and monitoring – Automatically discover the data sources in your landscape, assess data quality, and apply rules continuously to monitor data quality.
- Data cleansing – Standardize, consolidate, match, correct, and enrich enterprise data.
- Master data management – Manage authoritative lists of your key enterprise data (customers, materials, suppliers, products, employees, etc.).
- Data lineage and impact analysis – Trace the provenance of your enterprise data to determine the origin of data-quality problems and understand downstream impact.
- Data privacy – Encrypt, mask, and anonymize fields to protect sensitive information.
- Archiving – Move older data out of production environments to improve system performance, accelerate backups/migrations, ensure compliance, and reduce costs.
- Lifecycle management – Manage retention and destruction of data, while supporting legal holds and providing an auditable record for compliance needs.
3. Watch data as it leaves the secure zone (e.g., rethink data extraction)
Until a few years ago, the focus of data governance was getting good data into the secure zone; few organizations worried about what went out. Now, with GDPR and ever-larger security breaches, companies must take great care to ensure that PII and other sensitive data is not freely transferred beyond the security perimeters. Access and process controls are critical, but tools like data masking, encryption, and anonymization can also play an important role in ensuring that enterprise data can circulate without creating business risk.
4. Embrace the robots (e.g., RPA is coming to DQ!)
As robotic process automation (RPA) matures, it will surely be applied to data governance and automate many of the most labor-intensive processes of validating matches and monitoring quality. With the right systems in place, humans can spend their time fixing deeper problems at their root cause – other humans.
This is an exciting time to reframe how organizations ingest, improve, use, and dispose of their data, and by improving data quality while enhancing security, there are benefits for everyone.
If you’d like to discuss your company’s data governance strategy, register for our Let’s Talk Data webcast to learn more and join in the live Q&A – I’d love to hear from you.