Emerging Technologies Enabling Data-Driven Policy And Practice

Ryan van Leent

In October 2017, the Melbourne Institute released findings of a working paper series on intergenerational disadvantage, which illustrates that young people are almost twice as likely to need social assistance if their parents are on benefit. While this conclusion will be unsurprising to those with experience in social protection, this is a landmark study in terms of the data analyzed and the analytics approach that was applied. Indeed, at the time the study was initiated, the technologies that would ultimately facilitate the analysis did not yet exist! Now, with emerging predictive analytics, machine learning, and real-time computing technologies, there is unprecedented opportunity for data-driven policy and program insights to deliver better social and economic outcomes.

Part of what makes the Melbourne Institute’s findings unique is that the analysis provides irrefutable evidence of intergenerational disadvantage in Australia on the basis that the study has been conducted against a full dataset – not just a sample. The Department of Social Services’ Transgenerational Data Set (TDS) provides access to the records of 124,285 Australians born between October 1987 and March 1988, and 98% of these subjects were able to be matched to their primary carers, thereby enabling them to be included in the study. A longitudinal analysis is being conducted over 18 years and has already been applied across 126 million fortnightly social assistance payments, with transaction data currently available for these young Australians through age 26. This is an excellent use case for Big Data analytics within a public sector context.

Big Data analytics enables us to challenge preconceptions that might have been formed by a limited view of the data. In the case of the Melbourne Institute’s study, the data argues against the notion of a widespread welfare culture in which values are shaped and disadvantage becomes increasingly entrenched. Rather, the data shows that disadvantage caused by circumstance (e.g., disability) is much harder to overcome than that caused by personal choice.

Big Data analytics also has the potential to provide new insights across datasets, enabling us to develop a more complete understanding of people and their circumstances. Again, in the case of the Melbourne Institute’s study, the data shows a strong cross-program correlation across the spectrum of social benefits. This is particularly pronounced in the case of parental mental health disability, which is identified as having a broad range of consequences for young people who take on the burden of caring for their parents.

Now, three emerging technologies have the potential to shape how we consume Big Data and how we might apply new insights to deliver better social and economic outcomes:

  1. Predictive analytics is a form of advanced analytics that uses both new and historical data to forecast future activity, behavior, and trends. It encompasses a range of statistical techniques used to predict the probability of certain outcomes for individuals, based on observed patterns in historical data of people with a similar profile. These predictions can be applied to inform and influence policy decisions, such as identifying high-risk cases in a child protection scenario and prompting early intervention to prevent child abuse and neglect.
  1. Machine learning is a type of artificial intelligence (AI) that allows software applications to become more accurate in predicting outcomes without being explicitly programmed. It extends predictive analytics through the computational exploration of correlations between sample inputs and known outputs, which can be used to refine predictive models over time. This approach can be applied to optimize service plans by uncovering hidden patterns in the data and proposing interventions with the highest probability of delivering the desired outcomes in a given circumstance.
  1. Real-time computing is the use of, or the capacity to use, data and related resources as soon as the data enters the system. It enables analytical techniques to be applied at the point of service, in the operational system, thereby reducing the lag time traditionally associated with data warehousing. This capability is key to making the analytics proactive rather than reactive so that, for example, new customers can be segmented based on their risk profile at the time of intake.

One U.S. state government agency is a leading adopter of emerging technologies in its Management and Performance Hub (MPH). The MPH is a real-time computing platform with predictive analytics capabilities. In 2014-15, the state successfully piloted the MPH to statistically quantify the importance of risk factors driving a persistently high infant mortality rate. The pilot project applied predictive analytics to 9 billion rows of data across 50+ datasets to establish correlations and causations between previously unknown risk factors, and it enabled them to identify subpopulations with underlying drivers for infant mortality. As a result, the state secured an additional $13.5 million budget appropriation for new programs targeting early intervention for high-risk cohorts. Having established an enterprise-wide data analytics asset, the state has recently applied the MPH to combating the opioid epidemic and reducing criminal recidivism.

In another example, an Australian government agency has recently completed an eight-week trial of machine learning technology. The government’s strategic objective is to reduce citizen debt propensity by enabling earlier notice and targeted interventions through root cause analysis. The purpose of the trial was to apply machine learning to a vast array of data to provide early indicators of customers who may not have the capacity to pay compulsory contributions. In the trial, 187 million transaction records were analyzed across 97,000 customers, and the prototype achieved a 71% debtor prediction rate after four weeks of training. Implementing this capability on a real-time computing platform will enable the agency to build dynamic risk profiles that can inform evidence-based decision-making and customer segmentation. The agency is continuing to work to refine the machine learning model, and it intends to apply this capability to reduce total liabilities.

Emerging technologies have the potential to substantially automate and significantly improve the business processes associated with needs analysis, risk assessment, service planning, service delivery, and outcome tracking. In some cases, as has been demonstrated by the Melbourne Institute’s study into intergenerational disadvantage, we already have the data and the ability to analyze it retrospectively. The opportunity is to leverage this and apply today’s emerging technologies, as demonstrated by the Management and Performance Hub in the U.S. and the Australian government’s machine learning prototype, to deliver better social and economic outcomes for future generations.

Not enough humanitarian aid gets to where it’s needed. It’s time to reinvent aid delivery for the digital economy in order to rewire hope.


Ryan van Leent

About Ryan van Leent

Ryan van Leent is a member of Global Solution Management for Public Sector at SAP. He is responsible for solutions that cover social protection, debt collection management, and fraud and compliance. Ryan is the author of the SAP reference architecture for the social protection industry solution and has a key role in defining the solution road map. He is an active member of the Institute for Digital Government at SAP and a frequent contributor to social media discussions on digital transformation and data-driven government practices.