‘Having the ability to link data is just the first step: let’s start linking research to data’

This article was originally published on Croakey.

Researchers from the Centre for Health Policy at The University of Melbourne have conducted a systematic review to see how linked hospital data has been used for research purposes across Australia over the past two decades.

They found there has been significant high value research resulting from linked data but a great deal of unevenness between States, for a variety of reasons.

In the article below, timely for both the ongoing debate about Census data and the current public inquiry by the Productivity Commission on Data Availability and Use, they warn Australian should not fall into the trap of developing more data linkage infrastructure without also investing in improving access and researchers to realise its full potential.

Professor Philip Clarke, Dr Kim Dalziel, Dr Dennis Petrie and Michelle Tew write:

Over the past few weeks there has been considerable public debate about the collection of personal information in the Census to permit linking this with other administrative information. While there has been significant focus on the privacy risk of collecting and linking such data, largely absent from this debate has been researchers extolling the benefits of linking data to answer a wide range of significant health and medical questions.

Having the ability to link data is only the first step to realising the “value” of such data. Here we focus on the progress in Australia of  turning data into research publications using evidence from our recent study Growth of linked hospital data use in Australia: a systematic review.

Data linkage of hospital records commenced in Western Australia in the 1970s, but its adoption by other states has been relatively slow. To facilitate national linkage capabilities the Population Health Research Network (PHRN) was established in 2009. Since then significant public investments of over $93 million have funded initiatives to build data linkage infrastructure and in the establishment of a data linkage unit in each state.

Our study undertook a systematic review of published research to identify Australian studies utilising linked hospital data. The extracted information was coded by the state in which the data linkage was conducted, the type of linkage used and area of research addressed.

There has been significant high value research that has resulted from such linked data. For example, data linkage has played an important role in the 45 and Up Study, an ongoing longitudinal study, providing individual patient data on hospital, pharmaceutical and health resource use helping researchers answer a variety of research questions on hospitalisations due to specific infections and use of general practice and healthcare cost at end of life.

Variation across the states

As the figure below shows there is a great deal of variation across Australian states in publication outputs using linked hospital data. The overwhelming majority (83 per cent) of published literature is contributed by just two states, Western Australia and New South Wales. The breakdown of PHRN funding indicates that, with the exception of Tasmania, all Australian states received a total of $5–9.5 million over recent years. The publications outputs from utilising linked data from these states are modest. For example, in 2014, there were only an average of three publications per year across South Australia and the Northern Territory, Queensland and Victoria, and there were no identified publications using Tasmanian linked hospital data.


The variation in research output across states provides opportunity to learn and understand why linkage does not routinely involve multiple states.  A recent study assessing the feasibility of cross jurisdictional linkage indicates that common complaints among researchers wanting to use data linkage were the numerous and duplicative paperwork needed for ethics, data custodian and data linkage units applications, lack of an informative timeline, lengthy delays and financial barriers.

The more productive Data Linkage Units such as WA and NSW charge for their data linkage services, which may indicate the need for resources to support research and skilled personnel to provide linked data. Additionally, these productive units appear to have clear established guidelines and processes available to guide researchers and significant additional funding for studies such as the 45 and Up study that enhance the existing administrative data.

In the states with a low publication output it is hard to determine whether this is due to the failure of the linkage units to supply data, or to there not being a sufficient demand from researchers for linked datasets. Beyond developing the infrastructure for comprehensive administrative datasets, there is clearly a need to provide training and funding for analysis if data linkage is to translate into research publications and further impact.

Looking to the future, the Australian Department of Health has just released a linked 10% sample of administrative health data from Medicare and pharmaceutical benefit records. This amounts to providing researchers access to around 1 billion lines of de-identified health use data involving around 3 million Australians. This has the potential to answer many important questions in Australia and provides a momentum for a greater recognition of the value of linked data.

A further expansion of use of de-identified linked health data, that can answer research questions, would provide a much better evidence base for monitoring current policies and developing new ones. In Scotland it is possible to link a 5% semi-random sample of their Census to other administrative dataset including hospitalisations and cancer registries (Scottish Longitudinal Study). Such linkage, among many other things, has permitted the monitoring of health inequalitiesspatial variations in care and the impact of policy changes.

While data linkage can be regarded very much like a natural resource such as iron ore, it has to be accessed and transformed by researchers into something clinicians and policy makers can use to inform practice and policy. Australia should not fall into the trap of developing more data linkage infrastructure without also investing in improving access and researchers to realise its full potential.

About the authors:

Professor Philip Clarke is the Head of the Health Economics Unit at the Centre for Health Policy within the Melbourne School of Population and Global Health, University of Melbourne. His health economic research interests include developing methods to value the benefits of improving access to health care, health inequalities and the use of simulation models in health economic evaluation.

Dr Kim Dalziel is a Senior Research Fellow and recipient of a McKenzie Fellowship. She has considerable expertise in modelling health interventions including for regulatory authorities such as PBAC and MSAC in Australia and for the National Institute of Health and Clinical Excellence (NICE) in the UK. She specialises in paediatric health economic evaluation and health services research.

Dr Dennis Petrie is a Senior Research Fellow in the Centre for Health Policy, University of Melbourne and holds a Discovery Early Career Researcher Award from the Australian Research Council . He has experience in analysing large linked data from Scotland and Sweden and has worked on the longitudinal measurement and evaluation of health inequalities, the economics of drugs and alcohol, improving safely and quality of prescribing, the economics of adherence and has been involved in a number of economic evaluations of healthcare interventions.

Michelle Tew is a Health Economics Research Assistant at the Centre for Health Policy, The University of Melbourne. She has worked on a number of costing studies utilising hospital administrative data including analysing costs associated with Caesarean section for overweight women and in the implementation of a sepsis protocol pathway.

Find out more