Privacy and the use of health data for research

Christine M O'Keefe and Chris J Connolly
Med J Aust 2010; 193 (9): 537-541. || doi: 10.5694/j.1326-5377.2010.tb04041.x
Published online: 1 November 2010

Analysis of the vast archives of clinical and health system data can yield information that is vital to effective health policy development and evaluation. It can also lead to enhanced clinical care through evidence-based practice and safety and quality monitoring. However, analysis must be conducted in such a way that standards of privacy and confidentiality are not compromised for individual health care consumers. In recognition of Australia’s international leadership in scope and extent of health-related data collected at the population level, the Population Health Research Network (PHRN) ( has been established to provide Australian researchers with access to linkable, de-identified data from a wide range of health data sets, across jurisdictions and sectors.

The use of sophisticated data analysis and data-mining tools can increase the risks of privacy breaches occurring.1-7 This topic is becoming more important in the context of the PHRN investments designed to improve accessibility to Australian health-related data for the research sector.8

In this article, we review privacy regulation and privacy perception in Australia, in the context of health data used in research. We also provide a summary of resources for researchers interested in privacy issues related to secondary use of health data.

Privacy regulation in Australia
General privacy laws

The general privacy legislation currently in place in Australia is shown in Box 1.

  • Definition of personal information. There are some minor inconsistencies between the definitions of personal information in the different Acts. The definitions are regularly tested in privacy complaints.9

  • Definitions of use and disclosure. Within the privacy regulation framework, there are different provisions for use and disclosure, although it can be difficult to determine whether a given scenario involves use, disclosure or both.

  • Consent for disclosure. There are considerable inconsistencies in federal and state privacy regulation surrounding consent for disclosure.10-12 Strict application of the consent provisions in some cases has forced researchers to seek alternative methods of access to data without triggering consent provisions.

  • De-identification. Some privacy laws include specific provisions for de-identification and de-identified data, limited to certain types of research.

Health-specific privacy laws

The main health-specific privacy laws are shown in Box 2. Health information is represented differently in different laws: it is included either in personal information or sensitive information, or is defined separately.

Other legislative privacy requirements

Enforceable guidelines provide an additional layer of privacy regulation for health research without consent. The key federal guidelines are those approved under section 95 (for Medical research) and section 95A (for national privacy principles about health information of the Privacy Act 1988).

An analysis conducted by the National Health and Medical Research Council (NHMRC) of the use of these guidelines in practice13 found:

  • Among consumers, there was a low level of awareness of privacy legislation and people had difficulty distinguishing between “confidentiality” and “privacy”. Consumers were uncertain about providing consent for the use of their data.

  • Health professionals tended to equate confidentiality with privacy and always maintained patient confidentiality.

  • Researchers reported difficulty in accessing registries and inconsistencies in decisions made by human research ethics committees regarding access and disclosure.

  • Data custodians believed that there was no need for researchers to have access to identified data and felt that they obtained the same benefit from de-identified information.

  • Ethics committees believed that interpreting privacy legislation was complex, and they were most strongly opposed to researchers having access to health information without consent.

  • The research community has had some difficulty in using the guidelines, and the initial test for compliance rests with ethics committees that appear to have applied the test inconsistently.

Australian Law Reform Commission Review of the Privacy Act

The final report of the Australian Law Reform Commission (ALRC) Review of the Privacy Act was delivered on 30 May 2008,14 and the government released the first stage of its response on 14 October 2009.15 Exposure draft legislation containing an important element of the first stage response, the proposed Australian privacy principles, was released on 24 June 2010.16

In response to the ALRC recommendations 6-1 to 6-7, there are likely to be revisions to definitions in the Privacy Act, including that in recommendation 6-1, to define “personal information” as “information or an opinion, whether true or not, and whether recorded in a material form or not, about an identified or reasonably identifiable individual”. The Office of the Privacy Commissioner is likely to publish guidance on the meaning of terms such as identified and reasonably identifiable.

Regarding health research, recommendation 65-1 is likely to give rise to the replacement of the various existing guidelines on privacy and research by a formal set of research rules issued primarily by the NHMRC.

Other key accepted recommendations can be summarised as follows:

  • Recommendations 65-2 and 65-3. “Research” should be extended to include human research more generally and the compilation or analysis of statistics.

  • Recommendation 65-6. When a research proposal seeks to rely on the research exceptions in the Privacy Act, it must be reviewed and approved by a human research ethics committee.

  • Recommendation 66-1. The research rules issued by the NHMRC should address the question of the collection, use or disclosure of personal information without consent for inclusion in a database or register for research purposes, and that approval to establish such a database does not extend to future unspecified uses.

  • Recommendation 66-3. The research rules issued by the NHMRC should address the circumstances and conditions under which it is appropriate to collect, use or disclose personal information without consent in order to identify potential participants in research.

Privacy perception in Australia — the evidence

In this section we review available evidence of community attitudes and public perceptions regarding privacy in the context of using health data for research, focusing on de-identification, consent and participation.


In reporting on Australian Government Department of Health and Ageing (DoHA) qualitative research, Taylor17 noted that “consumers are not familiar with the term ‘de-identified data’ and even when it’s explained to them, it’s a concept that they are not all that comfortable with”.

In a poll conducted by the Australian Medical Association (AMA) in 2005,18 60% of respondents reported that they were slightly or very concerned about the de-identification process.


Surveys by the Office of the Privacy Commissioner in 2001 and 2004 found that about 64% of respondents said consent should be sought for the use of de-identified data for research, while 33% said that use without consent was fine.19,20 In the 2007 survey, 51% said that consent should be sought, while 46% said that consent should not be sought.21

Similarly, the DoHA research17 found that consumers supported the use of data in research and registers, provided the data were de-identified and the purpose was legitimate and worthwhile. If identified data were to be used, consumers expected to be informed and their consent to be sought.

In the contrasting AMA poll,18 about 80% of respondents thought that their doctor should ask permission before allowing their de-identified data to be used for medical research, government purposes or commercial purposes. The comments provided suggest that some respondents may have overlooked the fact that the survey was only about de-identified data.

Comparative results can be seen in the United States National Consumer Health Privacy Survey 200522 and an Australian perspective is provided by the Australian Consumer Association.23

It is interesting to compare individuals’ preferences for consent to the use of de-identified data for research with the legislation which permits the use of de-identified and sometimes even identifiable health data for research without consent, under certain conditions.


The use of an individual’s health data for research can be viewed as participation by that individual in the research. An individual may have an objection to the purpose of the research on moral grounds even when there is no risk of identification or personal consequences.24

The AMA poll18 found that 67% of respondents would give permission for their de-identified data to be used for research, 45% would give permission for government purposes and 32% would give permission for commercial purposes, showing that some participation concerns existed for a significant number of respondents.25

De-identification, consent and bias

De-identification is a complex issue surrounded by a lack of standard terminology and clarity. However, it is important because it underpins many health information privacy regulations.

First, it is often not clear what is meant when the term “de-identified” is used to refer to data. Sometimes it appears to mean simply that nominated identifiers such as name, address, date of birth and Medicare number have been removed from the data. At other times its use appears to imply that individuals represented in a dataset cannot be identified from the data, although it can also be unclear what this means. Simply removing nominated identifiers is often insufficient to ensure that individuals represented in a dataset cannot be identified. It can be straightforward to match some of the available data fields with the corresponding fields from external datasets, and thereby obtain enough information to determine individuals’ names either uniquely or with a low degree of uncertainty. This is particularly true of health information or of information which contains times and/or dates of events.

In Australia, the National statement on ethical conduct in human research26 avoided the term “de-identified data” because its meaning is unclear. Instead, it proposed that data may be collected, stored or disclosed in three mutually exclusive forms: individually identifiable, re-identifiable, and non-identifiable. One problem with this approach is the datasets that do not fit into any of the defined categories.

In contrast, the US Health Insurance Portability and Accountability Act 1996 (HIPAA) ( provides a useful legislative test for de-identification that provides certainty for the research community and for ethics committees.

Considering the issues surrounding the concept of de-identification in the Australian context would help to address some of the concerns highlighted by the NHMRC analysis,13 outlined under “Other legislative privacy requirements” above. The de-identification test contained in the HIPAA is a useful example of a legislative test that provides certainty for the research community. On the other hand, there may be a significant burden of compliance: if an organisation has many datasets then it would take a great deal of time for a person to perform the tasks outlined.

Consent and bias

Bias refers to the distortion of study results due to flaws in design or analysis. There is concern and some evidence that selection effects from consent processes lead to bias in research results.

Some investigations have been done on the possibility that consent processes may lead to bias in the makeup of study groups, and that this in turn may jeopardise the quality and applicability of the results. Woolf and colleagues27 concluded that:

Patients who release personal information for health services research differ in important characteristics from those who do not ... older patients and those in poorer health were more likely to grant consent. Quality and health services research restricted to patients who give consent may misrepresent outcomes for the general population.

With regard to population health, Stanley28,29 has stated that:

The advantage of population record linkage [without consent], from an epidemiological perspective, is that it is not biased and no-one is excluded. This relates to human rights because generally the people who are excluded from studies are the most marginalised. The results are useful for the whole population.

Balancing privacy and research

There is no fundamental disagreement in the literature that the rights of the individual with respect to privacy need to be balanced against the public interest in the outcomes of health research. However, there is a range of views on where the appropriate balance lies.30-32 A mutually satisfactory balance for consumers and the community is likely to be achieved by a combination of policy-centric33 and technology-centric34 measures.

There are at least four arguments that excessive privacy regulation has a negative effect on public health research. These include arguments about efficiency and quality of health research, the risk of avoidable harm to research subjects and the interests of the community versus those of the individual.

Efficiency of health research

The perception is that overheads resulting from privacy regulation hamper research efficiency in Australia.35,36 ALRC recommendations 65-3, 65-6 and 66-2 potentially exacerbate this situation.

Quality of health research

The fear is that selection effects from privacy-related processes including consent will lead to results bias. ALRC recommendations 66-1 and 66-3 (that organisations developing systems “to allow the linkage of personal information for research purposes should conduct a Privacy Impact Assessment”) were accepted in principle and potentially exacerbate this situation.

Risk of avoidable harm to research subjects

Avoidable harm may be caused to research subjects if they are exposed to sensitive medical information during overt data collection. For example, a request for consent to link treatment records with cancer registries could cause anxiety.37

Interests of the community versus those of the individual

There is a perception that excessive privacy regulation denies the community the full potential benefits of health research based on more complete data. The moral dimension of this work has been addressed directly by Australian researchers as follows:

The examples provided demonstrate that only complete population data obtained by such linkage is inclusive of all those often underrepresented or excluded in many studies ...28

This relates to human rights because generally the people who are excluded from studies are the most marginalised.29

How does the ethics committee, or privacy officer in an organisation interpret [the Privacy Act’s public interest exceptions to consent gathering]? You might expect that the ethical considerations would determine the outcome. However, it is more likely that the overriding consideration will be legal liability.35

Discussion and conclusion

There is little evidence of privacy complaints or breaches in health research. However, privacy regulation and privacy perception are both key factors in the health research context, acting as potential restraints on some types of research that could deliver considerable public benefit. Further, significant concerns regarding consent and de-identification remain in the community. In particular, the proportion of individuals who believe that consent should be required even where information is de-identified is likely to remain at significant levels (perhaps somewhere between one-quarter and one-third of the population) for some time to come.

Will these community concerns impact on health research? Ultimately, decisions on research are made by ethics committees applying guidelines that allow some balance between privacy and research. The decision is therefore taken out of the hands of individual consumers, but these community concerns help to shape privacy regulation and will have an indirect influence on the decisions of ethics committees.

Under the changes proposed by the ALRC, a single set of formal research rules issued by the Privacy Commissioner will guide all decisions by ethics committees. This may lead to improved consistency in outcomes that attempt to balance privacy rights with the public interest.

The ALRC recommendations also leave room for technical solutions to play an increased role in allowing personal information to be de-identified for research purposes. Recent advances in the techniques for de-identifying personal information34 provide some hope that de-identification can occur without a negative impact on data quality.

1 General privacy legislation currently in place in Australia





Privacy Act 1988 (Cwlth)

Federal Privacy Commissioner

Australian Capital Territory

Privacy Act 1988 (Cwlth)

Federal Privacy Commissioner

New South Wales

Privacy and Personal Information Protection Act 1998

NSW Privacy Commissioner

Northern Territory

Information Act 2002

NT Information Commissioner


Information Privacy Act 2009

QLD Information Commissioner

South Australia

Cabinet Administrative Instruction 1/89 2009

Privacy Committee of South Australia


Personal Information Protection Act 2004

Ombudsman Tasmania


Information Privacy Act 2000

Victorian Privacy Commissioner

Western Australia

No laws

Not applicable

Cwlth = Commonwealth.

2 Health privacy legislation currently in place in Australia


Health privacy legislation



Privacy Act 1988 (Cwlth)

Federal Privacy Commissioner

Australian Capital Territory

Health Records (Privacy and Access) Act 1997

Community and Health Services Complaints Commissioner

New South Wales

Health Records and Information Privacy Act 2002

Public sector: internal review Private sector: Privacy NSW

Northern Territory

None currently in place

Not applicable


Information Privacy Act 2009

Health Quality and Complaints Commission

South Australia

Code of Fair Information Practice

Not applicable


None currently in place

Not applicable


Health Records Act 2001

Health Services Commissioner

Western Australia

None currently in place

Not applicable

Cwlth = Commonwealth.

Received 8 October 2009, accepted 3 May 2010

  • Christine M O'Keefe1
  • Chris J Connolly2

  • 1 Preventative National Health Research Flagship, CSIRO Mathematics, Informatics and Statistics, Canberra, ACT.
  • 2 Galexia, Sydney, NSW.



Most of this work was conducted as a project for the Preventative Health National Research Flagship, and we thank the other project stakeholder representatives who gave valuable feedback. Additional research and guidance was provided by Galexia consultants Peter van Dijk and Francis Vierboom.

  • 1. Gomatam S, Karr AF, Reiter JP, et al. Data dissemination and disclosure limitation in a world without microdata: a risk-utility framework for remote access servers. Stat Sci 2005; 20: 163-177.
  • 2. O’Keefe CM, Good NM. Regression output from a remote analysis server. Data Knowl Eng 2009; 68: 1175-1186.
  • 3. Reiter JP. Model diagnostics for remote-access regression servers. Stat Comput 2003; 13: 371-380.
  • 4. Reiter JP, Kohnen CN. Categorical data regression diagnostics for remote servers. J Stat Comput Sim 2005; 75: 889-903.
  • 5. Reznek AP. Recent confidentiality research related to access to enterprise microdata. Washington, DC: US Census Bureau, Center for Economic Studies, 2006. (accessed Sep 2010).
  • 6. Sparks R, Carter C, Donnelly J, et al. A framework for performing statistical analyses of unit record health data without violating either privacy or confidentiality of individuals. Proceedings of the 55th session of the International Statistical Institute, Sydney, 2005 (CD-ROM).
  • 7. Sparks R, Carter C, Donnelly J, et al. Remote access methods for exploratory data analysis and statistical modelling: privacy-preserving analytics. Comput Methods Programs Biomed 2008; 91: 208-222.
  • 8. Australian Government Department of Innovation, Industry, Science and Research. Strategic roadmap for Australian research infrastructure. Canberra: Commonwealth of Australia, 2008. (accessed Sep 2010).
  • 9. Office of the Privacy Commissioner. Privacy Commissioner concludes investigation into CAMM Pacific and Health Communications Network Limited [media release]. (accessed Sep 2010).
  • 10. New South Wales Administrative Decisions Tribunal Appeal Panel. Vice Chancellor, Macquarie University v FM [2003] NSWADTAP 43. (accessed Sep 2010).
  • 11. New South Wales Administrative Decisions Tribunal. KJ v Wentworth Area Health Service [2004] NSWADT 84. (accessed Sep 2010).
  • 12. New South Wales Administrative Decisions Tribunal. MT v Director General, NSW Department of Education and Training [2004] NSWADT 194. (accessed Sep 2010).
  • 13. Hill D. Consumer attitudes to privacy in an e-health environment [speech]. Proceedings of the National Health Information Summit: transfering Australia’s health system; 2004 Dec 6-7; Melbourne. Canberra, Australian Health Information Council, 2005.
  • 14. Australian Law Reform Commission. For your information: Australian privacy law and practice (ALRC Report 108). Sydney: Commonwealth of Australia, 2008. (accessed Oct 2010).
  • 15. Australian Government Department of the Prime Minister and Cabinet. Enhancing national privacy protection: Australian Government first stage response to the Australian Law Reform Commission report 108. Canberra: DPMC, 2009. (accessed Sep 2010).
  • 16. Parliament of Australia, Senate. Exposure drafts of Australian privacy amendment legislation. Canberra: Senate Finance and Public Administration Committee, 2010. (accessed Sep 2010).
  • 17. Taylor J. Consumer attitudes to privacy in an e-health environment [speech]. Proceedings of the National Health Information Summit: transfering Australia’s health system; 2004 Dec 6-7; Melbourne. Canberra, Australian Health Information Council, 2005.
  • 18. Australian Medical Association. AMA poll shows patients are concerned about the privacy and security of their medical records. Canberra: AMA, 2005. (accessed Sep 2010).
  • 19. Office of the Federal Privacy Commissioner. Privacy and the community. Sydney: OFPC, 2001. (accessed Oct 2010).
  • 20. Office of the Federal Privacy Commissioner. Community attitudes towards privacy 2004. Sydney: OFPC, 2004. (accessed Oct 2010).
  • 21. Office of the Privacy Commissioner, Australia. Community attitudes to privacy 2007. Sydney: OFPC, 2007. (accessed Oct 2010).
  • 22. Forrester Research. National Consumer Health Privacy Survey 2005. Oakland, CA: California HealthCare Foundation, 2005. (accessed Sep 2010).
  • 23. Ballenden N. Transforming Australia’s health system [speech]. Proceedings of the National Health Information Summit: transfering Australia’s health system; 2004 Dec 6-7; Melbourne. Canberra, Australian Health Information Council, 2005.
  • 24. Privacy in Research Ethics and Law. Recommendations from PRIVIREAL to the European Commission. Sheffield, UK: PRIVIREAL, 2003. (accessed Sep 2010).
  • 25. Newell C. Health information management and information communications technology for public health and population research: a consumer perspective [speech]. Proceedings of the National Health Information Summit: transfering Australia’s health system; 2004 Dec 6-7; Melbourne. Canberra, Australian Health Information Council, 2005.
  • 26. National Health and Medical Research Council. National statement on ethical conduct in human research (2007). Canberra: NHMRC, 2007. (accessed Sep 2010).
  • 27. Woolf SH, Rothemich SF, Johnson RE, Marsland DW. Selection bias from requiring patients to give consent to examine data for health services research. Arch Fam Med 2000; 9: 1111-1118.
  • 28. Stanley F. Record linkage — public good or invasion of privacy? Proceedings of the 25th International Conference of Data Protection and Privacy Commissioners; 2003 Sep 10–12; Sydney, Australia. (accessed Sep 2010).
  • 29. Stanley F. Australian Health Information Council conference 2004 [speech]. Proceedings of the National Health Information Summit: transfering Australia’s health system; 2004 Dec 6-7; Melbourne. Canberra, Australian Health Information Council, 2005.
  • 30. Kelman CW, Pearson S-A, Day R, et al. Evaluating medicines: let’s use all the evidence. Med J Aust 2007; 186: 249-252. <MJA full text>
  • 31. Peto J, Fletcher O, Gilham C. Data protection, informed consent, and research. BMJ 2004; 328: 1029-1030.
  • 32. Wanless D. Securing good health for the whole population. London: HM Treasury, 2004. (accessed Sep 2010).
  • 33. Room S. Data protection, informed consent, and research: Data Protection Act does not bar medical research [letter]. BMJ 2004; 328: 1437.
  • 34. O’Keefe CM. Privacy and the use of health data — reducing disclosure risk. Electronic J Hlth Informatics 2008; 3: e5.
  • 35. Gun R. Privacy law is kneecapping epidemiological research. Australas Epidemiologist 2005; 12.1: 2-4.
  • 36. Skene L. Proliferating ethics committees and privacy legislation: new fetters on scientific research. Australas Epidemiologist 2005; 12.1: 16-18.
  • 37. Hill D. Testimony for the National Health and Medical Research Council, Senate Legal and Constitutional References Committee, inquiry into the Privacy Act 1988, 20 May 2005. Canberra: Commonwealth of Australia, 2005. (accessed Sep 2010).


remove_circle_outline Delete Author
add_circle_outline Add Author

Do you have any competing interests to declare? *

I/we agree to assign copyright to the Medical Journal of Australia and agree to the Conditions of publication *
I/we agree to the Terms of use of the Medical Journal of Australia *
Email me when people comment on this article

Online responses are no longer available. Please refer to our instructions for authors page for more information.