NEJM: Protection or Harm? Suppressing Substance-Use Data

14 May 2015 10:55 AM | Deleted user

Retrieved from New England Journal of Medicine
May 14, 2015  |  Austin B. Frakt, Ph.D., and Nicholas Bagley, J.D.

What if it were impossible to closely study a disease affecting 1 in 11 Americans over 11 years of age — a disease that's associated with more than 60,000 deaths in the United States each year, that tears families apart, and that costs society hundreds of billions of dollars?1 What if the affected population included vulnerable and underserved patients and those more likely than most Americans to have costly and deadly communicable diseases, including HIV–AIDS? What if we could not thoroughly evaluate policies designed to reduce costs or improve care for such patients?

These questions are not rhetorical. In an unannounced break with long-standing practice, the Centers for Medicare and Medicaid Services (CMS) began in late 2013 to withhold from research data sets any Medicare or Medicaid claim with a substance-use–disorder diagnosis or related procedure code. This move — the result of privacy-protection concerns — affects about 4.5% of inpatient Medicare claims and about 8% of inpatient Medicaid claims from key research files (see table),


impeding a wide range of research evaluating policies and practices intended to improve care for patients with substance-use disorders.

The timing could not be worse. Just as states and federal agencies are implementing policies to address epidemic opioid abuse and coincident with the arrival of new and costly drugs for hepatitis C — a disease that disproportionately affects drug users — we are flying blind.

The affected data sources include Medicare and Medicaid Research Identifiable Files, which contain beneficiary ZIP Codes, dates of birth and death, and in some cases Social Security numbers. For tasks common to most health services research — such as combining patient-level data across systems (e.g., Medicare, Medicaid, and the Veterans Health Administration [VHA]), associating them with community or market factors (e.g., provider density or type of health insurance plans available), or studying mortality as an outcome — these are essential variables.

For decades, CMS has released data on claims related to substance-use disorders to allow researchers to study health systems and medical practice. One early example of such work is a study based on 1991 Medicare claims data that showed that few elderly patients received follow-up outpatient mental health care after being discharged with a substance-use–disorder diagnosis. Patients who received prompt follow-up care were less likely to die, a finding that could not have been obtained without information on patients' precise date of death.2 More recently, a 2010 study used 2003–2004 Medicare claims data linked by Social Security number to records from the VHA to assess the extent to which patients with substance-use disorders relied on the VHA for care.3 Substance-use disorders are among the diagnoses that have been included in the Dartmouth Atlas analyses of geographic variation in Medicare spending — which rely on ZIP Code identifiers — going back to at least 1998. To our knowledge, no patients have been harmed because of data breaches associated with studies such as these.

CMS has justified the data suppression by pointing to privacy regulations that prescribe the stringent conditions under which information related to the treatment of substance-use disorders may be shared.4 These regulations, which are overseen by the Substance Abuse and Mental Health Services Administration (SAMHSA), already frustrate accountable care organizations and health-information exchanges, since their elaborate consent requirements make it difficult or impossible to share patient data related to substance-use disorders. As a result, many organizations exclude such information from their systems, undercutting efforts to improve care and efficiency.

For researchers, the problem is more acute. Although the privacy regulations authorize providers to disclose data on substance-use disorders for research purposes, they prohibit third-party payers — including CMS — from doing so. In 1976, when the regulations were first adopted, this prohibition was not a substantial impediment to research. Before computers came into widespread use, researchers could not look to insurers or CMS to provide large claims-based data sets. Even if they could, crunching those data would have been exceedingly difficult.

But the world has changed. Access to reliable Medicare and Medicaid data has long offered researchers a window into U.S health care.2,3 Indeed, given the unwillingness of private insurers to share their data, Medicare and Medicaid data often provide our only way of gathering information about medical practice, patient outcomes, and costs. The very importance of the data may explain why CMS has long overlooked the prohibition on disclosure.

In 2013, however, SAMHSA advised CMS that the privacy regulations require suppression of claims related to substance-use disorders. The agency's sudden insistence on this point is puzzling. The law that the privacy regulations are intended to implement states that identifiable data on substance-use disorders “may be disclosed,” even without patient consent, “to qualified personnel for the purpose of conducting scientific research.” Banning CMS from sharing such data with researchers is difficult to square with that statutory exemption.

Nonetheless, in November 2013, CMS began scrubbing Medicare data of claims related to substance-use disorders. It did the same for Medicaid data in early 2014. No notice was given to the research community about the policy change. Most of our colleagues have been shocked to learn of it; many others probably remain unaware of the change.

The suppression has skewed Medicaid data more than Medicare data, a disparity that reflects differences between the populations served by the two programs (see table, and the Supplementary Appendix, available with the full text of this article at In both programs, inpatient claims are much more likely to be affected than outpatient claims.

In the vast majority of cases, claims are suppressed because the patients have secondary diagnoses of substance-use disorders. That raises an additional concern: many of the withheld data pertain to admissions for services that address not substance-use disorders but rather conditions that may be exacerbated by substance abuse. In other words, the data suppression extends well beyond its intended domain.

The effects of the CMS actions are thus much broader than they might initially seem. Clearly, it is now infeasible to conduct any study of patients with substance-use disorders based on Research Identifiable Files. But studies of conditions disproportionately affecting such patients — such as hepatitis C or HIV — will also be hampered. Moreover, any study relying on those files cannot make full diagnosis-based risk adjustments that include substance-use–disorder diagnoses. And because the data have been altered in a systematic, nonrandom manner — with suppression affecting different populations, age groups, regions, and providers to different degrees — the results of many studies that have no apparent connection to substance use will be biased.

And to what end? Without question, protecting patient confidentiality is essential, especially when it comes to potentially stigmatizing diagnoses and treatments. But there is no evidence that researchers — who, under current rules, must adhere to strict data-protection protocols, backed by criminal penalties — cannot appropriately secure research data. And most Americans want their health data to be available for research.5 At the same time, data suppression and access limitations remove from scrutiny a great deal of taxpayer-financed care.

We believe that the federal government's short-sighted policy will harm the very people it was meant to protect. We encourage SAMHSA and CMS, in dialogue with researchers and providers, to restore access to data that are necessary to improving care for patients with substance-use disorders.

Massachusetts Health Data Consortium
460 Totten Pond Road | Suite 690
Waltham, Massachusetts 02451

For more information,
please contact us at

join our mailing list

© Massachusetts Health Data Consortium