Care Redesign
Relentless Reinvention

It’s Time to Rethink How We Measure Remission from Depression

Article · February 5, 2017

For the estimated 15.7 million U.S. adults with depression, the growing focus on treatment is promising, particularly in primary care settings. This shift in care delivery also demands rigorous measurement of the quality of depression services and their effects on population health. Several measure stewards and clearinghouses — the National Quality Forum (NQF), the National Commission on Quality Assurance, and Minnesota Community Measurement (MNCM) — as well as the Centers for Medicare and Medicaid Services (CMS) are converging on measures that emphasize discrete, narrow follow-up windows.

Our collaborative care for depression program at NYC Health + Hospitals has shown us that measures are most useful when the time frame for follow-up is broader. In this brief article, we aim to make a case for reconsidering how remission from depression is measured.

Widen the Time Frame for Follow-Up Assessment

Our program’s bottom-line quality metric, aligned with New York State Office of Mental Health standards, uses this fraction to measure depression improvement:

  • Numerator: Patients with demonstrable clinical improvement
  • Denominator: Patients enrolled in collaborative care for ≥70 days (patients must have a PHQ-9 score >9 to enroll in the program)

We define clinical improvement as either a Patient Health Questionnaire (PHQ)-9 score <10 (i.e., no or mild depression) or a current PHQ-9 score that is <50% of the patient’s baseline score. For patients enrolled in the program during Quarter 1 of 2016, our improvement rate was 57.6%.

Contrast that method with the NQF measure of depression remission (derived from MNCM and adopted by CMS):

  • Numerator: Adults with major depression or dysthymia and an initial PHQ-9 score >9 who achieve remission at 12 months
  • Denominator: Adults with major depression or dysthymia and an index PHQ-9 score >9

Notably, NQF defines remission as a PHQ-9 score <5 (no depression) at 12 months (+/–30 days); if multiple scores are taken within that 60-day window, the most recent score is used.

Although improvement and remission are separate clinical outcomes, we nevertheless believe that a metric similar to the one we use for improvement would be a better metric of remission than what the NQF currently endorses. That’s because the NQF approach narrows the time frame for scoring the follow-up PHQ-9 to 60 days (i.e., between months 11 and 13 after the initial PHQ-9). This narrowing is not rooted in the evidence. For example, in a study in the Journal of the American Board of Family Medicine, the median time to remission for depression in collaborative-care programs was 86 days (i.e., by month 3). Therefore, the NQF numerator does not capture patients who are successfully identified and treated before the 60-day follow-up window opens at month 11.

We deepened our analysis with data from our own depression registry and electronic health record. Specifically, using the NQF measure, we created three time windows for follow-up PHQ-9 scoring: 11 to 13 months (the prescribed window), 9 to 15 months, and 3 to 12 months after the baseline PHQ-9. The table shows what we found.

Broadening the PHQ-9 Follow-Up Timeframe Captures More Remissions

Broadening the PHQ-9 Follow-Up Timeframe Captures More Remissions. Click To Enlarge.

The data show that the vast majority of patients (305/430) enrolled in collaborative care had no PHQ-9 administered within the narrow follow-up time frame of 11 to 13 months. Using a broader time frame, anchored roughly on median time to remission (86 days) and ending one year from baseline (i.e., 3–12 months), increases the number of patients with a follow-up PHQ-9 within that time frame — and dramatically increases the remission rate.

In our program, when patients show significant, persistent improvement, they are “graduated” from collaborative care (to free up resources for other patients). Graduates receive the same care that the rest of our primary care population does, including universal screening for depression, generally with the PHQ-2, a two-question validated screening tool often used to decide whether to administer a PHQ-9 (a “yes” answer to at least one PHQ-2 question prompts a PHQ-9). Using either a PHQ-2 result equal to zero or a PHQ-9 score <5 as an indicator of depression remission, we found that the remission rate increases dramatically when the follow-up window is wider: from 26.0% (using the 11–13 month time frame) to 51.6% (using the 3–12 month time frame). This wider time frame better reflects how many of our patients in collaborative care actually experience remission from depression, rather than identifying only patients who improved and also happened to undergo rescreening during the narrower (60-day) window.

Finding the Optimal Measure of Remission

These results cast doubt on whether the NQF-endorsed measure optimally captures depression remission, particularly in a primary care population. Indeed, narrow follow-up windows end up excluding most of the population whose quality of care we wish to assess. Nevertheless, the NQF depression measure is now included in multiple measure sets, including a consensus set being defined by the Core Quality Measure Collaborative, led by America’s Health Insurance Plans, CMS, and the NQF. The need for the measure to account for depression services delivered in primary care is underlined by CMS’ move toward paying for collaborative care.

An optimal depression-remission measure might instead follow the example of NQF-endorsed quality metrics for other chronic diseases, such as hypertension and diabetes. For instance, the hypertension measure uses adults with diagnosed hypertension as its denominator and patients with adequately controlled blood pressure at their most recent visit in the measurement year as its numerator. For one core diabetes measure, the denominator includes adults with a diagnosis of diabetes (type 1 or 2) and a numerator that includes patients whose most recent HbA1c level during the measurement year was >9.0% (poor control) and those for whom an HbA1c test was not done during that year.

A feasible revised depression measure might use the current NQF-specified denominator, but with a numerator representing patients with depression or dysthymia whose most recent PHQ-9 is <5 (using the most recent PHQ-2 if it is newer than the PHQ-9). Another option, similar to the NQF-endorsed diabetes-control measure, could have a numerator that includes patients with depression or dysthymia who have a PHQ-9 score ≥10 or who lack a PHQ-2 or PHQ-9 for the measurement year. In either case, a 12-month performance period (i.e., without a prespecified follow-up window) could simplify reporting while measuring the population-wide effect of depression care more accurately.

Perhaps the best and simplest solution would roughly mirror the NQF-endorsed hypertension measure: Use a denominator of patients with diagnosed depression or dysthymia (but without the NQF measure’s initial elevated PHQ-9 requirement) and a numerator of patients whose symptoms are adequately controlled (most recent PHQ-9 score <5 or PHQ-2 = 0), again with a 12-month performance period.

***

It’s heartening to see care for depression getting more attention, but that progress is not enough. We must ensure that systems of care are built on rigorous, meaningful quality measures. Our experience with collaborative care for depression leads us to favor practical changes in how remission of depression is measured. We hope our concrete suggestions advance the conversation.

 

The views expressed in this article are those of the authors and do not necessarily represent the views or policy of NYC Health + Hospitals.

This article originally appeared in NEJM Catalyst on November 9, 2016.

New Call for Submissions ­to NEJM Catalyst

Connect

A weekly email newsletter featuring the latest actionable ideas and practical innovations from NEJM Catalyst.

Learn More »

More From Care Redesign
Relative Health Care System Performance and Spending in 11 High-Income Countries

From Last to First — Could the U.S. Health Care System Become the Best in the World?

The United States could achieve the best-performing health care system in the world by undertaking coordinated efforts that address four challenges.

Addressing the Prescription Opioid Crisis: Advancing Provider Education and Collaborating with All Stakeholders

Providers have a large role to play in tackling the opioid overdose epidemic, but they can’t go it alone.

Collaboration Between Doctors and Computers - Machine Learning

Lost in Thought — The Limits of the Human Mind and the Future of Medicine

It’s ironic that just when clinicians feel that there’s no time in their daily routines for thinking, the need for deep thinking is more urgent than ever.

Root Cause Analysis: Typical Domains of Root Cause: Medical Errors

Does Every Hospital Admission Deserve a Root Cause Analysis?

Will applying the RCA rubric to hospital admissions better help define and manage care?

The Actionable Possibilities of Scientific Wellness P4 Medicine Systems Medicine - Leroy Hood Talk Still

The Actionable Possibilities of Scientific Wellness

Systems medicine has given us powerful tools for changing how we think about health care.

Uniquely Identified: The Impact of a National Health Index

What does the NHI mean to a New Zealand clinician, researcher, and health care consumer?

The Intersection of Home-Based Primary Care and Home-Based Palliative Care

My Favorite Slide: The Intersection of Home-Based Primary Care and Home-Based Palliative Care

What are the overlapping provider skill sets needed to care for homebound patients?

Simplifying Person-Centered Care with Use of the Personalized Perfect Care (PPC) Bundle

Personalized Perfect Care

The Personalized Perfect Care Bundle: Making quality metrics easier to understand and more patient-centered.

Improving Care and Cutting Costs: Implementation of a Laboratory Formulary to Facilitate Better Laboratory Ordering Practices

Can a formulary system help to prioritize cost-effective lab tests in the same way it has done for prescription drugs?

Data Graphic: Real-Time Communication Is Key to Improving Post-Acute Care Transitions

The NEJM Catalyst Insights Council weighs in on the best opportunities for post-acute care transitions.

Connect

A weekly email newsletter featuring the latest actionable ideas and practical innovations from NEJM Catalyst.

Learn More »

Topics

Triggering the Tipping Point in Payment…

The Co-chairs of the Guiding Committee of the Health Care Payment Learning & Action Network…

Coordinated Care

95 Articles

How to Have a High-Performing Employed…

It’s a generally accepted view that all hospital-employed physician groups are constitutionally incapable of operating…

The Digital Experience Must Also Be…

The most humane experiences happen when we meet patients where they are by designing the…

Insights Council

Have a voice. Join other health care leaders effecting change, shaping tomorrow.

Apply Now