Care Redesign
Relentless Reinvention

It’s Time to Rethink How We Measure Remission from Depression

Article · February 5, 2017

For the estimated 15.7 million U.S. adults with depression, the growing focus on treatment is promising, particularly in primary care settings. This shift in care delivery also demands rigorous measurement of the quality of depression services and their effects on population health. Several measure stewards and clearinghouses — the National Quality Forum (NQF), the National Commission on Quality Assurance, and Minnesota Community Measurement (MNCM) — as well as the Centers for Medicare and Medicaid Services (CMS) are converging on measures that emphasize discrete, narrow follow-up windows.

Our collaborative care for depression program at NYC Health + Hospitals has shown us that measures are most useful when the time frame for follow-up is broader. In this brief article, we aim to make a case for reconsidering how remission from depression is measured.

Widen the Time Frame for Follow-Up Assessment

Our program’s bottom-line quality metric, aligned with New York State Office of Mental Health standards, uses this fraction to measure depression improvement:

  • Numerator: Patients with demonstrable clinical improvement
  • Denominator: Patients enrolled in collaborative care for ≥70 days (patients must have a PHQ-9 score >9 to enroll in the program)

We define clinical improvement as either a Patient Health Questionnaire (PHQ)-9 score <10 (i.e., no or mild depression) or a current PHQ-9 score that is <50% of the patient’s baseline score. For patients enrolled in the program during Quarter 1 of 2016, our improvement rate was 57.6%.

Contrast that method with the NQF measure of depression remission (derived from MNCM and adopted by CMS):

  • Numerator: Adults with major depression or dysthymia and an initial PHQ-9 score >9 who achieve remission at 12 months
  • Denominator: Adults with major depression or dysthymia and an index PHQ-9 score >9

Notably, NQF defines remission as a PHQ-9 score <5 (no depression) at 12 months (+/–30 days); if multiple scores are taken within that 60-day window, the most recent score is used.

Although improvement and remission are separate clinical outcomes, we nevertheless believe that a metric similar to the one we use for improvement would be a better metric of remission than what the NQF currently endorses. That’s because the NQF approach narrows the time frame for scoring the follow-up PHQ-9 to 60 days (i.e., between months 11 and 13 after the initial PHQ-9). This narrowing is not rooted in the evidence. For example, in a study in the Journal of the American Board of Family Medicine, the median time to remission for depression in collaborative-care programs was 86 days (i.e., by month 3). Therefore, the NQF numerator does not capture patients who are successfully identified and treated before the 60-day follow-up window opens at month 11.

We deepened our analysis with data from our own depression registry and electronic health record. Specifically, using the NQF measure, we created three time windows for follow-up PHQ-9 scoring: 11 to 13 months (the prescribed window), 9 to 15 months, and 3 to 12 months after the baseline PHQ-9. The table shows what we found.

Broadening the PHQ-9 Follow-Up Timeframe Captures More Remissions

Broadening the PHQ-9 Follow-Up Timeframe Captures More Remissions. Click To Enlarge.

The data show that the vast majority of patients (305/430) enrolled in collaborative care had no PHQ-9 administered within the narrow follow-up time frame of 11 to 13 months. Using a broader time frame, anchored roughly on median time to remission (86 days) and ending one year from baseline (i.e., 3–12 months), increases the number of patients with a follow-up PHQ-9 within that time frame — and dramatically increases the remission rate.

In our program, when patients show significant, persistent improvement, they are “graduated” from collaborative care (to free up resources for other patients). Graduates receive the same care that the rest of our primary care population does, including universal screening for depression, generally with the PHQ-2, a two-question validated screening tool often used to decide whether to administer a PHQ-9 (a “yes” answer to at least one PHQ-2 question prompts a PHQ-9). Using either a PHQ-2 result equal to zero or a PHQ-9 score <5 as an indicator of depression remission, we found that the remission rate increases dramatically when the follow-up window is wider: from 26.0% (using the 11–13 month time frame) to 51.6% (using the 3–12 month time frame). This wider time frame better reflects how many of our patients in collaborative care actually experience remission from depression, rather than identifying only patients who improved and also happened to undergo rescreening during the narrower (60-day) window.

Finding the Optimal Measure of Remission

These results cast doubt on whether the NQF-endorsed measure optimally captures depression remission, particularly in a primary care population. Indeed, narrow follow-up windows end up excluding most of the population whose quality of care we wish to assess. Nevertheless, the NQF depression measure is now included in multiple measure sets, including a consensus set being defined by the Core Quality Measure Collaborative, led by America’s Health Insurance Plans, CMS, and the NQF. The need for the measure to account for depression services delivered in primary care is underlined by CMS’ move toward paying for collaborative care.

An optimal depression-remission measure might instead follow the example of NQF-endorsed quality metrics for other chronic diseases, such as hypertension and diabetes. For instance, the hypertension measure uses adults with diagnosed hypertension as its denominator and patients with adequately controlled blood pressure at their most recent visit in the measurement year as its numerator. For one core diabetes measure, the denominator includes adults with a diagnosis of diabetes (type 1 or 2) and a numerator that includes patients whose most recent HbA1c level during the measurement year was >9.0% (poor control) and those for whom an HbA1c test was not done during that year.

A feasible revised depression measure might use the current NQF-specified denominator, but with a numerator representing patients with depression or dysthymia whose most recent PHQ-9 is <5 (using the most recent PHQ-2 if it is newer than the PHQ-9). Another option, similar to the NQF-endorsed diabetes-control measure, could have a numerator that includes patients with depression or dysthymia who have a PHQ-9 score ≥10 or who lack a PHQ-2 or PHQ-9 for the measurement year. In either case, a 12-month performance period (i.e., without a prespecified follow-up window) could simplify reporting while measuring the population-wide effect of depression care more accurately.

Perhaps the best and simplest solution would roughly mirror the NQF-endorsed hypertension measure: Use a denominator of patients with diagnosed depression or dysthymia (but without the NQF measure’s initial elevated PHQ-9 requirement) and a numerator of patients whose symptoms are adequately controlled (most recent PHQ-9 score <5 or PHQ-2 = 0), again with a 12-month performance period.


It’s heartening to see care for depression getting more attention, but that progress is not enough. We must ensure that systems of care are built on rigorous, meaningful quality measures. Our experience with collaborative care for depression leads us to favor practical changes in how remission of depression is measured. We hope our concrete suggestions advance the conversation.


The views expressed in this article are those of the authors and do not necessarily represent the views or policy of NYC Health + Hospitals.

This article originally appeared in NEJM Catalyst on November 9, 2016.

New Call for Submissions ­to NEJM Catalyst


A weekly email newsletter featuring the latest actionable ideas and practical innovations from NEJM Catalyst.

Learn More »

More From Care Redesign

Improving Care and Cutting Costs: Implementation of a Laboratory Formulary to Facilitate Better Laboratory Ordering Practices

Can a formulary system help to prioritize cost-effective lab tests in the same way it has done for prescription drugs?

Data Graphic: Real-Time Communication Is Key to Improving Post-Acute Care Transitions

The NEJM Catalyst Insights Council weighs in on the best opportunities for post-acute care transitions.

Collaborative Care for Depression in a Safety-Net Health System

Integrating depression treatment into primary care in New York City’s public system.

The Waiting Game — Why Providers May Fail to Reduce Wait Times

Waiting has emotional effects on patients, so it’s ironic that physicians often cite long waiting times as evidence of their excellence.

No Stories Without Data, No Data Without Stories

We must remember to listen to the stories of the human beings on the receiving end of the policies we develop.

From Co-Located to Integrated Teams: How Utah’s Neurobehavior HOME Program Changed Its Culture

University of Utah Health incentivized coordination through integrated teams to provide better care at a lower cost for patients with developmental disabilities.

What’s More Valuable Than a Healthy Choice? Making Lifestyle Medicine Standard Practice.

A framework for embracing the health benefits of lifestyle choices in medicine.

The Other Victims of the Opioid Epidemic

The opioid epidemic is a national crisis that should not be underestimated. But its solution will require development of meaningful interventions.

Population Health — What’s in a Name?

Physicians and executives may agree on the concept but differ on how to define it.

We Need More Geriatricians, Not More Primary Care Physicians

Geriatricians are among the most satisfied specialists, so why don’t we have more of them?


A weekly email newsletter featuring the latest actionable ideas and practical innovations from NEJM Catalyst.

Learn More »


Coordinated Care

90 Articles

Data Graphic: Real-Time Communication Is Key…

The NEJM Catalyst Insights Council weighs in on the best opportunities for post-acute care transitions.

Reading List: Rushika Fernandopulle

NEJM Catalyst Thought Leader Rushika Fernandopulle weighs in on the most influential and inspiring texts…

Controlling the Cost of Medicaid

Both political parties should support policies that focus on incentives as a mechanism for improving…

Insights Council

Have a voice. Join other health care leaders effecting change, shaping tomorrow.

Apply Now