Care Redesign
Relentless Reinvention

It’s Time to Rethink How We Measure Remission from Depression

Article · February 5, 2017

For the estimated 15.7 million U.S. adults with depression, the growing focus on treatment is promising, particularly in primary care settings. This shift in care delivery also demands rigorous measurement of the quality of depression services and their effects on population health. Several measure stewards and clearinghouses — the National Quality Forum (NQF), the National Commission on Quality Assurance, and Minnesota Community Measurement (MNCM) — as well as the Centers for Medicare and Medicaid Services (CMS) are converging on measures that emphasize discrete, narrow follow-up windows.

Our collaborative care for depression program at NYC Health + Hospitals has shown us that measures are most useful when the time frame for follow-up is broader. In this brief article, we aim to make a case for reconsidering how remission from depression is measured.

Widen the Time Frame for Follow-Up Assessment

Our program’s bottom-line quality metric, aligned with New York State Office of Mental Health standards, uses this fraction to measure depression improvement:

  • Numerator: Patients with demonstrable clinical improvement
  • Denominator: Patients enrolled in collaborative care for ≥70 days (patients must have a PHQ-9 score >9 to enroll in the program)

We define clinical improvement as either a Patient Health Questionnaire (PHQ)-9 score <10 (i.e., no or mild depression) or a current PHQ-9 score that is <50% of the patient’s baseline score. For patients enrolled in the program during Quarter 1 of 2016, our improvement rate was 57.6%.

Contrast that method with the NQF measure of depression remission (derived from MNCM and adopted by CMS):

  • Numerator: Adults with major depression or dysthymia and an initial PHQ-9 score >9 who achieve remission at 12 months
  • Denominator: Adults with major depression or dysthymia and an index PHQ-9 score >9

Notably, NQF defines remission as a PHQ-9 score <5 (no depression) at 12 months (+/–30 days); if multiple scores are taken within that 60-day window, the most recent score is used.

Although improvement and remission are separate clinical outcomes, we nevertheless believe that a metric similar to the one we use for improvement would be a better metric of remission than what the NQF currently endorses. That’s because the NQF approach narrows the time frame for scoring the follow-up PHQ-9 to 60 days (i.e., between months 11 and 13 after the initial PHQ-9). This narrowing is not rooted in the evidence. For example, in a study in the Journal of the American Board of Family Medicine, the median time to remission for depression in collaborative-care programs was 86 days (i.e., by month 3). Therefore, the NQF numerator does not capture patients who are successfully identified and treated before the 60-day follow-up window opens at month 11.

We deepened our analysis with data from our own depression registry and electronic health record. Specifically, using the NQF measure, we created three time windows for follow-up PHQ-9 scoring: 11 to 13 months (the prescribed window), 9 to 15 months, and 3 to 12 months after the baseline PHQ-9. The table shows what we found.

Broadening the PHQ-9 Follow-Up Timeframe Captures More Remissions

Broadening the PHQ-9 Follow-Up Timeframe Captures More Remissions. Click To Enlarge.

The data show that the vast majority of patients (305/430) enrolled in collaborative care had no PHQ-9 administered within the narrow follow-up time frame of 11 to 13 months. Using a broader time frame, anchored roughly on median time to remission (86 days) and ending one year from baseline (i.e., 3–12 months), increases the number of patients with a follow-up PHQ-9 within that time frame — and dramatically increases the remission rate.

In our program, when patients show significant, persistent improvement, they are “graduated” from collaborative care (to free up resources for other patients). Graduates receive the same care that the rest of our primary care population does, including universal screening for depression, generally with the PHQ-2, a two-question validated screening tool often used to decide whether to administer a PHQ-9 (a “yes” answer to at least one PHQ-2 question prompts a PHQ-9). Using either a PHQ-2 result equal to zero or a PHQ-9 score <5 as an indicator of depression remission, we found that the remission rate increases dramatically when the follow-up window is wider: from 26.0% (using the 11–13 month time frame) to 51.6% (using the 3–12 month time frame). This wider time frame better reflects how many of our patients in collaborative care actually experience remission from depression, rather than identifying only patients who improved and also happened to undergo rescreening during the narrower (60-day) window.

Finding the Optimal Measure of Remission

These results cast doubt on whether the NQF-endorsed measure optimally captures depression remission, particularly in a primary care population. Indeed, narrow follow-up windows end up excluding most of the population whose quality of care we wish to assess. Nevertheless, the NQF depression measure is now included in multiple measure sets, including a consensus set being defined by the Core Quality Measure Collaborative, led by America’s Health Insurance Plans, CMS, and the NQF. The need for the measure to account for depression services delivered in primary care is underlined by CMS’ move toward paying for collaborative care.

An optimal depression-remission measure might instead follow the example of NQF-endorsed quality metrics for other chronic diseases, such as hypertension and diabetes. For instance, the hypertension measure uses adults with diagnosed hypertension as its denominator and patients with adequately controlled blood pressure at their most recent visit in the measurement year as its numerator. For one core diabetes measure, the denominator includes adults with a diagnosis of diabetes (type 1 or 2) and a numerator that includes patients whose most recent HbA1c level during the measurement year was >9.0% (poor control) and those for whom an HbA1c test was not done during that year.

A feasible revised depression measure might use the current NQF-specified denominator, but with a numerator representing patients with depression or dysthymia whose most recent PHQ-9 is <5 (using the most recent PHQ-2 if it is newer than the PHQ-9). Another option, similar to the NQF-endorsed diabetes-control measure, could have a numerator that includes patients with depression or dysthymia who have a PHQ-9 score ≥10 or who lack a PHQ-2 or PHQ-9 for the measurement year. In either case, a 12-month performance period (i.e., without a prespecified follow-up window) could simplify reporting while measuring the population-wide effect of depression care more accurately.

Perhaps the best and simplest solution would roughly mirror the NQF-endorsed hypertension measure: Use a denominator of patients with diagnosed depression or dysthymia (but without the NQF measure’s initial elevated PHQ-9 requirement) and a numerator of patients whose symptoms are adequately controlled (most recent PHQ-9 score <5 or PHQ-2 = 0), again with a 12-month performance period.


It’s heartening to see care for depression getting more attention, but that progress is not enough. We must ensure that systems of care are built on rigorous, meaningful quality measures. Our experience with collaborative care for depression leads us to favor practical changes in how remission of depression is measured. We hope our concrete suggestions advance the conversation.


The views expressed in this article are those of the authors and do not necessarily represent the views or policy of NYC Health + Hospitals.

This article originally appeared in NEJM Catalyst on November 9, 2016.

Have an Article Idea ­for NEJM Catalyst?


A weekly email newsletter featuring the latest actionable ideas and practical innovations from NEJM Catalyst.

Learn More »

More From Care Redesign
Relentless Reinvention

Survey Snapshot: Genomic Data Is Far from Clinical Use

NEJM Catalyst Insights Council members say that clinical and cost data will continue to be the most useful data sources.

Relentless Reinvention

Why Every Health Care Organization Needs a Data Science Strategy

Data science strategy can help providers tap into the power of their data, improve its quality, and keep it safe.

Relentless Reinvention

Learning to Drive — Early Exposure to End-of-Life Conversations in Medical Training

The importance of listening to the patient at the end of life.

Relentless Reinvention

Care Redesign Survey: What Data Can Really Do for Health Care

NEJM Catalyst Insights Council members are shifting from disillusionment over the unfulfilled promises of big data to a more realistic vision of how sophisticated analytics can transform health care delivery.

Relentless Reinvention

Can We Achieve Scale in Innovation?

Innovation and scale are inextricably tied to the future success and sustainability of health care providers.

Relentless Reinvention

Measures Only Get Better When You Use Them

Optimism, innovation, and how the two go together.

Relentless Reinvention

P4 Medicine and the Democratization of Health Care

“P4 health care” — predictive, preventive, personalized, and participatory — will use the astonishing power of systems medicine and big data to bring cutting-edge scientific wellness to everyone, improving health and saving money.

Relentless Reinvention

What Happens When We Can’t Cope — Part 2

Do we give patients more work, more choice, or both?

Relentless Reinvention

Data Graphic: Impact of Disruptors to Traditional Care Delivery

How can health care disruptors improve patient experience if traditional organizations won’t change?

Relentless Reinvention

An Image of Medical Complexity

It’s no wonder the general public can’t decode the complexity of health care finance when MDs and PhDs can’t either.


A weekly email newsletter featuring the latest actionable ideas and practical innovations from NEJM Catalyst.

Learn More »


How Multi-Specialty Hubs Fill a Major…

Kaiser Permanente, Mid-Atlantic States identified a niche for patients seeking immediate care and found a…

Care Integration

48 Articles

How Multi-Specialty Hubs Fill a Major…

Kaiser Permanente, Mid-Atlantic States identified a niche for patients seeking immediate care and found a…

Quality Management

87 Articles

The Patient as Consumer and the…

Many physicians do not believe patient satisfaction is a legitimate pursuit. Yet they must meet…

Insights Council

Have a voice. Join other health care leaders effecting change, shaping tomorrow.

Apply Now