ProPublica: Can You Grade Surgeons On Their Mistakes?

How do you find a good doctor? You know, the person who has the biggest role in your health—how do you know if they’re good? Come to think of it, how much do you even know about your doctor? Sure, you can probably find where they went to school, maybe a few angry or praising reviews on various websites, but what about important things like the number of people who have died under their care from preventable causes? What if you’re looking for a surgeon, someone who will actually be cutting you open and having access to your organs? Doesn’t that seem like the kind of thing you should know?

Despite the years of extensive training they undergo, it is unreasonable to expect that every surgeon is equally skilled at the procedures they do. If you click anything in this article (and don’t tend to vomit immediately upon seeing organs), please watch this video and this video of two surgeries to see how immediately apparent the difference is between good and bad surgeons. How do you know which one is operating on you?

Last month, ProPublica released a surgeon scorecard: a searchable database that rates surgeons based on deaths and complication rates from eight routine elective surgeries (including my personal favorite, gallbladder removal). The scorecard uses Medicare data (mostly patients over age 65), over a span of five years, 2.3 million surgeries, and 17,000 surgeons. The data analysis limited complications to those most directly attributed to the surgeon (like infections and uncontrolled bleeding), excluded ER transfers, screened out high risk patients, and adjusted for variables like hospital quality, patient age, and luck (stat nerds can read their complete methodology here). Basically, these are common, non-life-threatening surgeries done on patients in good health, to see how surgeons stacked up in terms of mistakes most likely to be their fault.

Overall, the results were fine: Even surgeons with the highest complication rates have 90% of their surgeries go off without a hitch. Sure, some error is to be expected, right? But then, 11% of the surgeons accounted for 25% of the complications, while over 2,000 surgeons had one complication, or none at all. What level of error is acceptable when it’s your life at stake?

The analysis led to some surprising findings as well, such as a surgeon at nationally renowned Johns Hopkins hospital falling way below the grade. Meanwhile, two surgeons in Alabama who see mostly overweight smokers have a complication rate of zero, which they attribute to tag-teaming surgeries and personal follow-up.

But more importantly, this is huge for consumers—being able to search for a surgeon not based on anecdotal recommendations or widely varying Yelp reviews (which are generally not good indicators), but based on your chances of not dying. People should be excited about this, right?

Basically every medical professional I talked to expresses extreme reservations about the ProPublica database. A doctor was concerned about how they accounted for difficult cases (again, see methodology). A consultant said that rampant Medicare fraud makes the data questionable (it’s still the best data available). A friend noted that #capitalism would mean the best surgeons would become too exclusive to be available to those who needed them. Some have compared it to the disastrous experiment in paying teachers based on test scores. Will doctors turn away difficult patients to uphold their rankings? And what about the poor doctors who have made it through years of frantic studying and competing, through college, through med school, only to keep being graded for their entire lives? Can you imagine having what is essentially your annual performance review available publicly?

There is some precedent for doctor-specific evaluation. In an effort to increase transparency, New York began publishing mortality rates for individual cardiovascular surgeons in 1989, and as with the ProPublica ratings, the numbers were adjusted to account for riskiness of patient (for example, a patient would be excluded if they were receiving more than one procedure). As a result, hospitals upgraded protocols and mortality rates dropped dramatically, by a third.

Or did they? One study found that two-thirds of cardiovascular surgeons admitted to not treating a patient in order to protect their scores. New York Magazine found instances of patients receiving unnecessary procedures in order to exclude them from stats, risky patients being sent out of state, and risk level being fudged in order to make the doctor look more skilled. There are numerous ways to game the system, despite the part of the Hippocratic oath where doctors promise, “I will remember that I remain a member of society, with special obligations to all my fellow human beings, those sound of mind and body as well as the infirm.”

One school of thought is that surgeons’ statistics should be available at the hospital-level instead of being made public. A recent article in Medium explored this possibility with a system called Amplio that collects extensive data before, during, and after operations, and can provide results taking a number of different factors into account. The scores differ from the ProPublica numbers in that they are multifaceted (meaning, surgeons can’t try to optimize along only one measure, like hospital readmissions), take all patients into account (not just Medicare), but, most importantly, stay anonymous within the hospital. By keeping results confidential, Amplio hopes to avoid system-gaming while still encouraging surgeons to seek help if needed. The article argues that “patients actually make little use of objective outcomes data when it’s available, that in fact they’re much more likely to choose a surgeon or hospital based on reputation or raw proximity.”

However, ProPublica points out that hospitals are unlikely to self-regulate. The sad truth is that readmissions result in more money for the hospital, though Medicare is putting incentives in place to curb this. Questionable cases are put before a peer review panel, which largely does nothing (the strictest punishment most members can remember handing out is a “letter of inquiry”). ProPublica notes, “Peer review is often undermined by the personal and financial ties that bind doctors together, according to experts and a 2008 study.”

In 1999, a report called “To Err Is Human” from the Institute of Medicine shocked the medical world, revealing that some 98,000 deaths were caused by medical errors every year. They avoided placing blame on individual doctors, and instead suggested several system-wide improvements to avoid mistakes. Now, it’s estimated that number is 400,000, making it the third leading cause of death in the United States.

We need to change something, and tools like the Surgeon Scorecard seem as good as anything else. Surgeons need the data in order to know how to improve, and the public should be able to use performance when making medical decisions. While I don’t think the ProPublica ratings are the one true answer to solve this problem, and measures certainly need to be in place to safeguard against gaming, I think it’s clear that public accountability is a step in the right direction.


EDIT: There’s a pretty healthy debate on my Facebook page between some doctors and the Stat Badgers. Feel free to weigh in!


Scroll To Top