he foundation of medical research, which is considered the gold standard, is the Randomized Controlled Trial when individuals are matched with others, and then randomized to one treatment or another. While the beauty of an RCT lies in its process of deliberate randomness, very little attention has been paid to the integrity of its building blocks: categories. It’s time to take a more in-depth look.
Much of the ambiguity of RCT’s results lies in how closely the treatment and control groups matched to one another. Variables that are continuous along a spectrum, like age, are grouped into “buckets,” in an attempt to make the heterogenous more homogenous. But as our knowledge increases, the underlying diversity of variables, even ones we formerly thought were more in the discrete, yes/no category are being challenged. Among the challenged are age, gender, and race.
“See, it’s not about races
Where your blood comes from
Is where your space is
I’ve seen the bright get duller
I’m not going to spend my life being a color”.
– Black and White, Michael Jackson
Race has long been a description of phenotype, black, brown, yellow, and white. The more contemporary, and politically charged statement is that race is a social construct – a cultural, not “objective” view. This particular view has generated a growing tempest in medicine around a standard test of kidney function, the glomerular filtration rate (GFR). GFR measures the ability of the kidneys to detoxify the blood and is a primary marker of kidney health or failure. Actually, measuring GFR requires a 24-hour collection of urine so various nomograms, the new term might be algorithms, estimate those 24-hour values from a single point in time, blood test.
For a variety of reasons, including a belief that black individuals were more muscular than comparable white individuals, and therefore had higher levels of creatinine, the substance measured in GFR, the algorithm adjusted for being black. As a result, clearances for black patients were greater than those for comparable white individuals. The downstream result is that black patients were most likely to have false-negative GFR tests; their kidney function was worse than the GFR would suggest. These factitiously good results not only delayed concern about developing renal insufficiency but adversely impacted their priority on recipient lists for kidney donation. In the last few weeks, UCSF, Zuckerberg San Francisco General Hospital, the University of Washington, and Vanderbilt, among others, have dropped the racial component of the algorithm, in some cases substituting some indirect measure of muscle mass.
“This equation assumes that Black people are a homogeneous group of people, and doesn’t take into account, how Black is Black enough?” – Vanessa Grubbs, MD Associate Professor of Nephrology UCSF
23andMe has made a business of separating us based on our genetics rather than our phenotype. Their calculated equivalence to “race” is ancestry composition. To give you a sense of how genetics differs from phenotype, consider the US census that describes six categories , and 23andMe which has six main categories with 18 sub-categories and an additional 38 sub-sub categories.
Our earliest gender assignment was based on the most obvious of phenotypes, our external genitalia – male, female, and individuals born with both, hermaphrodites. With expanding knowledge, our definition has progressed to the internal phenotype, the presence of a uterus and ovaries, or prostate and testis, to the presence of sex chromosomes. Today, gender can be measured by the release of hormones, a metabolic pattern. This metabolic definition has come forward in the discussion around long-distance runner Caster Semenya and her ability to compete as a female athlete.  Further conflating the problem is the entanglement of gender identity and orientation, which are related but separate. While more and more often, we see that the question of gender on surveys is couched as “gender identified at birth,” this wording may meet some political correctness but fails to provide much scientific precision.
You would think that age is pretty straightforward. Even creating buckets of ages should be easy. Of course, it all depends on what you mean by age. In looking at underage drinking, while the definition of underage may vary from locale to locale, the idea of what chronologically is 18 or 19 should be constant. On the other hand, what happens when we want to study health impacts. In this setting, are we interested in chronologic or physiologic age (often considered as frailty or some measure of co-morbidities)? It will make a difference. For example, how important is age versus frailty in the high-mortality associated with COVID-19? Capturing a chronologic value is probably not sufficient.
Medicine – the applied art
All of these categories are basic to applied medical research. And while the cultural positioning around categorizing race, and to a lesser degree, gender burns with a hot white light that may shed more heat than necessary, the concerns they raise are nonetheless valid. We need to develop not only consistent definitions but recognize that the definition changes with what general area we study. This further fractionation of categories may improve precision medicine, letting us compare more apples to apples. Still, it will require far more participants in studies to replace the statistical power lost as categories increase.
Dr. Charles Dinerstein, M.D., MBA, FACS is the Medical Director at the American Council on Science and Health. He has over 25 years of experience as a vascular surgeon. He completed his MBA with distinction in the George Washington University Healthcare MBA program and has served as a consultant to hospitals.
A version of this article was originally published at the American Council on Science and Health’s website and has been republished here with permission. The American Council on Science and Health can be found on Twitter @ACSHorg