A 2018 analysis of studies looking for genetic variants associated with disease found that under-representation [of minorities] persists: 78% of study participants were of European ancestry, compared to 10% of Asian ancestry and 2% of African ancestry. Other ancestries each represented less than 1% of the total. Several projects, such as H3Africa, are starting to increase participation of under-represented groups, both among participants and among researchers. Large biobanks assembled in Europe and North America, combining biological samples with health-related data, also set sampling targets to increase diversity.
But even when data from minority groups are available, many researchers discard them. Although there can be valid reasons to restrict analyses to a particular population, discarding such data by default is ethically problematic: it worsens under-representation and negates participants’ efforts to contribute to research.
Funding agencies have taken steps to improve the diversity of participants who are recruited for studies — notably, this has led to better representation of women in clinical trials since the 1990s. But agencies have less control over researchers’ decisions of what to analyse. Scientists are pulled towards statistical convenience and publishing incentives, which can both conflict with the collective goal of greater equity.
There are good reasons to follow precedent: using standard analytical pipelines reduces development cost and the need for extensive validation and explanation…. By omitting data, scientists squander an opportunity to build useful knowledge about minority populations.