You’ve probably heard the joke. Two campers see a bear approaching. One starts putting on her sneakers. The other asks: “What are you doing? You can’t outrun a bear.” The response is obvious: “I don’t have to outrun the bear; I just have to outrun you.”
So it is with the “objectivity” of information systems. The question is not whether digital services are—or ever can be—completely unbiased. It’s whether they can outrun the available alternatives. Much more often than not, they can and will.
That this bias exists is undeniable. A machine-learning system is only as objective as its underlying dataset. If this data reflects historical biases, then so will the system. If certain populations are not sufficiently represented statistically, then the system will be weak in those areas.
…
The good news is that just as survey professionals strive to design random samples and ask neutral questions, so can AI developers take steps to assure the quality of their underlying data. Systems can, for example, be tested for bias by isolating criteria such as race, gender, location and other factors. Facial recognition weaknesses can and have been corrected through better data. These and similar techniques will only improve over time, as business practices mature and as the volume of relevant data steadily increases.