Amongst the most widely-used algorithmic outputs is one which is so ingrained in our lives and in our societal experience that most of us never notice it. Yet this algorithm has pernicious effects on the lives of some people, delaying their development, disadvantaging them in their careers, and denting their self-image. What is this source of injustice, hidden in plain sight? Nothing more than the humble, alphabetical sort. But the harm caused by our preference for lists ordered from A to Z is well-documented. There are the school children who get fewer turns in PE lessons because those with a surname starting with W rather than B might not get to run again in the time available. Applications to college or university which get less attention because they languish at the bottom of the pile.
Collaborators on academic papers whose names are hidden in et al. Even the subtle motivational harm of lacklustre clapping at the end of a graduation ceremony because the candidates were presented in alphabetical order. I am giving these examples precisely because they are mundane, but all are foreseeable given some thought. In many of these cases, there is not even an advantage to having ordered data. But why is it the sort, rather than a “random order” feature which takes pride of place in our spreadsheets’ menus?
Ethical challenges
There is a temptation to imagine the ethical challenges in AI to be a consequence of impenetrable complexity and “general AI” behaviour. A general AI, which we could relate to as a sentient entity and turn to many varied tasks is a relatively distant proposition and that distance in turn encourages us to kick the ethical questions into the long grass. Instead, we should first consider those relatively simple automatic decisions made with personal data that produce disproportionately negative effects. Such labour-saving automations may not be so headline-grabbing as a malevolent AI, Skynet-like in its evil intent, but they are vastly more important and influential. When a bank refuses credit to a current customer or even refuses to offer banking services to a new customer, these decisions could be based on biased training data or poorly-fitting algorithms. Similarly, the first stages of recruitment in larger firms depend on the rapid processing of large numbers of CVs. Biases introduced at that early stage will inevitably affect the overall hiring profile of the company and the rejected applicants will have no idea that it was the system, not their data which counted against them. If the most mundane automation can produce disproportionate harm, then the new deep neural networks which have hidden layers of processing for making inferences have even greater potential for harm. Establishing in a classical way how the inference was arrived at becomes difficult or impossible.
Bias - creations and creators
Of course, there is no reason we should expect more from our creations than we do from their creators: we can draw a parallel here with human reasoning. Often, the reasons we give for a decision are a post hoc justification and not the real reason at all. Bias goes undetected in human decision making all the time, even to the decision maker, and an AI could behave the same way. Who takes responsibility for the decision made by a biased staff member? Normally, it is the organisation, not the individual. So must it be for AI decision making.
Our examples so far reflect decision making and the ethical risks of leaving that to machines. What happens when machines output human languages? In May of this year, a particular kind of natural language processing model, GPT-3 hit the news. Developed by a private company called OpenAI it was released in a restricted way for research purposes. Its creators openly admitted that there were many potential use cases that were unexplored and possibly harmful and some of the early output confirmed their fears, showing leanings towards antisemitism and gender bias. That was unsurprising since much of the training text came from the uncensored web where these views are often shared under the cover of anonymity. GPT-3 is the latest in a series of models trained on ever larger corpora of the written word but differs from its smaller ancestors in that it is too large to be trained outside of the lab environment. Instead, it is provided in its trained state and simply prompted by the user with phrases, descriptions or requests. It then outputs some paragraphs of related text, which could be answers to questions asked, an interview or screenplay, the beginnings of a story or even an entire article on the subject prompted. Some of the output will be verbatim from the gigabytes of documents (including Wikipedia articles and scientific papers) but much is strung together from word and phrase associations in the training data and resembles original thought. Controversy is widespread, much of it related to how ‘human’ the output seems at first glance. Much of its writing is of a higher standard than the clickbait articles that have become so familiar in the advertisement funding model of the modern web. GPT-3 responses to human questions even went unnoticed on Reddit for a week, even though some topics were very sensitive.[1] There are concerns about the sheer rate of unedited output which is possible, although the developers in collaboration with Microsoft are reviewing each new use case to filter out the likely worst abusers. Even with access strictly controlled, it will not be long before GPT-3 and other competing technologies are widely accessible to whoever should want to use them, regardless of their goals. Misinformation campaigns, spam email content, SEO filler text: all are now inexpensively in reach with human input reduced to editing the junk rather than acting creatively to produce it.
Options
So what options do we have to limit the ethical traps associated with the manifold benefits of AI to society? There are efforts to regulate the use of AI in decision making, but those groups developing frameworks for algorithmic accountability[1] cannot expect to keep pace with developments in the area. Rather than demanding or regulating all the uses and output, we might consider a principles-based approach which mandates particular standards and oversight in all such systems, whether the intelligences behind them are pure artificial, purely human or a hybrid of the two. Freedom from bias (what might be described as fairness), a right of appeal to a competent authority and an oversight body to investigate the most egregious examples of abuse might be starting points.
One thing that is certain is that getting this right early on is paramount, rather than running to catch up with the developing technology and broadening use cases. Frameworks and oversight of social media might have been beneficial 10 years ago, but the cat is very much out of the bag now. We should not miss the opportunity early on where algorithms are concerned, no matter how mundane the technology and applications are at present. Maybe this is one area at least where we should work through the challenges and impacts from A to Z?
References
[3] The EU’s effort for example: https://www.europarl.europa.eu/thinktank/en/document.html?reference=EPRS_STU(2019)624262