Artificial intelligence systems are getting better and smarter, but are they ready to make impartial predictions, recommendations or decisions for us? Not quite, Gartner research vice president Darin Stewart said at the 2018 Gartner Catalyst event in San Diego.
Just like in our society, bias in AI is ubiquitous, Stewart said. These AI biases tend to arise from the priorities that the developer and the designer set when developing the algorithm and training the model.
Direct bias in AI arises when the model makes predictions, recommendations and decisions based on sensitive or prohibited attributes — aspects like race, gender, sexual orientation and religion. Fortunately, with the right tools and processes in place, direct bias can be “pretty easy to detect and prevent,” Stewart said.
According to Stewart, preventing bias requires situational testing on the inputs, turning off each of the sensitive attributes as you’re training the model and then measuring the impact on the output. The problem is that one of machine learning’s fundamental characteristics is to compensate for missing data. Therefore, nonsensitive attributes that are strongly correlated with the sensitive attributes are going to be weighted more strongly to compensate. This introduces — or at least reinforces — indirect bias in AI systems.
AI bias in criminal sentencing
A distressing real-life example of this indirect bias reinforcement is in criminal justice, as an AI sentencing solution called Compas is currently being used in several U.S. states, Stewart said. The system takes a profile of a defendant and generates a risk score based on how likely a defendant is to reoffend and be considered a risk to the community. Judges then take these risk scores into account when sentencing.
A study looked at several thousand different verdicts associated with the AI system and found that African-Americans were 77% more likely than white defendants to be incorrectly classified as high risk. Conversely, white defendants were 40% more likely to be misclassified as low risk, then go on to reoffend.
Even though it is not part of the underlying data set, Compas’ predictions are highly correlated with race because more weight is given to related nonsensitive attributes like geography and education level.
Darin Stewartresearch vice president, Gartner
“You’re kind of in a Catch 22,” Stewart said. “If you omit all of the sensitive attributes, yes, you’re eliminating direct bias, but you’re reintroducing and reinforcing indirect bias. And if you have separate classifiers for each of the sensitive attributes, then you’re reintroducing direct bias.”
One of the best ways IT pros can combat this, Stewart said, is to determine at the outset what the threshold of acceptable differentiation should be and then measure each value against it. If it exceeds your threshold, it’s excluded from the model. If it’s under the limit, it’s included in the model.
“You should use those thresholds, those measures of fairness, as constraints on the training process itself,” Stewart said.
If you are creating an AI system that is going to “materially impact someone’s life,” you also need to have a human in the loop who understands why decisions are being made, he added.
Context is key
Stewart also warned IT practitioners to be wary when training an AI system on historical records. AI systems are optimized to match previous decisions — and previous biases. He points to the racist practice of “redlining” in Portland, Ore., — which was legal in the city from 1856 until 1990 — that prevented people of color from purchasing homes in certain neighborhoods for decades. AI systems used in real estate could potentially reinstate this practice, Stewart said.
“Even though the laws change and those bias practices are no longer allowed, there’s 144 years of precedent data and a lot of financial activity-based management solutions are trained on those historical records,” Stewart said.
To avoid perpetuating that type of bias in AI, Stewart said it’s critical that IT pros pay close attention to the context surrounding their training data.
“This goes beyond basic data hygiene,” Stewart said. “You’re not just looking for corrupted and duplicate values, you’re looking for patterns. You’re looking for context.”
If IT pros are using unstructured data, text analytics is their best friend, Stewart said. It can help them uncover patterns they wouldn’t find otherwise. Ideally, IT pros will also have a master list of “don’t-go-there” items they check against when searching for bias.
“Develop a list of suspect results so that if something unusual started popping out of the model, it would be a red flag that needs further investigation,” Stewart said.
Intentionally inserting bias in AI
Is there ever a case where IT pros would want to inject bias into an AI system? With all the talk about the dangers of perpetuating AI bias, it may seem odd to even consider the possibility. But if one is injecting that bias to correct a past inequity, Stewart’s advice was to go for it.
“That is perfectly acceptable if it is a legitimate and ethical target,” he said. “There are legitimate cases where a big disparity between two groups is the correct outcome, but if you see something that isn’t right or that isn’t reflected in the natural process, you can inject bias into the algorithm and optimize it to maximize [a certain] outcome. “
Inserting bias in AI systems could, for instance, be used to correct gender disparities in certain industries, he said. The only proviso he would put on the practice of purposefully inserting bias into an AI algorithm is to document it and be transparent about what you’re doing.
“That way, people know what’s going on inside the algorithm and if suddenly things shift to the other extreme, you know how to dial it back,” Stewart said.