To prevent discrimination, you have to look at race in the algorithms

What happens if you let the computer direct the selection based on issues such as origin and gender? Then you get a much better result, researchers from the Central Planning Office saw.

Joost van Egmond

It’s time to switch. Because if there’s one thing we don’t do now, it’s include data about race or gender in an automatic selection procedure. It gives a very high chance of profiling, we quickly think. A frightening example of this was the benefits issue, where people with a second citizenship were selected. But you should actually use this data, researchers from the Central Planning Bureau (CPB) argue.

On Wednesday they will report under the heading fair algorithms. That’s quite a claim, but CPB thinks it can live up to it. “Or better,” researcher Mark Katenberg hastens to say, “this method gives you the opportunity to adjust to what you consider fair. We are not talking about what is fair.”

First effective, then acting

The idea is not that complicated. It all starts with the first round, in which the focus is on effectiveness, as is often the case now. The computer, often a self-learning program, is instructed to find the people who have the highest chance of belonging to the group you are looking for. It could be anything. This may relate to fraud detection, where you look for potential scammers, but also, for example, which of the thousands of applicants is best suited for medical studies. The latter is not currently done by a computer, but the researchers did it in a test.

There is a certain set of lists. This is very risky, as was confirmed recently when it emerged that student finance service Duo was primarily investigating MBO students with an immigration background for fraud. But you fight that with the second step, selection based on representation. Are certain groups over or underrepresented? To find out, you need all the data we currently keep out of the box as much as possible, such as race, gender, parental income, and so on.

The best of both worlds?

With this data, you can make adjustments to get the combination of effectiveness and representation that you, as the algorithm manager, find optimal. Kattenberg and colleagues demonstrate this on the basis of a study of medical training choice. A weighted lottery has been used there for years, and the candidates with the highest scores had the best chance. An algorithm designed by CPB employees more than beat the effectiveness of this lottery. Much better than the lottery, the algorithm was able to predict who would complete medical school in the allotted time. This was at the expense of acting, however, as students with an immigrant background did not appear in the story. But if they then correct that, they get the best of both worlds: still a much higher success rate than the lottery, but also the percentage of students with an immigrant background equals the enrollment percentage.

“Exchange is very important here,” Katenberg asserts. “You can see what happens if you modify the algorithm for representation, and what the cost of that is in terms of effectiveness. This is of course a tough discussion, but we have to have it.”

Iris Muis, from the University of Utrecht’s School of Data, can corroborate this reasoning. A data ethicist, who was not involved in the research, sees potential in the concept. “You always need to check for bias to see if an algorithm doesn’t discriminate. That’s an explanation for that. If you don’t include such sensitive data in your analysis, you don’t know what’s going on. And then you don’t find out until it’s too late.”

The handling of such personal data is therefore controversial. It is only permissible if the distinction is made on the basis of relevant factors and if it is objectively justified. The Dutch Data Protection Authority (AP) oversees the proper use of algorithms as a “coordinating supervisor”. He did not give any advice on the CPB method. “We will take note of the report with great interest,” the Associated Press said.

Read also:

Algorithms are not abstract at all, this is how they work

Algorithms do nothing without human instructions, even self-learning algorithms simply do what you tell them to do. This is how the algorithm worksexplained in Jip-and-Janneke language.

Awareness of data ethics is growing in government, but the outcome is yet to come

Benefits scandal, a court ban on an algorithm against fraud in favor of benefits. Major affairs have made it clear in recent years that oversight of automated decisions is insufficient. But to do something about it, there is still a lot to do.

See also  That's why it shrinks as you get older

Megan Vasquez

"Creator. Coffee buff. Internet lover. Organizer. Pop culture geek. Tv fan. Proud foodaholic."

Leave a Reply

Your email address will not be published. Required fields are marked *