Amazon scraps secret AI recruiting tool that showed bias against women

SAN FRANCISCO: Amazon.com’s machine-learning specialists uncovered a big problem: their new recruiting engine did not like women.
The team had been building computer programs since 2014 to review job applicants’ resumes with the aim of mechanizing the search for top talent, five people familiar with the effort said.
Automation has been key to Amazon’s e-commerce dominance, be it inside warehouses or driving pricing decisions. The company’s experimental hiring tool used artificial intelligence to give job candidates scores ranging from one to five stars – much like shoppers rate products on Amazon, some of the people said.
“Everyone wanted this holy grail,” one of the people said. “They literally wanted it to be an engine where I’m going to give you 100 resumes, it will spit out the top five, and we’ll hire those.”
But by 2015, the company realized its new system was not rating candidates for software developer jobs and other technical posts in a gender-neutral way.
That is because Amazon’s computer models were trained to vet applicants by observing patterns in resumes submitted to the company over a 10-year period. Most came from men, a reflection of male dominance across the tech industry.
In effect, Amazon’s system taught itself that male candidates were preferable. It penalized resumes that included the word “women’s,” as in “women’s chess club captain.” And it downgraded graduates of two all-women’s colleges, according to people familiar with the matter. They did not specify the names of the schools.
Amazon edited the programs to make them neutral to these particular terms. But that was no guarantee that the machines would not devise other ways of sorting candidates that could prove discriminatory, the people said.
The Seattle company ultimately disbanded the team by the start of last year because executives lost hope for the project, according to the people, who spoke on condition of anonymity. Amazon’s recruiters looked at the recommendations generated by the tool when searching for new hires, but never relied solely on those rankings, they said.
Amazon declined to comment on the recruiting engine or its challenges, but the company says it is committed to workplace diversity and equality.
The company’s experiment, which Reuters is first to report, offers a case study in the limitations of machine learning. It also serves as a lesson to the growing list of large companies including Hilton Worldwide Holdings and Goldman Sachs that are looking to automate portions of the hiring process.
Some 55 percent of US human resources managers said artificial intelligence, or AI, would be a regular part of their work within the next five years, according to a 2017 survey by talent software firm CareerBuilder.
Employers have long dreamed of harnessing technology to widen the hiring net and reduce reliance on subjective opinions of human recruiters. But computer scientists such as Nihar Shah, who teaches machine learning at Carnegie Mellon University, say there is still much work to do.
“How to ensure that the algorithm is fair, how to make sure the algorithm is really interpretable and explainable – that’s still quite far off,” he said.
Amazon’s experiment began at a pivotal moment for the world’s largest online retailer. Machine learning was gaining traction in the technology world, thanks to a surge in low-cost computing power. And Amazon’s Human Resources department was about to embark on a hiring spree: Since June 2015, the company’s global headcount has more than tripled to 575,700 workers, regulatory filings show.
So, it set up a team in Amazon’s Edinburgh engineering hub that grew to around a dozen people. Their goal was to develop AI that could rapidly crawl the web and spot candidates worth recruiting, the people familiar with the matter said.
The group created 500 computer models focused on specific job functions and locations. They taught each to recognize some 50,000 terms that showed up on past candidates’ resumes. The algorithms learned to assign little significance to skills that were common across IT applicants, such as the ability to write various computer codes, the people said.
Instead, the technology favored candidates who described themselves using verbs more commonly found on male engineers’ resumes, such as “executed” and “captured,” one person said.
Gender bias was not the only issue. Problems with the data that underpinned the models’ judgements meant that unqualified candidates were often recommended for all manner of jobs, the people said.
With the technology returning results almost at random, Amazon shut down the project, they said.
The company managed to salvage some of what it learned from its failed AI experiment. It now uses a “much-watered down version” of the recruiting engine to help with some rudimentary chores, including culling duplicate candidate profiles from databases, one of the people familiar with the project said.
Another said a new team in Edinburgh has been formed to give automated employment screening another try, this time with a focus on diversity.

Amazon scraps secret AI recruiting tool that showed bias against women