The Limits of Machine Learning in Recruiting: Garbage In, Garbage Out?

The Limits of Machine Learning in Recruiting: Garbage In, Garbage Out?

1200 900 Paul Breloff
Hardly a day goes by without someone asking: “So Shortlist is basically using artificial intelligence to screen CVs faster?” My response: “Well, not really…”

There’s no doubt that AI, neural nets, natural language processing, and machine learning are having a moment. They’re the shiny new toys in Silicon Valley right now, and with good reason — the capabilities are powerful, and if the vision is realized, the world will change, big-time. And no doubt there are a number of companies looking at ways to apply AI, machine learning and related concepts to recruiting.

It’s true that we incorporate sophisticated data analysis into our screening. We collect hundreds of data points from each candidate, and we’re starting to engineer even more features or variables that can predict who gets hired, who performs well, and who sticks around. These data feedback loops can get even more powerful as we do more and more hiring with a single employer, learning what they like and who does well at their company.

That said, we see glaring limitations to applying machine learning and AI to screen candidates based on their CVs and social media profiles alone — summed up in the old computer science adage of “garbage in garbage out.” GIGO is the concept that any output can only be as good as the inputs, and ultimately, any predictive algorithm is only as good as the data going in and the outcomes you’re predicting. In this case, most fancy new machine learning-based solutions in the recruiting space are using CV data as the primary input, and “similarity to other candidates/employees” or to a job description as the outcome to be predicted.

But this is myopic: CVs are shown to have only modest predictive relevance to performance in a job when considered on their own. Of course prior experience can matter, but CVs alone fail to capture individual performance and contribution, raw talent, actual competence, motivation, and a host of other factors that are important in assessing quality.

There’s also the pesky reality that they’re often embellished and written to be picked up by keyword-driven screening engines, which distorts analysis and results. Think back to your own hiring experiences; how many times has that candidate who looked so great on paper disappointed in reality? And have you ever had great colleagues who didn’t go to great schools or work at fancy corporates but who have shined in the real world? So don’t ignore a CV, but don’t rely on it exclusively!

And there are limits on the availability of “outcome data” — i.e., how do people actually perform once on the job? Ultimately, you need good “training data” and time to build a good algorithm, which means you need a data set with outcomes you care about — i.e., what was the person’s productivity on the job, did they stick around, were they a great culture fit? Unfortunately, that data is rarely available when training an algorithm in recruiting contexts.

Unless you’re running predictions about actual performance, actual retention, and actual outcomes that matter — then you run a great risk that you’re just pattern matching the status quo, which may entrench the same hiring mistakes, the same biases, the same lack of diversity that we already see.

But if CV data is all we’ve got, what’s a recruiter to do? Well, we can start to generate new data and new signals. Our technology automates the collection of dozens of new, user-generated data points — raw data about experience and salary expectation, performance data drawn from cognitive and competency tests, and the meta-data about how a candidate goes through the process which can be mined for motivation, speed, curiosity. As we use this treasure trove of new data to supplement traditional CV data, we become even more excited about the promise of machine learning approaches to make sense of it, yielding better matches for companies and candidates alike.

So, it’s an exciting time for AI (particularly as my brilliant brother joins one the coolest AI companies out there — congrats Tom!), but I don’t think it will be the standalone silver bullet in recruiting for some time to come. Humans are just too darn complicated.