![AI Robot hand selecting image of a person out of a line of polaroid photos, implying that AI can assist in the selection of job candidates](/sites/default/files/styles/scaled_140_/public/2025-01/iStock-924555488.jpg.webp?itok=PJNvcGzV)
Using AI for a Better, Cheaper Recruitment Process
With the right model, AI can use video recording of interviews to infer the job candidates who are best-suited to sales jobs.
Hiring new employees is expensive. There are many applicants and, often, several rounds of interviews; there is internal deliberation. This is particularly true for jobs that rely on softer interpersonal skills. Recruiting college graduates for entry-level sales jobs in 2018, for example, cost more than $6,000 per person.
It’s no wonder, then, that “firms are keen on tech-driven solutions to cut entry-level hiring costs,” says Yale SOM’s K. Sudhir. In an article for Marketing Science, Sudhir showcases research on a novel AI model that analyzes video interviews and assesses candidates’ “latent sales ability.” With three colleagues — Ishita Chakraborty from the University of Wisconsin-Madison and Khai Chiong and Howard Dover from the University of Texas at Dallas — he finds that the AI effectively discriminates among good job prospects while driving down screening costs.
Of particular value is the AI’s ability to correctly decipher and interpret signals from body language and conversational dynamics — an advancement unique to this model. While other AI tools have been used to assess textual and linguistic cues, this is the first to take full advantage of the many other dimensions of information available in a video recording.
Equally important, because the European Union and certain U.S. jurisdictions label the use of AI in hiring as “high-risk,” meaning the ethical ramifications are significant, the researchers took pains to make the model’s decision-making transparent. This allowed them to peer into the mechanics and, by observing the model’s approach to candidate assessment, derive several managerial insights about which attributes provided by interviews make for good salespeople.
Sudhir and his colleagues trained the model on 195 role-play interviews of 15-minutes each, in which an applicant pretended to sell a software product to an interviewer. The candidates were university students; the interviewers were sales professionals from a range of companies and sectors.
The model used a video recording to assess three distinct verbal features tied to persuasiveness and sales ability. First, it looked at the content of the interview, meaning what candidates said. Second, it looked at the way in which things were said, defined along eight dimensions; these included metrics like politeness, complexity of language, confidence, and optimism. Finally, introducing a new avenue of analysis, the model examined the interactivity of the conversation — how much give and take there was — along with the degree to which candidates matched the linguistic style of the interviewer, a shift (often unconscious) that “can lead to better rapport building and more effective negotiations,” Sudhir writes.
Alongside this vocal analysis, the AI was trained to identify 24 points along the body — eyes, hands, legs, etc. — and then quantify motion related to things like hand gestures, posture, and head movement. In the right proportion, these motions were understood by the model to suggest confidence, comfort, and warmth, while something like too much rapid hand movement was interpreted as nervousness.
The key finding was that AI alone was able to effectively determine candidates with good sales ability. That said, when one human was integrated into the process the model performed significantly better while still being cost-effective. As more people were added to the loop, performance increased, but not enough to justify the additional cost.
The researchers measured the model’s performance in several ways, but most simply they found that the AI model on its own improved the quality of the workforce relative to one that is randomly selected by about 40%. The addition of one human improved this figure to 67%.
To put this in context, “suppose a firm wants to fill 25 salesforce positions from 100 interviewees,” Sudhir writes. With AI, nearly 12 out of the 25 selected candidates would be good hires, while random hiring would recruit about 6 good hires per 25. The hybrid model results in nearly 15 out of 25 being good hires.
Sudhir and his colleagues also looked at the most valuable way to deploy humans in this process. They found that the most cost-effective approach — and one that doesn’t result in significant loss of quality — is to have AI do the initial screening and then humans join for the second round of final selection.
The model’s unusual degree of transparency provided insights into what signals matter when trying to predict sales ability. The most important variable that the AI model tied to candidate success was “information density,” or the number of words spoken by a seller during the 15-minute interview. “A below-average information density leads to a sharp decline in the predicted latent sales ability, suggesting that a salesperson with little to say tends to have poor performance,” Sudhir writes. “However, an increase in information density beyond the median does not have much incremental benefit.”
They also found a fine balance in the amount each party speaks: sellers must let potential buyers talk, but not so much that they dominate the conversation. Politeness and “collaborativeness” are also more important than the other six linguistic variables when assessing sales performance.
For Sudhir, this project represents the first of many steps. “The video interview is particularly effective for recognizing subtle persuasion skills essential for sales roles, which résumés fail to capture for candidates who are similar in age, education, and experience,” he writes. But the potential for this application extends beyond job interviews and sales skills alone.
Overall, we believe our work can serve as a starting point for a novel and richer stream of work using video data in the persuasion and selling literature.