
How to find the perfect candidate burried in 50 000 CVs?
2 min read • Written by human: Oliver Cingl, Founder & CEO of GrowByte
Atollon, a long established ERP software company in Prague asked us to build a tool for their client, that enables them to find the perfect job candidate in their database of 50 000+ CV documents. Without that tool, there was essentially no smart way to find anything. With thousands of CVs, each with a different format and structure, even a simple keyword search might be insufficient. We had to build a robust solution that could handle all of the possible CVs and provide a useful structured overview of each candidate.
The first challenge was to somehow extract all of the data from the documents into a structured unified format. We needed to find a good OCR tool that could successfully parse even the weirdest PDFs. After some research and testing, we decided to use unstructured.io, which performed best on the tests, and it's even open source, which let us host it on our servers to prevent any data leaks. After we extract the data, we also need to vectorize them, so they could be queried using semantic search.
Next, the search engine. To correclty rank the candidates, we needed to use a combination of semantic search and structured filters (BM25). The semantic search was used to find the best matches for the search query, while the structured filters were used to narrow down the results to the most relevant candidates. I didn't tell you everything about the data extracting part though. Beside extracting and vectorizing the data, we also created a unified structured format for skills, education and past job experiences. AI takes the extracted text from the CV and maps it to this structured format, including a quick AI summary of the candidate. Only thanks to this, we could achieve peak performace.
Thanks to this tool, Atollon's client can now search easily in natural language. For example, search "senior python developer with 10+ years of experience in the fintech industry and a CS degree from MIT" and the system will give them the perfect candidates ranked from best to worst. They can now focus on the acutally important stuff, instead of drowning in thousands of CVs.
Want to level up your processes just like Atollon's client? Call me or send me a message:
If you're unsure what to send, just say "hey" and I'll get back to you.