CS-led startup secures $17.7M to transform AI training data | Cheriton School of Computer Science

As AI companies mature, the industry is now on the hunt for high-quality training data.

As Serena Ge, a former Waterloo computer science student, explained in an earlier interview in the article Four Waterloo-founded startups earn $2 million seed funding, “For large language models to work efficiently they must be trained on a lot of data so they can understand how the world works.”

During her co-op term at Cohere, Serena was responsible for improving their large language models’ reasoning capabilities. Yet, she noticed that many companies lack good-quality code data.

This gap inspired her to co-found Datacurve, alongside former computer science student Charley Lee. Their startup provides companies with high-quality training sets to fine-tune their AI models.

What sets Datacurve apart is its unique “bounty hunter” system. It incentivizes software engineers to provide high-quality datasets by completing various challenges. In exchange, they will receive rewards. So far, Datacurve has distributed $1M in bounties.

“We treat this as a consumer product, not a data labelling operation,” says Serena. “We spend a lot of time thinking about: How can we optimize it so that the people we want are interested and get onto our platform?”

Recently, Datacurve closed a $15 million Series A funding round led by Chemistry. This comes after they raised $2.7 million in seed — bringing their funding total to $17.7M.

This funding announcement adds to a long list of accolades for Datacurve. Last year, they were selected by the Y Combinator — the world’s leading startup accelerator — and Forbes’s illustrious 30 under 30 list.

To learn more about their funding deal, please read the full article on TechCrunch.