As AI companies mature, the industry is now on the hunt for high-quality training data.
“For large language models to work efficiently they must be trained on a lot of data so they can understand how the world works,” says Serena Ge, a former Waterloo computer science student.
During her co-op term at Cohere, Serena was responsible for improving their large language models’ (LLMs) reasoning capabilities. Yet, she noticed that many companies lack good-quality code data.
This gap inspired her to co-found Datacurve, alongside former computer science student Charley Lee. Their startup provides companies with high-quality training sets to fine-tune their AI models.

What sets Datacurve apart is its unique “bounty hunter” system. It incentivizes software engineers to provide high-quality datasets by completing various challenges. In exchange, they will receive rewards. So far, Datacurve has distributed $1M in bounties.
“We treat this as a consumer product, not a data labeling operation,” says Serena. “We spend a lot of time thinking about: How can we optimize it so that the people we want are interested and get onto our platform?”
Recently, Datacurve closed a $15 million Series A funding round led by Chemistry. This comes after they raised $2.7 million in seed — bringing their funding total to $17.7M.
This funding announcement adds to a long list of accolades for Datacurve. Last year, they were selected by the Y Combinator — the world’s leading startup accelerator — and Forbes’s illustrious 30 under 30 list.
To learn more about their funding deal, please read the full press release on TechCrunch.