Epoch AI, a nonprofit organization dedicated to advancing artificial intelligence, finds itself at the center of controversy after revelations about its funding sources. The organization developed FrontierMath, a set of mathematical benchmarks intended to objectively evaluate AI performance. However, it was recently disclosed that Epoch AI received funding from OpenAI, information that had been kept from contributors until the announcement of OpenAI's upcoming AI, o3, on December 20. Concerns have emerged on social media regarding the potential impact this secrecy might have on FrontierMath's reputation as an impartial benchmark.
Despite having a verbal agreement with OpenAI not to use FrontierMath's problem set for AI training, the tech giant had access to many problems and solutions within the benchmark. This arrangement raised eyebrows among industry observers and contributors alike, especially given that OpenAI utilized FrontierMath to demonstrate o3’s capabilities. Tamay Besiroglu, associate director of Epoch AI, admitted that the organization erred in not being more forthright about OpenAI's involvement.
“We were restricted from disclosing the partnership until around the time o3 launched, and in hindsight we should have negotiated harder for the ability to be transparent to the benchmark contributors as soon as possible,” said Besiroglu.
Ellot Glazer, the lead mathematician at Epoch AI, emphasized the organization's commitment to maintaining an objective evaluation process. Although Glazer acknowledged OpenAI's support of Epoch AI's decision to keep a separate, unseen holdout set for independent verification, he noted that Epoch AI has not yet independently verified OpenAI's FrontierMath o3 results.
“However, we can’t vouch for them until our independent evaluation is complete,” Glazer commented. He added his personal belief that OpenAI's score was legitimate and that there was no incentive for them to misrepresent their internal benchmarking performances.
A contractor for Epoch AI, identified by the username "Meemi," revealed that communication regarding OpenAI's involvement lacked transparency, leaving many contributors uninformed until the public disclosure. Meemi’s statement sparked further debate about the ethical considerations of such non-disclosures in collaborative scientific endeavors.
“Our mathematicians deserved to know who might have access to their work. Even though we were contractually limited in what we could say, we should have made transparency with our contributors a non-negotiable part of our agreement with OpenAI,” Besiroglu reflected.
Epoch AI relies primarily on funding from Open Philanthropy, a foundation known for its research and grantmaking activities. Despite this primary source of financial support, OpenAI's involvement raised questions about potential biases in benchmark results and the nonprofit's future credibility.
In response to criticisms, Epoch AI has reiterated its commitment to maintaining rigorous standards for independent verification. The separate holdout set is designed to uphold the integrity of FrontierMath’s results, ensuring they are free from external influence.
“OpenAI has … been fully supportive of our decision to maintain a separate, unseen holdout set,” affirmed Besiroglu, highlighting ongoing efforts to preserve the benchmark's objectivity.
Leave a Reply