OpenAI introduces GeneBench Pro for genomic AI evaluation

OpenAI has introduced GeneBench Pro, an advanced benchmark for evaluating AI systems on complex genomics workflows, measuring long-horizon scientific reasoning, data analysis, and research decision-making.

OpenAI has launched GeneBench Pro, a benchmark designed to evaluate how AI systems perform on realistic genomics and quantitative biology research tasks.

Unlike traditional biology benchmarks that focus on isolated questions, GeneBench Pro measures multi-stage scientific workflows, including data cleaning, exploratory analysis, statistical modeling, quality control, and interpretation of results.

The benchmark contains expert-designed evaluations with verifiable answers that reflect real research challenges encountered by computational biologists. OpenAI says GeneBench Pro provides a more rigorous assessment of AI capabilities in scientific research and helps track progress toward reliable AI systems that can assist scientists with complex, end-to-end genomics analysis.

OpenAI