ScienceAgentBench Evaluation

Evaluate BIOLABS Skills on Bioinformatics benchmark tasks

Evaluation Settings

Run generated code

Evaluation Progress

0 / 0 tasks 0%
Waiting to start...

Evaluation History

No evaluations yet

Run an evaluation to see results here

With BIOLABS Skills
--%
0 evaluation runs
Without Knowledge
--%
0 evaluation runs

Success Rate Comparison

Skill Usage Distribution

Performance Over Time

Error Analysis

No errors recorded yet

Generated Code