Team

Why this project

Agentic AI for science is moving fast, and the tooling has reached a point where it is genuinely worth evaluating on real research tasks. A hackathon is a natural venue to figure out what’s actually useful, find where current agents fall short, and prototype a small extension.

What a team could build in one day

  • Pick one of the available frameworks (see Resources).
  • Run a real research task end-to-end — for example, a literature triage, a small bioinformatics pipeline, or a protocol-execution agent.
  • Document what works, what breaks, and where a small extension would yield outsized value.

Minimum viable demo: one task run end-to-end with a written failure / success analysis.

Stretch directions

  • Build a custom skill or tool that fills a gap discovered during the evaluation.
  • Compose multiple agents in sequence for a more ambitious workflow.
  • Benchmark across two frameworks on the same task.

Resources