Trying and developing agentic AI for science

Team

Why this project

Agentic AI for science is moving fast, and the tooling has reached a point where it is genuinely worth evaluating on real research tasks. A hackathon is a natural venue to figure out what’s actually useful, find where current agents fall short, and prototype a small extension.

What a team could build in one day

Pick one of the available frameworks (see Resources).
Run a real research task end-to-end — for example, a literature triage, a small bioinformatics pipeline, or a protocol-execution agent.
Document what works, what breaks, and where a small extension would yield outsized value.

Minimum viable demo: one task run end-to-end with a written failure / success analysis.

Stretch directions

Build a custom skill or tool that fills a gap discovered during the evaluation.
Compose multiple agents in sequence for a more ambitious workflow.
Benchmark across two frameworks on the same task.

Resources

Biomni — biomedical agent framework.
Claude Scientific Skills.
Claude for Life Sciences.
Kaggle 5-Day Agents course.