This repo tracks problems that may be worth attacking with frontier reasoning models, human experts, and tight verification loops.
It is not a list of important topics. It is a list of problems where a model could plausibly search a large idea space and produce an artifact humans can check.
A problem belongs here only if it has:
- A precise target.
- A verifier that is cheaper than discovery.
- Public data, code, or literature to start from.
- A route to a first 7-day experiment.
- A reason general reasoning could matter, not just scale or data cleaning.
| Rank | Problem | Domain | Why it is attackable |
|---|---|---|---|
| 1 | Hadwiger-Nelson: search for a 6-chromatic unit-distance graph | math | finite graph certificates, mechanical validation, direct analogy to the OpenAI unit-distance result. |
| 2 | Cancer dependency synthesis from DepMap | medicine / biology | public CRISPR/RNAi screens, held-out dependency prediction, wet-lab follow-up possible. |
| 3 | C. elegans connectome-to-function gap | neurobiology | full connectome plus perturbation atlas; clear prediction gap between anatomy and function. |
| 4 | ARC / ConceptARC cognitive abstraction models | cognitive science | human baselines, task generators, explicit verifier, useful for understanding human-like abstraction. |
| 5 | Small Ramsey number certificate search | math | exact finite combinatorics, SAT/proof certificates, historically hard but checkable. |
| 6 | Collateral sensitivity treatment design | medicine / evolution | sequential therapy is an optimization problem with experimental validation. |
| 7 | FlyWire circuit hypothesis generation | neurobiology | whole fly brain connectome, programmatic access, testable circuit hypotheses. |
| 8 | Self-healing reach adapters for agent runtimes | agent infrastructure | adapter drift has cheap verify loops; code-level repairs could keep long-tail web/app CLIs usable. |
Start with problem 001.
Output target:
- A cleaned corpus of known 5-chromatic unit-distance graphs.
- A verifier that checks unit-distance embedding and k-colorability certificate.
- A model/human search loop that proposes graph transformations.
- A "kill criterion": if no candidate pressure toward 6-colorability appears after one week, pause math search and switch to DepMap.
rubric.md— scoring rubric.safety.md— safety boundary, especially for medicine / biology.docs/workflow.md— intake, promotion, and first-week attack protocol.problems/— one file per candidate.templates/problem.md— template for future entries.
This repo was created after OpenAI announced a general-purpose reasoning model had disproved the planar unit distance conjecture on May 20, 2026: