Skip to content

[submission] contractibleSpace_houseWithTwoRooms #243

@daouid

Description

@daouid

Submission URL

https://gist.github.com/daouid/73182095262a4e329bcec8579bdd573a

Model

Antigravity (Multi-Model Ensemble: Gemini 3.1 Pro, Gemini 3 Flash, Claude 4.6 Sonnet/Opus)

How this solution was produced (optional)

POC of how a moderately advanced harness / scaffolding can deliver: antigravity, SKILLS.md / AGENTS.md , MCP server.

The human in the loop is not a mathematician, nor a software engineer, just someone curious and armed with patience, and acting as a babysitter:

with simple encouragements like
"remember, if we dont have the needed bricks, we build them and lay them
search online for guidance if needed
step by step, brick by brick, we are progressing we have time, you are doing great, try to address 1 thing at a time
think using sequential thinking tool as needed, take your time and proceed with care
no shortcuts, no cheating, we have time
the sky is the limit, this is not an open problem, you got this ! "

AI did all the thinking.

No golfing done.

This 7,000+ line proof was produced interactively via an AI-driven pair-programming workflow using Antigravity, Google's agentic IDE (a VS Code fork). Due to the massive scale of the deformation retraction, the solution was driven by cycling through the entire suite of available frontier models as quota constraints required.

The formal verification relied on a specialized scaffolding pipeline:

Iterative Prompt Engineering: The task was initialized using structured prompts (contractibleSpace_houseWithTwoRooms.md that enforce Mathlib style guides (e.g., "Rely on existing Mathlib structural lemmas" and "Abstract into lemmas parameterized by characteristics").

Failure Feedback Loop: When intermediate attempts failed (e.g., due to type class synthesis errors or unsolved goals), the raw lake build trace logs and compiler outputs were automatically captured and injected back into the prompt context. This allowed the models to iteratively diagnose and correct their own errors.

Model Context Protocol (MCP): The agents interacted with the Lean 4 compiler in real-time via the lean-lsp MCP server, gaining high-fidelity "Language Server to Agent" access:
Proof State Tracking (lean_goal): Real-time extraction of tactic states to track the 24 nested sub-cubes of the retraction.
Diagnostics (lean_diagnostic_messages): Immediate compiler feedback on type mismatches and syntax errors to keep the massive 7,000-line construction mathematically sound.

Structural Synthesis: The proof was built constructively by the ensemble over multiple sessions. The agents defined the 24 topological spaces (C_1 through C_24), constructed explicit piecewise continuous projections (proj_1 through proj_24), and systematically eliminated sorry placeholders until the full deformation retraction onto the point was rigorously verified by the Lean compiler.

Acknowledgements

  • I understand that the lean-eval CI will fetch my submission URL and run comparator on every lakefile.toml whose name matches a benchmark problem id.
  • I understand that only the set of solved problem IDs, along with the metadata I entered above, will be published to the public leaderboard results store.

Metadata

Metadata

Assignees

No one assigned

    Labels

    submissionBenchmark submission issue that triggers the submission workflow

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions