doc: define trust in "Validating a Lean Proof"#847
Conversation
Trust was previously discussed in an informal way. Trust is now defined along three axes: proof authors, verification software, and correctness of statement. The trust definition is also referred to in "Elaboration and Compilation".
|
Some more context can be found in this Zulip discussion. |
|
Preview for this PR is ready! 🎉 (also as a proofreading version). built with commit 5d349db. |
| In particular, we use {deftech}_honest_ when the goal is to create a valid proof. | ||
| This allows for mistakes and bugs in proofs and meta-code (tactics, attributes, commands, etc.), but not for code that clearly only serves to circumvent the system (such as using the {option}`debug.skipKernelTC`). | ||
| Note that the {keyword}`unsafe` marker on API functions is unrelated to whether this API can be used in an dishonest way. | ||
| * Regarding the _proof author_, it matters a lot whether one is dealing with an {tech}[honest] proof attempt, and needs protection against only benign mistakes, or a possibly-{tech}[malicious] proof attempt that actively tries to mislead. |
There was a problem hiding this comment.
“matters a lot” is a bit informal for the reference manual.
Also, the phrasing is a bit off: the sentence beings with “proof author” but then talks about honest/malicoius proof “attempts”. And below its suddenly “code”. Not sure how to fix best, though.
|
|
||
| Depending on the circumstances, additional steps may be recommended to rule out misleading proofs. | ||
| In particular, it matters a lot whether one is dealing with an {tech}[honest] proof attempt, and needs protection against only benign mistakes, or a possibly-{tech}[malicious] proof attempt that actively tries to mislead. | ||
| The steps needed to rule out a misleading proof depend on the author's {deftech}[trust] assumptions towards the _proof author_ (including authors that proved depended-upon theorems), the _verification software_, and the _correctness of the statement_. |
There was a problem hiding this comment.
This is a good start. I wonder if we should differentiate between “proof author” and “theorem statement author”, but probably confusing at this point; the difference really only matters in the comparator workflow.
|
|
||
| * Regarding the verification software, Lean takes {ref "elaboration-compilation"}[a number of steps] to process a theorem and its proof. | ||
| Different uses correspond to trusting different steps of this pipeline. | ||
| At a minimum, the Lean kernel or an alternative kernel such as [`nanoda`](https://github.com/ammkrn/nanoda_lib) has to be trusted. |
There was a problem hiding this comment.
The middle sentence feels rather unhelpful here, not sure what it should tell the user. If this bullet point is meant to introduce the concept of the trusted code base, maybe it should say say.
Maybe it should be a third bullet and say something like
* Depending on the circumstances and trust assumptions described in the two bullets above, the validity of the result relies on different software components. This is called the “trusted computing base” (TCB), and typically comprises at least the Lean kernel. Most of Leans {ref "elaboration-compilation"}[elaboration pipeline], in particular proof tactics, are not part of the TCB.
| * Regarding the correctness of the statement, it is important to distinguish the question “does the theorem have a valid proof” from “what does the theorem statement mean”. | ||
| No matter what software is used and how trusted the environment is, a theorem is meaningful only if its author(s) and user(s) are certain that its statement mathematically expresses its intended informal meaning. | ||
|
|
||
| As written, the statement is {tech (key := "Lean elaborator")}[elaborated] before being passed to the kernel. | ||
| To avoid trusting the {tech (key := "Lean elaborator")}[elaborator], one has to {keywordOf Lean.Parser.Command.print}`#print` the statement to get the elaborated type and manually verify that it indeed expresses the desired statement. | ||
| This brings {keywordOf Lean.Parser.Command.print}`#print` into the trusted base, which is less than 1000 lines of code. |
There was a problem hiding this comment.
This is good in spirit, but rather flawed in content:
#printuses the delaborator, which is far more than 1000 lines of code, is known to be easily cheateable (custom delaborator, custom notation). One would rather have to usepp.raw.- Reading
pp.rawoutput is not user friendly, once type classes come into play, even experts will have trouble with this. - Recursive definitions are elaborated into something unintelligle. There the best bet is to manually verify equational theorems that characterize the theorems.
- And even then malicious code can easily compromise
#print. So to be bulletproof, one would have to use a pretty-printer from an external checker, making all the usability things even worse.
TL;DR: We don't have a practical story for this problem yet, and I definitely don’t want to point out half-baked incomplete solutions here. The best we have right now is “Assume theorem statement author is honest and include the elaborator in your TCB for understanding theorem statement” followed by “Review the raw dumps of the code of the theorem statement and all involved definitions generated from the external checker and hire an expert to make sense of them”.
There was a problem hiding this comment.
That's quite a big issue, given existing claims about a small trusted base. The small trusted base claim doesn't appear only in the docs "enabling the trusted kernel to be very small", but also in the Lean FPO home page. Assuming what you say is precise, then this claim is flat out incorrect.
What makes the situation even worse is that the need to trust the elaborator (or to "review raw dumps of code") is present both when we care whether "the theorem has a valid proof" or "what does the theorem statement mean", so it seems to me we can't lawyer our way into making the small trusted base partially correct.
Trust was previously discussed in an informal way.
Trust is now defined along three axes: proof authors, verification software, and correctness of statement.
The trust definition is also referred to in "Elaboration and Compilation".