-
Notifications
You must be signed in to change notification settings - Fork 37
feat: time monad + MergeSort analysis #165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| #eval mergeSort [4,1,3,4,5,6] | ||
| #eval MS_REC 6 | ||
|
|
||
| theorem mergeSort_time_eq_MS_REC (xs : List ℕ) : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://github.com/leanprover/cslib?tab=contributing-ov-file#style-and-documentation for a link to the style documentation.
| let sortedRight ← mergeSort right | ||
| merge sortedLeft sortedRight | ||
|
|
||
| #check mergeSort |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These checks and evals should be removed. If you like, you can add tests to the CslibTests directory.
|
I am holding off on reviewing style until discussion on the content is settled. There was a previous Zulip thread that raised some issues that I would like to see addressed with respect to this PR. Also, where does this fit in with CSLib at large? Boole is going to also cover this ground, yes? Will there be two parallel developments of time complexity? |
|
Personally I think this model leaves too much room for tinkering in the specification. Users manually annotate their function calls with tick. A good spec method should minimise the extent to which a user can determine where complexity is counted, for example by specifying a query model upfront and having all queries counted. There have been meetings within the community going as far back as March 2023 where we discussed ways to impose a model, where the user did not get to pick and choose when an operation is counted inside the algorithms def. |
That's a valuable perspective, and I appreciate the concern about specification rigor. You're right that there's an inherent trade-off between flexibility and formal control in complexity analysis. I see both approaches as complementary rather than mutually exclusive. The manual annotation model gives users more discretion, with both drawbacks and benefits. From my experience with TCS research, this flexibility enables more proofs to be developed, especially when working with existing libraries. The lightweight tracking approach functions somewhat like a "soft sorry"—the algorithmic statement requires human verification. Still, the advantage is that we can directly leverage whatever Mathlib provides and annotate the time complexity of operations. A fully automated system with a predefined query model would be the ideal end goal. However, there's real value in having algorithms with manual runtime tracking as intermediate artifacts. These can serve as concrete examples and test cases to inform the development of more rigorous automation—similar to how early formalization work often guides better tooling. Given the discussions since March 2023 about imposing a model where users don't control what gets counted, I'd suggest two constructive paths forward: we could continue monitoring projects like Boole as they develop automated approaches, or if you'd like to implement the stricter specification model you're envisioning, I'd be very interested in comparing both approaches. Having concrete implementations of both paradigms would provide valuable data on their respective trade-offs. What do you think about this? |
|
I think the Boole model has its issues too. In TCS papers what we are typically counting are calls to black box procedures i.e. queries and Boole likely won't capture that. So I agree with your point that actually we need some control over what operations we count. My point is more concretely that this can still be done in a different way that limits the user to specifying which operations to count by listing them out. This is much easier to review than a spec style that allows users to insert the ticks in the algorithmic functions directly. The monadic computation should then collect and sum up these operations internally without user spec inside the algorithm spec. And then an interpretation function should take the monadic computation and run the operations. This way as long as one can see the basic operations you are counting, all in one place and can trust how the complexity operations compose with monadic bind, one can review your algorithm spec more easily. In the array case, this means you write a monadic composition spec which simply collects operation tokens for array reads, compares, and swaps. Then the interpretation function takes this function and runs the operations corresponding to these tokens. The complexity spec simply counts these operation tokens. To summarise: I am saying that the monadic approach can (and for reliability purposes, should) be much more robust. EDIT: The deepmind debate formalisation (in a non-main branch) shows the comparison based sorting lower bound. I think that's a good example to build on. |
|
@Shreyas4991 do I understand correctly that you're suggesting to provide annotations/attributes about what should 'tick' externally from the algorithm definition? Do you have an example of this? |
| all_goals (simp_all only [IsSorted, sorted_cons, and_imp, Bind.bind, tick, Nat.cast_one, ret_bind, | ||
| implies_true, and_true, forall_const]; apply min_all_merge <;> grind) | ||
|
|
||
| theorem MSCorrect_sorted (xs : List α) : IsSorted (mergeSort xs).ret := by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mergeSort_sorted
| lemma merge_perm (l₁ l₂ : List α) : (merge l₁ l₂).ret ~ l₁ ++ l₂ := by | ||
| fun_induction merge <;> simp only [Bind.bind, tick, ret_bind, cons_append, perm_cons] <;> grind | ||
|
|
||
| theorem MSCorrect_perm (xs : List α) : (mergeSort xs).ret ~ xs := by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mergeSort_perm, I guess?
I'd also add a docstring.
|
The Lean Action CI is now not working. The error is Maybe this branch needs to be updated? |
This will mainly be used for analyzing theoretical algorithms that often appear in theoretical computer science papers. In TCS papers, it is rare for mistracking time to be the correctness concern, since most algorithms in theory papers are easy to track. The main advantage is the flexibility in expressing algorithms and running time across many models. This will serve as a good example for the future development of Boole languages as well. As Boole matures, these Lean developments may be ported to it. Until then, nothing is holding us back from proving the analysis of algorithms in Lean. I am aware of the heated discussion about time complexity. There is no real harm in accepting multiple proposals. It is healthy to have many options to choose from, depending on the context. While there may be many proposals at the start, it is too premature to rule out good ones before seeing them grow. If someone believes in their design, we should let them shine and do their thing. In the end, good design will stand the test of time. |
|
I maintain that the free monad based query approach (Tanner Duve's WIP PR) is a far more systematic and better suited approach for TCS papers than explicitly marking operations to count inside an algorithm
The plain tick approach is far too arbitrary to allow this level of compositionality of complexity results and clean abstractions of models Note that I support the lightweight approach for formalisation of time complexity and have argued for this since March 2023. I think it is the only feasible way to modularly formalise frontier results in algorithms theory. I just think such an effort needs to be systematic for a library of CSLib's proposed scale, to avoid mistakes creeping into the spec. |
|
I got the following Lean Action CI error message. /home/runner/work/cslib/cslib/Cslib/Algorithms/Lean/MergeSort/MergeSort.lean: However, I cannot remove this as I use this tactic in my proof. How do I proceed? @chenson2018, do you have advice? |
fmontesi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good now, thanks for all the changes!
|
@chenson2018 are you happy with the latest changes? Github still shows me a cross for your review, but I think it's the comment about the tactic import? |
|
Let me do a pass for style, I had held off while more conceptual discussion was ongoing. |
| Authors: Sorrachai Yingchareonthawornhcai | ||
| -/ | ||
|
|
||
| import Cslib.Algorithms.Lean.TimeM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure about this directory structure? I probably would have gone with maybe Cslib/Algorithms/TimeM and had Basic.lean and MergeSort.lean within.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now we're pretty sure. We discussed this a long time ago in the SG, the motivation being that there will be a directory per language/approach, e.g., Cslib/Algorithms/Boole.
chenson2018
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that I have not reviewed at all from a design perspective, only maintainability and style. I left one comment about the module naming. Besides that, the commits I pushed removed the unsetting of tactic hygiene and restructured some proofs involving simp and grind. If you feel any changes are too opinionated, please let me know.
@fmontesi I would ask that if you are planning to merge that you do so after the release that is planned for tomorrow.
|
Roger, thanks! (I plan on merging and will wait for the release.) |
Introducing time monad and analysis of mergeSort, including correctness and running time.