-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hint mode #2229
Hint mode #2229
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the very detailed explanation! I don't follow some of the points you made, but I appreciate the effort you took to understand this and explain it.
The changes here aren't too bad, but it's a bit annoying to have to write cat_comp (A:=Foo)
in a number of places. Is there a way to tell Coq to try using the second map to determine A
? Or any other trick we could use?
If not, some ideas are:
- Make a notation (maybe
f $o[A] g
?) that specifies the category. - Make a notation (maybe
f oG g
) that specifically works with group homomorphisms.
My broader question is about the benefit. The description of the PR talks about making rapply
faster. Can you give an example of something that is faster with this PR than without it?
Yes, I will try to construct examples of I think the |
I like the idea of this notation, but I would prefer a separate PR for it. There are two points to address. Firstly, we need to do some formatting tweaks which I can help with in the PR, and secondly we need to find |
This PR was motivated by downstream breakages in the rapply tactic while I was trying to work on my bicategory development, so I hope it's okay for me to draw examples from that PR rather than trying to find examples in the current master. In #2215 in theories/WildCat/TwoOneCat.v, at the definition of
So, at each iteration before the final one, unification fails (it's the wrong number of holes for fmap_comp to unify with the goal) and this takes 13 seconds before giving up. Typeclass search is looking for 1-category structures on types ?X and ?Y and a 1-functor structure on ?F : ?X -> ?Y (all unknown because unification failed) and there are a lot of possible instantiations of these given a bicategory structure on ?X, so search takes a while. Adding |
For contrast, without using |
Why not do |
I don't follow your point. The example I gave was not an instance of But the general scope here is beyond "Robustness" of Coq code is pretty complex and subjective. That said, if you write a term On the other hand it's definitely a convenient feature that makes the code more concise because typeclass search often hits the right answer quickly and helps the elaborator figure things out where we would otherwise need lots of annotations. So I think there's a tradeoff between concision and robustness here. The opposite extreme here is |
Have you looked into how much would need to change if we made |
Really? I thought we often needed the wildcat instances to be found. Here's the data for WildCat/*.v:
Anyways, I think this is beside the point. It's a good idea to make rapply behave better, maybe by using I'm in favour of moving forward with this PR, but think we should first solve the composition issue.
If that's true, then it could be added to this PR and maybe we'd be done. If it doesn't work, then I think we should have a separate PR that introduces a notation for composition in a wildcat. I proposed |
If we do rely on the current behaviour a lot, we could keep that behaviour and add a third version with the new behaviour. Here's one idea on the naming:
(Or the first two could be swapped, if we need most current uses of All three would have Ideally, we wouldn't need three, so this is only a worthwhile option if many uses of |
The original tactic I wrote down has slightly different shelving behavior than
If I wrote this correctly, it should try If I applied the timing experiment correctly, the results of this change are -1% time difference on average, which seems negligible. But I am not positive I correctly applied the timing script. Here is the result: Anyway, skimming through that branch |
e29c577
to
897c2c5
Compare
Some minor stylistic changes. Added a coercion GroupIsomorphism >-> Hom which resolves four of the six explicit Group category annotations. I tried adding a coercion GroupHomomorphism >-> Hom as well and this fixed the remaining two, but this seemed to break more things downstream so I skipped it. Instead I replaced |
897c2c5
to
7e7e14f
Compare
I think you are right about how this behaves. I suspect that it could be done more directly, maybe by using If you instead try your original way, where you just use
Interesting, I would have suspected a bigger improvement. But when
Unfortunately, in addition to the 69 lines that changed in an important way, that commit contains 52 lines with whitespace changes. You have to avoid putting such unrelated changes into your commits. Luckily, I found a way to view the changes without showing the whitespace changes: The library contains around 2000 uses of (s)rapply, so it's great that only 69 of them needed to change. And 4 or 5 of those changes are improvements anyways, where |
We should move the discussion of Also, can you redo the latest commit to this PR with the whitespace changes removed? |
This won't work, because |
4897236
to
87c3adf
Compare
87c3adf
to
85063b5
Compare
Whitespace changes removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve, but I'll wait to see what @Alizter thinks before merging.
It will be a couple of days before I can have a closer look. If there is a rush to get this through however, I don't mind @jdchristensen confirming the change and merging. |
I don't think there's any rush. The diff is now quite clean, so it should be quick to review when you get a chance. |
1a4fa8f
to
9f6160c
Compare
@Alizter Do you still want more time to look at this one? |
I'm just going to comment here with an example of this robustness issue I mentioned above, because I find it pretty complicated and I need a convenient place to write down a simple example that illustrates the problem. @jdchristensen said in response to the original post that he didn't follow everything I said, so I'm trying to make a simpler example.
This definition succeeds, but if you swap the order of the instances, it fails. What's happening is this. So I consider this a robustness issue, because whether elaboration of a term succeeds or not should not be dependent upon subtle details such as typeclass ordering. Let us now discuss possible solutions to this issue, which are not necessarily mutually exclusive.
The point here is that in this situation, the argument
|
Co-authored-by: Dan Christensen <[email protected]>
@@ -771,6 +771,10 @@ Defined. | |||
Global Instance isgraph_group : IsGraph Group | |||
:= Build_IsGraph Group GroupHomomorphism. | |||
|
|||
(** We add this coercion to make it easier to use a group isomorphism where Coq expects a category morphism. *) | |||
Definition isHom_GroupIsomorphism (G H : Group) : GroupIsomorphism G H -> Hom G H := idmap. | |||
Coercion isHom_GroupIsomorphism : GroupIsomorphism >-> Hom. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coercion isHom_GroupIsomorphism : GroupIsomorphism >-> Hom. | |
Identity Coercion isHom_GroupIsomorphism : GroupIsomorphism >-> Hom. |
Documentation on this is terrible:
https://coq.inria.fr/doc/master/refman/addendum/implicit-coercions.html#identity-coercions
Basically its a coercion that doesn't insert a term. This is useful for elaboration. So GroupIsomoprhism
will be treated as a Hom
and elaboration will work on that, but isHom_GroupIsomorphism
won't be inserted.
Also I don't like the name. Something like groupisomorphism_to_hom
is better IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gives me the error message "GroupIsomorphism must be a transparent constant." Do you know a fix for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also had trouble trying something similar. BTW, if this trick works, could it be applied to GroupHomomorphism
instead of GroupIsomorphism
without leading to cycles? That would help us use WildCat results more often when dealing with group homomorphisms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Identity Coercion
only works when both constants are definitions and one (eventually) unfolds to the other.
@patrick-nicodemus Thanks for the great overview.
I'll add a couple of things:
I also don't think it's really that serious a problem if the behaviour sometimes depends on the order in which files are imported. We should in general import the most generic things first and the most specific things later, which will often mean that the desired instances are tried first. Even if either order works, this should speed up type class search. (And it's worth noting that many of the suggestions in our lists will not only improve how often typeclass search succeeds, but will likely speed it up as well.) |
I've thought a bit more about the idea of using We could adapt the pointed types library to use WildCat notation (e.g. by making the existing notations like We could add more coercions to WildCat Definition pMap (A B : pType) := pForall A (pfam_const B).
Notation "A ->* B" := (pMap A B) : pointed_scope. (This also suggests that changing So my general summary is: Adding BTW, adding Existing Instance isgraph_ptype | 0. to the top of pSusp.v makes Coq find the right instance immediately. So I think adding some lines like that to appropriate files is a good way to control typeclass search and make it find what we want quickly. |
I agree with this. I didn't want to dramatically transform the library in one PR, so I thought it was best to make one small change (to IsGraph) and fix the breakages. But you're right that only removing one of the instances will have no effect as long as typeclass search continues for the other ones.
This made me curious about how much work it would take. I have created a branch of this PR at patrick-nicodemus/HoTT called "hint_mode_wildcat" where I use
Here is the most recent commit if you want to see the overall size of the changes that have to be made. https://github.com/patrick-nicodemus/HoTT/tree/hint_mode_wildcat Two somewhat big changes to make things work: I redefined I should note that I didn't go through and diagnose the exact reason for each build error, I think this would be valuable information. I figured the first step is to fix it, and then I can go back afterwards and worry about why it broke in the first place. I ran a timing test and the difference in runtimes was negligible. So apparently it does not affect performance much. I still think that these changes will result in fewer breakages due to refactoring on average. (This was my original motivation, because I kept breaking things when I was trying to refactor (2,1)-categories - |
Thanks for doing this! I think we have to decide to either go fully in with this approach, using The changes are much less than I expected. But it's still the case that various things become just slightly more awkward. Being able to smoothly apply WildCat lemmas to situations not using WildCat notation is really valuable, and I'm reluctant to impose these "papercuts" on all uses of the WildCat library, especially since there still don't seem to be any concrete benefits. You've mentioned challenges while refactoring, but most of the time people are proving things, and we want that to be as easy as possible. My suggestion is that we focus on changes that will make the typeclass search more efficient. For example, we can use priorities/costs more. Also, instead of importing all of WildCat, we should be selective, so we don't have the wildcat structures on Unit and Empty in the typeclass database, cluttering up the search. And at the top of a file, we can re-add important instances to the database, so they come first in the search. Etc. |
Assuming we decide not to merge this PR, would you be willing to merge many of the "paper cuts" in this PR on the grounds that these are "accidents waiting to happen", i.e. the things changed in this PR identify things that could possibly break later during refactoring? This would have no effect on the end user. Many of the changes are not too ugly, they are little one line fixes, sometimes a single character. |
To me, a lot of the changes clutter things up, so I would have a preference to not make them. A few would be fine, e.g. changing |
I've changed my mind, so I'm retracting my approval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have marked the changes we can take from here. Let's open a new PR as to keep the discussion on point. This PR was about proposing Hint Mode
and it seems it wasn't successful due to reasons we don't fully understand. Please open a new PR with the changes I marked, since they are a net readability improvement.
@@ -205,7 +205,7 @@ Defined. | |||
Section MonoidEnriched. | |||
Context {A : Type} `{HasEquivs A} `{!HasBinaryProducts A} | |||
(unit : A) `{!IsTerminal unit} {x y : A} | |||
`{!HasMorExt A} `{forall x y, IsHSet (x $-> y)}. | |||
`{!HasMorExt A} `{forall x y : A, IsHSet (x $-> y)}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave this
rapply Build_Is0Bifunctor''. | ||
Defined. | ||
: Is0Bifunctor (A:=AbGroup^op) ab_ext | ||
:= Build_Is0Bifunctor'' _. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these
I don't agree with this summary. @patrick-nicodemus has shown that it is possible to use Either way, we've also learned about how to reduce the amount of typeclass search, and I'll make a PR soon with a few minor changes illustrating this. I agree that some of the changes in this PR should also be made in a separate PR. Your list seems like a pretty reasonable subset to choose. |
I may have said this before but the VSCode Coq plugin does not yet support the very useful feature of interrupting a tactic that is taking a long time to run, so typeclass searches that loop or take exponential time to run require me to kill VSCode and reopen it. This irritation has been in mind when opening these related PR's. I think I'll switch back to ProofGeneral until that feature gets added, because otherwise interactive development with this library becomes quite painful. (Even "Check fmap11" can cause a typeclass loop and halt development since fmap11 has so many maximally inserted implicit arguments.) |
If you use |
@patrick-nicodemus Thanks again for experimenting with this. It's been very useful. But I think we should close this now. Sound good? |
@jdchristensen Sounds good, it's a little disappointing but I understand the criticism expressed. If I continue to run into breakages while refactoring, I might propose another PR in the same spirit, but I'll keep in mind that the policy is not to compromise on the strength of elaboration of newly proved theorems. |
One possible way to avoid this issue is that you can stick all the hints in a module, and only import it in files where the hints help, if it is feasible, for example, inside the wildcats library but not outside |
This pull request is similar to the previous one, where the
Hint Mode
ofIsGraph
is changed to!
, meaning that typeclass search will never try to solveIsGraph ?A
unless?A
is known.I've done some more experimenting and debugging and I feel I have a little more to say regarding Coq's unification, elaboration and typeclass inference. Elaboration is a complex beast and there's a lot I really don't understand and I'm early in learning. Still, I learned a lot in the process of working on this so I'd like to document what I learned.
Let me start with the justification for this PR: I claim that the current setup makes the
rapply
tactic slow and inefficient, and to explain this I will discuss the refine tactic and elaboration in Coq.In short, the way elaboration works in Coq for an incomplete term
t
declared to be of typeT
is:U = type(t)
fort
, which may contain holes.T
withU
, solving all holes in the process.This is very much a "bottom up" process: the declared type
T
has no effect at all at stage 1. It's possible to change this now using bidirectionality hints, which are not part of this PR, but I'll comment on these later, I think they could be useful.Actually this process is apparently recursive, in the sense that while traversing the term
t
from the ground up, both steps are repeated for each subterm. For example, iff a
is a small subterm oft
, wheref
is a variable of declared typeforall x :A, B x
anda
is a complex term, then:U_a = type(a)
, which may contain holes, andU_a
withA
and if successful reportsB a
as the type off a
.Typeclass search seems to be part of step 2; if unification fails, then by default we launch a typeclass search to instantiate the values within terms (if this flag is set to its default value, Typeclass resolution for conversion).
Typeclass search seems to be regarded as a "success" for a given subterm if it successfully instantiates all the typeclass holes in a consistent way, even if these choices cause elaboration of the broader term to fail.
An important thing that can interfere with this process is coercions. In order for coercions
x -> f x
to work right in partially elaborated terms, we need to know the type thatx
needs to be in order to apply the coercion. For example,g $o f
will fail in general ifg
needs to be coerced (say, fromIso x y
toHom x y
) in order to be of the correct type, because we can't derive the Hom type fromg
alone.Some of the downstream changes I made have a simple, obvious explanation. For example, in
it should be clear that there is no way to "honestly" elaborate the expected type
Reflexive Hom
orfun a => Id a
by deriving it from constraints, becauseA
is not mentioned. Instead, in trying to elaborateHom ?A ?IsGraph ?Is01Cat
,typeclass search would in theory search through the entire database plugging in all possible values of
?IsGraph
, but it starts with the hints prioritized in the context, and since the hints in the context give a valid solution to the typing problem, there's no need to continue the rest of the search. This seems harmless, although I think I have an aesthetic preference for explicit specification when things cannot be uniquely derived.I think there are a few points in the code base where we are relying on a somewhat surprising elaboration strategy for terms. I will attach a small example, which is a paraphrased version of one of the changes in this commit.
From reading the unification debug logs, the first thing that happens (after
f : Hom A B
is fully elaborated) is that we try to check that the application ofcat_comp
toinv B
is valid by unifying the type ofinv B
,GroupIsomorphism B B
, withHom ?A ?IsGraph ?Is01Cat ?x ?y
This unification fails almost immediately (thank you very much Gaetan Gilbert for helping me understand why!) so we try and elaborate both the type ofinv B
andHom ?A ?IsGraph ?Is01Cat ?x ?y
independently using typeclass search.It's important to note here that when we are trying to elaborate this latter term, we are truly starting from scratch and starting to "elaborate" the term
Hom _ _ _ _ _
from nothing. The typeclass search is not being driven by the type argument, because there is no type argument, so it is just generating any arbitrary typeclasses of the formIs01Cat
andIsGraph
, and instantiating the type this way. In this case we get lucky and get the right answer, but only by the coincidence thatisgraph_abgroup
happened to be the most recent hint in the database - if you changeAbGroup
toGroup
ininv
andabc1
, it will fail because it still findsisgraph_abgroup
as the most recent hint in the database and we can't convert a Group to an AbGroup.So, in my opinion, this is not very robust, we are getting lucky, and I think we shouldn't rely on this elaboration technique, if possible we should try and understand why unification is failing and fix it. This is one argument for using
Hint Mode +
in some cases.Now let me return to the discussion of
rapply
. Something likeBuild_Is0Bifunctor''
takes, say, a dozen arguments, so we wantrefine Build_Is0Bifunctor'' || refine Build_Is0Bifunctor'' _ || refine Build_Is0Bifunctor'' _ _
to fail quickly at each case so we quickly reach 12 arguments. The way this works is that we try to unify the type of(Build_Is01Bifunctor'' _ _ _ _ _ _)
with the type of the goal, and if this fails, we launch typeclass search to elaborate the term and try again, but this is not guided from the top down, so we are trying every combination of hints in the database for the typeclass holes. This is very expensive. UsingHint Mode !
on the parameter types would make this faster because we basically wouldn't carry out a typeclass search at all unless unification succeeded in instantiating the parameter types.Now, bidirectionality is complementary to this. If we write
then whenever we call
Build_Is0Bifunctor''
, it will first look to the context of the expected type to solve forA, B, C
, and then try to elaborate the rest of everything from this. In this situation, as long asBuild_Is0Bifunctor
is being used in a situation where the correct types can be inferred from the external context then you will not have this unguided typeclass search. And in many cases where we remove typeclass inference withHint Mode +
we could restore the same elaboration behavior with bidirectionality hints, so hopefully not too many more annotations. On the other hand if unification fails completely (as in the first few iterations ofrapply
) then typeclass search will still start from a blank slate, so bidirectionality will not make as much of an impact there.I experimented with bidirectionality and I found that in some cases you want to use it and in some you don't. If you have
cat_comp _ _ _ g f
whereg
needs to be coerced fromIso x y -> Hom x y
or something, then bidirectionality is helping you, because deriving the category structure fromIso x y
would give you the wrong answer. But ifg, f
are (say) group homomorphisms and you have a function application(cat_comp _ _ _ g f) x
, then bidirectionality could hurt you here because based on the application it will derive that(cat_comp _ _ _ g f)
should be of a function type, and thereforeHom x y
should be a function type, and maybe it will now coerceg, f
from group homomorphisms to functions before composing them. Or, if this unification fails, it will carry out typeclass search and just find any type with a category structure whose homs are function types.In some of the commits below, we have to add additional type annotations.
In some,
rapply
takes too much time to run; I checked that these can be fixed by adding moreHint Mode +
types to makerapply
fail faster, but I chose a simpler patch.A complementary fix for
rapply
would be perhaps to replacerefine (f _ _ _ _)
withnotypeclasses refine (f _ _ _ _); try(typeclasses eauto)
which should fail faster if it cannot successfully unify the type of the input with the type of the term at all, making the earlier stages much faster.