Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Toolchain dependencies that have circular self-dependencies should not be uninstalled before reinstalling/upgrading them #25857

Closed
embray opened this issue Jul 13, 2018 · 76 comments

Comments

@embray
Copy link
Contributor

embray commented Jul 13, 2018

On Cygwin I tried doing ./sage -f mpir and everything broke.

This is because my C compiler uses mpc and mpfr, and on Cygwin (see #25816) at least, it ends up using Sage's mpc and mpfr DLLs. This may be less of a problem on other platforms but I haven't given much thought to it yet.

The mpc and mpfr DLLs in Sage break when Sage's mpir gets uninstalled, so it's necessary at a minimum to also uninstall mpc and mpfr. But at that point it might also make sense to recursively uninstall anything that depends on mpir. This seems reasonable given that they'll have to be re-built anyways.

Component: build

Author: Erik Bray

Branch/Commit: fb51d4e

Reviewer: Dima Pasechnik, Jeroen Demeyer

Issue created by migration from https://trac.sagemath.org/ticket/25857

@embray embray added this to the sage-8.4 milestone Jul 13, 2018
@jdemeyer
Copy link
Contributor

Replying to @embray:

On Cygwin I tried doing ./sage -f mpir and everything broke.

Can you give more details :-)

I disagree with your solution: it just needs to reorder the operations from

  1. Uninstall
  2. Build
  3. Install

to

  1. Build
  2. Uninstall
  3. Install

That shouldn't break anything.

If this is not easy to fix, maybe we should just disable the uninstall feature for now?

@embray
Copy link
Contributor Author

embray commented Jul 13, 2018

comment:2

Replying to @jdemeyer:

Replying to @embray:

On Cygwin I tried doing ./sage -f mpir and everything broke.

Can you give more details :-)

I think I did in the description. By "everything" I mean that as soon as mpir's configure tries to detect a C compiler, gcc breaks because it's loading DLLs from $SAGE_LOCAL (ugh), and those DLLs are in turn compiled against the mpir from Sage, and are incompatible with my system's GMP.

The problem here is my system's gcc using libraries from Sage, when it really shouldn't. But this is a little difficult to avoid, unless I can patch things somewhere so that executables from the system such as gcc or git are explicitly not run with Sage's environment modifications (e.g. to $PATH). I believe this is less of a problem on Linux.

I disagree with your solution: it just needs to reorder the operations from

Why? I think it's a rather sensible solution.

  1. Uninstall
  2. Build
  3. Install

to

  1. Build
  2. Uninstall
  3. Install

It sort of depends on what you mean by "order of operations". Currently, sage-spkg does the second process, which is certainly my preference. (This process has its own problems, e.g. when building Python--this is partly Python's fault, and partly our fault for not providing better isolation between build and runtime environments.)

However, when you run ./sage -f like I did, it manually calls make <pkg>-clean before calling make <pkg>. Perhaps it shouldn't.

I don't consider this a blocker though, unless the problem can be reproduced other than on Cygwin, since I'm pretty much the only person who does development on Cygwin, and this isn't a normal use case.

@embray
Copy link
Contributor Author

embray commented Jul 13, 2018

comment:3

FWIW I did lots of testing of the uninstaller on Linux, including doing odd things like ./sage -f mpir (and I did specifically focus on mpir due to its involvement in tricky dependency cycles). Never saw any problem like this.

@embray
Copy link
Contributor Author

embray commented Jul 13, 2018

comment:4

Anyways, a lot more packages have problems if you don't remove their previous versions before building new versions of those packages, than the other way around. In this case, it's because mpir has dependents that are in turn dependencies of the compiler. In that case I would just uninstall all of mpir's dependents as well, hence my suggestion of recursive uninstall.

I don't like uninstalling things before their replacements are built, but it seems that's pretty necessary a lot of the time. For that reason I think I'll place a higher priority on #25202.

@embray
Copy link
Contributor Author

embray commented Jul 13, 2018

comment:5

This reminds me--did you ever get anywhere with creating a separate build-time package for mpir? Something like that might also help. Same for the other "toolchain-deps" (zlib, mpfr, mpc). They all share the problem that breaking them can break the C compiler which, in turn, needs them to build.

@jdemeyer
Copy link
Contributor

comment:6

Replying to @embray:

In that case I would just uninstall all of mpir's dependents as well, hence my suggestion of recursive uninstall.

How would you deal with a Sage-built GCC? That will obviously depend on Sage's MPIR too.

@jdemeyer
Copy link
Contributor

comment:7

Replying to @embray:

This reminds me--did you ever get anywhere with creating a separate build-time package for mpir?

Creating a separate package might simplify the build system, but it wouldn't fix the uninstall problem.

@jdemeyer
Copy link
Contributor

comment:8

Replying to @embray:

FWIW I did lots of testing of the uninstaller on Linux, including doing odd things like ./sage -f mpir (and I did specifically focus on mpir due to its involvement in tricky dependency cycles). Never saw any problem like this.

Did you do that with a Sage-built GCC? That's where things get tricky.

@embray
Copy link
Contributor Author

embray commented Jul 13, 2018

comment:9

A few additional thoughts:

a) As for sage-gcc, recursively uninstalling all dependents doubly makes sense. If you rebuild mpir you have to rebuild gcc anyways.

b) I'm thinking at the very least recursive uninstall should happen for all the "toolchain-deps"

c) Something I think would make all of this work better (either instead of, or parallel to separate packages), would be to have a separate install prefix, either under $SAGE_LOCAL or parallel to $SAGE_LOCAL or all toolchain dependencies. This prefix would be superseded by the usual SAGE_LOCAL paths, so the bootstrap toolchain never entirely goes away, but packages that are otherwise dependents of the toolchain can still be upgraded without breaking the existing toolchain.

@embray
Copy link
Contributor Author

embray commented Jul 13, 2018

comment:10

Replying to @jdemeyer:

Replying to @embray:

FWIW I did lots of testing of the uninstaller on Linux, including doing odd things like ./sage -f mpir (and I did specifically focus on mpir due to its involvement in tricky dependency cycles). Never saw any problem like this.

Did you do that with a Sage-built GCC? That's where things get tricky.

I did, and I don't remember having any problems (again, on Linux), but that was a long time ago now so my memory is hazy.

@jdemeyer
Copy link
Contributor

comment:11

Replying to @embray:

I did, and I don't remember having any problems (again, on Linux)

Maybe your system GMP libraries were compatible with the Sage MPIR libraries, so the GCC-in-Sage used your system GMP library while MPIR was being rebuilt? Just guessing...

@embray
Copy link
Contributor Author

embray commented Jul 13, 2018

comment:12

I'd think it would almost have to. I don't see how else it could work.

All the more reason to have separate copies as "toolchain dependencies", and even then only needed when doing silly things like building gcc. Then, a developer wanting to do things (such as I was doing) like experiment with patching and reconfiguring mpir can do so on top of that without breaking the toolchain.

@jdemeyer
Copy link
Contributor

comment:13

The question is what can we do right now to fix this serious bug?

@embray
Copy link
Contributor Author

embray commented Jul 13, 2018

comment:14

Unless someone can reproduce this outside of Cygwin, I don't think it's that serious. I just installed sage-gcc on a Linux build and am trying all manner of ./sage -f mpir/gcc/etc without any problems.

I'm going to keep thinking about it though. If I can find a simple solution that works for Cygwin I believe it would work on Linux as well. Perhaps one very localized solution would be that if one of the toolchain-deps gets uninstalled they should all be uninstalled and rebuilt.

@jdemeyer
Copy link
Contributor

comment:15

Replying to @embray:

Unless someone can reproduce this outside of Cygwin, I don't think it's that serious.

Given the analysis, I don't see any reason why this would be limited to Cygwin.

@jdemeyer
Copy link
Contributor

comment:16

Just reproduced it on Ubuntu 16.04.1 LTS (CPU architecture ppc64le) with system compiler

gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

When running ./sage -f mpir on Sage 8.3.rc1, the problem is as I predicted:

$ ldd local/libexec/gcc/powerpc64le-unknown-linux-gnu/7.2.0/cc1
        linux-vdso64.so.1 =>  (0x00003fffa2290000)
        libmpc.so.3 => /home/jdemeyer/sage-test/local/lib/libmpc.so.3 (0x00003fffa2260000)
        libmpfr.so.6 => /home/jdemeyer/sage-test/local/lib/libmpfr.so.6 (0x00003fffa21c0000)
        libgmp.so.23 => not found
        libdl.so.2 => /lib/powerpc64le-linux-gnu/libdl.so.2 (0x00003fffa2180000)
        libz.so.1 => /home/jdemeyer/sage-test/local/lib/libz.so.1 (0x00003fffa2140000)
        libm.so.6 => /lib/powerpc64le-linux-gnu/libm.so.6 (0x00003fffa2050000)
        libc.so.6 => /lib/powerpc64le-linux-gnu/libc.so.6 (0x00003fffa1e80000)
        /lib64/ld64.so.2 (0x00003fffa22b0000)
        libgmp.so.23 => not found
        libgmp.so.23 => not found

causing obvious build failures.

@embray
Copy link
Contributor Author

embray commented Jul 16, 2018

comment:17

I'll try to reproduce that on x86_64, as I don't have access to that architecture. But I see what you're saying in general.

There are two cases here:

  • without sage-gcc
  • with sage-gcc

The reason I focused here on Cygwin is that this a problem for Cygwin even without sage-gcc due to problems with the DLL search path on Windows. It non-sage-gcc should not have any problem on Linux.

But it is a problem certainly for sage-gcc if it's built against a sage-mpir that is not compatible somehow or other with the system's libgmp.

This all goes back to (one of) Sage's original sins, which is that you're building software packages in a build environment that is also the installation target. For most packages this is not a big problem, but for your build toolchain itself it certainly is. This is why I think that for bootstrapping purposes the build toochain and its dependencies should have some level of isolation from the rest of the install target.

As for a solution, I'd certainly be willing to disable the new-style uninstallation as the default behavior for now. Honestly, I would have preferred to wait until more/all of DESTDIR-related updates were in, as well as #25140, though be design it's supposed to be able to work without that (and does in most cases). But I'm hesitant because for the majority of packages it does work well now, and I'm not entirely sure what the impact would be of fully disabling it now (but maybe it would be small).

So I think for now it still might be best to find a middle-ground. What if, for now, for toolchain dependencies only, we ensure that their files are not removed before reinstalling them? This could work in part with cooperation from sage-spkg by adding a flag similar to pip's --ignore-installed which causes it to ignore dependencies that are already satisfied and just install on top of them, rather than uninstalling conflicting/existing dependencies first.

@jdemeyer
Copy link
Contributor

comment:18

Personally, I would prefer to revert uninstall completely because that's the safest operation. But if you have a different solution which is reasonably simple, that might be OK for me too.

@jdemeyer jdemeyer modified the milestones: sage-8.4, sage-8.3 Jul 17, 2018
@embray
Copy link
Contributor Author

embray commented Jul 17, 2018

comment:19

I mean, this issue needs to be resolved somehow anyways--the new package uninstallation is not going away long-term.

@embray
Copy link
Contributor Author

embray commented Jul 17, 2018

comment:20

I'm currently testing a fix that I think is pretty uninvasive. I would still consider it a work-around but it's reasonable.

@embray
Copy link
Contributor Author

embray commented Jul 17, 2018

Branch: u/embray/ticket-25857

@embray
Copy link
Contributor Author

embray commented Jul 17, 2018

Author: Erik Bray

@embray
Copy link
Contributor Author

embray commented Jul 30, 2018

comment:56

Replying to @jdemeyer:

Some other comments:

  1. the flags SAGE_INSTALL_FETCH_ONLY, SAGE_CHECK and SAGE_KEEP_BUILT_SPKGS are meant to be usable as environment variables. So I would do the same for KEEP_EXISTING (rename it to SAGE_KEEP_EXISTING and remove KEEP_EXISTING=0).

Ok.

  1. keep the whitespace line in the -clean rule.

Yes, weirdly I thought I already fixed that. I don't know why the fix went away again.

@embray
Copy link
Contributor Author

embray commented Jul 30, 2018

comment:57

Replying to @jdemeyer:

Replying to @embray:

I'm sorry but you'll have to live with that. Read the docs?

Understanding this syntactically is not the same as actually understanding it semantically. Whenever I try to review this ticket, I also end up realizing that there is no way to be sure that this Makefile syntax does what it is supposed to do.

And your suggestion to run make -f build/make/Makefile -n DEBUG_RULES=1 would be useful if it would actually do what it said: the toolchain-deps rules that you change on this ticket do not appear there.

That's just for printing the automatically generated rules, not for printing every rule in the makefile, not statically-defined targets, like toolchain-deps.

I'm not sure what's unclear about that target.

I'm not making this comment just to be annoying or pedantic. I really do think that it's a serious problem with #21524.

I believe your intentions are sincere, but I still find this objection very confusing. I feel like you're just stubbornly refusing to learn for some reason, because I find it easy to understand, and to check; maybe just not as easy as, say, a Python script. I don't know if that's what's going on; that's just what it feels like to me. Maybe a tutorial on make would help; I don't know.

Anyways, I think that discussion is outside the scope of this ticket.

@embray
Copy link
Contributor Author

embray commented Jul 30, 2018

comment:59

Replying to @embray:

Replying to @jdemeyer:

Some other comments:

  1. the flags SAGE_INSTALL_FETCH_ONLY, SAGE_CHECK and SAGE_KEEP_BUILT_SPKGS are meant to be usable as environment variables. So I would do the same for KEEP_EXISTING (rename it to SAGE_KEEP_EXISTING and remove KEEP_EXISTING=0).

Ok.

Actually, the more I think about it, I think we should avoid adding yet another environment variable for now. Is this something we actually need? If there's not an obvious use case for setting this flag for all packages I think it should be avoided. The name SAGE_KEEP_EXISTING is also very unclear as to what it does. Keep existing what? In the context of the sage-spkg script it makes sense, but it's very unobvious otherwise.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jul 30, 2018

Changed commit from 2da7f8e to 85a368f

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jul 30, 2018

Branch pushed to git repo; I updated commit sha1. New commits:

85a368feach of these templates should have an empty newline at the end to make them more readable when printed one after the other

@embray
Copy link
Contributor Author

embray commented Jul 30, 2018

comment:62

Replying to @embray:

Replying to @jdemeyer:

Replying to @embray:

I'm sorry but you'll have to live with that. Read the docs?

Understanding this syntactically is not the same as actually understanding it semantically. Whenever I try to review this ticket, I also end up realizing that there is no way to be sure that this Makefile syntax does what it is supposed to do.

Which syntax specifically? What are you trying to check?

@jdemeyer
Copy link
Contributor

jdemeyer commented Aug 7, 2018

comment:63

Replying to @embray:

If there's not an obvious use case for setting this flag for all packages I think it should be avoided.

Why? I would argue the opposite: it doesn't harm to support it as an environment variable, so why should we not do that?

The name SAGE_KEEP_EXISTING is also very unclear as to what it does.

Then call it SAGE_SKIP_UNINSTALL or whatever.

@jdemeyer
Copy link
Contributor

jdemeyer commented Aug 7, 2018

comment:64

Replying to @embray:

Which syntax specifically? What are you trying to check?

I'm just trying to check that this branch does what it seems to be doing. It looks like it does the right thing (skipping uninstall for toolchain packages) but how I can be sure?

By now, I've looked at this ticket a few times and I guess it's OK. I'll wait for your reply on the environment variable issue before doing a final review.

@embray
Copy link
Contributor Author

embray commented Aug 8, 2018

comment:65

Replying to @jdemeyer:

Replying to @embray:

Which syntax specifically? What are you trying to check?

I'm just trying to check that this branch does what it seems to be doing. It looks like it does the right thing (skipping uninstall for toolchain packages) but how I can be sure?

I'm still not sure what you're asking here. Are you just asking how to test code in a makefile? How can you be sure any code does what you think it looks like it does? I'm not trying to be facetious--I'm just trying to get to the bottom of what your concern is.

@embray
Copy link
Contributor Author

embray commented Aug 8, 2018

comment:66

Replying to @jdemeyer:

Replying to @embray:

If there's not an obvious use case for setting this flag for all packages I think it should be avoided.

Why? I would argue the opposite: it doesn't harm to support it as an environment variable, so why should we not do that?

I don't agree. Adding any new "feature" means you have to support it. The main purpose of this change (although it does add new features in the form of new command-line flags to some scripts) was not really to add new features so much as to workaround a specific problem. So better to keep the feature surface area of the change minimal. I also believe the YAGNI principle applies here.

But if you really want it for some reason a compromise might be to change the name but not document it anywhere. I'm also a little unsure about this in that it's really a behavior of the uninstallation that this is changing, not the installation. If anything it should be an environment variable read by the uninstaller, but there again it adds needless complication... (That last sentence didn't make sense; I misremembered what my own patch does :)

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 10, 2018

Branch pushed to git repo; I updated commit sha1. New commits:

fb51d4eallow 'sage-spkg -k' to also be implied by setting the environment varible SAGE_SPKG_SKIP_UNINSTALL=yes

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 10, 2018

Changed commit from 85a368f to fb51d4e

@embray
Copy link
Contributor Author

embray commented Aug 10, 2018

comment:68

How about this? I called it SAGE_SPKG_SKIP_UNINSTALL and noted it in the documentation at the top of sage-spkg, but not elsewhere, at least for now.

I'd like to get this done already; I think this fix is too important to be held up any longer by bikeshedding.

@dimpase
Copy link
Member

dimpase commented Sep 7, 2018

comment:69

Perhaps this should get #25188 as a dependence, so that the latter and this are merged together, with just one new configure tarball instead of two...

@embray
Copy link
Contributor Author

embray commented Sep 7, 2018

comment:70

Looks like the latest patchbot build failed due to #18438.

Dima, that's not a bad idea, but I'd like to see a positive review on both first.

@dimpase
Copy link
Member

dimpase commented Oct 15, 2018

Reviewer: Dima Pasechnik, Jeroen Demeyer

@vbraun
Copy link
Member

vbraun commented Oct 20, 2018

Changed branch from u/embray/ticket-25857 to fb51d4e

@embray
Copy link
Contributor Author

embray commented Oct 28, 2018

comment:73

This should be re-targeted for 8.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants