-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bundler downloads gem dependencies rather than Bazel downloader #16
Comments
I agree it would be ideal, though I wonder how hard is it going to implement this. There are few questions for me at the moment:
One alternative I planned to do was to try using |
So let's not forget the main goal of bundler: it's to resolve the versions of the gems assuming lock file does not exist. In Ruby we generally git checking Gemfile + Gemfile.lock for apps (rails apps, cli apps etc). However, for gems we never git commit the Gemfile.lock because we want the gem to be pulled into a 3rd party app with its own versions. |
Bundler has the following common tasks:
|
This used to be the case, but currently Bundler recommends checking in Gemfile.lock for gems as well - https://bundler.io/guides/faq.html#using-gemfiles-inside-gems:
|
Yup, another way to explain it: Bundler should still be used to read the user description on dependencies and run its resolver sand constraint solver, then the user must check in the result. We just need to parse it in starlark and try to download exactly the same files bundler would. |
However, for gems we never git commit the Gemfile.lock because we want the gem to be pulled into a 3rd party app with its own versions.
This used to be the case, but currently Bundler recommends checking in Gemfile.lock for gems as well - https://bundler.io/guides/faq.html#using-gemfiles-inside-gems:
Wow! Learn something new every day!
I had no idea.
So, in that case — it is my understanding that Bazel does not write anything into the sources folder. So if I have a Ruby project with a Gemfile, and we run a workspace rule `bundle_install` — is it even possible to copy/place the generated Gemfile.lock file next to the Gemfile so that it can be checked in?
|
I suppose we need to re-implement https://github.com/rubygems/bundler/blob/master/lib/bundler/lockfile_parser.rb in Starlark first, then iterate over all the gems and download then. Ruby code relies a lot on usage of regular expressions, I am not sure it's possible in Starlark - at least there are no regexps there as far as I know. @alexeagle Can we instead do this using Ruby script that would produce a list of files to download and then use this list in a repository rule? |
Getting a ruby interpreter in the repository rule context is very difficult and doing work in repository rules is a bad practice since the result isn't cached. We can write a parser in starlark, we've already had to do YAML (for pnpm) and TOML (for python pdm). |
We should require that the user has already done this using the typical developer tooling before invoking |
Oh, that sounds like fun :D Can you share some links here to see how it was done?
Yes, there is |
I suppose it's in https://github.com/aspect-build/rules_js/blob/fdb95cffe366e2e3b62de41b2921c906621ec4aa/js/private/yaml.bzl, can't find TOML parser though. |
So let's not forget what bundler does as it's perhaps primary task. I believe most other features are extraneous. The syntax that the gem 'rails', '7.1.0'
gem 'rack', '>=1.0'
gem 'puma', '~>6'
gem 'puma-daemon' Version specifiers are defined by the rubygems. When you run This is far from a trivial problem, and the algorithm itself went through multiple iterations. The latest update completely revamped it. In other words, I think that if you can skip reimplementing any of this in Starlark, you absolutely should skip it. TLDRI think it might make most sense that if a Bazel ruby project has a
CaveatAnother complication is that some ruby projects might have more than one Thoughts? |
The environment variable |
Yes, this is how it works in other languages. You must run the resolver from the package manager in order to create the lockfile that our repository rule begins from. |
I think that makes everything a lot easier. An Idea about BundlerWhat if there was a
By vendoring gems with the project and providing Gemfile.lock we can make bundle a build rule instead of a workspace rule. A Question about shared cacheMy next question is about how does a giant ruby monorepo benefit from the ruby projects build by Bazel. With compiled languages I get the point of a shared cache a 100%. But what are we sharing in a distributed cache for a large team of ruby developers? Especially if the lock files are already resolved and gems are vendored? Do we just get the benefit of faster build times because of a small number of gems with native extensions? |
I was just reading the blog post on Tinder Engineering about migrating their iOS app to Bazel. They mentioned the same exact issue we are dealing with Bundler. Here is what they said: |
@kigster This is interesting, can you share the link? If there is already a ruleset that implemented a parser for Podfile.lock, we can re-use and adapt it for Gemfile.lock. |
The link is above, just click on "Tinder Engineering" and it will take you to the blog. I don't know where their source are. |
Thank you, my bad! |
Yesterday I met the engineers who own the Ruby rules at Stripe, and confirmed the design @p0deje and I have discussed is pretty much what they do. @sushain97 can help answer questions about the details. |
@p0deje here's my simplistic toml parser for a Python package manager |
👋 I'm happy to share code if it would help. I'd just need to redact anything sensitive. We have a |
@sushain97 That would be amazing, I'm still trying to figure out a proper way to tell |
All of this is wonderful news. I would like to get your feedback on the idea of The main use case is this: let's say we have an existing Rails App that is being pulled into a mono repo using the new/now-standard ruby rules. Rails projects strive on following conventions. That's what makes Rails so great for starting web apps. Many repos will have a There may or may not be a My main question has always been this: what is the process of "Bazelifying" such a repo? I.e. which of the commands/rake tasks/tests do we want to be able to invoke via Bazel? And in the end we should be able to clearly formulate what is the biggest selling point of migrating a large Ruby or Rails app to Bazel. Is it only driven by the desire to consolidate the code into a monorepo or are there additional advantages for ruby developers too? Shared remote build cache is one of the "killer" features in Bazel that makes development of huge compiled codebases possible. But does it also help Ruby projects? Will Ruby developers working on the repo see Bazel integration as a boost to their productivity or hindrance? I remembered seeing that some languages had a "bridge" CLI tooling that auto generated BUILD files by scanning the project, across many folders, making the process of migrating the repo much much easier. Is this a valid pattern for Bazel integrations? Are there real world examples? Would it make sense to develop such a ruby gem in parallel with the rules themselves? Sorry for the long post. TBH these are the questions I've been struggling from the day one of learning about Bazel and being tasked with creating Ruby rules. |
I don't know if it's easier but the point is that some repos, in particular gems, might have several Gemfiles aimed at different parts of the source tree. So we can't just assume that it's always called A repo with alternative files will have some macro (a makefile target, a shell script),that invokes commands by overriding the default In other words it might be a good idea to create a Bazel Struct or some other representation of both the Gemfile and Gemfile.lock that can be parametrized with a file name. Then any further bundle commands would have to depend on that target. |
Sorry for the late reply. First, some answers:
We're generally trying to avoid introducing Rake into any tasks executed by Bazel. Rake has its own dependency graph, unlike e.g. scripts in We have a
A non-exhaustive listing of our reasons:
For Ruby, we generally don't expose Bazel to our developers yet. An exception is for defining deployable container images.
Our approach is in many ways informed by having very bespoke tooling around basically all our application Ruby so I don't have a super useful perspective here. Our
Yes, we have a number of I've excerpted some of the code that we use here and uploaded it to https://github.com/sushain97/rules_ruby. The repository is private because the code itself is of varying quality and usefulness. I'm happy to discuss specifics in Slack. I've scrubbed the repository of anything sensitive but in case I missed anything, I'd request that we leave it private. The contents are licensed under MIT so relevant bits can be copied as appropriate. |
Here is a quick update on the progress I have so far. You can refer to https://github.com/bazel-contrib/rules_ruby/tree/bundle-fetch-attempt-304 to see WIP code and there are even some builds on CI passing (https://github.com/bazel-contrib/rules_ruby/actions/runs/7175240646), but there are multiple issues I'm running into. First of all, thanks to @sushain97 I managed to have jar-dependencies (0.4.1)
psych (5.1.1.1)
stringio
psych (5.1.1.1-java)
jar-dependencies (>= 0.1.7)
stringio (3.0.9) When resolving this part, we should install Moving on, some gems are restrictable in # Gemfile
gem 'debug', '>= 1.0.0', platforms: %i[mri mswin64]
# Gemfile.lock
debug (1.6.3) Bundle will not attempt to install I'm going to try using Another approach I'm thinking about is to generate |
Most Ruby developers would prefer that
Be used to install gems because of potential future changes to bundler will require constant rewrite of the Bazel parsing rules. Also, I thought Gemfile.lock does have a platform section? |
It does, but it only informs about the platforms of the whole Gemfile, not a specific relationship between a gem and its platforms. |
But I want to commend you for doing excellent work on pushing these rules forward and I wish I had more time to help. |
@sushain97 could you please elaborate on Are you referring to packwerk or something else? |
Completely internal system. I don't think there's anything published externally about it but it's effectively a way for a set of Ruby files to package itself and export/import specific constants within that package's namespace. edit: It does look similar to |
Personally I like where this is going with packwerk. |
As always, there are (partial) answers in other interpreted languages. @jvolkman has been working in python where we have the same problem: interpreting packages to be installed from the lockfile is platform- and interpreter-dependent. Some thoughts:
|
There is another option. I happen to know the maintainer of Bundler. If we can propose a change to the lock file that would retain all the information needed on each platform; perhaps running Maybe other languages already resolved this? |
They are not too common on enterprise rails apps, but more common within libraries eg gems themselves. Also, as we discussed earlier versions of ruby gems advocated against including the lock file with the gem. Now they do. So more gems will have both now.
Personally I don't think it's a good idea. The algorithm for computing the versions that satisfy all constraints in the
I can't answer that, but here is a blog post detailing the last time bundlers algorithm has changed to speed up resolution by an order of magnitude.
I think that's right. |
I had the wrong link for the resolver algorithm, but it's now been updated to this: |
It's not so common, but we already have this in our example because rules_ruby/examples/gem/Gemfile Line 6 in a6ab8c2
My implementation is already doing that, however we need to know which gems to install on which platforms. This information is only present in Gemfile.
To reiterate, the platform information is not present in Gemfile.lock, only in Gemfile which cannot be parsed without an interpreter since it's a Ruby code. People write full-blown Ruby code there.
Yes, but the problem here is that a rule needs to make requests to rubygems.org to fetch information, which means that this action can only run with Not to mention that running
That would be fantastic if Gemfile.lock had:
So given we have this Gemfile.lock:
In a perfect world we would have the following:
|
Is this a public thread? Can anyone see it? I just reached out to @tenderlove and @indirect to weigh in. |
👋 I've been following this issue, apologies for not jumping in earlier. I agree with the sense being expressed here that we should probably just rely on With respect to checksums, bundler 2.5.0.dev (which will be in the upcoming Ruby 3.3 release) does write checksums to the lock file, in a new section named
For more information on this new feature, start with rubygems/rubygems#6374 HTH |
Thank you for this info! It seems the time is right to ask Bundler maintainers for what Bazel needs. @p0deje Perhaps we need to spec your proposal on the rubygems issues page? |
I finally reached a point where CI passes - it's been a trial-and-error to make it green given the variety of interpreter/OS combinations. What proved to work fine on MRI in Linux/macOS caused problems on Windows, likewise making JRuby/TruffleRuby work took a lot of effort. Here is a general overview of the current implementation
The implementation has several limitations at the moment:
|
#48 is ready for initial review and I would appreciate if anyone takes a look. |
Ideally the ruleset should use
rctx.download[_and_extract]
to fetch dependency gems so that:The text was updated successfully, but these errors were encountered: