Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This adds support for "extras". #18

Merged
merged 1 commit into from
Nov 16, 2017
Merged

Conversation

mattmoor
Copy link
Contributor

"Extras" are additional dependencies of a given library, which are consumed by passing the "extra" name in brackets after the distribution name, for example:

mock[docs]==1.0.1

We see this in the dependencies of several Google Cloud libraries, which depend on: googleapis_common_protos[grpc]

I've added a simple test that the dependency structure we synthesize for this kind of thing is correct via an "extras" test that has a requirements.txt of:

google-cloud-language==0.27.0

Fixes: #12

@mattmoor
Copy link
Contributor Author

Looks like my test is affected by this issue. :-/

@mattmoor
Copy link
Contributor Author

Ok, I commented out the test until the other issue is resolved, since AIUI (from @damienmg) there isn't currently a way to disable a test on particular platforms.

This should be RFAL.

# dependencies to make sure they are fully satisfied.
for extra_dep in whl.dependencies(extra=extra):
dep_distro, dep_extra = parse_requirement(extra_dep)
if not is_possible(dep_distro, dep_extra):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work for me for with google-cloud-datastore 1.4.0.

The keys of whl_map have underscores (eg "googleapis_common_protos"), as does distro. dep_distro has dashes, eg "googleapi-common-protos". So if an extra depends on something with dashes, then it fails to find it in whl_map, even though the version with underscores is there. For google-cloud-datastore, this occurs because eg proto-google-cloud-datastore-v1[grpc] depends on googleapis-common-protos[grpc].

See workaround: drigz/rules_python@2d6ce0bf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@@ -0,0 +1 @@
google-cloud-language==0.27.0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI the 0.27.0, 0.28.x, and 0.29 releases all had wide-ranging changes to deps, and various problems with deps, unrelated to the issue at hand.

@@ -93,6 +48,47 @@ def repository_name(self):
parser.add_argument('--directory', action='store',
help=('The directory into which to put .whl files.'))

def determine_possible_extras(whls):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document the argument and return value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this definitely needed a docstring.

for whl in whls
}
def parse_requirement(name):
if '[' not in name:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use pkg_resources.Requirement.parse(name) instead? This might be a little brittle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks for the pointer.

@@ -63,13 +63,12 @@ def metadata(self):
def name(self):
return self.metadata().get('name')

def dependencies(self):
def dependencies(self, extra=None):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document that passing extra=None means the dependencies of the base wheel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should all be done. thanks, and sorry for the delayed response.

# dependencies to make sure they are fully satisfied.
for extra_dep in whl.dependencies(extra=extra):
dep_distro, dep_extra = parse_requirement(extra_dep)
if not is_possible(dep_distro, dep_extra):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@@ -93,6 +48,47 @@ def repository_name(self):
parser.add_argument('--directory', action='store',
help=('The directory into which to put .whl files.'))

def determine_possible_extras(whls):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this definitely needed a docstring.

for whl in whls
}
def parse_requirement(name):
if '[' not in name:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks for the pointer.

@@ -63,13 +63,12 @@ def metadata(self):
def name(self):
return self.metadata().get('name')

def dependencies(self):
def dependencies(self, extra=None):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@mattmoor
Copy link
Contributor Author

The failure here was the same we were seeing with Flask. I'm rebasing this on this PR to pick up @duggelz fixes, and CI is already clean on OSX!

THANKS @duggelz YOU ARE A MAGICIAN :)

@mattmoor mattmoor force-pushed the just-par branch 2 times, most recently from 6b26ef9 to b519c4d Compare November 16, 2017 20:22
@mattmoor
Copy link
Contributor Author

@duggelz This should be RFAL. I just rebased it on the other merged PR.

Copy link

@duggelz duggelz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with some comments/questions.

'requirement("%s")' % d
for d in whl.dependencies()
])))
name = "pkg",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an intentional change from 2 to 4 space indent, because it goes into a BUILD file?

Side note: I indent everything open-source with 4 spaces, instead of the Google 2, except when I forget.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while, so let's say yes :)

Yeah, the external emacs mode screws with me. I didn't realize the world didn't do 2 space.

'requirement("%s")' % d
for d in whl.dependencies()
]),
extras='\n\n'.join([
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty long list comprehension, but I'll let it slide.


licenses(["notice"]) # Apache 2.0

# TODO(mattmoor): Enable testing once we resolve:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test work yet or no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL, it should, the hard part was already uncommented. Thanks for catching this.

"Extras" are additional dependencies of a given library, which are consumed by passing the "extra" name in brackets after the distribution name, for example:
```
mock[docs]==1.0.1
```

We see this in the dependencies of several Google Cloud libraries, which depend on: `googleapis_common_protos[grpc]`

I've added a simple test that the dependency structure we synthesize for this kind of thing is correct via an "extras" test that has a `requirements.txt` of:
```
google-cloud-language==0.27.0
```

Fixes: bazelbuild#12
@mattmoor
Copy link
Contributor Author

Thanks @duggelz

@mattmoor mattmoor merged commit 3e167dc into bazelbuild:master Nov 16, 2017
@hutchk
Copy link

hutchk commented Nov 22, 2017

I just commented out both "requirement" deps from the BUILD file and the test still passes... Is that expected?

macbook-pro:extras hk$ git diff .
diff --git a/examples/extras/BUILD b/examples/extras/BUILD
index 94880ce..bc22772 100644
--- a/examples/extras/BUILD
+++ b/examples/extras/BUILD
@@ -22,8 +22,8 @@ py_test(
     name = "extras_test",
     srcs = ["extras_test.py"],
     deps = [
-        requirement("google-cloud-language"),
+        # requirement("google-cloud-language"),
         # Make sure that we can resolve the "extra" dependency
-        requirement("googleapis-common-protos[grpc]"),
+        # requirement("googleapis-common-protos[grpc]"),
     ],
 )
macbook-pro:extras hk$ bazel test :extras_test
INFO: Found 1 test target...
Target //examples/extras:extras_test up-to-date:
  bazel-bin/examples/extras/extras_test
INFO: Elapsed time: 0.116s, Critical Path: 0.00s
//examples/extras:extras_test                                   (cached) PASSED in 0.2s

Executed 0 out of 1 test: 1 test passes.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.

@drigz
Copy link
Contributor

drigz commented Nov 22, 2017

@hutchk See this comment in extras_test.py:

# The test is the build itself, which should not work if extras are missing.

@hutchk
Copy link

hutchk commented Nov 23, 2017

@drigz So I commented out the extras in the BUILD file. Shouldn't the build fail?

@drigz
Copy link
Contributor

drigz commented Nov 23, 2017

@hutchk As I understand it, the test is that if you have a requirement with an extra (in this case "googleapis-common-protos[grpc]"), that the build succeeds. In Python, if a dependency is not in the BUILD file, you will only notice at runtime, but this test doesn't care about what is available at runtime (as stated in the comment).

Therefore, deleting the requirement from the BUILD file is a bit like deleting the assert from a normal test - you turn it into a no-op.

@hutchk
Copy link

hutchk commented Nov 26, 2017

@drigz Thanks. At least to me it wasn't immediately obvious what is being done under the hood, which may be a combination of my: a) lack of proficiency in Python b) lack of understanding of Bazel and c) how the pip_import works with transitive dependencies.

I've been trying to wrap my head around and would appreciate some confirmation if possible...

Questions:

  1. In requirements.txt for example, even though we've specified a single line "google-cloud-language==0.27.0", pip_import will download all of the package dependencies (transparently), including googleapis-common-protos[grpc]? And this is what Issue 35 is trying to address?

  2. The test seems to only address that the build system is able to parse (and not generate an error) the following from the BUILD file:
    requirement("googleapis-common-protos[grpc]").
    However, if the underlying implementation in determine_possible_extras() imported the incorrect dependencies or didn't work at all, then the test would still pass, is this correct? If so, I guess this was the intention to begin with and we didn't mean to fully test the implementation?

  3. (slightly off topic) Is there a Bazel query I can use to see which pip modules are available (or would be downloaded) for a particular target?

  4. (slightly off topic) Would you be able to provide best practice advice on how to organize my dependencies?
    For instance, I was thinking of having the following in my WORKSPACE for importing all pip requirements.

pip_import(
    name = "third_party_py",
    requirements = "//third_party:python_requirements.txt",
)

But I'm worried that it won't be maintainable and I don't know all the caveats. Perhaps I should just have a unique requirements.txt file for each target or package?

  1. (slightly off topic) Would it be feasible/recommended to create a unique py_library build target for each 3rd party dependency? This way, whether I'm using pip import underneath or building from source would be made transparent to the user?
    For example, in //third_party/python I could have the following BUILD file:
py_library (
    name = "google_cloud_language",
    srcs = [ ??? ],
    deps = [
       requirement("google-cloud-language"),
       requirement("googleapis-common-protos[grpc]"),
    ]
)

This way, the user of //third_party/python:google_cloud_language would not need to determine what other requirements they need to pull in.

Thanks in advance and sorry for the long post.

@drigz
Copy link
Contributor

drigz commented Nov 27, 2017

This is all off-topic for comments on a pull request, but I'll try to respond anyway. In future, the other Bazel support forums (SO, IRC, bazel-discuss@) would be more appropriate.

I'm hoping @mattmoor will correct me if I'm wrong on any of this:

In requirements.txt for example, even though we've specified a single line "google-cloud-language==0.27.0", pip_import will download all of the package dependencies (transparently), including googleapis-common-protos[grpc]?

Correct. Before this pull request (#18), others would have been downloaded but googleapis-common-protos[grpc] would have been ignored.

And this is what Issue 35 is trying to address?

As I read it, #35 is about allowing the user to specify transitive dependencies in the WORKSPACE rather than having them "magically" downloaded, so that the WORKSPACE can specify a single version to resolve conflicts.

For example, you depend on google-cloud-language==0.27.0 which depends on google-cloud-core==0.27.0, but you also depend on google-cloud-datastore==0.26.0 which depends on google-cloud-core==0.26.0. If google-cloud-core==0.27.0 were actually compatible with both, you'd want to be able to choose that yourself, rather than downloading both versions.

However, if the underlying implementation in determine_possible_extras() imported the incorrect dependencies or didn't work at all, then the test would still pass, is this correct? If so, I guess this was the intention to begin with and we didn't mean to fully test the implementation?

Correct. Based on the comment in extras_test.py, I assume it was intentional.

Perhaps I should just have a unique requirements.txt file for each target or package?

I don't know if there's a best practice here. I've been using multiple requirements.txt files so that it's clear why each requirement is in the file. However, I'm worried about what happens when I start to get version conflicts...

Would it be feasible/recommended to create a unique py_library build target for each 3rd party dependency?

I don't see a need to do it. You shouldn't need to manually list googleapis-common-protos[grpc] - I think that's just done for the test - so requirement("google-cloud-language") should be enough.

@mattmoor
Copy link
Contributor Author

@dgriz thanks for chiming in while I've been out. I think you have been right on about almost everything, but a few clarifications below.

but googleapis-common-protos[grpc]

Specifically, googleapis-common-protos would have been pulled in, but any dependencies required by the grpc extra would have been ignored.

if the underlying implementation in determine_possible_extras() imported the incorrect dependencies or didn't work at all, then the test would still pass, is this correct?

It should not still pass if it doesn't import it at all because the py_library target for requirement("googleapis-common-protos[grpc]") should not exist, which is what I wanted to cover.

It wasn't my intent to cover sufficient functionality coverage to exercise that we include the grpc extra correctly.

I don't know if there's a best practice here.

You should use a single requirements.txt per binary target (multiple binary targets may recycle the same requirements.txt). The reason for this is to avoid exactly the problem you describe, which is that distinct resolutions may produce different versions. This can happen even if you use ==, if you only specify your direct dependencies (which likely depend on transitive deps through inequalities, which AIUI is a pip best practice).

We debated making this a requirement (no pun intended) by taking away the ability to name the rule, but erred on the side of flexibility (for better or for worse).

Would it be feasible/recommended to create a unique py_library build target for each 3rd party dependency?

That's exactly what requirement("foo[bar]") points you at. If you produce a single binary, then this level of specificity is probably overkill. If you use a single requirements.txt for multiple binaries, then you get consistent external dependency resolution, but may only want a subset of the external dependencies included in a single binary's runfiles (or PAR/zip).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants