Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State of the CI #3924

Open
apostasie opened this issue Feb 25, 2025 · 5 comments
Open

State of the CI #3924

apostasie opened this issue Feb 25, 2025 · 5 comments
Labels
area/ci e.g., CI failure

Comments

@apostasie
Copy link
Contributor

Description

In the past week(s) or so, the CI has been increasingly unstable.

While the root cause is unclear, I suspect github networking is somewhat under pressure, or maybe we are in a tier that is somehow throttled?

Specifically:

Maybe these are all related (eg: github networking degraded) - or maybe they are not, and it is a coincidence.

One way or the other, we are getting to the point where it is hard to get a green build on first try (on top of our test flakyness, which is a long fought battle on its own).

While some easy/localized actions can be taken to reduce outbound traffic for routine operations (#3915), I only see two possibilities moving forward:
a. we get some help / information from github about network quality
b. we take a serious hard look at how we do things and significantly reduce our outbound dependencies

I do not have insider contact for a. Does anyone have some?

For b.:

  • we could consider getting rid entirely of reliance on Docker Hub and host everything we need on ghcr instead (on the assumption it will be better), and systematically hunt down and remove unneeded outbound traffic (eg: golangci for eg)
  • we rethink both the way we build and use our base image and the way we use github cache, which has been a growing PITA (quota, slugishness)

Do people have thoughts about all this?

Steps to reproduce the issue

Describe the results you received and expected

na

What version of nerdctl are you using?

main

Are you using a variant of nerdctl? (e.g., Rancher Desktop)

None

Host information

No response

@apostasie apostasie added the kind/unconfirmed-bug-claim Unconfirmed bug claim label Feb 25, 2025
@apostasie
Copy link
Contributor Author

Mmmm... looks like Ubuntu is indeed having issues right now (ns2.canonical.com is down)

@AkihiroSuda AkihiroSuda added area/ci e.g., CI failure and removed kind/unconfirmed-bug-claim Unconfirmed bug claim labels Feb 25, 2025
@AkihiroSuda
Copy link
Member

we could consider getting rid entirely of reliance on Docker Hub and host everything we need on ghcr instead

👍

@apostasie
Copy link
Contributor Author

More data:

Looking at something like TestIssue3425 going over a minute also does not make any sense.

Or even something like TestHostNetworkHostName at 11.23s.

It is very hard not to think that we are throttled.

github.com/containerd/nerdctl/v2/cmd/nerdctl/builder TestBuilder 1m39.85s
github.com/containerd/nerdctl/v2/cmd/nerdctl/builder TestBuilder/WithPull 1m19.66s
github.com/containerd/nerdctl/v2/cmd/nerdctl/issues TestIssue3425 1m1.51s
github.com/containerd/nerdctl/v2/cmd/nerdctl/image TestRemove 55.4s
github.com/containerd/nerdctl/v2/cmd/nerdctl/builder TestBuilder/WithPull/pull_true 44.79s
github.com/containerd/nerdctl/v2/cmd/nerdctl/ipfs TestIPFSAddrWithKubo 37.24s
github.com/containerd/nerdctl/v2/cmd/nerdctl/issues TestIssue3425/with_ipfs 27.39s
github.com/containerd/nerdctl/v2/cmd/nerdctl/image TestImagePrune 27.28s
github.com/containerd/nerdctl/v2/cmd/nerdctl/system TestSystemPrune/volume_prune_all_success 24.4s
github.com/containerd/nerdctl/v2/cmd/nerdctl/system TestEventFilters/UnsupportedEventFilter 23.45s
github.com/containerd/nerdctl/v2/cmd/nerdctl/system TestEventFilters/UnsupportedStatusFilter 23.43s
github.com/containerd/nerdctl/v2/cmd/nerdctl/system TestEventFilters/StatusFilter 23.43s
github.com/containerd/nerdctl/v2/cmd/nerdctl/system TestEventFilters/CapitalizedFilter 23.41s
github.com/containerd/nerdctl/v2/cmd/nerdctl/image TestImages 19.59s

@apostasie
Copy link
Contributor Author

Github Cache is (also) clearly giving us a birdy.

#13 importing cache manifest from gha:16802566802771389591
#13 ERROR: maximum timeout reached: Request was blocked due to exceeding usage of resource 'Count' in namespace ''

@apostasie
Copy link
Contributor Author

Linking here scattered issues that pertain to this overall:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci e.g., CI failure
Projects
None yet
Development

No branches or pull requests

2 participants