-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
timeouts preventing installer from completing properly #843
Comments
I'm seeing a similar issue during install. Also using libvirt.
|
I was getting the exact same error messages as the original reported issue #843 (comment), consistently. What I did that ended up working was: I'm not really sure which steps were required, but I suspect previous retry attempts got something in a weird state and --dir was the trick that enabled the freshest bits to work properly. |
Also hit it when create cluster on aws.
After destroy the cluster and rm demo directory, re-run "./openshift-install create cluster --dir demo" works well. |
A lot of duplicate discussion of this one, e.g. #857 The install I just did also shows the However:
So I am suspecting the problem is with the logic for how we retry in the installer waiting for the event. |
I also saw today, like @cgwalters and was unable to successfully bring a cluster up:
|
I think this is because your pull secret is out of date, update it from try.openshift.com |
I changed cluster name and --dir to a complete new ones, it succeeds without this error with below version |
Getting new auth from try.openshift.com fixed this for me as well. Having feedback that this is the cause would be very welcome to this new user. It is also unclear how to re-enter the new auth. Is there a way to re-prompt during openshift-install create cluster? The --help doesn't indicate anything. |
|
Hopefully at this point most or all of these have been resolved. If there are futher bugs, please open a bugzilla. |
Version
Platform (aws|libvirt|openstack):
libvirt
What happened?
Installer did not complete, hit timeouts:
`ERROR: logging before flag.Parse: E1207 18:24:44.544541 16824 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=3, ErrCode=NO_ERROR, debug=""
WARNING RetryWatcher - getting event failed! Re-creating the watcher. Last RV: 210
WARNING Failed to connect events watcher: Get https://test1-api.tt.testing:6443/api/v1/namespaces/kube-system/events?resourceVersion=210&watch=true: dial tcp 192.168.126.10:6443: connect: connection refused
WARNING Failed to connect events watcher: Get https://test1-api.tt.testing:6443/api/v1/namespaces/kube-system/events?resourceVersion=210&watch=true: dial tcp 192.168.126.11:6443: connect: connection refused
WARNING Failed to connect events watcher: Get https://test1-api.tt.testing:6443/api/v1/namespaces/kube-system/events?resourceVersion=210&watch=true: dial tcp 192.168.126.11:6443: connect: connection refused
WARNING Failed to connect events watcher: Get https://test1-api.tt.testing:6443/api/v1/namespaces/kube-system/events?resourceVersion=210&watch=true: dial tcp 192.168.126.11:6443: connect: connection refused
WARNING Failed to connect events watcher: Get https://test1-api.tt.testing:6443/api/v1/namespaces/kube-system/events?resourceVersion=210&watch=true: dial tcp 192.168.126.11:6443: connect: connection refused
WARNING Failed to connect events watcher: Get https://test1-api.tt.testing:6443/api/v1/namespaces/kube-system/events?resourceVersion=210&watch=true: dial tcp 192.168.126.10:6443: connect: connection refused
WARNING Failed to connect events watcher: Get https://test1-api.tt.testing:6443/api/v1/namespaces/kube-system/events?resourceVersion=210&watch=true: dial tcp 192.168.126.10:6443: connect: connection refused
WARNING Failed to connect events watcher: Get https://test1-api.tt.testing:6443/api/v1/namespaces/kube-system/events?resourceVersion=210&watch=true: dial tcp 192.168.126.10:6443: connect: connection refused
WARNING Failed to connect events watcher: Get https://test1-api.tt.testing:6443/api/v1/namespaces/kube-system/events?resourceVersion=210&watch=true: dial tcp 192.168.126.11:6443: connect: connection refused
FATAL Error executing openshift-install: waiting for bootstrap-complete: watch closed before UntilWithoutRetry timeout
I also ran
oc get pods --all-namespaces
and got:$ oc get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE openshift-cluster-version cluster-version-operator-5bd8d79d6c-nnj92 0/1 Pending 0 25m
What you expected to happen?
I expected the installer to sucessfully complete and not timeout.
How to reproduce it (as minimally and precisely as possible)?
I just ran the installer.
References
I have been experience sporadic timeouts (though not during install) as described here: openshift/origin#21612
The text was updated successfully, but these errors were encountered: