You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rolling upgrades between major versions for a 2-node Elasticsearch cluster can be impossible. If the first node that is upgraded is the master node, then the rolling upgrade gets stuck because we never reach a state where we have two nodes in the cluster. The cluster health stays yellow. The second node can't join the cluster formed by the first upgraded node.
Log of the second node log:
{
"type": "server",
"timestamp": "2022-02-03T09:29:04,631Z",
"level": "WARN",
"component": "o.e.c.c.JoinHelper",
"cluster.name": "test-version-up-2-to-8x-wddt",
"node.name": "test-version-up-2-to-8x-wddt-es-masterdata-0",
"message": "last failed join attempt was 7ms ago, failed to join {test-version-up-2-to-8x-wddt-es-masterdata-1}{Dm1aKtG4QtqAWdG1qYhxrg}{66D72IvmSR2LTi77eKbd_g}{10.42.176.208}{10.42.176.208:9300}{cdfhilmrstw}{k8s_node_name=gke-thbkrkr-dev-cluster-default-pool-9292dcc0-3h08, ml.machine_memory=2147483648, xpack.installed=true, ml.max_jvm_size=1073741824} with JoinRequest{sourceNode={test-version-up-2-to-8x-wddt-es-masterdata-0}{A50qiG3DRo-dL0toCbiQxA}{_tnSXZa1QOGe97A63qoKDw}{10.42.177.120}{10.42.177.120:9300}{cdfhilmrstw}{k8s_node_name=gke-thbkrkr-dev-cluster-default-pool-a617668f-p82k, ml.machine_memory=2147483648, xpack.installed=true, transform.node=true, ml.max_open_jobs=512, ml.max_jvm_size=1073741824}, minimumTerm=2, optionalJoin=Optional[Join{term=2, lastAcceptedTerm=1, lastAcceptedVersion=65, sourceNode={test-version-up-2-to-8x-wddt-es-masterdata-0}{A50qiG3DRo-dL0toCbiQxA}{_tnSXZa1QOGe97A63qoKDw}{10.42.177.120}{10.42.177.120:9300}{cdfhilmrstw}{k8s_node_name=gke-thbkrkr-dev-cluster-default-pool-a617668f-p82k, ml.machine_memory=2147483648, xpack.installed=true, transform.node=true, ml.max_open_jobs=512, ml.max_jvm_size=1073741824}, targetNode={test-version-up-2-to-8x-wddt-es-masterdata-1}{Dm1aKtG4QtqAWdG1qYhxrg}{66D72IvmSR2LTi77eKbd_g}{10.42.176.208}{10.42.176.208:9300}{cdfhilmrstw}{k8s_node_name=gke-thbkrkr-dev-cluster-default-pool-9292dcc0-3h08, ml.machine_memory=2147483648, xpack.installed=true, ml.max_jvm_size=1073741824}}]}",
"cluster.uuid": "En4wbZQ-Ru-J0IXK-Ysl0g",
"node.id": "A50qiG3DRo-dL0toCbiQxA" ,
"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [test-version-up-2-to-8x-wddt-es-masterdata-1][10.42.176.208:9300][internal:cluster/coordination/join]",
"Caused by: java.lang.IllegalStateException:
node version [7.17.0] may not join a cluster comprising only nodes of version [8.0.0] or greater",
"at org.elasticsearch.cluster.coordination.JoinTaskExecutor.ensureVersionBarrier(JoinTaskExecutor.java:325) ~[elasticsearch-7.17.0-SNAPSHOT.jar:7.17.0-SNAPSHOT]",
"at org.elasticsearch.cluster.coordination.Coordinator.validateJoinRequest(Coordinator.java:585) ~[elasticsearch-7.17.0-SNAPSHOT.jar:7.17.0-SNAPSHOT]",
"at org.elasticsearch.cluster.coordination.Coordinator.lambda$handleJoinRequest$9(Coordinator.java:556) ~[elasticsearch-7.17.0-SNAPSHOT.jar:7.17.0-SNAPSHOT]",
"at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:136) ~[elasticsearch-7.17.0-SNAPSHOT.jar:7.17.0-SNAPSHOT]",
...
I think we should extend the condition for "forced upgrades" to include 2 node clusters and not restrict this to major version upgrades. There will always be a loss of availability on a two node cluster when upgrading so it does not make sense to try to orchestrate it gracefully.
this seems only to affect major version upgrades, I have no been able to reproduce the issue on a minor upgrade
I was originally tempted to just do all cluster changes on two/single node clusters in a full restart fashion but there is an argument for sticking to rolling upgrades for most changes and that is that even after cluster break down individual nodes will be able to serve partial search results (depending on shard placement ofc)
Rolling upgrades between major versions for a 2-node Elasticsearch cluster can be impossible. If the first node that is upgraded is the master node, then the rolling upgrade gets stuck because we never reach a state where we have two nodes in the cluster. The cluster health stays yellow. The second node can't join the cluster formed by the first upgraded node.
Log of the second node log:
YAML manifest to reproduce:
ECK should do a full cluster restart on clusters with 1 or 2 voting nodes/master role nodes.
The text was updated successfully, but these errors were encountered: