-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OKD SCOS 4.17] Only 4 CPU cores are recognized, no matter the actual number of cores of the Node #2119
Comments
This problem seems related to hyperthreading. Can you start a terminal on your SNO node (using
Additionally, I noticed that you posted your initial |
Hello @melledouwsma and thank you for your reply. I executed the commands and this is the output:
The I have recreated the cluster, so the credentials do not matter any more, thanks for pointing it out :-) |
I also ran into this issue with 4.17. The workaround we've just now implemented is to deploy 2 additional MachineConfig objects for $role master and worker respectively. When deploying them OpenShift will drain/cordon and reboot nodes. When they come back online SMT is enabled. apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: $role
name: 99-$role-enable-hyperthreading
spec:
kernelArguments:
- mitigations=auto This results in /proc/cmdline having a mitigations=auto at the end, which overrides the previous mitigations=auto,nosmt. I believe this is an issue that stems from a change in CoreOS where they changed from "mitigations=auto" to "mitigations=auto,nosmt", see for example this discussion: coreos/fedora-coreos-tracker#181. I don't know exactly when that CoreOS change happened or when it trickled into OpenShift/OKD. I think another consequence of this is that the minimum vCPU requirements for the control plane nodes doubled with OKD 4.17. I've installed OKD 4.15 on AWS successfully with both t3.xlarge and m6i.xlarge control plane nodes (both being 4 vCPU and 16 GB ram) but when using the OKD 4.17 version of openshift-install the installation kept failing until I increased the control plane EC2 instance type to c6i.2xlarge (8 vCPU and 16 GB ram). I've reported this issue to Redhat as well because as of right now the "hyperthreading: Enabled" setting in the openshift-install install-config.yaml doesn't seem to have any effect since the resulting machines run with "mitigations=auto,nosmt" instead of "mitigations=auto". Common Linux distros like RHEL (and its derivatives), and Amazon Linux all run with "mitigations=auto" by default but CoreOS decided to disable SMT by default. |
@bergner thank you very much for your reply. |
I have a bastion host on AWS and I did the following, to prepare for the creation of a new SNO OKD v4.17 SCOS node:
I used this install configuration yaml file (EC2 type of "t3a.2xlarge" has 8 vCPUs and 32GB of memory):
Then, I created the SNO OKD node:
Result output:
So far, everything seems good.
The Bug:
When I login to the OKD web console, I see in the Cluster Utilization section that only 4 CPU cores are available, instead of 8. I tried the same process with EC2 type of "m5.2xlarge", still only 4 cores are recognized instead of 8.
This is indeed a problem, because I also tested the same install configuration and EC2 type with an OpenShift installation, and OpenShift recognizes all 8 CPU cores.
Could you suggest on how to fix this issue?
The text was updated successfully, but these errors were encountered: