Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support NodeCountScaler #7258

Merged
merged 14 commits into from
May 7, 2024
Merged

feat: support NodeCountScaler #7258

merged 14 commits into from
May 7, 2024

Conversation

free6om
Copy link
Contributor

@free6om free6om commented May 7, 2024

fixed #6552
fixed #5795

Background

The requirement originated from here.
When the requirement was raised, the KB ClusterDefinition API had a WorkloadType field, where the underlying implementation relied on the native StatefulSet API of Kubernetes when the WorkloadType was Consensus, Replication, or Stateful. In this context, the requirement could be fulfilled by introducing a new WorkloadType similar to Daemon.
Starting from version 0.8, KB introduced a new API architecture where there is no longer a field like WorkloadType. In the final implementation, all components are supported through the InstanceSet, which is a common Workload API. The InstanceSet directly manages more fundamental Kubernetes objects such as Pods and PVCs.
In the new API architecture, adding support for the Daemon type of workload is no longer as straightforward as before. The specific design is as follows.

Design

Goals

  1. KB supports the ability to be aware of the number of nodes, i.e., dynamically adjusting the Replicas value of certain ComponentSpecs under the Cluster.
  2. KB supports enforcing a one-to-one relationship between Pods and Nodes through settings such as Pod anti-affinity, Node affinity, and taint tolerations.
  3. KB supports elevating the priority of Pods by setting PriorityClassName, reducing the risk of Pods being evicted due to insufficient Node resources.

The above three points are the core features of DaemonSet.

API

For points 2 and 3 mentioned above, KB currently supports them, and they can be configured through relevant fields in APIs like Cluster.
For point 1, it is achieved by introducing an experimental API called NodeCountScalar.

apiVersion: experimental.kubeblocks.io/v1alpha1
kind: NodeCountScalar
metadata:
  name: my-cluster-scaler
spec:
  targetClusterName: "my-cluster"
  targetComponentNames: ["component-0"]
status:
  componentStatuses:
  - name: "component-0"
    currentReplicas: 3
    readyReplicas: 3
    availableReplicas: 3
    desiredReplicas: 3
  conditions:
  - type: ScaleReady
    status: True
    reason: ScaleReady
    message: "scale ready"
    lastTransitionTime: 2024-05-07 16:10:01
  lastScaleTime: 2024-05-07 16:05:15

Implementation

The NodeCountScalar Controller watches for Node and Cluster objects.
When a Create or Delete event occurs for a Node, it triggers a full-scale tuning of all NodeCountScalar objects.
When a Create or Update event occurs for a Cluster, it triggers the tuning of the corresponding NodeCountScalar object (if present).

During tuning, the NodeCountScalar Controller retrieves the current list of Nodes, where the length of the list becomes the DesiredReplicas.
Then, the Replicas value of the target ComponentSpec in the target Cluster object is updated to DesiredReplicas, achieving the goal of dynamically adjusting Replicas.
During the tuning process of the Cluster and its secondary resources, the Status field of the corresponding NodeCountScalar object is updated in real-time to observe the progress of dynamic adjustments.

Testing

  1. Create the target Cluster object
    Create the Cluster using the following command. At this time, the Replicas value of the ComponentSpec named "mysql" is 0, and the Cluster is in the Stopped state.
kbcli cluster create mytest --cluster-definition=apecloud-mysql --set replicas=0
  1. Create the NodeCountScalar object
    Run the following command to create the NodeAwareScaler:
kubectl create -f -<<EOF
apiVersion: experimental.kubeblocks.io/v1alpha1
kind: NodeCountScalar
metadata:
  name: mytest
spec:
  targetClusterName: mytest
  targetComponentNames: ["mysql"]
EOF

Expected results

  1. The Replicas value in the ComponentSpec named "mysql" in the Cluster object should be changed to the same value as the number of Nodes.
  2. After the Cluster tuning is complete (i.e., the Phase in the Status is Running), when you retrieve the NodeCountScalar object named "mytest" using kubectl, you should see the following output:
# kubectl get ncs mytest
# #output should be like this:
NAME     TARGET-CLUSTER-NAME   READY   REASON     MESSAGE            LAST-SCALE-TIME
mytest   mytest                True    Ready      scale ready        4m55s

@free6om free6om added this to the Release 0.9.0 milestone May 7, 2024
@free6om free6om self-assigned this May 7, 2024
@github-actions github-actions bot added the size/XXL Denotes a PR that changes 1000+ lines. label May 7, 2024
@apecloud-bot apecloud-bot requested a review from realzyy May 7, 2024 07:20
Copy link

codecov bot commented May 7, 2024

Codecov Report

Attention: Patch coverage is 54.31034% with 106 lines in your changes are missing coverage. Please review.

Project coverage is 64.88%. Comparing base (a89c4b2) to head (07f16ba).
Report is 30 commits behind head on main.

Files Patch % Lines
controllers/experimental/tree_loader.go 42.85% 16 Missing and 8 partials ⚠️
...ntrollers/experimental/reconciler_update_status.go 70.66% 17 Missing and 5 partials ⚠️
...rollers/experimental/nodecountscaler_controller.go 0.00% 16 Missing ⚠️
controllers/experimental/cluster_handler.go 0.00% 14 Missing ⚠️
...rs/experimental/reconciler_scale_target_cluster.go 62.16% 9 Missing and 5 partials ⚠️
controllers/experimental/node_scaling_handler.go 0.00% 12 Missing ⚠️
pkg/controllerutil/util.go 75.00% 2 Missing and 1 partial ⚠️
pkg/controller/instanceset/instance_util.go 83.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7258      +/-   ##
==========================================
- Coverage   64.96%   64.88%   -0.09%     
==========================================
  Files         337      345       +8     
  Lines       41723    41936     +213     
==========================================
+ Hits        27107    27210     +103     
- Misses      12260    12345      +85     
- Partials     2356     2381      +25     
Flag Coverage Δ
unittests 64.88% <54.31%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@free6om free6om changed the title feat: support NodeAwareScaler feat: support NodeCountScaler May 7, 2024
}
cluster, _ := object.(*appsv1alpha1.Cluster)
nodes := tree.List(&corev1.Node{})
// TODO(free6om): filter nodes that satisfy pod template spec of each component (by nodeSelector, nodeAffinity&nodeAntiAffinity, tolerations)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are taints on some nodes, which cannot be tolerated by the Component, which situation will happen?

  • Some Pods are in pending status.
  • Some Nodes have multiple Pods deployed.

}
statusList = append(statusList, status)
}
instanceset.MergeList(&statusList, &scaler.Status.ComponentStatuses,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems MergeList should be moved into a more suitable package.

@apecloud-bot apecloud-bot added the approved PR Approved Test label May 7, 2024
@apecloud-bot apecloud-bot removed the approved PR Approved Test label May 7, 2024
@apecloud-bot apecloud-bot added the approved PR Approved Test label May 7, 2024
@free6om free6om merged commit 70df1e8 into main May 7, 2024
52 checks passed
@free6om free6om deleted the support/node-aware-scaler branch May 7, 2024 11:25
@free6om
Copy link
Contributor Author

free6om commented May 7, 2024

/cherry-pick release-0.9

Copy link

github-actions bot commented May 7, 2024

🤖 says: cherry pick action finished successfully 🎉!
See: https://github.com/apecloud/kubeblocks/actions/runs/8984506513

github-actions bot pushed a commit that referenced this pull request May 7, 2024
(cherry picked from commit 70df1e8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved PR Approved Test area/user-interaction size/XXL Denotes a PR that changes 1000+ lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Proposal] Support daemonset workloadType proposal [Features] Support for DaemonSet Workloads in KubeBlocks
5 participants