-
Notifications
You must be signed in to change notification settings - Fork 27
Add CreateNodePool and ScaleOut actions for Cassandra #256
Conversation
@wallrj PR needs rebase |
d1d6c3f
to
42f02a1
Compare
42f02a1
to
5c14447
Compare
@wallrj PR needs rebase |
* In jetstack#256 I want to refactor nodepool.Sync so that its only responsibility is to update the NodePool status. * And the seed labelling seems like it is a separate concern that can live in its own module. * With its own unit tests.
* In jetstack#256 I want to refactor nodepool.Sync so that its only responsibility is to update the NodePool status. * And the seed labelling seems like it is a separate concern that can live in its own module. * With its own unit tests.
* In jetstack#256 I want to refactor nodepool.Sync so that its only responsibility is to update the NodePool status. * And the seed labelling seems like it is a separate concern that can live in its own module. * With its own unit tests.
* In jetstack#256 I want to refactor nodepool.Sync so that its only responsibility is to update the NodePool status. * And the seed labelling seems like it is a separate concern that can live in its own module. * With its own unit tests.
* In jetstack#256 I want to refactor nodepool.Sync so that its only responsibility is to update the NodePool status. * And the seed labelling seems like it is a separate concern that can live in its own module. * With its own unit tests.
Automatic merge from submit-queue. Move seed pod labelling to a separate controller * In #256 I want to refactor nodepool.Sync so that its only responsibility is to update the NodePool status. * And the seed labelling seems like it is a separate concern that can live in its own module. * With its own unit tests. **Release note**: ```release-note NONE ```
5c14447
to
46059e8
Compare
rebased |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good, I'm happy to merge
pkg/apis/navigator/v1alpha1/types.go
Outdated
@@ -44,7 +44,7 @@ type CassandraClusterSpec struct { | |||
// CassandraClusterNodePool describes a node pool within a CassandraCluster. | |||
type CassandraClusterNodePool struct { | |||
Name string `json:"name"` | |||
Replicas int64 `json:"replicas"` | |||
Replicas int32 `json:"replicas"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these changed to int32
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to match the type of StatefulSet.Status.ReadyReplicas
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing corresponding change in the internal api version
46059e8
to
ca8ae71
Compare
if actual.Namespace != expected.Namespace { | ||
t.Errorf("Namespace %q != %q", expected.Namespace, actual.Namespace) | ||
} | ||
if expected.Replicas != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if expected.Replicas == nil
but actual.Spec.Replicas != nil
?
@@ -118,3 +120,46 @@ func StatefulSet(c StatefulSetConfig) *apps.StatefulSet { | |||
}, | |||
} | |||
} | |||
|
|||
func AssertStatefulSetMatches(t *testing.T, expected StatefulSetConfig, actual *apps.StatefulSet) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should put assertions into this package. generate.AssertStatefulSetMatches
doesn't really 'flow' when read out loud.
Perhaps create internal/test/util/assert
and call assert.StatefulSetMatches
?
Also, given this is used for unit tests, can we just use reflect.DeepEqual
instead? This method could easily return true for two very different statefulsets with its current implementation, which if reused could cause confusion (and also, if this function is extended in future to check more fields, it may cause tests to fail, thus making this not very reusable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed that assertion and simplified the tests.
pkg/apis/navigator/v1alpha1/types.go
Outdated
@@ -44,7 +44,7 @@ type CassandraClusterSpec struct { | |||
// CassandraClusterNodePool describes a node pool within a CassandraCluster. | |||
type CassandraClusterNodePool struct { | |||
Name string `json:"name"` | |||
Replicas int64 `json:"replicas"` | |||
Replicas int32 `json:"replicas"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing corresponding change in the internal api version
ss := nodepool.StatefulSetForCluster(a.Cluster, a.NodePool) | ||
_, err := s.Clientset.AppsV1beta1().StatefulSets(ss.Namespace).Create(ss) | ||
if k8sErrors.IsAlreadyExists(err) { | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should return an error here always if err != nil
(these actions shouldn't be idempotent, in favour of nextAction
choosing what to execute next sensibly)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few good reasons for them to be idempotent:
- the
CassandraCluster.Status.NodePools
map (upon which this action is calculated) may not have been updated. - future actions (e.g. ScaleIn) may perform a more complex sequence of steps which need to be "ratcheted" through, by repeated calls to
ScaleIn.Execute
, e.g.
- Decommission node
- Decrease the statefulset replica count.
In which case each of those intermediate steps must be idempotent and in which case the final step may as well be idempotent.
- The lister used to calculate the action may not have the most up-to-date resources, so it appears there isn't a StatefulSet, until you attempt to create it, which fails with AlreadyExists (although I might be making that up, because the current implementation doesn't check the lister first)
On the whole, it makes sense to me that these actions should all be idempotent, but happy to discuss further.
if ss.Spec.Replicas == nil || *ss.Spec.Replicas < a.NodePool.Replicas { | ||
ss.Spec.Replicas = &a.NodePool.Replicas | ||
_, err = s.Clientset.AppsV1beta1().StatefulSets(ss.Namespace).Update(ss) | ||
if err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For convention with everywhere else, let's return err
if it does not equal nil, and if not log the event and return nil.
i.e.
if err != nil {
return err
}
<log event>
return nil
@@ -34,77 +36,65 @@ func NewControl( | |||
} | |||
} | |||
|
|||
func (e *defaultCassandraClusterNodepoolControl) removeUnusedStatefulSets( | |||
func (e *defaultCassandraClusterNodepoolControl) clusterStatefulSets( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you collapse the method signature to a single line?
} | ||
} else { | ||
delete(cluster.Status.NodePools, np.Name) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you refactor out this else
? (turn it into an if
with continue after it). Reduces the amount of nesting going on here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole function has been simplified and merged in a separate PR #299
cluster.Status.NodePools[np.Name] = nps | ||
} | ||
if nps.ReadyReplicas != ss.Status.ReadyReplicas { | ||
nps.ReadyReplicas = ss.Status.ReadyReplicas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took a little while to spot that this is where we actually set the fields in the node pool status - perhaps create a function statusForNodePool
to handle this? (which returns a NodePoolStatus structure)
// (the statefulset has been removed by a DeleteNodePool action - not yet implemented) | ||
func (e *defaultCassandraClusterNodepoolControl) Sync(cluster *v1alpha1.CassandraCluster) error { | ||
if cluster.Status.NodePools == nil { | ||
cluster.Status.NodePools = map[string]v1alpha1.CassandraClusterNodePoolStatus{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sure how maps are ordered when serialised - perhaps we shouldn't be using a map here at all in favour of a slice that has a name
field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version of the function was just wrong.
Instead, in #299 I re-populate the CassandraCluster.Status.NodePools
each time, based on the statefulsets that are owned by the cluster, and which have the expected name.
} | ||
// 20% chance of ScaleOut | ||
if rand.Intn(4) == 0 { | ||
np.Replicas++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will increase it by 1 ever result in it being particularly large? Should we not just set this to some random value output by the rng?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(maybe you are following some best practice I am not aware of though!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the initial value is between 0-5 (see code above) so it should never be more than 6.
This is a way to simulate an occasional difference between the Spec.NodePool.Replicas and Status.NodePools.ReadyReplicas
@wallrj PR needs rebase |
…ned by the cluster * Update E2E tests to check the CassandraCluster status rather than checking statefulset status. * Issue an API UpdateStatus with the CassandraCluster.Status after running all control.Sync methods. * Modify the RBAC role to allow the controller to update CassandraCluster.Status * Add a statefulset informer callback to trigger a controller sync when our statefulsets change. Extracted from: jetstack#256 Part of: jetstack#253
490933c
to
8b5ab5c
Compare
8b5ab5c
to
49253c8
Compare
Noticed a weird E2E test error:
Maybe caused by opencontainers/runc#1326 or moby/moby#31230 |
49253c8
to
414be89
Compare
@wallrj: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
* These become the only changes supported by the Cassandra controller. * ScaleIn and CassandraUpgrade actions will be implemented in followup branches. * Some initial documentation on supported configuration changes. Fixes: jetstack#253
414be89
to
36b773f
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: munnerz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
These become the only changes supported by the Cassandra controller until ScaleIn and CassandraUpgrade actions are implemented in followup branches.
Fixes: #253
Release note: