Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISCOVERY: Cleanup AbstractDisruptionTestCase #34808

Conversation

original-brownbear
Copy link
Member

  • Make the internal test cluster manage minimum master nodes where we used the default of (nodes / 2 + 1) before
  • Manually set master nodes in the cases where we didn't use the default
  • Remove use of the NodeConfigurationSource indirection
  • Relates Fix port assignment and discovery in tests #33675

* Make the internal test cluster manage minimum master nodes where we used the default of (nodes / 2 + 1) before
* Manually set master nodes in the cases where we didn't use the default
* Remove use of the `NodeConfigurationSource` indirection
* Relates elastic#33675
@original-brownbear original-brownbear added >test Issues or PRs that are addressing/adding tests v7.0.0 >refactoring :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v6.5.0 labels Oct 24, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@colings86 colings86 added v6.6.0 and removed v6.5.0 labels Oct 25, 2018
@original-brownbear
Copy link
Member Author

@DaveCTurner ping (just in case this didn't get noticed, it's not urgent :))

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noted a few things that I don't think we need to do (see final comment for my realisation about why they were being done). Also asked whether we can make use of super.nodeSettings(). I think there's more things we can do to simplify this, but this looks like a step in the right direction.

String masterNode = internalCluster().startMasterOnlyNode(Settings.EMPTY);
internalCluster().startDataOnlyNode(Settings.EMPTY);

setMinimumMasterNodes(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? I think that the cluster should know it only has one master node and sets this accordingly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it's not :) Removing.

List<String> nodes = startCluster(2, 1);

List<String> nodes = startCluster(2);
setMinimumMasterNodes(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here we should explicitly start a master and a non-master node rather than starting two master-eligible nodes and then breaking the config like this.

final List<String> allMasterEligibleNodes = internalCluster().startMasterOnlyNodes(3);
final String dataNode = internalCluster().startDataOnlyNode();
setMinimumMasterNodes(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this shouldn't be necessary because the cluster knows it only has three master nodes so will already have set minimum_master_nodes to 2.


@Override
protected Settings nodeSettings(int nodeOrdinal) {
return Settings.builder().put(discoveryConfig.nodeSettings(nodeOrdinal))
return Settings.builder().put(currentSettings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also defer to super.nodeSettings(nodeOrdinal) so we don't also need to set DISCOVERY_HOSTS_PROVIDER_SETTING and MAX_LOCAL_STORAGE_NODES_SETTING here? (This also picks up a correct value for DISCOVERY_ZEN_PING_UNICAST_HOSTS_SETTING).

@@ -60,21 +57,22 @@
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.not;

@ESIntegTestCase.ClusterScope(scope = ESIntegTestCase.Scope.TEST, numDataNodes = 0, transportClientRatio = 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh ok it makes sense that this has gone from the subclasses. However I think I prefer it to remain on the subclasses for the sake of being explicit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will revert :)

@original-brownbear
Copy link
Member Author

original-brownbear commented Oct 30, 2018

@DaveCTurner thanks for taking a look! All points addressed I think => see if you like it :)

Also, I was able to simplify things some more with your comments :) Now that we're never manually configuring a certain number of min nodes the configureCluster method that takes a node count as input could go away :) (changed the logic here so that we assert that we never overwrite the settings ... which is better than just silently skipping if there's already settings anyway imo).

@@ -59,7 +59,7 @@ public void testDisruptionOnSnapshotInitialization() throws Exception {
.put(DiscoverySettings.COMMIT_TIMEOUT_SETTING.getKey(), "30s") // wait till cluster state is committed
.build();
final String idxName = "test";
configureCluster(settings, 4, 2);
configureCluster(settings);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only test case in this suite, so we could pass the settings to each node by overriding nodeSettings ourselves, and drop the configureCluster() call. That then means that this suite can just inherit directly from ESIntegTestCase - it doesn't use any of the other functionality that AbstractDisruptionTestCase provides.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DaveCTurner actually, I'm not so sure here. We're also using the DEFAULT_SETTINGS on AbstractDisruptionTestCase, maybe cd1ed87 isn't so bad afterall?

Copy link
Member Author

@original-brownbear original-brownbear Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise we're forced to duplicate the logic in org.elasticsearch.discovery.AbstractDisruptionTestCase#nodeSettings

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant this:

diff --git a/server/src/test/java/org/elasticsearch/discovery/SnapshotDisruptionIT.java b/server/src/test/java/org/elasticsearch/discovery/SnapshotDisruptionIT.java
index 7d4fcff2f47..e825e2f53cd 100644
--- a/server/src/test/java/org/elasticsearch/discovery/SnapshotDisruptionIT.java
+++ b/server/src/test/java/org/elasticsearch/discovery/SnapshotDisruptionIT.java
@@ -32,6 +32,7 @@ import org.elasticsearch.snapshots.SnapshotInfo;
 import org.elasticsearch.snapshots.SnapshotMissingException;
 import org.elasticsearch.snapshots.SnapshotState;
 import org.elasticsearch.test.ESIntegTestCase;
+import org.elasticsearch.test.discovery.TestZenDiscovery;
 import org.elasticsearch.test.disruption.NetworkDisruption;
 import org.elasticsearch.test.junit.annotations.TestLogging;

@@ -50,11 +51,13 @@ import static org.hamcrest.Matchers.instanceOf;
  */
 @TestLogging("org.elasticsearch.snapshot:TRACE")
 @ESIntegTestCase.ClusterScope(scope = ESIntegTestCase.Scope.TEST, numDataNodes = 0, transportClientRatio = 0)
-public class SnapshotDisruptionIT extends AbstractDisruptionTestCase {
+public class SnapshotDisruptionIT extends ESIntegTestCase {

     @Override
     protected Settings nodeSettings(int nodeOrdinal) {
         return Settings.builder().put(super.nodeSettings(nodeOrdinal))
+            .put(AbstractDisruptionTestCase.DEFAULT_SETTINGS)
+            .put(TestZenDiscovery.USE_MOCK_PINGS.getKey(), false)
             .put(DiscoverySettings.COMMIT_TIMEOUT_SETTING.getKey(), "30s")
             .build();
     }

Basically I don't like the statefulness of currentSettings and would prefer to be explicit about the settings we're using when starting up each node. Looking at the code in more detail I see that we only call configureCluster within ClusterDisruptionIT, in two places. In one place we call it to add settings, and I think it'd be better to add those settings when starting the nodes instead. In the other place we pass in Settings.EMPTY; however that entire test uses nothing else from AbstractDisruptionTestCase so maybe it should be in an ESIntegTestCase fixture on its own instead.

@original-brownbear
Copy link
Member Author

@DaveCTurner Fixed settings setup in cd1ed87 :)

@original-brownbear
Copy link
Member Author

Nevermind one second, I just realize I didn't do what you asked me to do. My bad :) fixing

@original-brownbear
Copy link
Member Author

original-brownbear commented Oct 30, 2018

See my comment, let me know what you like better there :) #34808 (comment)

@original-brownbear
Copy link
Member Author

@DaveCTurner there we go (I hope :)).

  • Split the ClusterDisruptionIT into suits so I can have one without the settings override
  • Don't extend AbstractDisruptionIT in the snapshot disruption test
    => no more mutable settings field :)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. I asked for a couple of small changes but this is coming together very nicely.

.put(TestZenDiscovery.USE_MOCK_PINGS.getKey(), false).build();
}

@Before
public void clearConfig() {
discoveryConfig = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@Override
protected Collection<Class<? extends Plugin>> nodePlugins() {
return Arrays.asList(MockTransportService.TestPlugin.class);
}

void configureCluster(int numberOfNodes, int minimumMasterNode) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 🎉

* node but already deleted on the source node. Search request should still work.
*/
public void testSearchWithRelocationAndSlowClusterStateProcessing() throws Exception {
internalCluster().startMasterOnlyNode();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment about why this couldn't use AbstractDisruptionTestCase.DEFAULT_SETTINGS was useful and should be preserved (even though I don't yet fully understand it).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added it back :)

import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.not;

public abstract class AbstractDisruptionTestCase extends ESIntegTestCase {

static final TimeValue DISRUPTION_HEALING_OVERHEAD = TimeValue.timeValueSeconds(40); // we use 30s as timeout in many places.

private NodeConfigurationSource discoveryConfig;
static final Settings DEFAULT_SETTINGS = Settings.builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could this move back down to where it was before, since it's not being changed? Would make the diff smaller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done :)

@original-brownbear
Copy link
Member Author

@DaveCTurner thanks :) => All comments addressed

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. Really pleased with where this ended up, thanks for the extra iterations.

@original-brownbear
Copy link
Member Author

@DaveCTurner np + thanks for the patient review :) => merging

@original-brownbear original-brownbear merged commit 3fa67c5 into elastic:master Oct 31, 2018
@original-brownbear original-brownbear deleted the tidy-abstract-disruption-testcase branch October 31, 2018 06:52
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Nov 16, 2018
* DISCOVERY: Cleanup AbstractDisruptionTestCase

* Make the internal test cluster manage minimum master nodes where we used the default of (nodes / 2 + 1) before
* Remove use of the `NodeConfigurationSource` indirection
* Relates elastic#33675
original-brownbear added a commit that referenced this pull request Nov 16, 2018
* DISCOVERY: Cleanup AbstractDisruptionTestCase

* Make the internal test cluster manage minimum master nodes where we used the default of (nodes / 2 + 1) before
* Remove use of the `NodeConfigurationSource` indirection
* Relates #33675
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. >refactoring >test Issues or PRs that are addressing/adding tests v6.6.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants