Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shard coordinator failing on movement to non-shard and proxy nodes. #3277

Closed
marrick66 opened this issue Jan 18, 2018 · 2 comments
Closed

Shard coordinator failing on movement to non-shard and proxy nodes. #3277

marrick66 opened this issue Jan 18, 2018 · 2 comments

Comments

@marrick66
Copy link

For Akka.Cluster.Sharding 1.3.2-beta54, I've noticed that if a non-shard node or proxy node is selected to become the singleton shard coordinator due to failure, it will not start until that new node is restarted and the coordinator is moved back to a node with a shard region started. The two cases I can reproduce are:

  1. Movement to a node with no shard region or proxy started. This fails with an error of "Error [Cannot find serializer with id [13]", which matches ClusterShardingMessageSerializer in the code. This would make sense, since it's probably not loaded in the first place on that node.

  2. Movement to a node with a shard region proxy started. This shows as the coordinator being moved to the shard, but no persistence recovery is attempted and no new registrations can be completed. No exceptions seem to be thrown, however.

As an example, I have two nodes with shard regions started on them, and a single node that either has a shard proxy or none. If crash the shard nodes and bring them back up, the coordinator is moved to the single node, and the behavior occurs as above.

Is this a valid use case I'm attempting?

@Horusiath
Copy link
Contributor

Horusiath commented Jan 19, 2018

@marrick66 When specifying settings on start of cluster sharding region, you may specify the role, that all nodes containing regions of that type, should have:

var sharding = ClusterSharding.Get(system);
var region = await sharding.StartAsync(
    typeName: nameof(MyActor),
    entityProps: Props.Create<MyActor>(),
    settings: ClusterShardingSettings.Create(system).WithRole(nameof(MyActor) + "-region"),
    messageExtractor: new MessageExtractor());

All nodes with that role are expected to have shard regions started on them.

Regarding shard region proxy, here role also should be provided, but node, which hosts proxy shouldn't contain that role.

@marrick66
Copy link
Author

marrick66 commented Jan 19, 2018

I had roles set for the cluster, but not the shards. Adding them to the shard configuration prevents the non-shard nodes from attempting to become the coordinator. Thanks for the help, I appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants