[autoparallel] handled illegal sharding strategy #1728
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is the problem?
There is seldom a systematic way to filter out the illegal sharding strategy generated by the
StrategyGenerator
. Illegal sharding strategies can be generated when:These sharding strategies are allowed by default in the current implementation and thus will lead to wrong result.
What does this PR do?
This PR designed and implemented a systematic and hierarchical way of handling illegal sharding strategy. The illegal ones are captured in three layers:
ShardingSpec
: when aShardingSpec
is instantiated, it will automatically check whether the specs are correct. If not, it will throwShardingSpecException
.StrategyGenerator
: for aStrategyGenerator
, we need to implement one method for one strategy. This method must be decorated withignore_sharding_exception
, this decorator will capture theShardingSpecException
and return None upon exception. TheseNone
values will be removed automatically later.NodeHandler
: During logical-physical sharding spec conversion in thepost_process
method, the developer needs to manually catch theShardingSpecException
. TheNodeHandler
will check for the validity of the sharding strategy inregister_strategy
method (currently not implemented in this PR as it will cause many tests to fail, I will put up a separate PR to deal with this).In this way, we can ensure each node has the correct sharding strategies.
A summary of the code change
ShardingSpecException
is defined in thesharding_spec.py
so that every exception we throw has better semantics.In this PR, the APIs of the
StrategyGenerator
is refactored by introducing an additionalcollate_strategies
method. This method is introduced so that we don't have to manually remove illegal sharding strategy and update cost in every generator. This will be taken over by thegenerate
method and code redundancy is removed. Therefore, for every childStrategyGenerator
, thegenerate
method is changed tocollate_strategies
method, andgenerate
method only exists in the parent class.Every strategy method in the child
StrategyGenerator
implementation is decorated withignore_sharding_exception
. Two changes are included for this decorator.exception_handler
because we don't catch the general exception, but only sharding exceptionThe
validate
method will be called inside__init__
method of theStrategyGenerator
class.validate
is defined previously but never called anywhere, so now it is called to do first-hand checking.As an example, I added stricter test code in the
test_linear_node_handler.py
to ensure all generated sharding strategies are valid.