You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/07 08:29:03 UTC

[GitHub] [spark] HeartSaVioR opened a new pull request #35419: [SPARK-38124][SQL][SS] Introduce StatefulOpClusteredDistribution and apply to all stateful operators

HeartSaVioR opened a new pull request #35419:
URL: https://github.com/apache/spark/pull/35419

### What changes were proposed in this pull request?

This PR revives `HashClusteredDistribution` and renames to `StatefulOpClusteredDistribution` so that the rationalization of the distribution is clear from the name. Renaming is safe because this class no longer needs to be general one - in SPARK-35703 we moved out the usages of `HashClusteredDistribution` to `ClusteredDistribution`; stateful operators are exceptions.

Only `HashPartitioning` with same expressions and number of partitions can satisfy `StatefulOpClusteredDistribution`. That said, we cannot modify `HashPartitioning` unless we clone `HashPartitioning` and assign the clone to `StatefulOpClusteredDistribution`.

This PR documents the expectation of stateful operator on partitioning in the classdoc of `StatefulOpClusteredDistribution`.

This PR also changes all stateful operators to use `StatefulOpClusteredDistribution` instead of `ClusteredDistribution`. This is a long standing issue (Spark 2.3.0+) for stateful operator to use `ClusteredDistribution` which has relaxed requirements than `HashClusteredDistribution`, and this PR fixes the issue.

This PR also has to introduce some changes on Aggregate, since the required child distribution Aggregate is not same with stateful operator's one, which may bring unexpected shuffles between Aggregate and StateStoreRestoreExec / StateStoreSaveExec. This PR makes sure the overall pipeline of `Aggregate -> StateStoreRestoreExec -> Aggregate -> StateStoreSaveExec -> Aggregate` to use stateful operator's required child distribution.

### Why are the changes needed?

Spark does not guarantee stable physical partitioning for stateful operators across query lifetime, and due to the relaxed distribution requirement it is hard to expect what would be the current physical partitioning of the state.
(We expect hash partitioning with grouping keys, but ClusteredDistribution does not "guarantee" the partitioning. It is much more relaxed.)

This PR will enforce the physical partitioning of stateful operators to be hash partition with grouping keys, which is our general expectation of state store partitioning.

### Does this PR introduce _any_ user-facing change?

Yes, if they have streaming queries with checkpoint which state is NOT partitioned via hash partitioning with grouping keys, the change may impact the query result. But we have no idea what is current partitioning of the state in checkpoint, so this is unfortunately the best effort we can do.

### How was this patch tested?

Existing tests.

TODO: New tests will come up sooner.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org