You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/30 02:52:15 UTC

[GitHub] [spark] allisonwang-db opened a new pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

allisonwang-db opened a new pull request #30195:
URL: https://github.com/apache/spark/pull/30195


   Backport #30093 for branch-3.0. I've updated the configuration version to 2.4.8.
   
   ### What changes were proposed in this pull request?
   This PR aims to fix a correctness bug in the optimizer rule EliminateSorts. It also adds a new physical rule to remove redundant sorts that cannot be eliminated in the Optimizer rule after the bugfix.
   
   ### Why are the changes needed?
   A global sort should not be eliminated even if its child is ordered since we don't know if its child ordering is global or local. For example, in the following scenario, the first sort shouldn't be removed because it has a stronger guarantee than the second sort even if the sort orders are the same for both sorts.
   ```
   Sort(orders, global = True, ...)
     Sort(orders, global = False, ...)
   ```
   Since there is no straightforward way to identify whether a node's output ordering is local or global, we should not remove a global sort even if its child is already ordered.
   
   ### Does this PR introduce any user-facing change?
   Yes
   
   ### How was this patch tested?
   Unit tests


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719165947


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719139894


   **[Test build #130433 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130433/testReport)** for PR 30195 at commit [`9526dee`](https://github.com/apache/spark/commit/9526deea2f24208dbd6ebd0ed29e8ddaadd84604).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719161360


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35038/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719257007






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719139894


   **[Test build #130433 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130433/testReport)** for PR 30195 at commit [`9526dee`](https://github.com/apache/spark/commit/9526deea2f24208dbd6ebd0ed29e8ddaadd84604).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719402939


   GA passed, merging to 3.0!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719165947






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719257007


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719257030


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130433/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719252908


   **[Test build #130433 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130433/testReport)** for PR 30195 at commit [`9526dee`](https://github.com/apache/spark/commit/9526deea2f24208dbd6ebd0ed29e8ddaadd84604).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719165926


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35038/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30195:
URL: https://github.com/apache/spark/pull/30195#issuecomment-719165952


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35038/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #30195: [SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a physical rule to remove redundant sorts

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #30195:
URL: https://github.com/apache/spark/pull/30195


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org