You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/02/19 08:24:36 UTC

[GitHub] [spark] MaxGekk opened a new pull request #31589: [SPARK-34424][SQL][TESTS][3.1] Fix failures of HiveOrcHadoopFsRelationSuite

MaxGekk opened a new pull request #31589:
URL: https://github.com/apache/spark/pull/31589


   ### What changes were proposed in this pull request?
   Modify `RandomDataGenerator.forType()` to allow generation of dates/timestamps that are valid in both Julian and Proleptic Gregorian calendars. Currently, the function can produce a date (for example `1582-10-06`) which is valid in the Proleptic Gregorian calendar. Though it cannot be saved to ORC files AS IS since ORC format (ORC libs in fact) assumes Julian calendar. So, Spark shifts `1582-10-06` to the next valid date `1582-10-15` while saving it to ORC files. And as a consequence of that, the test fails because it compares original date `1582-10-06` and the date `1582-10-15` loaded back from the ORC files.
   
   In this PR, I propose to generate valid dates/timestamps in both calendars for ORC datasource till SPARK-34440 is resolved.
   
   ### Why are the changes needed?
   The changes fix failures of `HiveOrcHadoopFsRelationSuite`. For instance, the test "test all data types" fails with the seed **610710213676**:
   ```
   == Results ==
   !== Correct Answer - 20 ==    == Spark Answer - 20 ==
    struct<index:int,col:date>   struct<index:int,col:date>
   ...
   ![9,1582-10-06]               [9,1582-10-15]
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   By running the modified test suite:
   ```
   $ build/sbt -Phive -Phive-thriftserver "test:testOnly *HiveOrcHadoopFsRelationSuite"
   ```
   
   Authored-by: Max Gekk <ma...@gmail.com>
   Signed-off-by: HyukjinKwon <gu...@apache.org>
   (cherry picked from commit 03161055de0c132070354407160553363175c4d7)
   Signed-off-by: Max Gekk <ma...@gmail.com>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31589:
URL: https://github.com/apache/spark/pull/31589#issuecomment-781938010


   **[Test build #135264 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135264/testReport)** for PR 31589 at commit [`0031ab3`](https://github.com/apache/spark/commit/0031ab3dff8211b95b7dd499f2eb911ca5769801).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31589:
URL: https://github.com/apache/spark/pull/31589#issuecomment-781971551


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39844/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on pull request #31589:
URL: https://github.com/apache/spark/pull/31589#issuecomment-781920665


   @HyukjinKwon I cherry-picked this onto 3.0, and ran the test locally.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31589:
URL: https://github.com/apache/spark/pull/31589#issuecomment-782003895


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39844/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31589:
URL: https://github.com/apache/spark/pull/31589#issuecomment-782095060


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135264/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31589:
URL: https://github.com/apache/spark/pull/31589#issuecomment-781981963


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39844/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #31589:
URL: https://github.com/apache/spark/pull/31589


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31589:
URL: https://github.com/apache/spark/pull/31589#issuecomment-781938010


   **[Test build #135264 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135264/testReport)** for PR 31589 at commit [`0031ab3`](https://github.com/apache/spark/commit/0031ab3dff8211b95b7dd499f2eb911ca5769801).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31589:
URL: https://github.com/apache/spark/pull/31589#issuecomment-782072943


   thanks, merging to 3.1/3.0!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31589:
URL: https://github.com/apache/spark/pull/31589#issuecomment-782095060


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135264/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31589:
URL: https://github.com/apache/spark/pull/31589#issuecomment-782003895


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39844/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31589: [SPARK-34424][SQL][TESTS][3.1][3.0] Fix failures of HiveOrcHadoopFsRelationSuite

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31589:
URL: https://github.com/apache/spark/pull/31589#issuecomment-782070659


   **[Test build #135264 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135264/testReport)** for PR 31589 at commit [`0031ab3`](https://github.com/apache/spark/commit/0031ab3dff8211b95b7dd499f2eb911ca5769801).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org