You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/02/14 15:28:36 UTC

[GitHub] [spark] MaxGekk opened a new pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

MaxGekk opened a new pull request #31564:
URL: https://github.com/apache/spark/pull/31564


   ### What changes were proposed in this pull request?
   In the PR, I propose to update the Spark SQL guide about the SQL configs that are related to datetime rebasing:
   - spark.sql.legacy.parquet.int96RebaseModeInWrite
   - spark.sql.legacy.parquet.datetimeRebaseModeInWrite
   - spark.sql.legacy.parquet.int96RebaseModeInRead
   - spark.sql.legacy.parquet.datetimeRebaseModeInRead
   - spark.sql.legacy.avro.datetimeRebaseModeInWrite
   - spark.sql.legacy.avro.datetimeRebaseModeInRead
   
   Parquet options added by #31489:
   - datetimeRebaseMode
   - int96RebaseMode
   
   and Avro options added by #31529:
   - datetimeRebaseMode 
   
   ### Why are the changes needed?
   To inform users about supported DS options and SQL configs.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   By generating the doc and manually checking:
   ```
   $ SKIP_API=1 SKIP_SCALADOC=1 SKIP_PYTHONDOC=1 SKIP_RDOC=1 jekyll serve --watch
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #31564:
URL: https://github.com/apache/spark/pull/31564#discussion_r576561784



##########
File path: docs/sql-data-sources-parquet.md
##########
@@ -329,4 +365,54 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession
   </td>
   <td>1.6.0</td>
 </tr>
+<tr>
+<td>spark.sql.legacy.parquet.datetimeRebaseModeInRead</td>
+  <td><code>EXCEPTION</code></td>
+  <td>The rebasing mode for the values of the <code>DATE</code>, <code>TIMESTAMP_MILLIS</code>, <code>TIMESTAMP_MICROS</code> logical types from the Julian to Proleptic Gregorian calendar:<br>
+    <ul>
+      <li><code>EXCEPTION</code>: Spark will fail the reading if it sees ancient dates/timestamps that are ambiguous between the two calendars.</li>
+      <li><code>CORRECTED</code>: Spark will not do rebase and read the dates/timestamps as it is.</li>
+      <li><code>LEGACY</code>: Spark will rebase dates/timestamps from the legacy hybrid (Julian + Gregorian) calendar to Proleptic Gregorian calendar when reading Parquet files.</li>
+    </ul>
+    This config is only effective if the writer info (like Spark, Hive) of the Parquet files is unknown.
+  </td>
+  <td>3.0.0</td>
+</tr>
+<tr>
+  <td>spark.sql.legacy.parquet.datetimeRebaseModeInWrite</td>

Review comment:
       - There is some kind informal (or maybe formal) rule that all legacy configs are internal , see https://github.com/apache/spark/pull/27448
   - If you looked at Avro configs (http://spark.apache.org/docs/latest/sql-data-sources-avro.html), we already have the config `spark.sql.legacy.replaceDatabricksSparkAvro.enabled` which is legacy, internal and documented publicly :-) 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #31564:
URL: https://github.com/apache/spark/pull/31564#discussion_r576624487



##########
File path: docs/sql-data-sources-parquet.md
##########
@@ -329,4 +365,54 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession
   </td>
   <td>1.6.0</td>
 </tr>
+<tr>
+<td>spark.sql.legacy.parquet.datetimeRebaseModeInRead</td>
+  <td><code>EXCEPTION</code></td>
+  <td>The rebasing mode for the values of the <code>DATE</code>, <code>TIMESTAMP_MILLIS</code>, <code>TIMESTAMP_MICROS</code> logical types from the Julian to Proleptic Gregorian calendar:<br>
+    <ul>
+      <li><code>EXCEPTION</code>: Spark will fail the reading if it sees ancient dates/timestamps that are ambiguous between the two calendars.</li>
+      <li><code>CORRECTED</code>: Spark will not do rebase and read the dates/timestamps as it is.</li>
+      <li><code>LEGACY</code>: Spark will rebase dates/timestamps from the legacy hybrid (Julian + Gregorian) calendar to Proleptic Gregorian calendar when reading Parquet files.</li>
+    </ul>
+    This config is only effective if the writer info (like Spark, Hive) of the Parquet files is unknown.
+  </td>
+  <td>3.0.0</td>
+</tr>
+<tr>
+  <td>spark.sql.legacy.parquet.datetimeRebaseModeInWrite</td>

Review comment:
       Regarding to mention of the rebasing SQL configs in the Spark SQL guide. I see at least two options:
   1. Remove `.internal()` as Hyukjin proposed
   2. Not document them at all - just document DS options
   3. The approach of the current PR: document them and leave as internal(). I do believe we should document those configs since we mention them in spark upgrade exceptions.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #31564:
URL: https://github.com/apache/spark/pull/31564#discussion_r576617058



##########
File path: docs/sql-data-sources-parquet.md
##########
@@ -329,4 +365,54 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession
   </td>
   <td>1.6.0</td>
 </tr>
+<tr>
+<td>spark.sql.legacy.parquet.datetimeRebaseModeInRead</td>
+  <td><code>EXCEPTION</code></td>
+  <td>The rebasing mode for the values of the <code>DATE</code>, <code>TIMESTAMP_MILLIS</code>, <code>TIMESTAMP_MICROS</code> logical types from the Julian to Proleptic Gregorian calendar:<br>
+    <ul>
+      <li><code>EXCEPTION</code>: Spark will fail the reading if it sees ancient dates/timestamps that are ambiguous between the two calendars.</li>
+      <li><code>CORRECTED</code>: Spark will not do rebase and read the dates/timestamps as it is.</li>
+      <li><code>LEGACY</code>: Spark will rebase dates/timestamps from the legacy hybrid (Julian + Gregorian) calendar to Proleptic Gregorian calendar when reading Parquet files.</li>
+    </ul>
+    This config is only effective if the writer info (like Spark, Hive) of the Parquet files is unknown.
+  </td>
+  <td>3.0.0</td>
+</tr>
+<tr>
+  <td>spark.sql.legacy.parquet.datetimeRebaseModeInWrite</td>

Review comment:
       In this PR https://github.com/apache/spark/pull/31571, I made `spark.sql.legacy.replaceDatabricksSparkAvro.enabled` as non-internal since it has been already documented publicly.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780821027


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39784/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780813229


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135203/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780848093


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39784/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780788785


   **[Test build #135203 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135203/testReport)** for PR 31564 at commit [`773cb0b`](https://github.com/apache/spark/commit/773cb0b6351a4a87ae755305e0e889e110168eee).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-778812604


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39731/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #31564:
URL: https://github.com/apache/spark/pull/31564#discussion_r576542456



##########
File path: docs/sql-data-sources-parquet.md
##########
@@ -329,4 +365,54 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession
   </td>
   <td>1.6.0</td>
 </tr>
+<tr>
+<td>spark.sql.legacy.parquet.datetimeRebaseModeInRead</td>
+  <td><code>EXCEPTION</code></td>
+  <td>The rebasing mode for the values of the <code>DATE</code>, <code>TIMESTAMP_MILLIS</code>, <code>TIMESTAMP_MICROS</code> logical types from the Julian to Proleptic Gregorian calendar:<br>
+    <ul>
+      <li><code>EXCEPTION</code>: Spark will fail the reading if it sees ancient dates/timestamps that are ambiguous between the two calendars.</li>
+      <li><code>CORRECTED</code>: Spark will not do rebase and read the dates/timestamps as it is.</li>
+      <li><code>LEGACY</code>: Spark will rebase dates/timestamps from the legacy hybrid (Julian + Gregorian) calendar to Proleptic Gregorian calendar when reading Parquet files.</li>
+    </ul>
+    This config is only effective if the writer info (like Spark, Hive) of the Parquet files is unknown.
+  </td>
+  <td>3.0.0</td>
+</tr>
+<tr>
+  <td>spark.sql.legacy.parquet.datetimeRebaseModeInWrite</td>

Review comment:
       We'll probably have to make these configuration properly exposed via removing `.internal()`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #31564:
URL: https://github.com/apache/spark/pull/31564#discussion_r578040779



##########
File path: docs/sql-data-sources-parquet.md
##########
@@ -252,6 +252,42 @@ REFRESH TABLE my_table;
 
 </div>
 
+## Data Source Option

Review comment:
       Do you plan to list other datasource options here? If not, I would write it as another section like Rebasing Datetime. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-778799440


   **[Test build #135150 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135150/testReport)** for PR 31564 at commit [`4795e90`](https://github.com/apache/spark/commit/4795e90ac3463639b490b47194a40f1a2648a717).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780794774


   **[Test build #135203 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135203/testReport)** for PR 31564 at commit [`773cb0b`](https://github.com/apache/spark/commit/773cb0b6351a4a87ae755305e0e889e110168eee).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780813229


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135203/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #31564:
URL: https://github.com/apache/spark/pull/31564


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-779091020


   @dongjoon-hyun @gengliangwang @cloud-fan @HyukjinKwon Could you review this PR, please.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-778797967


   **[Test build #135150 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135150/testReport)** for PR 31564 at commit [`4795e90`](https://github.com/apache/spark/commit/4795e90ac3463639b490b47194a40f1a2648a717).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780768127


   Since https://github.com/apache/spark/pull/31576 has been merged by @cloud-fan , I documented public configs here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-778803627


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39731/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780354284


   > These options are for migration purposes (same as the legacy configs).
   
   I believe the rebase configs were placed to the `legacy` namespace mistakenly because they can be used not only for migration from previous Spark versions but also for reading (and writing) files written by other systems/frameworks/libs. So, the configs will stay with us forever. I would like to propose to "rename" existing configs via:
   `ConfigBuilder` has the method `withAlternative` , so, we can to introduce an alternative per each legacy rebase config:
   ```
   spark.sql.legacy.parquet.int96RebaseModeInRead -> spark.sql.parquet.int96RebaseModeInRead
   ```
   and deprecate
   ```
   spark.sql.legacy.parquet.int96RebaseModeInRead
   ```
   After that, document `spark.sql.parquet.int96RebaseModeInRead` in the Spark SQL guide.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-781184278


   Merged to master.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780848093


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39784/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-778803981


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135150/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-778812604


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39731/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-778797967


   **[Test build #135150 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135150/testReport)** for PR 31564 at commit [`4795e90`](https://github.com/apache/spark/commit/4795e90ac3463639b490b47194a40f1a2648a717).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #31564:
URL: https://github.com/apache/spark/pull/31564#discussion_r576566797



##########
File path: docs/sql-data-sources-parquet.md
##########
@@ -329,4 +365,54 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession
   </td>
   <td>1.6.0</td>
 </tr>
+<tr>
+<td>spark.sql.legacy.parquet.datetimeRebaseModeInRead</td>
+  <td><code>EXCEPTION</code></td>
+  <td>The rebasing mode for the values of the <code>DATE</code>, <code>TIMESTAMP_MILLIS</code>, <code>TIMESTAMP_MICROS</code> logical types from the Julian to Proleptic Gregorian calendar:<br>
+    <ul>
+      <li><code>EXCEPTION</code>: Spark will fail the reading if it sees ancient dates/timestamps that are ambiguous between the two calendars.</li>
+      <li><code>CORRECTED</code>: Spark will not do rebase and read the dates/timestamps as it is.</li>
+      <li><code>LEGACY</code>: Spark will rebase dates/timestamps from the legacy hybrid (Julian + Gregorian) calendar to Proleptic Gregorian calendar when reading Parquet files.</li>
+    </ul>
+    This config is only effective if the writer info (like Spark, Hive) of the Parquet files is unknown.
+  </td>
+  <td>3.0.0</td>
+</tr>
+<tr>
+  <td>spark.sql.legacy.parquet.datetimeRebaseModeInWrite</td>

Review comment:
       Once we document then it's hard to say it's an internal .. Maybe we should remove `internal()` and fix the docs to say these configurations are subject to be deprecated and removed out ...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780333787


   These options are for migration purposes (same as the legacy configs). Do we really need to mention them in the public doc? How about the migration guide?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-778808407


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39731/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780802446


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39784/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-778803981


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/135150/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a change in pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on a change in pull request #31564:
URL: https://github.com/apache/spark/pull/31564#discussion_r578146476



##########
File path: docs/sql-data-sources-parquet.md
##########
@@ -252,6 +252,42 @@ REFRESH TABLE my_table;
 
 </div>
 
+## Data Source Option

Review comment:
       Yes, other options should be here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #31564: [SPARK-34437][SQL][DOCS] Update Spark SQL guide about the rebasing DS options and SQL configs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #31564:
URL: https://github.com/apache/spark/pull/31564#issuecomment-780788785


   **[Test build #135203 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/135203/testReport)** for PR 31564 at commit [`773cb0b`](https://github.com/apache/spark/commit/773cb0b6351a4a87ae755305e0e889e110168eee).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org