You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/30 14:15:18 UTC

[GitHub] [spark] MaxGekk opened a new pull request #30551: [SPARK-33588][SQL][2.4] Respect the `spark.sql.caseSensitive` config while resolving partition spec in v1 `SHOW TABLE EXTENDED`

MaxGekk opened a new pull request #30551:
URL: https://github.com/apache/spark/pull/30551


   ### What changes were proposed in this pull request?
   Perform partition spec normalization in `ShowTablesCommand` according to the table schema before getting partitions from the catalog. The normalization via `PartitioningUtils.normalizePartitionSpec()` adjusts the column names in partition specification, w.r.t. the real partition column names and case sensitivity.
   
   ### Why are the changes needed?
   Even when `spark.sql.caseSensitive` is `false` which is the default value, v1 `SHOW TABLE EXTENDED` is case sensitive:
   ```sql
   spark-sql> CREATE TABLE tbl1 (price int, qty int, year int, month int)
            > USING parquet
            > partitioned by (year, month);
   spark-sql> INSERT INTO tbl1 PARTITION(year = 2015, month = 1) SELECT 1, 1;
   spark-sql> SHOW TABLE EXTENDED LIKE 'tbl1' PARTITION(YEAR = 2015, Month = 1);
   Error in query: Partition spec is invalid. The spec (YEAR, Month) must match the partition spec (year, month) defined in table '`default`.`tbl1`';
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. After the changes, the `SHOW TABLE EXTENDED` command respects the SQL config. And for example above, it returns correct result:
   ```sql
   spark-sql> SHOW TABLE EXTENDED LIKE 'tbl1' PARTITION(YEAR = 2015, Month = 1);
   default	tbl1	false	Partition Values: [year=2015, month=1]
   Location: file:/Users/maximgekk/spark-warehouse/tbl1/year=2015/month=1
   Serde Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
   InputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
   OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
   Storage Properties: [serialization.format=1, path=file:/Users/maximgekk/spark-warehouse/tbl1]
   Partition Parameters: {transient_lastDdlTime=1606595118, totalSize=623, numFiles=1}
   Created Time: Sat Nov 28 23:25:18 MSK 2020
   Last Access: UNKNOWN
   Partition Statistics: 623 bytes
   ```
   
   ### How was this patch tested?
   By running the modified test suite via:
   ```
   $ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *DDLSuite"
   ```
   
   Authored-by: Max Gekk <ma...@gmail.com>
   Signed-off-by: Dongjoon Hyun <do...@apache.org>
   (cherry picked from commit 0054fc937f804660c6501d9d3f6319f3047a68f8)
   Signed-off-by: Max Gekk <ma...@gmail.com>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30551: [SPARK-33588][SQL][2.4] Respect the `spark.sql.caseSensitive` config while resolving partition spec in v1 `SHOW TABLE EXTENDED`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30551:
URL: https://github.com/apache/spark/pull/30551#issuecomment-735819616


   **[Test build #131993 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131993/testReport)** for PR 30551 at commit [`c2ca5dc`](https://github.com/apache/spark/commit/c2ca5dc8846638c3cc06a512e3b920219bce08a9).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30551: [SPARK-33588][SQL][2.4] Respect the `spark.sql.caseSensitive` config while resolving partition spec in v1 `SHOW TABLE EXTENDED`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30551:
URL: https://github.com/apache/spark/pull/30551#issuecomment-735929802






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30551: [SPARK-33588][SQL][2.4] Respect the `spark.sql.caseSensitive` config while resolving partition spec in v1 `SHOW TABLE EXTENDED`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30551:
URL: https://github.com/apache/spark/pull/30551#issuecomment-735929802






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #30551: [SPARK-33588][SQL][2.4] Respect the `spark.sql.caseSensitive` config while resolving partition spec in v1 `SHOW TABLE EXTENDED`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #30551:
URL: https://github.com/apache/spark/pull/30551#issuecomment-735854072






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #30551: [SPARK-33588][SQL][2.4] Respect the `spark.sql.caseSensitive` config while resolving partition spec in v1 `SHOW TABLE EXTENDED`

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #30551:
URL: https://github.com/apache/spark/pull/30551#issuecomment-735854072






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #30551: [SPARK-33588][SQL][2.4] Respect the `spark.sql.caseSensitive` config while resolving partition spec in v1 `SHOW TABLE EXTENDED`

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #30551:
URL: https://github.com/apache/spark/pull/30551#issuecomment-735928952


   **[Test build #131993 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131993/testReport)** for PR 30551 at commit [`c2ca5dc`](https://github.com/apache/spark/commit/c2ca5dc8846638c3cc06a512e3b920219bce08a9).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #30551: [SPARK-33588][SQL][2.4] Respect the `spark.sql.caseSensitive` config while resolving partition spec in v1 `SHOW TABLE EXTENDED`

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #30551:
URL: https://github.com/apache/spark/pull/30551#issuecomment-735819616


   **[Test build #131993 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131993/testReport)** for PR 30551 at commit [`c2ca5dc`](https://github.com/apache/spark/commit/c2ca5dc8846638c3cc06a512e3b920219bce08a9).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #30551: [SPARK-33588][SQL][2.4] Respect the `spark.sql.caseSensitive` config while resolving partition spec in v1 `SHOW TABLE EXTENDED`

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on pull request #30551:
URL: https://github.com/apache/spark/pull/30551#issuecomment-735812273


   @dongjoon-hyun Please, review this PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #30551: [SPARK-33588][SQL][2.4] Respect the `spark.sql.caseSensitive` config while resolving partition spec in v1 `SHOW TABLE EXTENDED`

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #30551:
URL: https://github.com/apache/spark/pull/30551


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org