You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2018/02/06 20:58:27 UTC

[GitHub] spark pull request #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore sho...

GitHub user dongjoon-hyun opened a pull request:

    https://github.com/apache/spark/pull/20522

    [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not ignore table properties

    ## What changes were proposed in this pull request?
    
    Previously, SPARK-22158 fixed for `USING hive` syntax. This PR aims to fix for `STORED AS` syntax. Although the test case covers ORC part, the patch consider both `convertMetastoreOrc` and `convertMetastoreParquet`.
    
    ## How was this patch tested?
    
    Pass newly added test cases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dongjoon-hyun/spark SPARK-22158-2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20522.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20522
    
----
commit 23d8205052fd52b4acbf71a8e51e4ce607c8af0e
Author: Dongjoon Hyun <do...@...>
Date:   2018-02-06T20:53:20Z

    [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not ignore table properties

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20522: [SPARK-23355][SQL] convertMetastore should not ig...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/20522


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by tgravescs <gi...@git.apache.org>.
Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    hey, sorry its unclear to me exactly what isn't support right now in 2.3.  The jira and this pr mention table properties not supported but #19382 seems like it fixed that.  The reason I'm wondering if to know what is/is not support in 2.3 to determine if I personally turn the config convertMetastoreOrc on or not.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    LGTM, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore sho...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20522#discussion_r166440352
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
    @@ -189,12 +189,13 @@ case class RelationConversions(
       private def convert(relation: HiveTableRelation): LogicalRelation = {
         val serde = relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
         if (serde.contains("parquet")) {
    -      val options = relation.tableMeta.storage.properties + (ParquetOptions.MERGE_SCHEMA ->
    -        conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties +
    +        (ParquetOptions.MERGE_SCHEMA ->
    +          conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
           sessionCatalog.metastoreCatalog
             .convertToLogicalRelation(relation, options, classOf[ParquetFileFormat], "parquet")
         } else {
    -      val options = relation.tableMeta.storage.properties
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties
    --- End diff --
    
    @cloud-fan and @gatorsmile .
    Unlike `USING` syntax, `STORED AS` syntax saves the property into `tableMeta.properties`.
    This PR considers both table properties.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Thank you for review and merge, @cloud-fan !


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    **[Test build #87131 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87131/testReport)** for PR 20522 at commit [`0f65eb9`](https://github.com/apache/spark/commit/0f65eb948892ddf9cc1fc94b002f746f9e7714d8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2703/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore sho...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20522#discussion_r166459898
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
    @@ -189,12 +189,13 @@ case class RelationConversions(
       private def convert(relation: HiveTableRelation): LogicalRelation = {
         val serde = relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
         if (serde.contains("parquet")) {
    -      val options = relation.tableMeta.storage.properties + (ParquetOptions.MERGE_SCHEMA ->
    -        conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties +
    +        (ParquetOptions.MERGE_SCHEMA ->
    +          conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
           sessionCatalog.metastoreCatalog
             .convertToLogicalRelation(relation, options, classOf[ParquetFileFormat], "parquet")
         } else {
    -      val options = relation.tableMeta.storage.properties
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties
    --- End diff --
    
    These are not the right things to do. The table properties should not be always put to the serde properties.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    **[Test build #89901 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89901/testReport)** for PR 20522 at commit [`06a9a45`](https://github.com/apache/spark/commit/06a9a4502957bb63c9e5960d55820228323c8c56).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87130/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Also cc @cloud-fan @gengliangwang 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2704/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Sure. I'll try, @gatorsmile . It'll take some time for me.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    @gatorsmile . Sorry for late updating. I updated this PR by narrowing the scope of configuration key names specifically for ORC and Parquet. The test coverage is reading and writing non-partitioned tables.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Just FYI, https://github.com/apache/spark/pull/19382 is to solve the issue of storage/serde properties. However the issue here is table properties. They are not related. Please create a new JIRA instead of treating it as a follow-up


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    **[Test build #87130 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87130/testReport)** for PR 20522 at commit [`23d8205`](https://github.com/apache/spark/commit/23d8205052fd52b4acbf71a8e51e4ce607c8af0e).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/643/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    @dongjoon-hyun Could you submit the PR to resolve all the related issues?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore sho...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20522#discussion_r166460621
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
    @@ -189,12 +189,13 @@ case class RelationConversions(
       private def convert(relation: HiveTableRelation): LogicalRelation = {
         val serde = relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
         if (serde.contains("parquet")) {
    -      val options = relation.tableMeta.storage.properties + (ParquetOptions.MERGE_SCHEMA ->
    -        conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties +
    +        (ParquetOptions.MERGE_SCHEMA ->
    +          conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
           sessionCatalog.metastoreCatalog
             .convertToLogicalRelation(relation, options, classOf[ParquetFileFormat], "parquet")
         } else {
    -      val options = relation.tableMeta.storage.properties
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties
    --- End diff --
    
    This place is only for `convertMetastore`. I think we need this inevitably to prevent regression.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    **[Test build #87131 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87131/testReport)** for PR 20522 at commit [`0f65eb9`](https://github.com/apache/spark/commit/0f65eb948892ddf9cc1fc94b002f746f9e7714d8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    I agree with you on all the issues. BTW, how can we put these existing Spark bugs into ORC migration guide?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore sho...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20522#discussion_r166460402
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
    @@ -189,12 +189,13 @@ case class RelationConversions(
       private def convert(relation: HiveTableRelation): LogicalRelation = {
         val serde = relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
         if (serde.contains("parquet")) {
    -      val options = relation.tableMeta.storage.properties + (ParquetOptions.MERGE_SCHEMA ->
    -        conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties +
    +        (ParquetOptions.MERGE_SCHEMA ->
    +          conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
           sessionCatalog.metastoreCatalog
             .convertToLogicalRelation(relation, options, classOf[ParquetFileFormat], "parquet")
         } else {
    -      val options = relation.tableMeta.storage.properties
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties
    --- End diff --
    
    That is the reason why my previous PR https://github.com/apache/spark/pull/20120 was closed. We need a comprehensive fix for resolving all the DDL issues.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore sho...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20522#discussion_r166460947
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
    @@ -189,12 +189,13 @@ case class RelationConversions(
       private def convert(relation: HiveTableRelation): LogicalRelation = {
         val serde = relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
         if (serde.contains("parquet")) {
    -      val options = relation.tableMeta.storage.properties + (ParquetOptions.MERGE_SCHEMA ->
    -        conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties +
    +        (ParquetOptions.MERGE_SCHEMA ->
    +          conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
           sessionCatalog.metastoreCatalog
             .convertToLogicalRelation(relation, options, classOf[ParquetFileFormat], "parquet")
         } else {
    -      val options = relation.tableMeta.storage.properties
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties
    --- End diff --
    
    It's surprising that Apache Spark doesn't respect table properties in `convertMetastoreParquet` until now. It has been `true` by default.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/642/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87131/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    @tgravescs .
    Yep. The title of SPARK-22158 was changed recently because it only supported Table SerDe properties. If you set the ORC property into Table SerDe properties (not in Table properties), it will work. This PR includes Table properties additionally.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore sho...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20522#discussion_r166488226
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
    @@ -189,12 +189,13 @@ case class RelationConversions(
       private def convert(relation: HiveTableRelation): LogicalRelation = {
         val serde = relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
         if (serde.contains("parquet")) {
    -      val options = relation.tableMeta.storage.properties + (ParquetOptions.MERGE_SCHEMA ->
    -        conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties +
    +        (ParquetOptions.MERGE_SCHEMA ->
    +          conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
           sessionCatalog.metastoreCatalog
             .convertToLogicalRelation(relation, options, classOf[ParquetFileFormat], "parquet")
         } else {
    -      val options = relation.tableMeta.storage.properties
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties
    --- End diff --
    
    The issues have been defined in the PR description of https://github.com/apache/spark/pull/20120:
    ```
    Currently, we ignore table-specific compression conf when the Hive serde tables are converted to the data source tables. We also ignore it when users set compression in the TBLPROPERTIES clause instead of the OPTIONS clause, even if the tables are native data source tables. 
    ```
    https://github.com/apache/spark/pull/20087 is also trying to resolve the related issues. Thus, we might still miss multiple critical bugs. 
    
    Simply adding the table properties to the options will introduce new bugs. For example, users might provide conflicting serde properties and change the properties through DDLs.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore should not...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    **[Test build #87130 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87130/testReport)** for PR 20522 at commit [`23d8205`](https://github.com/apache/spark/commit/23d8205052fd52b4acbf71a8e51e4ce607c8af0e).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89901/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #20522: [SPARK-22158][SQL][FOLLOWUP] convertMetastore sho...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20522#discussion_r166488751
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
    @@ -189,12 +189,13 @@ case class RelationConversions(
       private def convert(relation: HiveTableRelation): LogicalRelation = {
         val serde = relation.tableMeta.storage.serde.getOrElse("").toLowerCase(Locale.ROOT)
         if (serde.contains("parquet")) {
    -      val options = relation.tableMeta.storage.properties + (ParquetOptions.MERGE_SCHEMA ->
    -        conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties +
    +        (ParquetOptions.MERGE_SCHEMA ->
    +          conf.getConf(HiveUtils.CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING).toString)
           sessionCatalog.metastoreCatalog
             .convertToLogicalRelation(relation, options, classOf[ParquetFileFormat], "parquet")
         } else {
    -      val options = relation.tableMeta.storage.properties
    +      val options = relation.tableMeta.properties ++ relation.tableMeta.storage.properties
    --- End diff --
    
    We can't blindly add the table properties to the options. This also introduces the behavior changes. For example, some table properties might have the identical names as the serde properties with different semantics. 
    
    Thus, what we can do is to do this only for the table properties we need.
    
    Conceptually, the table properties should not take the serde-related confs. However, Hive basically does not follow the rule it sets. It is pretty strange. No idea why Hive did it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #20522: [SPARK-23355][SQL] convertMetastore should not ignore ta...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/20522
  
    **[Test build #89901 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89901/testReport)** for PR 20522 at commit [`06a9a45`](https://github.com/apache/spark/commit/06a9a4502957bb63c9e5960d55820228323c8c56).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org