You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2017/10/30 23:12:57 UTC

[GitHub] spark pull request #19615: [SPARK-19611][SQL][followup] set dataSchema corre...

GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/19615

    [SPARK-19611][SQL][followup] set dataSchema correctly in HiveMetastoreCatalog.convertToLogicalRelation

    ## What changes were proposed in this pull request?
    
    We made a mistake in https://github.com/apache/spark/pull/16944 . In `HiveMetastoreCatalog#inferIfNeeded` we infer the data schema, merge with full schema, and return the new full schema. At caller side we treat the full schema as data schema and set it to `HadoopFsRelation`.
    
    This doesn't cause any problem because both parquet and orc can work with a wrong data schema that has extra columns, but it's better to fix this mistake.
    
    ## How was this patch tested?
    
    N/A

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark infer

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19615.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19615
    
----
commit 46f530fe777c921d43a2f323abc91d8bb69423d5
Author: Wenchen Fan <we...@databricks.com>
Date:   2017-10-30T23:05:57Z

    set dataSchema correctly in HiveMetastoreCatalog.convertToLogicalRelation

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19615: [SPARK-19611][SQL][followup] set dataSchema correctly in...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19615
  
    cc @budde @gatorsmile 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19615: [SPARK-19611][SQL][followup] set dataSchema correctly in...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19615
  
    **[Test build #83234 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83234/testReport)** for PR 19615 at commit [`46f530f`](https://github.com/apache/spark/commit/46f530fe777c921d43a2f323abc91d8bb69423d5).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19615: [SPARK-19611][SQL][followup] set dataSchema corre...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19615


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19615: [SPARK-19611][SQL][followup] set dataSchema corre...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19615#discussion_r147951101
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
    @@ -164,13 +164,12 @@ private[hive] class HiveMetastoreCatalog(sparkSession: SparkSession) extends Log
                 }
               }
     
    -          val (dataSchema, updatedTable) =
    -            inferIfNeeded(relation, options, fileFormat, Option(fileIndex))
    +          val updatedTable = inferIfNeeded(relation, options, fileFormat, Option(fileIndex))
    --- End diff --
    
    Yeah, we missed this. Actually the variable name is indicating it is data schema...


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19615: [SPARK-19611][SQL][followup] set dataSchema correctly in...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19615
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19615: [SPARK-19611][SQL][followup] set dataSchema correctly in...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/19615
  
    Thank you so much, @cloud-fan !


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19615: [SPARK-19611][SQL][followup] set dataSchema correctly in...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19615
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83234/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19615: [SPARK-19611][SQL][followup] set dataSchema correctly in...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19615
  
    also cc @dongjoon-hyun @viirya 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19615: [SPARK-19611][SQL][followup] set dataSchema correctly in...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/19615
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19615: [SPARK-19611][SQL][followup] set dataSchema correctly in...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19615
  
    **[Test build #83234 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83234/testReport)** for PR 19615 at commit [`46f530f`](https://github.com/apache/spark/commit/46f530fe777c921d43a2f323abc91d8bb69423d5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19615: [SPARK-19611][SQL][followup] set dataSchema correctly in...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19615
  
    thanks, merging to master/2.2!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org