You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gengliangwang <gi...@git.apache.org> on 2018/08/20 08:21:51 UTC

[GitHub] spark pull request #22151: [SPARK-25160][SQL]Avro: remove sql configuration ...

GitHub user gengliangwang opened a pull request:

    https://github.com/apache/spark/pull/22151

    [SPARK-25160][SQL]Avro: remove sql configuration spark.sql.avro.outputTimestampType

    ## What changes were proposed in this pull request?
    
    In the PR for supporting logical timestamp types https://github.com/apache/spark/pull/21935, a SQL configuration spark.sql.avro.outputTimestampType is added, so that user can specify the output timestamp precision they want.
    
    With PR https://github.com/apache/spark/pull/21847,  the output file can be written with user specified types. 
    
    So there is no need to have such trivial configuration. Otherwise to make it consistent we need to add configuration for all the Catalyst types that can be converted into different Avro types. 
    
    This PR also add a test case for user specified output schema with different timestamp types.
    
    ## How was this patch tested?
    
    Unit test


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gengliangwang/spark removeOutputTimestampType

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22151.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22151
    
----
commit 088bc83a7d5f767bcdfa98d72fd5bf6f1293f8f8
Author: Gengliang Wang <ge...@...>
Date:   2018-08-20T08:02:46Z

    remove sql configuration spark.sql.avro.outputTimestampType

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22151: [SPARK-25160][SQL]Avro: remove sql configuration ...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22151#discussion_r211243540
  
    --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala ---
    @@ -201,13 +201,11 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable:
     
       private def newStructConverter(
           catalystStruct: StructType, avroStruct: Schema): InternalRow => Record = {
    -    if (avroStruct.getType != RECORD) {
    +    if (avroStruct.getType != RECORD || avroStruct.getFields.size() != catalystStruct.length) {
           throw new IncompatibleSchemaException(s"Cannot convert Catalyst type $catalystStruct to " +
             s"Avro type $avroStruct.")
         }
    -    val avroFields = avroStruct.getFields
    -    assert(avroFields.size() == catalystStruct.length)
    --- End diff --
    
    Is this check redundant? Why remove it now?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22151: [SPARK-25160][SQL]Avro: remove sql configuration spark.s...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22151
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22151: [SPARK-25160][SQL]Avro: remove sql configuration spark.s...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/22151
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22151: [SPARK-25160][SQL]Avro: remove sql configuration spark.s...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/22151
  
    Merged to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22151: [SPARK-25160][SQL]Avro: remove sql configuration spark.s...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22151
  
    **[Test build #94949 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94949/testReport)** for PR 22151 at commit [`088bc83`](https://github.com/apache/spark/commit/088bc83a7d5f767bcdfa98d72fd5bf6f1293f8f8).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22151: [SPARK-25160][SQL]Avro: remove sql configuration ...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22151#discussion_r211246362
  
    --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala ---
    @@ -201,13 +201,11 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable:
     
       private def newStructConverter(
           catalystStruct: StructType, avroStruct: Schema): InternalRow => Record = {
    -    if (avroStruct.getType != RECORD) {
    +    if (avroStruct.getType != RECORD || avroStruct.getFields.size() != catalystStruct.length) {
           throw new IncompatibleSchemaException(s"Cannot convert Catalyst type $catalystStruct to " +
             s"Avro type $avroStruct.")
         }
    -    val avroFields = avroStruct.getFields
    -    assert(avroFields.size() == catalystStruct.length)
    --- End diff --
    
    Ah, I miss it. :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22151: [SPARK-25160][SQL]Avro: remove sql configuration spark.s...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22151
  
    **[Test build #94949 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94949/testReport)** for PR 22151 at commit [`088bc83`](https://github.com/apache/spark/commit/088bc83a7d5f767bcdfa98d72fd5bf6f1293f8f8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22151: [SPARK-25160][SQL]Avro: remove sql configuration spark.s...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22151
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22151: [SPARK-25160][SQL]Avro: remove sql configuration ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22151


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22151: [SPARK-25160][SQL]Avro: remove sql configuration ...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22151#discussion_r211246064
  
    --- Diff: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroLogicalTypeSuite.scala ---
    @@ -179,7 +192,7 @@ class AvroLogicalTypeSuite extends QueryTest with SharedSQLContext with SQLTestU
         }
       }
     
    -  test("Logical type: user specified schema") {
    +  test("Logical type: user specified read schema") {
    --- End diff --
    
    nit: `Logical type: user specified read schema with different timestamp types`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22151: [SPARK-25160][SQL]Avro: remove sql configuration ...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22151#discussion_r211246175
  
    --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala ---
    @@ -201,13 +201,11 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable:
     
       private def newStructConverter(
           catalystStruct: StructType, avroStruct: Schema): InternalRow => Record = {
    -    if (avroStruct.getType != RECORD) {
    +    if (avroStruct.getType != RECORD || avroStruct.getFields.size() != catalystStruct.length) {
           throw new IncompatibleSchemaException(s"Cannot convert Catalyst type $catalystStruct to " +
             s"Avro type $avroStruct.")
         }
    -    val avroFields = avroStruct.getFields
    -    assert(avroFields.size() == catalystStruct.length)
    --- End diff --
    
    Ah, yea, strictly sounds unrelated tho.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22151: [SPARK-25160][SQL]Avro: remove sql configuration spark.s...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22151
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2315/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22151: [SPARK-25160][SQL]Avro: remove sql configuration ...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22151#discussion_r211246084
  
    --- Diff: external/avro/src/main/scala/org/apache/spark/sql/avro/AvroSerializer.scala ---
    @@ -201,13 +201,11 @@ class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable:
     
       private def newStructConverter(
           catalystStruct: StructType, avroStruct: Schema): InternalRow => Record = {
    -    if (avroStruct.getType != RECORD) {
    +    if (avroStruct.getType != RECORD || avroStruct.getFields.size() != catalystStruct.length) {
           throw new IncompatibleSchemaException(s"Cannot convert Catalyst type $catalystStruct to " +
             s"Avro type $avroStruct.")
         }
    -    val avroFields = avroStruct.getFields
    -    assert(avroFields.size() == catalystStruct.length)
    --- End diff --
    
    I think he moved this condition above.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22151: [SPARK-25160][SQL]Avro: remove sql configuration spark.s...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22151
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94949/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org