You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2016/07/29 06:29:20 UTC
[jira] [Comment Edited] (SPARK-16777) Parquet schema converter depends on deprecated APIs

    [ https://issues.apache.org/jira/browse/SPARK-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398781#comment-15398781 ] 

Hyukjin Kwon edited comment on SPARK-16777 at 7/29/16 6:29 AM:
---------------------------------------------------------------

Please let me leave a note because I actually took a look before :)

I guess it is about the warnings below: 

{code}
[WARNING] .../spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala:448: method listType in object ConversionPatterns is deprecated: see corresponding Javadoc for more information.
[WARNING]         ConversionPatterns.listType(
[WARNING]                            ^
[WARNING] .../spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala:464: method listType in object ConversionPatterns is deprecated: see corresponding Javadoc for more information.
[WARNING]         ConversionPatterns.listType(
[WARNING]                            ^
{code}

This should not be changed unless we drop backwords compatibility for Spark prior to 1.4.x because the new method for {{listType}}, {{listOfElements}}, checks if the name of elements in Parquet's {{LIST}} is {{element}} in Parquet schema and throws an exception if not.

It seems Spark prior to 1.4.x writes {{ArrayType}} with Parquet's {{LIST}} but with {{array}} as its element name.

Therefore, changing this will throw an exception as below:

{code}
List element type must be named 'element'
java.lang.IllegalArgumentException: List element type must be named 'element'
	at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:55)
	at org.apache.parquet.schema.ConversionPatterns.listOfElements(ConversionPatterns.java:123)
	at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:448)
	at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321)
	at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convert$1.apply(ParquetSchemaConverter.scala:313)
	at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convert$1.apply(ParquetSchemaConverter.scala:313)
{code}


was (Author: hyukjin.kwon):
Please let me leave a note because I actually took a look before :)

I guess it is about the warnings below: 

{code}
[WARNING] .../spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala:448: method listType in object ConversionPatterns is deprecated: see corresponding Javadoc for more information.
[WARNING]         ConversionPatterns.listType(
[WARNING]                            ^
[WARNING] .../spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala:464: method listType in object ConversionPatterns is deprecated: see corresponding Javadoc for more information.
[WARNING]         ConversionPatterns.listType(
[WARNING]                            ^
{code}

This should not be changed unless we drop backwords compatibility for Spark prior to 1.4.x because the new method for {{listType}}, {{listOfElements}}, checks if the name of elements in Parquet's{{LIST}} is {{element}} in Parquet schema and throws an exception if not.

It seems Spark prior to 1.4.x writes {{ArrayType}} with Parquet's {{LIST}} but with {{array}} as its element name.

Therefore, changing this will throws an exception as below:

{code}
List element type must be named 'element'
java.lang.IllegalArgumentException: List element type must be named 'element'
	at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:55)
	at org.apache.parquet.schema.ConversionPatterns.listOfElements(ConversionPatterns.java:123)
	at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:448)
	at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321)
	at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convert$1.apply(ParquetSchemaConverter.scala:313)
	at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convert$1.apply(ParquetSchemaConverter.scala:313)
{code}

> Parquet schema converter depends on deprecated APIs
> ---------------------------------------------------
>
>                 Key: SPARK-16777
>                 URL: https://issues.apache.org/jira/browse/SPARK-16777
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: holdenk
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org