You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2016/07/29 06:29:20 UTC
[jira] [Comment Edited] (SPARK-16777) Parquet schema converter
depends on deprecated APIs
[ https://issues.apache.org/jira/browse/SPARK-16777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398781#comment-15398781 ]
Hyukjin Kwon edited comment on SPARK-16777 at 7/29/16 6:29 AM:
---------------------------------------------------------------
Please let me leave a note because I actually took a look before :)
I guess it is about the warnings below:
{code}
[WARNING] .../spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala:448: method listType in object ConversionPatterns is deprecated: see corresponding Javadoc for more information.
[WARNING] ConversionPatterns.listType(
[WARNING] ^
[WARNING] .../spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala:464: method listType in object ConversionPatterns is deprecated: see corresponding Javadoc for more information.
[WARNING] ConversionPatterns.listType(
[WARNING] ^
{code}
This should not be changed unless we drop backwords compatibility for Spark prior to 1.4.x because the new method for {{listType}}, {{listOfElements}}, checks if the name of elements in Parquet's {{LIST}} is {{element}} in Parquet schema and throws an exception if not.
It seems Spark prior to 1.4.x writes {{ArrayType}} with Parquet's {{LIST}} but with {{array}} as its element name.
Therefore, changing this will throw an exception as below:
{code}
List element type must be named 'element'
java.lang.IllegalArgumentException: List element type must be named 'element'
at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:55)
at org.apache.parquet.schema.ConversionPatterns.listOfElements(ConversionPatterns.java:123)
at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:448)
at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321)
at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convert$1.apply(ParquetSchemaConverter.scala:313)
at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convert$1.apply(ParquetSchemaConverter.scala:313)
{code}
was (Author: hyukjin.kwon):
Please let me leave a note because I actually took a look before :)
I guess it is about the warnings below:
{code}
[WARNING] .../spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala:448: method listType in object ConversionPatterns is deprecated: see corresponding Javadoc for more information.
[WARNING] ConversionPatterns.listType(
[WARNING] ^
[WARNING] .../spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala:464: method listType in object ConversionPatterns is deprecated: see corresponding Javadoc for more information.
[WARNING] ConversionPatterns.listType(
[WARNING] ^
{code}
This should not be changed unless we drop backwords compatibility for Spark prior to 1.4.x because the new method for {{listType}}, {{listOfElements}}, checks if the name of elements in Parquet's{{LIST}} is {{element}} in Parquet schema and throws an exception if not.
It seems Spark prior to 1.4.x writes {{ArrayType}} with Parquet's {{LIST}} but with {{array}} as its element name.
Therefore, changing this will throws an exception as below:
{code}
List element type must be named 'element'
java.lang.IllegalArgumentException: List element type must be named 'element'
at org.apache.parquet.Preconditions.checkArgument(Preconditions.java:55)
at org.apache.parquet.schema.ConversionPatterns.listOfElements(ConversionPatterns.java:123)
at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:448)
at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter.convertField(ParquetSchemaConverter.scala:321)
at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convert$1.apply(ParquetSchemaConverter.scala:313)
at org.apache.spark.sql.execution.datasources.parquet.ParquetSchemaConverter$$anonfun$convert$1.apply(ParquetSchemaConverter.scala:313)
{code}
> Parquet schema converter depends on deprecated APIs
> ---------------------------------------------------
>
> Key: SPARK-16777
> URL: https://issues.apache.org/jira/browse/SPARK-16777
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Reporter: holdenk
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org