You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Kristoffer Sjögren (JIRA)" <ji...@apache.org> on 2016/09/22 11:57:20 UTC

[jira] [Commented] (PARQUET-697) ProtoMessageConverter fails for unknown proto fields

    [ https://issues.apache.org/jira/browse/PARQUET-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513086#comment-15513086 ] 

Kristoffer Sjögren commented on PARQUET-697:
--------------------------------------------

Any comments on this?

> ProtoMessageConverter fails for unknown proto fields
> ----------------------------------------------------
>
>                 Key: PARQUET-697
>                 URL: https://issues.apache.org/jira/browse/PARQUET-697
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>    Affects Versions: 1.8.1
>            Reporter: Kristoffer Sjögren
>
> Hi
> We have Spark application that reads parquet files and turns them into a Protobuf RDD like the code below [1]. However, if the parquet schema contain fields that doesn't exist in protobuf class an IncompatibleSchemaModificationException [2] is thrown. 
> For compatibility reasons it would be nice to make it possible to ignore fields instead of throwing an exception. Maybe as an configuration? The fix for ignoring fields is quite easy, just instantiate an empty PrimitiveConverter instead.
> Cheers,
> -Kristoffer
> [1]
> JobConf conf = new JobConf(ctx.hadoopConfiguration());
> FileInputFormat.setInputPaths(conf, rawPath);
> ProtoReadSupport.setProtobufClass(conf, Msg.class.getName());
> NewHadoopRDD<Void, Msg.Builder> rdd =
>       new NewHadoopRDD(ctx.sc(), ProtoParquetInputFormat.class, void.class, Msg.class, conf);
> rdd.toJavaRDD().foreach(log -> {
>   System.out.println(log._2);
> });
> [2] https://github.com/apache/parquet-mr/blob/master/parquet-protobuf/src/main/java/org/apache/parquet/proto/ProtoMessageConverter.java#L84
> [3] converters[parquetFieldIndex - 1] = new PrimitiveConverter() {};



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)