You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2021/07/07 17:21:01 UTC

[jira] [Commented] (BEAM-11908) Deprecate .withProjection from ParquetIO

    [ https://issues.apache.org/jira/browse/BEAM-11908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376726#comment-17376726 ] 

Beam JIRA Bot commented on BEAM-11908:
--------------------------------------

This issue was marked "stale-assigned" and has not received a public comment in 7 days. It is now automatically unassigned. If you are still working on it, you can assign it to yourself again. Please also give an update about the status of the work.

> Deprecate .withProjection from ParquetIO
> ----------------------------------------
>
>                 Key: BEAM-11908
>                 URL: https://issues.apache.org/jira/browse/BEAM-11908
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-parquet
>            Reporter: Ismaël Mejía
>            Priority: P3
>              Labels: starter
>             Fix For: 2.32.0
>
>
> There are multiple issues wrong with the API of withProjection:
> 1. The current API requires an extra encoderSchema that is not needed when projecting data in Parquet. The simplest way to get this with the Parquet API is by passing the projectionSchema like this:
> {quote}{color:#000000}AvroReadSupport{color}.setAvroReadSchema({color:#871094}conf{color}, {color:#871094}projectionSchema{color});
> {color:#000000}AvroReadSupport{color}.setRequestedProjection({color:#871094}conf{color}, {color:#871094}projectionSchema{color});
> {quote}
> We can offer an alternative method `withProjection(Configuration conf, List<String> fields)` so users don't have to build their own projection Schema, but historically we have let users to rely on the upstream connector API. If we follow this we can better document in ParquetIO how to project fields by relying in the Parquet APIs and avoid maintaining this extra code in the Beam side.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)