You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "li xiang (JIRA)" <ji...@apache.org> on 2015/08/26 08:47:45 UTC

[jira] [Commented] (PARQUET-365) Class Summary does not provide a getter to return inputSchema

    [ https://issues.apache.org/jira/browse/PARQUET-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14712604#comment-14712604 ] 

li xiang commented on PARQUET-365:
----------------------------------

The patch is https://github.com/apache/parquet-mr/pull/265, please review.

> Class Summary does not provide a getter to return inputSchema
> -------------------------------------------------------------
>
>                 Key: PARQUET-365
>                 URL: https://issues.apache.org/jira/browse/PARQUET-365
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.6.0, 1.7.0, 1.8.0
>            Reporter: li xiang
>            Priority: Critical
>              Labels: easyfix, patch
>             Fix For: 1.8.0
>
>
> In Pig code, https://github.com/apache/pig/blob/trunk/src/org/apache/pig/EvalFunc.java. A private number "inputSchemaInternal" represents the schema. Setter and Getter are also provided
> {code}
> 316     private Schema inputSchemaInternal=null;
> 328     /**
> 329      * This method is for internal use. It is called by Pig core in both front-end
> 330      * and back-end to setup the right input schema for EvalFunc
> 331      */
> 332     public void setInputSchema(Schema input){
> 333         this.inputSchemaInternal=input;
> 334     }
> 335 
> 336     /**
> 337      * This method is intended to be called by the user in {@link EvalFunc} to get the input
> 338      * schema of the EvalFunc
> 339      */
> 340     public Schema getInputSchema(){
> 341         return this.inputSchemaInternal;
> 342     }
> {code}
> In parquet-mr/parquet-pig/src/main/java/parquet/pig/summary/Summary.java, class Summary extends EvalFunc. It uses a new number called inputSchema(vs. inputSchemaInternal used in class EvalFunc in Pig) to represent schema and override setInputSchema(), but the class does not override getInputSchema() to return inputSchema.
> {code}
> 51  public class Summary extends EvalFunc<String> implements Algebraic {
> 54     private Schema inputSchema;
> 257   @Override
> 258   public void setInputSchema(Schema input) {
> 259     try {
> 260       // relation.bag.tuple
> 261       this.inputSchema=input.getField(0).schema.getField(0).schema;
> 262       saveSchemaToUDFContext();
> 263     } catch (FrontendException e) {
> 264       throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) GENERATE Summary(A); Can not get schema from " + input, e);
> 265     } catch (RuntimeException e) {
> 266       throw new RuntimeException("Usage: B = FOREACH (GROUP A ALL) GENERATE Summary(A); Can not get schema from "+input, e);
> 267     }
> 268   }
> {code}
> If setInputSchema() of class Summary is called, inputSchema is set. But if we call getInputSchema() afterwards, it will return the value of inputSchemaInternal, which can be still null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)