You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Jie Li (JIRA)" <ji...@apache.org> on 2012/07/18 01:46:33 UTC

[jira] [Commented] (PIG-2824) Pushing checking number of fields into LoadFunc

    [ https://issues.apache.org/jira/browse/PIG-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416727#comment-13416727 ] 

Jie Li commented on PIG-2824:
-----------------------------

The idea would be similar to pushing projection down to the loader. We can create another interface (say, LoadCheckSchema) and LoadFuncs which implement it will check #fields by themselves, otherwise Pig will insert a Foreach as usual.
                
> Pushing checking number of fields into LoadFunc
> -----------------------------------------------
>
>                 Key: PIG-2824
>                 URL: https://issues.apache.org/jira/browse/PIG-2824
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.9.0, 0.10.0
>            Reporter: Jie Li
>         Attachments: 2824.png
>
>
> As described in PIG-1188, if users define a schema (w or w/o types), we need to check the number of fields after loading data, so if there are less fields we need to pad null fields, and if there are more fields we need to throw them away. 
> For schema with types, Pig used to insert a Foreach after the loader for type casting which also checks #fields. For schema without types there was no such Foreach, thus PIG-1188 inserted one just for checking #fields. Unfortunately, Foreach is too expensive for such checking, and ideally we can push it into the loader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira