You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Chris Pesto (JIRA)" <ji...@apache.org> on 2011/05/27 20:56:47 UTC

[jira] [Updated] (PIG-2099) AllLoader in trunk does not work properly with JSON schemas

     [ https://issues.apache.org/jira/browse/PIG-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Pesto updated PIG-2099:
-----------------------------

    Tags: piggybank  (was: piggbank)

> AllLoader in trunk does not work properly with JSON schemas
> -----------------------------------------------------------
>
>                 Key: PIG-2099
>                 URL: https://issues.apache.org/jira/browse/PIG-2099
>             Project: Pig
>          Issue Type: Bug
>          Components: data
>    Affects Versions: 0.8.0, 0.8.1
>            Reporter: Chris Pesto
>
> The AllLoader in the Piggybank in trunk does not pass JSON-defined schemas to the child loaders it instantiates.  If the schema is defined in the LOAD function, when Pig calls getSchema on the AllLoader the AllLoader instantiates the child loader and calls the child's getSchema if it respects the LoadMetadata interface.  If the AllLoader finds the JSON schema in a file, it does not instantiate the child loader until prepareToRead is called, and the child does not receive the schema.  I have hacked this in by adding to the AllLoader:
>         transient String location = null;
>         transient Job job = null;
> then in AllLoader::setLocation:
>         this.location = location;
>         this.job = job;
> then in AllLoader::prepareToRead:
>         if (childLoadFunc instanceof LoadMetadata) {
>                 ((LoadMetadata) childLoadFunc).getSchema(location, job);
>         }
> Although I suspect it is not good practice to store the location/job in the class variables like that, I don't know a better way to fix this.
> ------
> Also, getFuncSpecFromContent in the accompanying LoadFuncHelper class with the AllLoader should be modified:
>         funcSpec = new FuncSpec("org.apache.pig.piggybank.storage.PigStorageSchema()");
> since it currently instantiates a normal PigStorage object, which does not understand pre-defined schemas.  The documentation for the AllLoader should reference PigStorageSchema instead of PigStorage as well.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira