You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "mark meissonnier (JIRA)" <ji...@apache.org> on 2009/10/08 01:05:31 UTC
[jira] Commented: (PIG-760) Serialize schemas for PigStorage() and
other storage types.
[ https://issues.apache.org/jira/browse/PIG-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763294#action_12763294 ]
mark meissonnier commented on PIG-760:
--------------------------------------
Any new development on this issue? I'm finding it painful to have to modify the input schema to all "child" pig scripts anytime I modify my "root" pig script. I was thinking of developing something quick and then I figured someone might have done something or I could help the overall effort.
Please let me know.
Thanks
> Serialize schemas for PigStorage() and other storage types.
> -----------------------------------------------------------
>
> Key: PIG-760
> URL: https://issues.apache.org/jira/browse/PIG-760
> Project: Pig
> Issue Type: New Feature
> Reporter: David Ciemiewicz
>
> I'm finding PigStorage() really convenient for storage and data interchange because it compresses well and imports into Excel and other analysis environments well.
> However, it is a pain when it comes to maintenance because the columns are in fixed locations and I'd like to add columns in some cases.
> It would be great if load PigStorage() could read a default schema from a .schema file stored with the data and if store PigStorage() could store a .schema file with the data.
> I have tested this out and both Hadoop HDFS and Pig in -exectype local mode will ignore a file called .schema in a directory of part files.
> So, for example, if I have a chain of Pig scripts I execute such as:
> A = load 'data-1' using PigStorage() as ( a: int , b: int );
> store A into 'data-2' using PigStorage();
> B = load 'data-2' using PigStorage();
> describe B;
> describe B should output something like { a: int, b: int }
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.