You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Raghu Angadi (JIRA)" <ji...@apache.org> on 2011/07/22 21:14:57 UTC

[jira] [Updated] (PIG-2187) PigStorage should handle converting Tuple to text

     [ https://issues.apache.org/jira/browse/PIG-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated PIG-2187:
------------------------------

    Attachment: PIG-2187.patch

The patch add utility method in StorageUtils to convert ta Tuple to Text. Now PigTextOutputFormat is an alias for TextOutputFormat.

This looks pretty but only drawback is couple of extra copies to make Text from Tuple. These copies didn't exist before. But this is same as PigTextInputFormat which makes the same compromise, though it handles order of magnitude more data.

> PigStorage should handle converting Tuple to text
> -------------------------------------------------
>
>                 Key: PIG-2187
>                 URL: https://issues.apache.org/jira/browse/PIG-2187
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>         Attachments: PIG-2187.patch
>
>
> Currently it is simple for users to use a different TextInputFormat with PigStorage since PigStorage loader just expects one line at a time, takes care of parsing the text into a tuple.
> This is not the case with storage side. PigTextOutputFormat handles the conversion to Text (it actually write UTF8 of each field from tuple directly to output). This implies a different TextOutputFormat can not be used with PigStorage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira