You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org> on 2010/08/22 15:23:17 UTC

[jira] Commented: (PIG-1237) Piggybank MutliStorage - specify field to write in output

    [ https://issues.apache.org/jira/browse/PIG-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901161#action_12901161 ] 

Dmitriy V. Ryaboy commented on PIG-1237:
----------------------------------------

Gerrit,
Sorry this fell through the cracks! Just noticed this ticket.

The ability to specify just one column seems very limited. Perhaps instead one could optionally specify whether to materialize the splitField? I think this would accomplish the same thing in a more general manner.

Also perhaps this warrants a second constructor, as introducing new arguments to the existing one will break backwards compatibility.

> Piggybank MutliStorage - specify field to write in output
> ---------------------------------------------------------
>
>                 Key: PIG-1237
>                 URL: https://issues.apache.org/jira/browse/PIG-1237
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Gerrit Jansen van Vuuren
>            Assignee: Gerrit Jansen van Vuuren
>            Priority: Minor
>         Attachments: PIG-1237.patch
>
>
> I've made a modification to the piggy bank MutliStorage class that allows to optionally specify the index of the field in each tuple to write to output.
> This feature allows to have records with metadata like seqno, time of upload etc, and then to combine files from these records into one but without the metadata.
> e.g. 
> 1: date type seq1 data
> 2:  date type seq2 data
> then write output grouped by type and ordered by sequence:
> data
> data

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.