You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2016/12/01 21:19:58 UTC
[jira] [Resolved] (PIG-4901) To use Multistorage for each Group
[ https://issues.apache.org/jira/browse/PIG-4901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai resolved PIG-4901.
-----------------------------
Resolution: Fixed
Hadoop Flags: Reviewed
+1. Patch committed to trunk. Thanks Adam!
> To use Multistorage for each Group
> ----------------------------------
>
> Key: PIG-4901
> URL: https://issues.apache.org/jira/browse/PIG-4901
> Project: Pig
> Issue Type: Improvement
> Components: piggybank
> Affects Versions: 0.11.1, 0.16.0
> Environment: Hadoop 1.2.0
> Reporter: Divya
> Assignee: Adam Szita
> Priority: Minor
> Fix For: 0.17.0
>
> Attachments: PIG-4901.2.patch, PIG-4901.patch
>
>
> I am trying to group my data and store in hdfs with a folder for each 'name' and subfolders for each 'YearMonth' under each name folder.
> Input:
> (Date) (name) (col3) (col4)
> 2015-02-02 abc y z
> 2016-01-02 xyz i j
> 2015-03-02 abc f b
> 2015-02-06 abc y z
> 2016-03-02 xyz a q
>
> Expected out in hdfs:
> abc folder
> ->201502 subfolder
> 2015-02-02 abc y z
> 2015-02-06 abc y z
> ->201503 subfolder
> 2015-03-02 abc f b
> xyz folder
> ->201601
> 2016-01-02 xyz i j
> ->201603
> 2016-03-02 xyz a q
> I am not sure of how to use the Multistorage option on Name column after grouping the tuples by date.
> Any help is appreciated.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)