You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Divya (JIRA)" <ji...@apache.org> on 2016/05/20 15:38:12 UTC
[jira] [Created] (PIG-4901) To use Multistorage for each Group
Divya created PIG-4901:
--------------------------
Summary: To use Multistorage for each Group
Key: PIG-4901
URL: https://issues.apache.org/jira/browse/PIG-4901
Project: Pig
Issue Type: Task
Components: piggybank
Affects Versions: 0.11.1
Environment: Hadoop 1.2.0
Reporter: Divya
Priority: Minor
I am trying to group my data and store in hdfs with a folder for each 'name' and subfolders for each 'YearMonth' under each name folder.
Input:
(Date) (name) (col3) (col4)
2015-02-02 abc y z
2016-01-02 xyz i j
2015-03-02 abc f b
2015-02-06 abc y z
2016-03-02 xyz a q
Expected out in hdfs:
abc folder
->201502 subfolder
2015-02-02 abc y z
2015-02-06 abc y z
->201503 subfolder
2015-03-02 abc f b
xyz folder
->201601
2016-01-02 xyz i j
->201603
2016-03-02 xyz a q
I am not sure of how to use the Multistorage option on Name column after grouping the tuples by date.
Any help is appreciated.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)