You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by unmesha sreeveni <un...@gmail.com> on 2014/05/13 05:21:47 UTC
Inserting a field back to same position in a bag
Inserting a field back to same position in a
bag<http://stackoverflow.com/questions/22957813/inserting-a-field-back-to-same-position-in-a-bag>
up vote1down votefavorite<http://stackoverflow.com/questions/22957813/inserting-a-field-back-to-same-position-in-a-bag#>
*1*
I am having an inputfile.csv <http://pastebin.com/jbgPDNYP> My aim is to
split one field from Data bag and after some other processing I need to
join them back to same position. What I did so far is.
Data = load '$input' using PigStorage('$delimiter');
rankedoriginaldata = rank Data;
numericdata = foreach rankedoriginaldata generate $0,$split;
Run command
pig -x local -f seperator.pig -param input=data/StringNum.csv -param
output=OUT/Numericfile -param delimiter="," -param split='$2'
dump rankedoriginaldata <http://pastebin.com/cPj8Csxq>
dump numericdata <http://pastebin.com/eSjazap8>
Above script splits $2 from Data bag and added to numericdata bag along
with row id.Inorder to keeep an id for joining back.
1.From Data bag I need to exclude $split(ie eg: $2) and copy to another bag.
Expected data or Duplicate data.(excluding $2)<http://pastebin.com/kq01HZPB>
2.I need to join back the numeric data to its excat position and get
inputdata <http://pastebin.com/jbgPDNYP> back as resultant.
How to do this.
Please Suggest a better way.
I am able to do this for a specific file.But I want this script as custom
so that I can use this for any input file and do the operation
Hope some one can help me out with some hints.
--
*Thanks & Regards *
*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/