You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by unmesha sreeveni <un...@gmail.com> on 2014/05/13 05:21:47 UTC

Inserting a field back to same position in a bag

 Inserting a field back to same position in a
bag<http://stackoverflow.com/questions/22957813/inserting-a-field-back-to-same-position-in-a-bag>
  up vote1down votefavorite<http://stackoverflow.com/questions/22957813/inserting-a-field-back-to-same-position-in-a-bag#>
*1*

I am having an inputfile.csv <http://pastebin.com/jbgPDNYP> My aim is to
split one field from Data bag and after some other processing I need to
join them back to same position. What I did so far is.

Data = load '$input' using PigStorage('$delimiter');
rankedoriginaldata = rank Data;
numericdata = foreach rankedoriginaldata generate $0,$split;

Run command

pig -x local -f seperator.pig -param input=data/StringNum.csv -param
output=OUT/Numericfile -param delimiter="," -param split='$2'

dump rankedoriginaldata <http://pastebin.com/cPj8Csxq>

dump numericdata <http://pastebin.com/eSjazap8>

Above script splits $2 from Data bag and added to numericdata bag along
with row id.Inorder to keeep an id for joining back.

1.From Data bag I need to exclude $split(ie eg: $2) and copy to another bag.

Expected data or Duplicate data.(excluding $2)<http://pastebin.com/kq01HZPB>

2.I need to join back the numeric data to its excat position and get
inputdata <http://pastebin.com/jbgPDNYP> back as resultant.

How to do this.

Please Suggest a better way.

I am able to do this for a specific file.But I want this script as custom
so that I can use this for any input file and do the operation


Hope some one can help me out with some hints.

-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/