You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Siddharth Tiwari <si...@live.com> on 2012/08/20 18:33:40 UTC
Streaming issue ( URGENT )
Hi team,
I have a python script which normally runs like this locally,
Python mapper.py file1 file2 2 .
How can I achieve this by using streaming API, and using the script as mapper. It actually joins the three files on a column which is passed as parameter ( numeric ) .
Also how can I use paste command in mapper to concatenate three files.
Ex, paste file1 file2 file3 > file4
This is in normal shell,
How to achieve it over streaming.
if possible please explain how can I achive it using multiple mappers and one reducer. It would be great If I could get some examples, tried searching a lot :(
Thanks in advance please help
*------------------------*
Cheers !!!
Siddharth Tiwari
Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.”
"Maybe other people will try to limit me but I don't limit myself"
Re: Streaming issue ( URGENT )
Posted by Bejoy Ks <be...@gmail.com>.
Hi Siddharth
Joins are better implemented in hive and pig. Try checking out those and
see whether it fits your requirements.
If you are still looking for implementing joins using mapreduce, you can
take a look at this example which uses MultipleInputs
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html
Regards
Bejoy KS
Re: Streaming issue ( URGENT )
Posted by Bejoy Ks <be...@gmail.com>.
Hi Siddharth
Joins are better implemented in hive and pig. Try checking out those and
see whether it fits your requirements.
If you are still looking for implementing joins using mapreduce, you can
take a look at this example which uses MultipleInputs
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html
Regards
Bejoy KS
Re: Streaming issue ( URGENT )
Posted by Bejoy Ks <be...@gmail.com>.
Hi Siddharth
Joins are better implemented in hive and pig. Try checking out those and
see whether it fits your requirements.
If you are still looking for implementing joins using mapreduce, you can
take a look at this example which uses MultipleInputs
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html
Regards
Bejoy KS
Re: Streaming issue ( URGENT )
Posted by Bejoy Ks <be...@gmail.com>.
Hi Siddharth
Joins are better implemented in hive and pig. Try checking out those and
see whether it fits your requirements.
If you are still looking for implementing joins using mapreduce, you can
take a look at this example which uses MultipleInputs
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html
Regards
Bejoy KS