You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Siddharth Tiwari <si...@live.com> on 2012/08/20 18:33:40 UTC

Streaming issue ( URGENT )

Hi team,




I have a python script which  normally runs like this locally,


Python mapper.py file1 file2  2 .


How can I achieve this by using streaming API, and using the script as mapper. It actually joins the three files on a column which is passed as parameter ( numeric ) .



Also how can I use paste command in mapper to concatenate three files.


Ex, paste file1 file2 file3 > file4


This is in normal shell,


How to achieve it over streaming.

if possible please explain how can I achive it using multiple mappers and one reducer. It would be great If I could get some examples, tried searching a lot :(



Thanks in advance please help

*------------------------*

Cheers !!!

Siddharth Tiwari

Have a refreshing day !!!
"Every duty is holy, and devotion to duty is the highest form of worship of God.” 

"Maybe other people will try to limit me but I don't limit myself"
 		 	   		  

Re: Streaming issue ( URGENT )

Posted by Bejoy Ks <be...@gmail.com>.
Hi Siddharth

Joins are better implemented in hive and pig. Try checking out those and
see whether it fits your requirements.

If you are still looking for implementing joins using mapreduce, you can
take a look at this example which uses MultipleInputs
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html

Regards
Bejoy KS

Re: Streaming issue ( URGENT )

Posted by Bejoy Ks <be...@gmail.com>.
Hi Siddharth

Joins are better implemented in hive and pig. Try checking out those and
see whether it fits your requirements.

If you are still looking for implementing joins using mapreduce, you can
take a look at this example which uses MultipleInputs
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html

Regards
Bejoy KS

Re: Streaming issue ( URGENT )

Posted by Bejoy Ks <be...@gmail.com>.
Hi Siddharth

Joins are better implemented in hive and pig. Try checking out those and
see whether it fits your requirements.

If you are still looking for implementing joins using mapreduce, you can
take a look at this example which uses MultipleInputs
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html

Regards
Bejoy KS

Re: Streaming issue ( URGENT )

Posted by Bejoy Ks <be...@gmail.com>.
Hi Siddharth

Joins are better implemented in hive and pig. Try checking out those and
see whether it fits your requirements.

If you are still looking for implementing joins using mapreduce, you can
take a look at this example which uses MultipleInputs
http://kickstarthadoop.blogspot.in/2011/09/joins-with-plain-map-reduce.html

Regards
Bejoy KS