You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Alexey Tigarev <al...@gmail.com> on 2010/01/21 18:18:27 UTC

Re: Anyone has sample program for Hadoop Streaming using shell scripting for Map/Reduce

On Thu, Jan 21, 2010 at 8:53 AM, Sunil Kulkarni
<su...@persistent.co.in> wrote:
> I am new to hadoop. Presently, I am reading Hadoop Streaming related documents.
> Anyone has sample program Hadoop Streaming using shell script used for Map/Reduce.
> Please help me on this.

Here's an article with a simple example,
"Finding Similar Items with Amazon Elastic MapReduce, Python, and
Hadoop Streaming", Pete Skomoroch:
http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2294

Hope this helps.

-- 
С уважением, Алексей Тигарев
<ti...@nlp.od.ua> Jabber: tigra@jabber.od.ua Skype: t__gra

Как программисту стать фрилансером и заработать первую $1000 на oDesk:
http://freelance-start.com/earn-first-1000-on-odesk

Re: Anyone has sample program for Hadoop Streaming using shell scripting for Map/Reduce

Posted by prasenjit mukherjee <pm...@quattrowireless.com>.
This is a sample work I am trying to write a distributed s3-fetch pig scrip
which uses python script.

s3fetch.pig:
define CMD `s3fetch.py` SHIP('/root/s3fetch.py');
r1 = LOAD '/ip/s3fetch_input_files' AS (filename:chararray);
grp_r1 = GROUP r1 BY filename PARALLEL 5;
r2 = FOREACH grp_r1 GENERATE FLATTEN(r1);
r3 = STREAM r2 through CMD;
store r3 INTO '/op/s3fetch_debug_log';

And here is my s3fetch.py :
for word in sys.stdin:
  word=word.rstrip()
  str='/usr/local/hadoop-0.20.0/
bin/hadoop fs -cp s3n://<s3-credentials>@bucket/dir-name/'+word+'
/ip/data/.';
  sys.stdout.write('\n\n'+word+ ':\t'+str+'\n')
  (input_str,out_err) = os.popen4(str);
  for line in out_err.readlines():
    sys.stdout.write('\t'+word+'::\t'+line)



On Thu, Jan 21, 2010 at 10:48 PM, Alexey Tigarev
<al...@gmail.com>wrote:

> On Thu, Jan 21, 2010 at 8:53 AM, Sunil Kulkarni
> <su...@persistent.co.in> wrote:
> > I am new to hadoop. Presently, I am reading Hadoop Streaming related
> documents.
> > Anyone has sample program Hadoop Streaming using shell script used for
> Map/Reduce.
> > Please help me on this.
>
> Here's an article with a simple example,
> "Finding Similar Items with Amazon Elastic MapReduce, Python, and
> Hadoop Streaming", Pete Skomoroch:
> http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2294
>
> Hope this helps.
>
> --
> С уважением, Алексей Тигарев
> <ti...@nlp.od.ua> Jabber: tigra@jabber.od.ua Skype: t__gra
>
> Как программисту стать фрилансером и заработать первую $1000 на oDesk:
> http://freelance-start.com/earn-first-1000-on-odesk
>