You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Prasan Ary <vo...@yahoo.com> on 2008/03/25 21:07:15 UTC
Map/reduce with input files on S3
I am running hadoop on EC2. I want to run a jar MR application on EC2 such that input and output files are on S3.
I configured hadoop-site.xml so that fs.default.name property points to my s3 bucket with all required identifications (eg; s3://<ID>:<secret key>@<bucket> ). I created an input directory in this bucket and put an input file in this directory. Then I restarted hadoop so that the new configuration takes into effect.
When I try to run the jar file now, I get the message "Hook previously registered" and the application dies.
Any Idea what might have gone wrong?
thanks.
---------------------------------
Never miss a thing. Make Yahoo your homepage.
Re: Map/reduce with input files on S3
Posted by Prasan Ary <vo...@yahoo.com>.
I changed the configuration a little so that the MR jar file now runs on my local hadoop cluster, but takes input files from S3.
I get the following output:
08/03/26 17:32:39 INFO mapred.FileInputFormat: Total input paths to process : 1
08/03/26 17:32:44 INFO mapred.JobClient: Running job: job_200803031605_0476
08/03/26 17:32:45 INFO mapred.JobClient: map 100% reduce 100%
Job failed!
A quick check at log file for jobtracker shows:
2008-03-26 17:32:45,001 INFO org.apache.hadoop.mapred.JobInProgress: Killing job 'job_200803031605_0476'
7:19:18,301 FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding to /192.168.0.240:54311
at org.apache.hadoop.ipc.Server.bind(Server.java:193)
at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:252)
at org.apache.hadoop.ipc.Server.<init>(Server.java:973)
at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:639)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:124)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2114)
2008-03-26 17:19:18,302 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down JobTracker at 192.168.0.240/192.168.0.240
************************************************************/
2008-03-26 17:32:10,784 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG: host = 192.168.0.240/192.168.0.240
STARTUP_MSG: args = []
STARTUP_MSG: version = 0.16.0
STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.16 -r 618351; compiled by 'hadoopqa' on Mon Feb 4 19:29:11 UTC 2008
************************************************************/
2008-03-26 17:32:10,943 FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding to /192.168.0.240:54311
at org.apache.hadoop.ipc.Server.bind(Server.java:193)
at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:252)
at org.apache.hadoop.ipc.Server.<init>(Server.java:973)
at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:393)
at org.apache.hadoop.ipc.RPC.getServer(RPC.java:355)
at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:639)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:124)
at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2114)
2008-03-26 17:32:10,944 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down JobTracker at 192.168.0.240/192.168.0.240
************************************************************/
Any Help ?
---------------------------------------------------------
Tom White <to...@gmail.com> wrote:
> I wonder if it is related to:
> https://issues.apache.org/jira/browse/HADOOP-3027
>
I think it is - the same problem is fixed for me when using HADOOP-3027.
Tom
---------------------------------
Never miss a thing. Make Yahoo your homepage.
Re: Map/reduce with input files on S3
Posted by Tom White <to...@gmail.com>.
> I wonder if it is related to:
> https://issues.apache.org/jira/browse/HADOOP-3027
>
I think it is - the same problem is fixed for me when using HADOOP-3027.
Tom
Re: Map/reduce with input files on S3
Posted by Prasan Ary <vo...@yahoo.com>.
Owen,
Yes I am using Hadoop 0.16.1 .
No, the jira doesn't relate to my case.
The message "Hook previously registered" comes up only if I try to access files on S3 from my java application running on EC2 . The same application runs smoothly if the input file is copied to image on EC2 and accessed from there.
------------------------------------------------------------------------------
Owen O'Malley <oo...@yahoo-inc.com> wrote:
On Mar 25, 2008, at 1:07 PM, Prasan Ary wrote:
> I am running hadoop on EC2. I want to run a jar MR application on
> EC2 such that input and output files are on S3.
That is an expected case, although I haven't ever used EC2.
> When I try to run the jar file now, I get the message "Hook
> previously registered" and the application dies.
>
> Any Idea what might have gone wrong?
I wonder if it is related to:
https://issues.apache.org/jira/browse/HADOOP-3027
Are you running 0.16.1?
-- Owen
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
Re: Map/reduce with input files on S3
Posted by Owen O'Malley <oo...@yahoo-inc.com>.
On Mar 25, 2008, at 1:07 PM, Prasan Ary wrote:
> I am running hadoop on EC2. I want to run a jar MR application on
> EC2 such that input and output files are on S3.
That is an expected case, although I haven't ever used EC2.
> When I try to run the jar file now, I get the message "Hook
> previously registered" and the application dies.
>
> Any Idea what might have gone wrong?
I wonder if it is related to:
https://issues.apache.org/jira/browse/HADOOP-3027
Are you running 0.16.1?
-- Owen