You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by shangan <sh...@corp.kaixin001.com> on 2010/08/27 05:59:35 UTC

mapreduce attempts killed


2010-08-27



shangan



发件人: shangan
发送时间: 2010-08-27 11:43:32
收件人: hadoop-user
抄送: 
主题: mapreduce attempts killed

there's quite a lot failed/killed task attempts while sometimes there's none of them when I run the same job, can anybody explain it ? those failed attemptes always take a lot of time which will obviously increase the overall job execution time. I have tried to trace those failures but I can't find the corresponding attempt in the log according to the information from jobtracker like the below:

    attemp                                                            task                            machine   state
attempt_201008261846_0016_m_000039_1   task_201008261846_0016_m_000039    vm203   KILLED

Sometimes I can see a lot of error like the following:

2010-08-27 10:17:39,772 INFO org.apache.hadoop.mapred.TaskTracker.clienttrace: src: 10.11.2.203:58713, dest: 10.11.2.163:55856, bytes: 0, op: MAPRED_SHUFFLE, cliID: attempt_201008261846_0011_m_000029_0
2010-08-27 10:17:39,772 ERROR org.mortbay.log: /mapOutput
java.lang.IllegalStateException: Committed
        at org.mortbay.jetty.Response.resetBuffer(Response.java:994)
        at org.mortbay.jetty.Response.sendError(Response.java:240)
        at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2963)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:324)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)


please give me a hand figuring out those problems? any help would be very appreciated.


2010-08-27 



shangan 

RE: multi-thread problem in map

Posted by xiujin yang <xi...@hotmail.com>.
Hi Amareshwari,

Thank you for your great help. 

I will check the source in 0.21 or trunk. 

Best

Xiujin Yang. 

> From: amarsri@yahoo-inc.com
> To: common-user@hadoop.apache.org
> Date: Fri, 27 Aug 2010 13:50:50 +0530
> Subject: Re: multi-thread problem in map
> 
> You can have a look at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper in 0.21 or trunk.
> 
> On 8/27/10 1:45 PM, "xiujin yang" <xi...@hotmail.com> wrote:
> 
> 
> 
> Hi All,
> 
> Under Hadoop 0.20.2, according to map introduction,
> If you want to use multi-threaded in Map, you can override the run method of Mapper.
> 
> Is anyone successful in using it? And who can give an example of using it. Thank you.
> 
> ********************************************************************************
> Comment about run method:
> 
>   *
>  * <p>Applications may override the {@link #run(Context)} method to exert
>  * greater control on map processing e.g. multi-threaded <code>Mapper</code>s
>  * etc.</p>
> 
> 
> run method:
> 
>   /**
>    * Expert users can override this method for more complete control over the
>    * execution of the Mapper.
>    * @param context
>    * @throws IOException
>    */
>   public void run(Context context) throws IOException, InterruptedException {
>     setup(context);
>     while (context.nextKeyValue()) {
>       map(context.getCurrentKey(), context.getCurrentValue(), context);
>     }
>     cleanup(context);
>   }
> 
> 
 		 	   		  

Re: multi-thread problem in map

Posted by Amareshwari Sri Ramadasu <am...@yahoo-inc.com>.
You can have a look at org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper in 0.21 or trunk.

On 8/27/10 1:45 PM, "xiujin yang" <xi...@hotmail.com> wrote:



Hi All,

Under Hadoop 0.20.2, according to map introduction,
If you want to use multi-threaded in Map, you can override the run method of Mapper.

Is anyone successful in using it? And who can give an example of using it. Thank you.

********************************************************************************
Comment about run method:

  *
 * <p>Applications may override the {@link #run(Context)} method to exert
 * greater control on map processing e.g. multi-threaded <code>Mapper</code>s
 * etc.</p>


run method:

  /**
   * Expert users can override this method for more complete control over the
   * execution of the Mapper.
   * @param context
   * @throws IOException
   */
  public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    while (context.nextKeyValue()) {
      map(context.getCurrentKey(), context.getCurrentValue(), context);
    }
    cleanup(context);
  }



multi-thread problem in map

Posted by xiujin yang <xi...@hotmail.com>.
Hi All,

Under Hadoop 0.20.2, according to map introduction,
If you want to use multi-threaded in Map, you can override the run method of Mapper. 

Is anyone successful in using it? And who can give an example of using it. Thank you. 

********************************************************************************
Comment about run method: 

  *
 * <p>Applications may override the {@link #run(Context)} method to exert 
 * greater control on map processing e.g. multi-threaded <code>Mapper</code>s 
 * etc.</p>


run method: 

  /**
   * Expert users can override this method for more complete control over the
   * execution of the Mapper.
   * @param context
   * @throws IOException
   */
  public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    while (context.nextKeyValue()) {
      map(context.getCurrentKey(), context.getCurrentValue(), context);
    }
    cleanup(context);
  }
 		 	   		  

Re: mapreduce attempts killed

Posted by Amareshwari Sri Ramadasu <am...@yahoo-inc.com>.
You should look at task logs to figure why the tasks failed. They are accessible from web ui and also on tasktracker nodes at ${hadoop.log.dir}/userlogs directory.
If the task is KILLED, it is KILLED by framework because it was speculative attempt or the job got failed/killed.
You can ignore the Jetty exceptions in TaskTracker (see http://issues.apache.org/jira/browse/MAPREDUCE-5).

Thanks
Amareshwari
On 8/27/10 9:29 AM, "shangan" <sh...@corp.kaixin001.com> wrote:




2010-08-27



shangan



发件人: shangan
发送时间: 2010-08-27 11:43:32
收件人: hadoop-user
抄送:
主题: mapreduce attempts killed

there's quite a lot failed/killed task attempts while sometimes there's none of them when I run the same job, can anybody explain it ? those failed attemptes always take a lot of time which will obviously increase the overall job execution time. I have tried to trace those failures but I can't find the corresponding attempt in the log according to the information from jobtracker like the below:

    attemp                                                            task                            machine   state
attempt_201008261846_0016_m_000039_1   task_201008261846_0016_m_000039    vm203   KILLED

Sometimes I can see a lot of error like the following:

2010-08-27 10:17:39,772 INFO org.apache.hadoop.mapred.TaskTracker.clienttrace: src: 10.11.2.203:58713, dest: 10.11.2.163:55856, bytes: 0, op: MAPRED_SHUFFLE, cliID: attempt_201008261846_0011_m_000029_0
2010-08-27 10:17:39,772 ERROR org.mortbay.log: /mapOutput
java.lang.IllegalStateException: Committed
        at org.mortbay.jetty.Response.resetBuffer(Response.java:994)
        at org.mortbay.jetty.Response.sendError(Response.java:240)
        at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2963)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:324)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)


please give me a hand figuring out those problems? any help would be very appreciated.


2010-08-27



shangan