You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Steve Sonnenberg <st...@gmail.com> on 2012/07/20 18:36:42 UTC

Fail to start mapreduce tasks across nodes

I have a 2-node Fedora system and in cluster mode, I have the following
issue that I can't resolve.

Hadoop 1.0.3
I'm running with filesystem, file:/// and invoking the simple 'grep' example

hadoop jar hadoop-examples-1.0.3.jar grep inputdir outputdir simple-pattern

The initiator displays

Error initializing attempt_201207201103_0003_m_000004_0:
   java.io.FileNotFoundException: File
file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken does
not exist.
     getFileStatus(RawLocalFileSystem.java)
     localizeJobTokenFile(TaskTracker.java:4268)
     initializeJob(TaskTracker.java:1177)
     localizeJob
     run

The /tmp/hadoop-hadoop/mapred/system directory only contains a '
jobtracker.info' file (on all systems)

On the target system, in the tasktracker log file, I get the following:

2012-07-20 11:35:59,954 DEBUG org.apache.hadoop.mapred.TaskTracker:
Got heartbeatResponse from JobTracker with responseId: 641 and 1
actions
2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:
LaunchTaskAction (registerTask): attempt_201207201103_0003_m_000006_0
task's state:UNASSIGNED
2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:
Trying to launch : attempt_201207201103_0003_m_000006_0 which needs 1
slots
2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: In
TaskLauncher, current free slots : 2 and trying to launch
attempt_201207201103_0003_m_000006_0 which needs 1 slots
2012-07-20 11:35:59,955 WARN org.apache.hadoop.mapred.TaskTracker:
Error initializing attempt_201207201103_0003_m_000006_0:
java.io.FileNotFoundException: File
file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken
does not exist.
        at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
        at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
        at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4268)
        at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1177)
        at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
        at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
        at java.lang.Thread.run(Thread.java:636)

2012-07-20 11:35:59,955 ERROR org.apache.hadoop.mapred.TaskStatus:
Trying to set finish time for task
attempt_201207201103_0003_m_000006_0 when no start time is set,
stackTrace is : java.lang.Exception
        at org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145)
        at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:3142)
        at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2440)
        at java.lang.Thread.run(Thread.java:636)

On both systems, ownership of all files directories under
/tmp/hadoop-hadoop is the user/group hadoop/hadoop.

Any ideas?

Thanks


-- 
Steve Sonnenberg

Re: Fail to start mapreduce tasks across nodes

Posted by Shanu Sushmita <sh...@gmail.com>.

yes we can see it :-)

SS
On 20 Jul 2012, at 12:15, Steve Sonnenberg wrote:

> Sorry this is my first posting and I haven't gotten a copy nor any  
> response.
> Could someone please respond if you are seeing this?
>
> Thanks,
> Newbie
>
> On Fri, Jul 20, 2012 at 12:36 PM, Steve Sonnenberg <steveisoft@gmail.com 
> > wrote:
> I have a 2-node Fedora system and in cluster mode, I have the  
> following issue that I can't resolve.
>
> Hadoop 1.0.3
> I'm running with filesystem, file:/// and invoking the simple 'grep'  
> example
>
> hadoop jar hadoop-examples-1.0.3.jar grep inputdir outputdir simple- 
> pattern
>
> The initiator displays
>
> Error initializing attempt_201207201103_0003_m_000004_0:
>    java.io.FileNotFoundException: File file:/tmp/hadoop-hadoop/ 
> mapred/system/job_201207201103_0003/jobToken does not exist.
>      getFileStatus(RawLocalFileSystem.java)
>      localizeJobTokenFile(TaskTracker.java:4268)
>      initializeJob(TaskTracker.java:1177)
>      localizeJob
>      run
>
> The /tmp/hadoop-hadoop/mapred/system directory only contains a  
> 'jobtracker.info' file (on all systems)
>
> On the target system, in the tasktracker log file, I get the  
> following:
> 2012-07-20 11:35:59,954 DEBUG org.apache.hadoop.mapred.TaskTracker:  
> Got heartbeatResponse from JobTracker with responseId: 641 and 1  
> actions
> 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:  
> LaunchTaskAction (registerTask):  
> attempt_201207201103_0003_m_000006_0 task's state:UNASSIGNED
> 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:  
> Trying to launch : attempt_201207201103_0003_m_000006_0 which needs  
> 1 slots
> 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:  
> In TaskLauncher, current free slots : 2 and trying to launch  
> attempt_201207201103_0003_m_000006_0 which needs 1 slots
> 2012-07-20 11:35:59,955 WARN org.apache.hadoop.mapred.TaskTracker:  
> Error initializing attempt_201207201103_0003_m_000006_0:
> java.io.FileNotFoundException: File file:/tmp/hadoop-hadoop/mapred/ 
> system/job_201207201103_0003/jobToken does not exist.
>         at  
> org 
> .apache 
> .hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java: 
> 397)
>         at  
> org 
> .apache 
> .hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>         at  
> org 
> .apache 
> .hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4268)
>         at  
> org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java: 
> 1177)
>         at  
> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java: 
> 1118)
>         at org.apache.hadoop.mapred.TaskTracker 
> $5.run(TaskTracker.java:2430)
>         at java.lang.Thread.run(Thread.java:636)
>
> 2012-07-20 11:35:59,955 ERROR org.apache.hadoop.mapred.TaskStatus:  
> Trying to set finish time for task  
> attempt_201207201103_0003_m_000006_0 when no start time is set,  
> stackTrace is : java.lang.Exception
>         at  
> org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145)
>         at org.apache.hadoop.mapred.TaskTracker 
> $TaskInProgress.kill(TaskTracker.java:3142)
>         at org.apache.hadoop.mapred.TaskTracker 
> $5.run(TaskTracker.java:2440)
>         at java.lang.Thread.run(Thread.java:636)
>
> On both systems, ownership of all files directories under /tmp/ 
> hadoop-hadoop is the user/group hadoop/hadoop.
>
>
>
> Any ideas?
>
> Thanks
>
> -- 
> Steve Sonnenberg
>
>
>
>
> -- 
> Steve Sonnenberg
>

Re: Fail to start mapreduce tasks across nodes

Posted by Steve Sonnenberg <st...@gmail.com>.

Sorry this is my first posting and I haven't gotten a copy nor any response.
Could someone please respond if you are seeing this?

Thanks,
Newbie

On Fri, Jul 20, 2012 at 12:36 PM, Steve Sonnenberg <st...@gmail.com>wrote:

> I have a 2-node Fedora system and in cluster mode, I have the following
> issue that I can't resolve.
>
> Hadoop 1.0.3
> I'm running with filesystem, file:/// and invoking the simple 'grep'
> example
>
> hadoop jar hadoop-examples-1.0.3.jar grep inputdir outputdir simple-pattern
>
> The initiator displays
>
> Error initializing attempt_201207201103_0003_m_000004_0:
>    java.io.FileNotFoundException: File
> file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken does
> not exist.
>      getFileStatus(RawLocalFileSystem.java)
>      localizeJobTokenFile(TaskTracker.java:4268)
>      initializeJob(TaskTracker.java:1177)
>      localizeJob
>      run
>
> The /tmp/hadoop-hadoop/mapred/system directory only contains a '
> jobtracker.info' file (on all systems)
>
> On the target system, in the tasktracker log file, I get the following:
>
> 2012-07-20 11:35:59,954 DEBUG org.apache.hadoop.mapred.TaskTracker: Got heartbeatResponse from JobTracker with responseId: 641 and 1 actions
> 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201207201103_0003_m_000006_0 task's state:UNASSIGNED
> 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201207201103_0003_m_000006_0 which needs 1 slots
> 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201207201103_0003_m_000006_0 which needs 1 slots
> 2012-07-20 11:35:59,955 WARN org.apache.hadoop.mapred.TaskTracker: Error initializing attempt_201207201103_0003_m_000006_0:
> java.io.FileNotFoundException: File file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken does not exist.
>         at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
>         at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>         at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4268)
>         at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1177)
>         at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
>         at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
>         at java.lang.Thread.run(Thread.java:636)
>
> 2012-07-20 11:35:59,955 ERROR org.apache.hadoop.mapred.TaskStatus: Trying to set finish time for task attempt_201207201103_0003_m_000006_0 when no start time is set, stackTrace is : java.lang.Exception
>         at org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145)
>         at org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:3142)
>         at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2440)
>         at java.lang.Thread.run(Thread.java:636)
>
> On both systems, ownership of all files directories under /tmp/hadoop-hadoop is the user/group hadoop/hadoop.
>
> Any ideas?
>
> Thanks
>
>
> --
> Steve Sonnenberg
>
>


-- 
Steve Sonnenberg

Re: Fail to start mapreduce tasks across nodes

Posted by Steve Sonnenberg <st...@gmail.com>.

I will try this.

For the HDFS,
The WebUI M/R Admin, using 50030 (from the Apache example) shows 2 nodes
registered.
All of the jobs shows completed only on one of the nodes.
I will package up a set of clean logs.

Thanks
-s

On Mon, Jul 23, 2012 at 2:08 PM, Harsh J <ha...@cloudera.com> wrote:

> Steve,
>
> If you're going to use NFS, make sure your "hadoop.tmp.dir" property
> points to the mount point that is NFS. Can you change that property
> and restart the cluster and retry?
>
> Regarding the HDFS issue, its hard to tell without logs. Did you see
> two nodes alive in the Web UI after configuring HDFS for two nodes and
> configuring MR to use HDFS?
>
> On Mon, Jul 23, 2012 at 11:23 PM, Steve Sonnenberg <st...@gmail.com>
> wrote:
> > Thanks Harsh,
> >
> > 1) I was using NFS
> > 2) I don't believe that anything under /tmp is distributed even when
> running
> > 3) When I use HDFS, it doesn't attempt to send ANY jobs to my second node
> >
> > Any clues?
> >
> > -steve
> >
> >
> > On Fri, Jul 20, 2012 at 11:52 PM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> A 2-node cluster is a fully-distributed cluster and cannot use a
> >> file:/// FileSystem as thats not a distributed filesystem (unless its
> >> an NFS mount). This explains why some of your tasks aren't able to
> >> locate an earlier written file on the /tmp dir thats probably
> >> available on the JT node alone, not the TT nodes.
> >>
> >> Use hdfs:// FS for fully-distributed operation.
> >>
> >> On Fri, Jul 20, 2012 at 10:06 PM, Steve Sonnenberg <
> steveisoft@gmail.com>
> >> wrote:
> >> > I have a 2-node Fedora system and in cluster mode, I have the
> following
> >> > issue that I can't resolve.
> >> >
> >> > Hadoop 1.0.3
> >> > I'm running with filesystem, file:/// and invoking the simple 'grep'
> >> > example
> >> >
> >> > hadoop jar hadoop-examples-1.0.3.jar grep inputdir outputdir
> >> > simple-pattern
> >> >
> >> > The initiator displays
> >> >
> >> > Error initializing attempt_201207201103_0003_m_000004_0:
> >> >    java.io.FileNotFoundException: File
> >> > file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken
> >> > does
> >> > not exist.
> >> >      getFileStatus(RawLocalFileSystem.java)
> >> >      localizeJobTokenFile(TaskTracker.java:4268)
> >> >      initializeJob(TaskTracker.java:1177)
> >> >      localizeJob
> >> >      run
> >> >
> >> > The /tmp/hadoop-hadoop/mapred/system directory only contains a
> >> > 'jobtracker.info' file (on all systems)
> >> >
> >> > On the target system, in the tasktracker log file, I get the
> following:
> >> >
> >> > 2012-07-20 11:35:59,954 DEBUG org.apache.hadoop.mapred.TaskTracker:
> Got
> >> > heartbeatResponse from JobTracker with responseId: 641 and 1 actions
> >> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:
> >> > LaunchTaskAction (registerTask): attempt_201207201103_0003_m_000006_0
> >> > task's
> >> > state:UNASSIGNED
> >> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:
> >> > Trying to
> >> > launch : attempt_201207201103_0003_m_000006_0 which needs 1 slots
> >> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: In
> >> > TaskLauncher, current free slots : 2 and trying to launch
> >> > attempt_201207201103_0003_m_000006_0 which needs 1 slots
> >> > 2012-07-20 11:35:59,955 WARN org.apache.hadoop.mapred.TaskTracker:
> Error
> >> > initializing attempt_201207201103_0003_m_000006_0:
> >> > java.io.FileNotFoundException: File
> >> > file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken
> >> > does
> >> > not exist.
> >> >         at
> >> >
> >> >
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
> >> >         at
> >> >
> >> >
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
> >> >         at
> >> >
> >> >
> org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4268)
> >> >         at
> >> >
> >> >
> org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1177)
> >> >         at
> >> >
> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
> >> >         at
> >> > org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
> >> >         at java.lang.Thread.run(Thread.java:636)
> >> >
> >> > 2012-07-20 11:35:59,955 ERROR org.apache.hadoop.mapred.TaskStatus:
> >> > Trying to
> >> > set finish time for task attempt_201207201103_0003_m_000006_0 when no
> >> > start
> >> > time is set, stackTrace is : java.lang.Exception
> >> >         at
> >> > org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145)
> >> >         at
> >> >
> >> >
> org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:3142)
> >> >         at
> >> > org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2440)
> >> >         at java.lang.Thread.run(Thread.java:636)
> >> >
> >> > On both systems, ownership of all files directories under
> >> > /tmp/hadoop-hadoop
> >> > is the user/group hadoop/hadoop.
> >> >
> >> >
> >> > Any ideas?
> >> >
> >> > Thanks
> >> >
> >> >
> >> > --
> >> > Steve Sonnenberg
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
> >
> >
> > --
> > Steve Sonnenberg
> >
>
>
>
> --
> Harsh J
>



-- 
Steve Sonnenberg

Re: Fail to start mapreduce tasks across nodes

Posted by Harsh J <ha...@cloudera.com>.

Steve,

If you're going to use NFS, make sure your "hadoop.tmp.dir" property
points to the mount point that is NFS. Can you change that property
and restart the cluster and retry?

Regarding the HDFS issue, its hard to tell without logs. Did you see
two nodes alive in the Web UI after configuring HDFS for two nodes and
configuring MR to use HDFS?

On Mon, Jul 23, 2012 at 11:23 PM, Steve Sonnenberg <st...@gmail.com> wrote:
> Thanks Harsh,
>
> 1) I was using NFS
> 2) I don't believe that anything under /tmp is distributed even when running
> 3) When I use HDFS, it doesn't attempt to send ANY jobs to my second node
>
> Any clues?
>
> -steve
>
>
> On Fri, Jul 20, 2012 at 11:52 PM, Harsh J <ha...@cloudera.com> wrote:
>>
>> A 2-node cluster is a fully-distributed cluster and cannot use a
>> file:/// FileSystem as thats not a distributed filesystem (unless its
>> an NFS mount). This explains why some of your tasks aren't able to
>> locate an earlier written file on the /tmp dir thats probably
>> available on the JT node alone, not the TT nodes.
>>
>> Use hdfs:// FS for fully-distributed operation.
>>
>> On Fri, Jul 20, 2012 at 10:06 PM, Steve Sonnenberg <st...@gmail.com>
>> wrote:
>> > I have a 2-node Fedora system and in cluster mode, I have the following
>> > issue that I can't resolve.
>> >
>> > Hadoop 1.0.3
>> > I'm running with filesystem, file:/// and invoking the simple 'grep'
>> > example
>> >
>> > hadoop jar hadoop-examples-1.0.3.jar grep inputdir outputdir
>> > simple-pattern
>> >
>> > The initiator displays
>> >
>> > Error initializing attempt_201207201103_0003_m_000004_0:
>> >    java.io.FileNotFoundException: File
>> > file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken
>> > does
>> > not exist.
>> >      getFileStatus(RawLocalFileSystem.java)
>> >      localizeJobTokenFile(TaskTracker.java:4268)
>> >      initializeJob(TaskTracker.java:1177)
>> >      localizeJob
>> >      run
>> >
>> > The /tmp/hadoop-hadoop/mapred/system directory only contains a
>> > 'jobtracker.info' file (on all systems)
>> >
>> > On the target system, in the tasktracker log file, I get the following:
>> >
>> > 2012-07-20 11:35:59,954 DEBUG org.apache.hadoop.mapred.TaskTracker: Got
>> > heartbeatResponse from JobTracker with responseId: 641 and 1 actions
>> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:
>> > LaunchTaskAction (registerTask): attempt_201207201103_0003_m_000006_0
>> > task's
>> > state:UNASSIGNED
>> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:
>> > Trying to
>> > launch : attempt_201207201103_0003_m_000006_0 which needs 1 slots
>> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: In
>> > TaskLauncher, current free slots : 2 and trying to launch
>> > attempt_201207201103_0003_m_000006_0 which needs 1 slots
>> > 2012-07-20 11:35:59,955 WARN org.apache.hadoop.mapred.TaskTracker: Error
>> > initializing attempt_201207201103_0003_m_000006_0:
>> > java.io.FileNotFoundException: File
>> > file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken
>> > does
>> > not exist.
>> >         at
>> >
>> > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
>> >         at
>> >
>> > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>> >         at
>> >
>> > org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4268)
>> >         at
>> >
>> > org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1177)
>> >         at
>> > org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
>> >         at
>> > org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
>> >         at java.lang.Thread.run(Thread.java:636)
>> >
>> > 2012-07-20 11:35:59,955 ERROR org.apache.hadoop.mapred.TaskStatus:
>> > Trying to
>> > set finish time for task attempt_201207201103_0003_m_000006_0 when no
>> > start
>> > time is set, stackTrace is : java.lang.Exception
>> >         at
>> > org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145)
>> >         at
>> >
>> > org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:3142)
>> >         at
>> > org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2440)
>> >         at java.lang.Thread.run(Thread.java:636)
>> >
>> > On both systems, ownership of all files directories under
>> > /tmp/hadoop-hadoop
>> > is the user/group hadoop/hadoop.
>> >
>> >
>> > Any ideas?
>> >
>> > Thanks
>> >
>> >
>> > --
>> > Steve Sonnenberg
>> >
>>
>>
>>
>> --
>> Harsh J
>
>
>
>
> --
> Steve Sonnenberg
>



-- 
Harsh J

Re: Fail to start mapreduce tasks across nodes

Posted by Steve Sonnenberg <st...@gmail.com>.

Thanks Harsh,

1) I was using NFS
2) I don't believe that anything under /tmp is distributed even when running
3) When I use HDFS, it doesn't attempt to send ANY jobs to my second node

Any clues?

-steve

On Fri, Jul 20, 2012 at 11:52 PM, Harsh J <ha...@cloudera.com> wrote:

> A 2-node cluster is a fully-distributed cluster and cannot use a
> file:/// FileSystem as thats not a distributed filesystem (unless its
> an NFS mount). This explains why some of your tasks aren't able to
> locate an earlier written file on the /tmp dir thats probably
> available on the JT node alone, not the TT nodes.
>
> Use hdfs:// FS for fully-distributed operation.
>
> On Fri, Jul 20, 2012 at 10:06 PM, Steve Sonnenberg <st...@gmail.com>
> wrote:
> > I have a 2-node Fedora system and in cluster mode, I have the following
> > issue that I can't resolve.
> >
> > Hadoop 1.0.3
> > I'm running with filesystem, file:/// and invoking the simple 'grep'
> example
> >
> > hadoop jar hadoop-examples-1.0.3.jar grep inputdir outputdir
> simple-pattern
> >
> > The initiator displays
> >
> > Error initializing attempt_201207201103_0003_m_000004_0:
> >    java.io.FileNotFoundException: File
> > file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken does
> > not exist.
> >      getFileStatus(RawLocalFileSystem.java)
> >      localizeJobTokenFile(TaskTracker.java:4268)
> >      initializeJob(TaskTracker.java:1177)
> >      localizeJob
> >      run
> >
> > The /tmp/hadoop-hadoop/mapred/system directory only contains a
> > 'jobtracker.info' file (on all systems)
> >
> > On the target system, in the tasktracker log file, I get the following:
> >
> > 2012-07-20 11:35:59,954 DEBUG org.apache.hadoop.mapred.TaskTracker: Got
> > heartbeatResponse from JobTracker with responseId: 641 and 1 actions
> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:
> > LaunchTaskAction (registerTask): attempt_201207201103_0003_m_000006_0
> task's
> > state:UNASSIGNED
> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:
> Trying to
> > launch : attempt_201207201103_0003_m_000006_0 which needs 1 slots
> > 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: In
> > TaskLauncher, current free slots : 2 and trying to launch
> > attempt_201207201103_0003_m_000006_0 which needs 1 slots
> > 2012-07-20 11:35:59,955 WARN org.apache.hadoop.mapred.TaskTracker: Error
> > initializing attempt_201207201103_0003_m_000006_0:
> > java.io.FileNotFoundException: File
> > file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken does
> > not exist.
> >         at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
> >         at
> >
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
> >         at
> >
> org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4268)
> >         at
> > org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1177)
> >         at
> > org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
> >         at
> org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
> >         at java.lang.Thread.run(Thread.java:636)
> >
> > 2012-07-20 11:35:59,955 ERROR org.apache.hadoop.mapred.TaskStatus:
> Trying to
> > set finish time for task attempt_201207201103_0003_m_000006_0 when no
> start
> > time is set, stackTrace is : java.lang.Exception
> >         at
> > org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145)
> >         at
> >
> org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:3142)
> >         at
> org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2440)
> >         at java.lang.Thread.run(Thread.java:636)
> >
> > On both systems, ownership of all files directories under
> /tmp/hadoop-hadoop
> > is the user/group hadoop/hadoop.
> >
> >
> > Any ideas?
> >
> > Thanks
> >
> >
> > --
> > Steve Sonnenberg
> >
>
>
>
> --
> Harsh J
>



-- 
Steve Sonnenberg

Re: Fail to start mapreduce tasks across nodes

Posted by Harsh J <ha...@cloudera.com>.

A 2-node cluster is a fully-distributed cluster and cannot use a
file:/// FileSystem as thats not a distributed filesystem (unless its
an NFS mount). This explains why some of your tasks aren't able to
locate an earlier written file on the /tmp dir thats probably
available on the JT node alone, not the TT nodes.

Use hdfs:// FS for fully-distributed operation.

On Fri, Jul 20, 2012 at 10:06 PM, Steve Sonnenberg <st...@gmail.com> wrote:
> I have a 2-node Fedora system and in cluster mode, I have the following
> issue that I can't resolve.
>
> Hadoop 1.0.3
> I'm running with filesystem, file:/// and invoking the simple 'grep' example
>
> hadoop jar hadoop-examples-1.0.3.jar grep inputdir outputdir simple-pattern
>
> The initiator displays
>
> Error initializing attempt_201207201103_0003_m_000004_0:
>    java.io.FileNotFoundException: File
> file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken does
> not exist.
>      getFileStatus(RawLocalFileSystem.java)
>      localizeJobTokenFile(TaskTracker.java:4268)
>      initializeJob(TaskTracker.java:1177)
>      localizeJob
>      run
>
> The /tmp/hadoop-hadoop/mapred/system directory only contains a
> 'jobtracker.info' file (on all systems)
>
> On the target system, in the tasktracker log file, I get the following:
>
> 2012-07-20 11:35:59,954 DEBUG org.apache.hadoop.mapred.TaskTracker: Got
> heartbeatResponse from JobTracker with responseId: 641 and 1 actions
> 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker:
> LaunchTaskAction (registerTask): attempt_201207201103_0003_m_000006_0 task's
> state:UNASSIGNED
> 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: Trying to
> launch : attempt_201207201103_0003_m_000006_0 which needs 1 slots
> 2012-07-20 11:35:59,954 INFO org.apache.hadoop.mapred.TaskTracker: In
> TaskLauncher, current free slots : 2 and trying to launch
> attempt_201207201103_0003_m_000006_0 which needs 1 slots
> 2012-07-20 11:35:59,955 WARN org.apache.hadoop.mapred.TaskTracker: Error
> initializing attempt_201207201103_0003_m_000006_0:
> java.io.FileNotFoundException: File
> file:/tmp/hadoop-hadoop/mapred/system/job_201207201103_0003/jobToken does
> not exist.
>         at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
>         at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>         at
> org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4268)
>         at
> org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1177)
>         at
> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1118)
>         at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2430)
>         at java.lang.Thread.run(Thread.java:636)
>
> 2012-07-20 11:35:59,955 ERROR org.apache.hadoop.mapred.TaskStatus: Trying to
> set finish time for task attempt_201207201103_0003_m_000006_0 when no start
> time is set, stackTrace is : java.lang.Exception
>         at
> org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145)
>         at
> org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:3142)
>         at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2440)
>         at java.lang.Thread.run(Thread.java:636)
>
> On both systems, ownership of all files directories under /tmp/hadoop-hadoop
> is the user/group hadoop/hadoop.
>
>
> Any ideas?
>
> Thanks
>
>
> --
> Steve Sonnenberg
>



-- 
Harsh J