You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Shuja Rehman <sh...@gmail.com> on 2011/04/04 17:02:24 UTC

Distributed Cache not working

Hi All,
I have implemented the distributed cache according to following article.

http://chasebradford.wordpress.com/2011/02/05/distributed-cache-static-objects-and-fast-setup/

but when i run the program over the cluster, i am getting the following
exceptions.

SEVERE: null
java.io.FileNotFoundException: File does not exist:
tmp/extract/7ecfcd44-dd47-4dbf-a12e-464e9d285762
        at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:519)
        at
org.apache.hadoop.filecache.DistributedCache.getFileStatus(DistributedCache.java:362)
        at
org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestamps(TrackerDistributedCacheManager.java:750)
        at
org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:706)
        at
org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:609)
        at org.apache.hadoop.mapred.JobClient.access$300(JobClient.java:170)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:808)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
        at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:465)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:495)
        at
AcuteNightProcessor.ProcessorDriver.runAlertJob(ProcessorDriver.java:346)
        at
AcuteNightProcessor.ProcessorDriver.main(ProcessorDriver.java:461)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

Here is the code :

 FileSystem fs = FileSystem.get(config);
       Path temp = new Path("tmp/extract", UUID.randomUUID().toString());
       ObjectOutputStream os = new ObjectOutputStream(fs.create(temp));
       os.writeObject(dcDto);
       os.close();
       fs.deleteOnExit(temp);

       // Register the file in the DC.  Open the local file "targets"
       DistributedCache.addCacheFile(new URI(temp+"#targets"), config);
       DistributedCache.createSymlink(config);


Does anybody know the solution?

Thanks




-- 
Regards
Shuja-ur-Rehman Baig
<http://pk.linkedin.com/in/shujamughal>

Re: How to abort a job in a map task

Posted by Mehmet Tepedelenlioglu <me...@gmail.com>.

It might be better to keep a counter of bad data and terminate regularly.
I would be hesitant to shoot down the mother-ship.

Mehmet

On Apr 6, 2011, at 5:40 PM, Haruyasu Ueda wrote:

> Hi all,
> 
> I'm writing M/R java program.
> 
> I want to abort a job itself in a map task, when the map task found
> irregular data.
> 
> I have two idea to do so.
> 1. execulte "bin/hadoop -kill jobID" in map task, from slave machine.
> 2. raise an IOException to abort.
> 
> I want to know which is better way. 
> Or, whether there is  better/recommended programming idiom.
> 
> If you have any experience about this, please share your case.
> 
> --HAL
> ========================================================================
> Haruyasu Ueda, Senior Researcher
>  Research Center for Cloud Computing
>  FUJITSU LABORATORIES LTD.
> E-mail: hal_ueda@jp.fujitsu.com
> Tel: +81 44 754 2575
> Ken-S602, 4-1-1, Kamikodanaka, Nakahara-ku, Kawasaki, 211-8588 Japan
> ========================================================================
> 
>

Re: How to abort a job in a map task

Posted by David Rosenstrauch <da...@darose.net>.

On 04/06/2011 08:40 PM, Haruyasu Ueda wrote:
> Hi all,
>
> I'm writing M/R java program.
>
> I want to abort a job itself in a map task, when the map task found
> irregular data.
>
> I have two idea to do so.
>   1. execulte "bin/hadoop -kill jobID" in map task, from slave machine.
>   2. raise an IOException to abort.
>
> I want to know which is better way.
> Or, whether there is  better/recommended programming idiom.
>
> If you have any experience about this, please share your case.
>
>   --HAL

I'd go with throwing the exception.  That way the cause of the job crash 
will get displayed right in the Hadoop GUI.

DR

How to abort a job in a map task

Posted by Haruyasu Ueda <ha...@jp.fujitsu.com>.

Hi all,

I'm writing M/R java program.

I want to abort a job itself in a map task, when the map task found
irregular data.

I have two idea to do so.
 1. execulte "bin/hadoop -kill jobID" in map task, from slave machine.
 2. raise an IOException to abort.

I want to know which is better way. 
Or, whether there is  better/recommended programming idiom.

If you have any experience about this, please share your case.

 --HAL
========================================================================
Haruyasu Ueda, Senior Researcher
  Research Center for Cloud Computing
  FUJITSU LABORATORIES LTD.
E-mail: hal_ueda@jp.fujitsu.com
Tel: +81 44 754 2575
Ken-S602, 4-1-1, Kamikodanaka, Nakahara-ku, Kawasaki, 211-8588 Japan
========================================================================

Fwd: Distributed Cache not working

Posted by Shuja Rehman <sh...@gmail.com>.

Hi All,
I have implemented the distributed cache according to following article.

http://chasebradford.wordpress.com/2011/02/05/distributed-cache-static-objects-and-fast-setup/

but when i run the program over the cluster, i am getting the following
exceptions.

SEVERE: null
java.io.FileNotFoundException: File does not exist:
tmp/extract/7ecfcd44-dd47-4dbf-a12e-464e9d285762
        at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:519)
        at
org.apache.hadoop.filecache.DistributedCache.getFileStatus(DistributedCache.java:362)
        at
org.apache.hadoop.filecache.TrackerDistributedCacheManager.determineTimestamps(TrackerDistributedCacheManager.java:750)
        at
org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:706)
        at
org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:609)
        at org.apache.hadoop.mapred.JobClient.access$300(JobClient.java:170)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:808)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063)
        at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:465)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:495)
        at
AcuteNightProcessor.ProcessorDriver.runAlertJob(ProcessorDriver.java:346)
        at
AcuteNightProcessor.ProcessorDriver.main(ProcessorDriver.java:461)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

Here is the code :

 FileSystem fs = FileSystem.get(config);
       Path temp = new Path("tmp/extract", UUID.randomUUID().toString());
       ObjectOutputStream os = new ObjectOutputStream(fs.create(temp));
       os.writeObject(dcDto);
       os.close();
       fs.deleteOnExit(temp);

       // Register the file in the DC.  Open the local file "targets"
       DistributedCache.addCacheFile(new URI(temp+"#targets"), config);
       DistributedCache.createSymlink(config);


Does anybody know the solution?

Thanks


-- 
Regards
Shuja-ur-Rehman Baig
<http://pk.linkedin.com/in/shujamughal>