You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Pete Tyler <pe...@gmail.com> on 2011/04/26 06:18:26 UTC

Migrating pseudo distribute map reduce with user jar file to 0.90.1

My map reduce jobs, which were running fine with HBase 0.20.4 with Hadoop
0.20.2 are now failing as I try to upgrade to HBase 0.90.1 with Hadoop
0.20.2-CDH3B4.

Under 0.90.1 I see the following error,

Error initializing attempt_201104252111_0001_m_000002_0:
java.io.FileNotFoundException: File
/tmp/hadoop-peter/mapred/staging/peter/.staging/job_201104252111_0001/job.jar
does not exist.
	at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:383)
	at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
	at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:207)
	at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
	at
org.apache.hadoop.fs.LocalFileSystem.copyToLocalFile(LocalFileSystem.java:61)
	at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1303)
	at
org.apache.hadoop.mapred.JobLocalizer.localizeJobJarFile(JobLocalizer.java:273)
	at
org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:381)
	at
org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:371)
	at
org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:198)
	at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1154)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
	at
org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1129)
	at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1055)
	at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2212)
	at
org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2176)

My code, which worked fine under 0.20.4 looks like this,

        Configuration conf = HBaseConfiguration.create();
        conf.set("mapred.job.tracker", "localhost:9001");

        Job job = new Job(conf);
        JobConf jobConf = job.getConfiguration();
        jobConf.setJar( dirName + "/" + jarName);

        job.setJarByClass(MyMapper.class);
        job.setJobName("My MapReduce Pseudo Distributed");

        Scan scan = new Scan();

        String tableNameIn  = "MyTableIn";
        String tableNameOut = "MyTableOut";

        TableMapReduceUtil.initTableMapperJob(
                  tableNameIn,
                  scan,
                  MyMapper.class,
                  ImmutableBytesWritable.class,
                  ImmutableBytesWritable.class,
                 job
         );

        TableMapReduceUtil.initTableReducerJob(
               tableNameOut,
               MyReducer.class,
               job
         );

        job.setOutputFormatClass(TableOutputFormat.class);
        job.setNumReduceTasks(1);

        job.waitForCompletion(true);

Any pointers on how to migrate my code so it works on 0.90.1 would be
brilliant. Many thanks.
-- 
View this message in context: http://old.nabble.com/Migrating-pseudo-distribute-map-reduce-with-user-jar-file-to-0.90.1-tp31475457p31475457.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: Migrating pseudo distribute map reduce with user jar file to 0.90.1

Posted by Jean-Daniel Cryans <jd...@apache.org>.

You might want to bring this issue to Cloudera as for the moment they
have the only Hadoop release that supports security.

J-D

On Tue, Apr 26, 2011 at 8:34 PM, Pete Tyler <pe...@gmail.com> wrote:
> Thanks for the links but I'm having trouble applying them. I'm trying to
> upgrade my OS X single node pseudo distributed node to HBase 0.90.1. The
> following line from the doc you've sent me,
>
>   "When you upgrade to CDH3, two new Unix user accounts called hdfs and
> mapred are automatically created to support security:"
>
> makes no sense to me as my entire installation process involved unzipping
> your tar ball.
>
> This then leads to my confusion over the permissions guidelines,
>
> DirectoryOwnerPermissions (see Footnote 1)dfs.name.dirhdfs:hadoopdrwx------
> dfs.data.dirhdfs:hadoopdrwx------mapred.local.dirmapred:hadoopdrwxr-xr-x
> mapred.system.dir in HDFSmapred:hadoop(see Footnote 2)
>
> If anyone can please help me that would be great. At the moment I feel I'm
> in danger of losing two years work due to the enormous gulf between the
> HBase 0.20.4 and the HBase 0.90.1. environments :(
>
> When I first installed Hadoop the example map reduce job ran fine, but now I
> see this error. *Sigh*, I seem to be going steadily backwards,
>
> Peters-MacBook-Pro:hadoop-0.20.2-CDH3B4 peter$ bin/hadoop jar
> hadoop-examples-0.20.2-CDH3B4.jar grep input output 'dfs[a-z.]+'
> 11/04/26 20:30:21 INFO mapred.JobClient: Cleaning up the staging area
> hdfs://localhost:9000/Users/peter/Development/Data/temp/mapred/staging/peter/.staging/job_201104262014_0003
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
> hdfs://localhost:9000/user/peter/input
> at
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:194)
> at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:205)
> at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:971)
> at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:963)
> at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242)
> at org.apache.hadoop.examples.Grep.run(Grep.java:69)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.examples.Grep.main(Grep.java:93)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>

Re: Migrating pseudo distribute map reduce with user jar file to 0.90.1

Posted by Stack <st...@duboce.net>.

On Tue, Apr 26, 2011 at 8:34 PM, Pete Tyler <pe...@gmail.com> wrote:
>   "When you upgrade to CDH3, two new Unix user accounts called hdfs and
> mapred are automatically created to support security:"
>
> makes no sense to me as my entire installation process involved unzipping
> your tar ball.
>

OK.  So, the statement above probably goes along w/ the case where
cdh3 is 'installed'.  If you untar a tar ball, yeah, no users are
going to be made.


> This then leads to my confusion over the permissions guidelines,
>
> DirectoryOwnerPermissions (see Footnote 1)dfs.name.dirhdfs:hadoopdrwx------
> dfs.data.dirhdfs:hadoopdrwx------mapred.local.dirmapred:hadoopdrwxr-xr-x
> mapred.system.dir in HDFSmapred:hadoop(see Footnote 2)
>

The above is just making dirs in hdfs available to users running these
processes.



> If anyone can please help me that would be great. At the moment I feel I'm
> in danger of losing two years work due to the enormous gulf between the
> HBase 0.20.4 and the HBase 0.90.1. environments :(
>

Yeah, a bunch happened in between.

St.Ack

Re: Migrating pseudo distribute map reduce with user jar file to 0.90.1

Posted by Stack <st...@duboce.net>.

On Tue, Apr 26, 2011 at 8:43 PM, Pete Tyler <pe...@gmail.com> wrote:
> P.S. This is a pure development system, I have no interest in any security
> features. If 'chmod 777 *' will fix this issue then that is great for me.
> Thanks.
>
Try it.
St.Ack

Re: Migrating pseudo distribute map reduce with user jar file to 0.90.1

Posted by Pete Tyler <pe...@gmail.com>.

P.S. This is a pure development system, I have no interest in any security
features. If 'chmod 777 *' will fix this issue then that is great for me.
Thanks.

Re: Migrating pseudo distribute map reduce with user jar file to 0.90.1

Posted by Stack <st...@duboce.net>.

On Tue, Apr 26, 2011 at 8:34 PM, Pete Tyler <pe...@gmail.com> wrote:
>   "When you upgrade to CDH3, two new Unix user accounts called hdfs and
> mapred are automatically created to support security:"
>
> makes no sense to me as my entire installation process involved unzipping
> your tar ball.
>

OK.  So, the statement above probably goes along w/ the case where
cdh3 is 'installed'.  If you untar a tar ball, yeah, no users are
going to be made.


> This then leads to my confusion over the permissions guidelines,
>
> DirectoryOwnerPermissions (see Footnote 1)dfs.name.dirhdfs:hadoopdrwx------
> dfs.data.dirhdfs:hadoopdrwx------mapred.local.dirmapred:hadoopdrwxr-xr-x
> mapred.system.dir in HDFSmapred:hadoop(see Footnote 2)
>

The above is just making dirs in hdfs available to users running these
processes.



> If anyone can please help me that would be great. At the moment I feel I'm
> in danger of losing two years work due to the enormous gulf between the
> HBase 0.20.4 and the HBase 0.90.1. environments :(
>

Yeah, a bunch happened in between.

St.Ack

Re: Migrating pseudo distribute map reduce with user jar file to 0.90.1

Posted by Pete Tyler <pe...@gmail.com>.

Thanks for the links but I'm having trouble applying them. I'm trying to
upgrade my OS X single node pseudo distributed node to HBase 0.90.1. The
following line from the doc you've sent me,

   "When you upgrade to CDH3, two new Unix user accounts called hdfs and
mapred are automatically created to support security:"

makes no sense to me as my entire installation process involved unzipping
your tar ball.

This then leads to my confusion over the permissions guidelines,

DirectoryOwnerPermissions (see Footnote 1)dfs.name.dirhdfs:hadoopdrwx------
dfs.data.dirhdfs:hadoopdrwx------mapred.local.dirmapred:hadoopdrwxr-xr-x
mapred.system.dir in HDFSmapred:hadoop(see Footnote 2)

If anyone can please help me that would be great. At the moment I feel I'm
in danger of losing two years work due to the enormous gulf between the
HBase 0.20.4 and the HBase 0.90.1. environments :(

When I first installed Hadoop the example map reduce job ran fine, but now I
see this error. *Sigh*, I seem to be going steadily backwards,

Peters-MacBook-Pro:hadoop-0.20.2-CDH3B4 peter$ bin/hadoop jar
hadoop-examples-0.20.2-CDH3B4.jar grep input output 'dfs[a-z.]+'
11/04/26 20:30:21 INFO mapred.JobClient: Cleaning up the staging area
hdfs://localhost:9000/Users/peter/Development/Data/temp/mapred/staging/peter/.staging/job_201104262014_0003
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
hdfs://localhost:9000/user/peter/input
at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:194)
at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:205)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:971)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:963)
at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1242)
at org.apache.hadoop.examples.Grep.run(Grep.java:69)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.examples.Grep.main(Grep.java:93)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

Re: Migrating pseudo distribute map reduce with user jar file to 0.90.1

Posted by Suraj Varma <sv...@gmail.com>.

With CDH3B4, the hadoop processes run as separate users (like hdfs,
mapred, etc). Did you set the CDH3B4 directory permissions correctly
as described in the install document?
See: https://ccp.cloudera.com/display/CDHDOC/Upgrading+to+CDH3 and
search for "permissions".
Also see this: https://ccp.cloudera.com/display/CDHDOC/Upgrading+to+CDH3#UpgradingtoCDH3-ChangesinUserAccountsandGroupsinCDH3DuetoSecurity

In this case, make sure that the /tmp/,,,/staging/... directory is
accessible by the task user account (most likely mapred). What seems
to be happenning in your case is that the task is not able to access
your /tmp directory due to user permissions.

If this still doesn't work - try running one of the MR
hadoop-*-examples (like "pi") so that you can validate that the
cluster is setup correctly.
--Suraj


On Mon, Apr 25, 2011 at 9:18 PM, Pete Tyler <pe...@gmail.com> wrote:
>
> My map reduce jobs, which were running fine with HBase 0.20.4 with Hadoop
> 0.20.2 are now failing as I try to upgrade to HBase 0.90.1 with Hadoop
> 0.20.2-CDH3B4.
>
> Under 0.90.1 I see the following error,
>
> Error initializing attempt_201104252111_0001_m_000002_0:
> java.io.FileNotFoundException: File
> /tmp/hadoop-peter/mapred/staging/peter/.staging/job_201104252111_0001/job.jar
> does not exist.
>        at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:383)
>        at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:207)
>        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157)
>        at
> org.apache.hadoop.fs.LocalFileSystem.copyToLocalFile(LocalFileSystem.java:61)
>        at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1303)
>        at
> org.apache.hadoop.mapred.JobLocalizer.localizeJobJarFile(JobLocalizer.java:273)
>        at
> org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:381)
>        at
> org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:371)
>        at
> org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:198)
>        at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1154)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>        at
> org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1129)
>        at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1055)
>        at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2212)
>        at
> org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2176)
>
> My code, which worked fine under 0.20.4 looks like this,
>
>        Configuration conf = HBaseConfiguration.create();
>        conf.set("mapred.job.tracker", "localhost:9001");
>
>        Job job = new Job(conf);
>        JobConf jobConf = job.getConfiguration();
>        jobConf.setJar( dirName + "/" + jarName);
>
>        job.setJarByClass(MyMapper.class);
>        job.setJobName("My MapReduce Pseudo Distributed");
>
>        Scan scan = new Scan();
>
>        String tableNameIn  = "MyTableIn";
>        String tableNameOut = "MyTableOut";
>
>        TableMapReduceUtil.initTableMapperJob(
>                  tableNameIn,
>                  scan,
>                  MyMapper.class,
>                  ImmutableBytesWritable.class,
>                  ImmutableBytesWritable.class,
>                 job
>         );
>
>        TableMapReduceUtil.initTableReducerJob(
>               tableNameOut,
>               MyReducer.class,
>               job
>         );
>
>        job.setOutputFormatClass(TableOutputFormat.class);
>        job.setNumReduceTasks(1);
>
>        job.waitForCompletion(true);
>
> Any pointers on how to migrate my code so it works on 0.90.1 would be
> brilliant. Many thanks.
> --
> View this message in context: http://old.nabble.com/Migrating-pseudo-distribute-map-reduce-with-user-jar-file-to-0.90.1-tp31475457p31475457.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
>