You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Lukas Kairies <lu...@googlemail.com> on 2014/01/27 11:05:05 UTC

Invalide URI in job start

Hello,

I try to use XtreemFS as an alternative file system for Hadoop 2.x. 
There is an existing FileSystem implementation for Hadoop 1.x that works 
fine. First think I did was to implement a DelegateToFileSystem subclass 
to provide an AbstractFileSystem implementation for XtreemFS (just 
constructors that use the FileSystem implementation). When I start the 
wordcount example application I get the following Exception on the 
NodeManager:

2014-01-20 14:18:19,349 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
Failed to parse resource-request
java.net.URISyntaxException: Expected scheme name at index 0: 
:///tmp/hadoop-yarn/staging/lkairies/.staging/job_1390223418764_0004/job.jar
         at java.net.URI$Parser.fail(URI.java:2829)
         at java.net.URI$Parser.failExpecting(URI.java:2835)
         at java.net.URI$Parser.parse(URI.java:3027)
         at java.net.URI.<init>(URI.java:753)
         at 
org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:80)
         at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
         at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:529)
         at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:497)
         at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
         at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
         at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
         at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
         at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:864)
         at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:73)
         at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:815)
         at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:808)
         at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
         at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
         at java.lang.Thread.run(Thread.java:724)

Additionally the following is printed on the console:

14/01/27 11:02:14 INFO input.FileInputFormat: Total input paths to 
process : 1
14/01/27 11:02:14 INFO mapreduce.JobSubmitter: number of splits:1
14/01/27 11:02:15 INFO Configuration.deprecation: user.name is 
deprecated. Instead, use mapreduce.job.user.name
14/01/27 11:02:15 INFO Configuration.deprecation: mapred.jar is 
deprecated. Instead, use mapreduce.job.jar
14/01/27 11:02:15 INFO Configuration.deprecation: 
mapred.output.value.class is deprecated. Instead, use 
mapreduce.job.output.value.class
14/01/27 11:02:15 INFO Configuration.deprecation: 
mapreduce.combine.class is deprecated. Instead, use 
mapreduce.job.combine.class
14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.map.class is 
deprecated. Instead, use mapreduce.job.map.class
14/01/27 11:02:15 INFO Configuration.deprecation: mapred.job.name is 
deprecated. Instead, use mapreduce.job.name
14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.reduce.class 
is deprecated. Instead, use mapreduce.job.reduce.class
14/01/27 11:02:15 INFO Configuration.deprecation: mapred.input.dir is 
deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.dir is 
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/01/27 11:02:15 INFO Configuration.deprecation: mapred.map.tasks is 
deprecated. Instead, use mapreduce.job.maps
14/01/27 11:02:15 INFO Configuration.deprecation: 
mapred.output.key.class is deprecated. Instead, use 
mapreduce.job.output.key.class
14/01/27 11:02:15 INFO Configuration.deprecation: mapred.working.dir is 
deprecated. Instead, use mapreduce.job.working.dir
14/01/27 11:02:15 INFO mapreduce.JobSubmitter: Submitting tokens for 
job: job_1390816735288_0001
14/01/27 11:02:15 INFO impl.YarnClientImpl: Submitted application 
application_1390816735288_0001 to ResourceManager at /0.0.0.0:8032
14/01/27 11:02:15 INFO mapreduce.Job: The url to track the job: 
http://ludiwg:8088/proxy/application_1390816735288_0001/
14/01/27 11:02:15 INFO mapreduce.Job: Running job: job_1390816735288_0001
14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 running 
in uber mode : false
14/01/27 11:02:19 INFO mapreduce.Job:  map 0% reduce 0%
14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 failed 
with state FAILED due to: Application application_1390816735288_0001 
failed 2 times due to AM Container for 
appattempt_1390816735288_0001_000002 exited with  exitCode: -1000 due 
to: .Failing this attempt.. Failing the application.
14/01/27 11:02:19 INFO mapreduce.Job: Counters: 0


The job files are created in XtreemFS. After a lot of debugging I still 
did not find the problem.

Any idea how to fix this?

Regards,
Lukas

Re: Invalide URI in job start

Posted by Lukas Kairies <lu...@googlemail.com>.
Hello,

first thanks for your reply and sorry for my late answer. The JobClient 
prints the following URL after I added a system.out statement:

port: -1 file: 
"/tmp/hadoop-yarn/staging/lkairies/.staging/job_1391433848658_0001/job.jar" 
(for all job files, i.e. also for the job.xml,...).

So there is no scheme and the port is also not set. The same URL is 
logged in the node manager log file.

Thanks,
Lukas Kairies

Am 27.01.2014 20:00, schrieb Vinod Kumar Vavilapalli:
> Need your help to debug this. Seems like the scheme is getting lost 
> somewhere along the way. Clearly as you say if job.jar is on the 
> file-system, then JobClient is properly uploading it. There are 
> multilple things that you'll need to check
>  - Check the NodeManager logs for the URL. It does print what URL it 
> is trying to download from. Check if the scheme is getting there or not.
>  - If that doesn't tell you something, change JobClient to print the 
> URL before it constructs the ContainerLaunchContext for the 
> ApplicationMaster. You'll need to do this in YarnRunner.java. 
> Specifically the method createApplicationResource.
>
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Jan 27, 2014, at 2:05 AM, Lukas Kairies 
> <lukas.xtreemfs@googlemail.com <ma...@googlemail.com>> 
> wrote:
>
>> Hello,
>>
>> I try to use XtreemFS as an alternative file system for Hadoop 2.x. 
>> There is an existing FileSystem implementation for Hadoop 1.x that 
>> works fine. First think I did was to implement a DelegateToFileSystem 
>> subclass to provide an AbstractFileSystem implementation for XtreemFS 
>> (just constructors that use the FileSystem implementation). When I 
>> start the wordcount example application I get the following Exception 
>> on the NodeManager:
>>
>> 2014-01-20 14:18:19,349 WARN 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
>> Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0: 
>> :///tmp/hadoop-yarn/staging/lkairies/.staging/job_1390223418764_0004/job.jar
>>        at java.net.URI$Parser.fail(URI.java:2829)
>>        at java.net.URI$Parser.failExpecting(URI.java:2835)
>>        at java.net.URI$Parser.parse(URI.java:3027)
>>        at java.net.URI.<init>(URI.java:753)
>>        at 
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:80)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:529)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:497)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:864)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:73)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:815)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:808)
>>        at 
>> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
>>        at 
>> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
>>        at java.lang.Thread.run(Thread.java:724)
>>
>> Additionally the following is printed on the console:
>>
>> 14/01/27 11:02:14 INFO input.FileInputFormat: Total input paths to 
>> process : 1
>> 14/01/27 11:02:14 INFO mapreduce.JobSubmitter: number of splits:1
>> 14/01/27 11:02:15 INFO Configuration.deprecation: user.name is 
>> deprecated. Instead, use mapreduce.job.user.name
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.jar is 
>> deprecated. Instead, use mapreduce.job.jar
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapred.output.value.class is deprecated. Instead, use 
>> mapreduce.job.output.value.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapreduce.combine.class is deprecated. Instead, use 
>> mapreduce.job.combine.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.map.class 
>> is deprecated. Instead, use mapreduce.job.map.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.job.name is 
>> deprecated. Instead, use mapreduce.job.name
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapreduce.reduce.class is deprecated. Instead, use 
>> mapreduce.job.reduce.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.input.dir is 
>> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.dir 
>> is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.map.tasks is 
>> deprecated. Instead, use mapreduce.job.maps
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapred.output.key.class is deprecated. Instead, use 
>> mapreduce.job.output.key.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.working.dir 
>> is deprecated. Instead, use mapreduce.job.working.dir
>> 14/01/27 11:02:15 INFO mapreduce.JobSubmitter: Submitting tokens for 
>> job: job_1390816735288_0001
>> 14/01/27 11:02:15 INFO impl.YarnClientImpl: Submitted application 
>> application_1390816735288_0001 to ResourceManager at /0.0.0.0:8032
>> 14/01/27 11:02:15 INFO mapreduce.Job: The url to track the job: 
>> http://ludiwg:8088/proxy/application_1390816735288_0001/
>> 14/01/27 11:02:15 INFO mapreduce.Job: Running job: job_1390816735288_0001
>> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 
>> running in uber mode : false
>> 14/01/27 11:02:19 INFO mapreduce.Job:  map 0% reduce 0%
>> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 
>> failed with state FAILED due to: Application 
>> application_1390816735288_0001 failed 2 times due to AM Container for 
>> appattempt_1390816735288_0001_000002 exited with  exitCode: -1000 due 
>> to: .Failing this attempt.. Failing the application.
>> 14/01/27 11:02:19 INFO mapreduce.Job: Counters: 0
>>
>>
>> The job files are created in XtreemFS. After a lot of debugging I 
>> still did not find the problem.
>>
>> Any idea how to fix this?
>>
>> Regards,
>> Lukas
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or 
> entity to which it is addressed and may contain information that is 
> confidential, privileged and exempt from disclosure under applicable 
> law. If the reader of this message is not the intended recipient, you 
> are hereby notified that any printing, copying, dissemination, 
> distribution, disclosure or forwarding of this communication is 
> strictly prohibited. If you have received this communication in error, 
> please contact the sender immediately and delete it from your system. 
> Thank You. 


Re: Invalide URI in job start

Posted by Lukas Kairies <lu...@googlemail.com>.
Hello,

first thanks for your reply and sorry for my late answer. The JobClient 
prints the following URL after I added a system.out statement:

port: -1 file: 
"/tmp/hadoop-yarn/staging/lkairies/.staging/job_1391433848658_0001/job.jar" 
(for all job files, i.e. also for the job.xml,...).

So there is no scheme and the port is also not set. The same URL is 
logged in the node manager log file.

Thanks,
Lukas Kairies

Am 27.01.2014 20:00, schrieb Vinod Kumar Vavilapalli:
> Need your help to debug this. Seems like the scheme is getting lost 
> somewhere along the way. Clearly as you say if job.jar is on the 
> file-system, then JobClient is properly uploading it. There are 
> multilple things that you'll need to check
>  - Check the NodeManager logs for the URL. It does print what URL it 
> is trying to download from. Check if the scheme is getting there or not.
>  - If that doesn't tell you something, change JobClient to print the 
> URL before it constructs the ContainerLaunchContext for the 
> ApplicationMaster. You'll need to do this in YarnRunner.java. 
> Specifically the method createApplicationResource.
>
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Jan 27, 2014, at 2:05 AM, Lukas Kairies 
> <lukas.xtreemfs@googlemail.com <ma...@googlemail.com>> 
> wrote:
>
>> Hello,
>>
>> I try to use XtreemFS as an alternative file system for Hadoop 2.x. 
>> There is an existing FileSystem implementation for Hadoop 1.x that 
>> works fine. First think I did was to implement a DelegateToFileSystem 
>> subclass to provide an AbstractFileSystem implementation for XtreemFS 
>> (just constructors that use the FileSystem implementation). When I 
>> start the wordcount example application I get the following Exception 
>> on the NodeManager:
>>
>> 2014-01-20 14:18:19,349 WARN 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
>> Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0: 
>> :///tmp/hadoop-yarn/staging/lkairies/.staging/job_1390223418764_0004/job.jar
>>        at java.net.URI$Parser.fail(URI.java:2829)
>>        at java.net.URI$Parser.failExpecting(URI.java:2835)
>>        at java.net.URI$Parser.parse(URI.java:3027)
>>        at java.net.URI.<init>(URI.java:753)
>>        at 
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:80)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:529)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:497)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:864)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:73)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:815)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:808)
>>        at 
>> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
>>        at 
>> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
>>        at java.lang.Thread.run(Thread.java:724)
>>
>> Additionally the following is printed on the console:
>>
>> 14/01/27 11:02:14 INFO input.FileInputFormat: Total input paths to 
>> process : 1
>> 14/01/27 11:02:14 INFO mapreduce.JobSubmitter: number of splits:1
>> 14/01/27 11:02:15 INFO Configuration.deprecation: user.name is 
>> deprecated. Instead, use mapreduce.job.user.name
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.jar is 
>> deprecated. Instead, use mapreduce.job.jar
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapred.output.value.class is deprecated. Instead, use 
>> mapreduce.job.output.value.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapreduce.combine.class is deprecated. Instead, use 
>> mapreduce.job.combine.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.map.class 
>> is deprecated. Instead, use mapreduce.job.map.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.job.name is 
>> deprecated. Instead, use mapreduce.job.name
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapreduce.reduce.class is deprecated. Instead, use 
>> mapreduce.job.reduce.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.input.dir is 
>> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.dir 
>> is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.map.tasks is 
>> deprecated. Instead, use mapreduce.job.maps
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapred.output.key.class is deprecated. Instead, use 
>> mapreduce.job.output.key.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.working.dir 
>> is deprecated. Instead, use mapreduce.job.working.dir
>> 14/01/27 11:02:15 INFO mapreduce.JobSubmitter: Submitting tokens for 
>> job: job_1390816735288_0001
>> 14/01/27 11:02:15 INFO impl.YarnClientImpl: Submitted application 
>> application_1390816735288_0001 to ResourceManager at /0.0.0.0:8032
>> 14/01/27 11:02:15 INFO mapreduce.Job: The url to track the job: 
>> http://ludiwg:8088/proxy/application_1390816735288_0001/
>> 14/01/27 11:02:15 INFO mapreduce.Job: Running job: job_1390816735288_0001
>> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 
>> running in uber mode : false
>> 14/01/27 11:02:19 INFO mapreduce.Job:  map 0% reduce 0%
>> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 
>> failed with state FAILED due to: Application 
>> application_1390816735288_0001 failed 2 times due to AM Container for 
>> appattempt_1390816735288_0001_000002 exited with  exitCode: -1000 due 
>> to: .Failing this attempt.. Failing the application.
>> 14/01/27 11:02:19 INFO mapreduce.Job: Counters: 0
>>
>>
>> The job files are created in XtreemFS. After a lot of debugging I 
>> still did not find the problem.
>>
>> Any idea how to fix this?
>>
>> Regards,
>> Lukas
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or 
> entity to which it is addressed and may contain information that is 
> confidential, privileged and exempt from disclosure under applicable 
> law. If the reader of this message is not the intended recipient, you 
> are hereby notified that any printing, copying, dissemination, 
> distribution, disclosure or forwarding of this communication is 
> strictly prohibited. If you have received this communication in error, 
> please contact the sender immediately and delete it from your system. 
> Thank You. 


Re: Invalide URI in job start

Posted by Lukas Kairies <lu...@googlemail.com>.
Hello,

first thanks for your reply and sorry for my late answer. The JobClient 
prints the following URL after I added a system.out statement:

port: -1 file: 
"/tmp/hadoop-yarn/staging/lkairies/.staging/job_1391433848658_0001/job.jar" 
(for all job files, i.e. also for the job.xml,...).

So there is no scheme and the port is also not set. The same URL is 
logged in the node manager log file.

Thanks,
Lukas Kairies

Am 27.01.2014 20:00, schrieb Vinod Kumar Vavilapalli:
> Need your help to debug this. Seems like the scheme is getting lost 
> somewhere along the way. Clearly as you say if job.jar is on the 
> file-system, then JobClient is properly uploading it. There are 
> multilple things that you'll need to check
>  - Check the NodeManager logs for the URL. It does print what URL it 
> is trying to download from. Check if the scheme is getting there or not.
>  - If that doesn't tell you something, change JobClient to print the 
> URL before it constructs the ContainerLaunchContext for the 
> ApplicationMaster. You'll need to do this in YarnRunner.java. 
> Specifically the method createApplicationResource.
>
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Jan 27, 2014, at 2:05 AM, Lukas Kairies 
> <lukas.xtreemfs@googlemail.com <ma...@googlemail.com>> 
> wrote:
>
>> Hello,
>>
>> I try to use XtreemFS as an alternative file system for Hadoop 2.x. 
>> There is an existing FileSystem implementation for Hadoop 1.x that 
>> works fine. First think I did was to implement a DelegateToFileSystem 
>> subclass to provide an AbstractFileSystem implementation for XtreemFS 
>> (just constructors that use the FileSystem implementation). When I 
>> start the wordcount example application I get the following Exception 
>> on the NodeManager:
>>
>> 2014-01-20 14:18:19,349 WARN 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
>> Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0: 
>> :///tmp/hadoop-yarn/staging/lkairies/.staging/job_1390223418764_0004/job.jar
>>        at java.net.URI$Parser.fail(URI.java:2829)
>>        at java.net.URI$Parser.failExpecting(URI.java:2835)
>>        at java.net.URI$Parser.parse(URI.java:3027)
>>        at java.net.URI.<init>(URI.java:753)
>>        at 
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:80)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:529)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:497)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:864)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:73)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:815)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:808)
>>        at 
>> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
>>        at 
>> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
>>        at java.lang.Thread.run(Thread.java:724)
>>
>> Additionally the following is printed on the console:
>>
>> 14/01/27 11:02:14 INFO input.FileInputFormat: Total input paths to 
>> process : 1
>> 14/01/27 11:02:14 INFO mapreduce.JobSubmitter: number of splits:1
>> 14/01/27 11:02:15 INFO Configuration.deprecation: user.name is 
>> deprecated. Instead, use mapreduce.job.user.name
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.jar is 
>> deprecated. Instead, use mapreduce.job.jar
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapred.output.value.class is deprecated. Instead, use 
>> mapreduce.job.output.value.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapreduce.combine.class is deprecated. Instead, use 
>> mapreduce.job.combine.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.map.class 
>> is deprecated. Instead, use mapreduce.job.map.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.job.name is 
>> deprecated. Instead, use mapreduce.job.name
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapreduce.reduce.class is deprecated. Instead, use 
>> mapreduce.job.reduce.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.input.dir is 
>> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.dir 
>> is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.map.tasks is 
>> deprecated. Instead, use mapreduce.job.maps
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapred.output.key.class is deprecated. Instead, use 
>> mapreduce.job.output.key.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.working.dir 
>> is deprecated. Instead, use mapreduce.job.working.dir
>> 14/01/27 11:02:15 INFO mapreduce.JobSubmitter: Submitting tokens for 
>> job: job_1390816735288_0001
>> 14/01/27 11:02:15 INFO impl.YarnClientImpl: Submitted application 
>> application_1390816735288_0001 to ResourceManager at /0.0.0.0:8032
>> 14/01/27 11:02:15 INFO mapreduce.Job: The url to track the job: 
>> http://ludiwg:8088/proxy/application_1390816735288_0001/
>> 14/01/27 11:02:15 INFO mapreduce.Job: Running job: job_1390816735288_0001
>> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 
>> running in uber mode : false
>> 14/01/27 11:02:19 INFO mapreduce.Job:  map 0% reduce 0%
>> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 
>> failed with state FAILED due to: Application 
>> application_1390816735288_0001 failed 2 times due to AM Container for 
>> appattempt_1390816735288_0001_000002 exited with  exitCode: -1000 due 
>> to: .Failing this attempt.. Failing the application.
>> 14/01/27 11:02:19 INFO mapreduce.Job: Counters: 0
>>
>>
>> The job files are created in XtreemFS. After a lot of debugging I 
>> still did not find the problem.
>>
>> Any idea how to fix this?
>>
>> Regards,
>> Lukas
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or 
> entity to which it is addressed and may contain information that is 
> confidential, privileged and exempt from disclosure under applicable 
> law. If the reader of this message is not the intended recipient, you 
> are hereby notified that any printing, copying, dissemination, 
> distribution, disclosure or forwarding of this communication is 
> strictly prohibited. If you have received this communication in error, 
> please contact the sender immediately and delete it from your system. 
> Thank You. 


Re: Invalide URI in job start

Posted by Lukas Kairies <lu...@googlemail.com>.
Hello,

first thanks for your reply and sorry for my late answer. The JobClient 
prints the following URL after I added a system.out statement:

port: -1 file: 
"/tmp/hadoop-yarn/staging/lkairies/.staging/job_1391433848658_0001/job.jar" 
(for all job files, i.e. also for the job.xml,...).

So there is no scheme and the port is also not set. The same URL is 
logged in the node manager log file.

Thanks,
Lukas Kairies

Am 27.01.2014 20:00, schrieb Vinod Kumar Vavilapalli:
> Need your help to debug this. Seems like the scheme is getting lost 
> somewhere along the way. Clearly as you say if job.jar is on the 
> file-system, then JobClient is properly uploading it. There are 
> multilple things that you'll need to check
>  - Check the NodeManager logs for the URL. It does print what URL it 
> is trying to download from. Check if the scheme is getting there or not.
>  - If that doesn't tell you something, change JobClient to print the 
> URL before it constructs the ContainerLaunchContext for the 
> ApplicationMaster. You'll need to do this in YarnRunner.java. 
> Specifically the method createApplicationResource.
>
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
>
> On Jan 27, 2014, at 2:05 AM, Lukas Kairies 
> <lukas.xtreemfs@googlemail.com <ma...@googlemail.com>> 
> wrote:
>
>> Hello,
>>
>> I try to use XtreemFS as an alternative file system for Hadoop 2.x. 
>> There is an existing FileSystem implementation for Hadoop 1.x that 
>> works fine. First think I did was to implement a DelegateToFileSystem 
>> subclass to provide an AbstractFileSystem implementation for XtreemFS 
>> (just constructors that use the FileSystem implementation). When I 
>> start the wordcount example application I get the following Exception 
>> on the NodeManager:
>>
>> 2014-01-20 14:18:19,349 WARN 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: 
>> Failed to parse resource-request
>> java.net.URISyntaxException: Expected scheme name at index 0: 
>> :///tmp/hadoop-yarn/staging/lkairies/.staging/job_1390223418764_0004/job.jar
>>        at java.net.URI$Parser.fail(URI.java:2829)
>>        at java.net.URI$Parser.failExpecting(URI.java:2835)
>>        at java.net.URI$Parser.parse(URI.java:3027)
>>        at java.net.URI.<init>(URI.java:753)
>>        at 
>> org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:80)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:529)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:497)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>>        at 
>> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:864)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:73)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:815)
>>        at 
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:808)
>>        at 
>> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
>>        at 
>> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
>>        at java.lang.Thread.run(Thread.java:724)
>>
>> Additionally the following is printed on the console:
>>
>> 14/01/27 11:02:14 INFO input.FileInputFormat: Total input paths to 
>> process : 1
>> 14/01/27 11:02:14 INFO mapreduce.JobSubmitter: number of splits:1
>> 14/01/27 11:02:15 INFO Configuration.deprecation: user.name is 
>> deprecated. Instead, use mapreduce.job.user.name
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.jar is 
>> deprecated. Instead, use mapreduce.job.jar
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapred.output.value.class is deprecated. Instead, use 
>> mapreduce.job.output.value.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapreduce.combine.class is deprecated. Instead, use 
>> mapreduce.job.combine.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.map.class 
>> is deprecated. Instead, use mapreduce.job.map.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.job.name is 
>> deprecated. Instead, use mapreduce.job.name
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapreduce.reduce.class is deprecated. Instead, use 
>> mapreduce.job.reduce.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.input.dir is 
>> deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.dir 
>> is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.map.tasks is 
>> deprecated. Instead, use mapreduce.job.maps
>> 14/01/27 11:02:15 INFO Configuration.deprecation: 
>> mapred.output.key.class is deprecated. Instead, use 
>> mapreduce.job.output.key.class
>> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.working.dir 
>> is deprecated. Instead, use mapreduce.job.working.dir
>> 14/01/27 11:02:15 INFO mapreduce.JobSubmitter: Submitting tokens for 
>> job: job_1390816735288_0001
>> 14/01/27 11:02:15 INFO impl.YarnClientImpl: Submitted application 
>> application_1390816735288_0001 to ResourceManager at /0.0.0.0:8032
>> 14/01/27 11:02:15 INFO mapreduce.Job: The url to track the job: 
>> http://ludiwg:8088/proxy/application_1390816735288_0001/
>> 14/01/27 11:02:15 INFO mapreduce.Job: Running job: job_1390816735288_0001
>> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 
>> running in uber mode : false
>> 14/01/27 11:02:19 INFO mapreduce.Job:  map 0% reduce 0%
>> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 
>> failed with state FAILED due to: Application 
>> application_1390816735288_0001 failed 2 times due to AM Container for 
>> appattempt_1390816735288_0001_000002 exited with  exitCode: -1000 due 
>> to: .Failing this attempt.. Failing the application.
>> 14/01/27 11:02:19 INFO mapreduce.Job: Counters: 0
>>
>>
>> The job files are created in XtreemFS. After a lot of debugging I 
>> still did not find the problem.
>>
>> Any idea how to fix this?
>>
>> Regards,
>> Lukas
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or 
> entity to which it is addressed and may contain information that is 
> confidential, privileged and exempt from disclosure under applicable 
> law. If the reader of this message is not the intended recipient, you 
> are hereby notified that any printing, copying, dissemination, 
> distribution, disclosure or forwarding of this communication is 
> strictly prohibited. If you have received this communication in error, 
> please contact the sender immediately and delete it from your system. 
> Thank You. 


Re: Invalide URI in job start

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.
Need your help to debug this. Seems like the scheme is getting lost somewhere along the way. Clearly as you say if job.jar is on the file-system, then JobClient is properly uploading it. There are multilple things that you'll need to check
 - Check the NodeManager logs for the URL. It does print what URL it is trying to download from. Check if the scheme is getting there or not.
 - If that doesn't tell you something, change JobClient to print the URL before it constructs the ContainerLaunchContext for the ApplicationMaster. You'll need to do this in YarnRunner.java. Specifically the method createApplicationResource.

Thanks,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Jan 27, 2014, at 2:05 AM, Lukas Kairies <lu...@googlemail.com> wrote:

> Hello,
> 
> I try to use XtreemFS as an alternative file system for Hadoop 2.x. There is an existing FileSystem implementation for Hadoop 1.x that works fine. First think I did was to implement a DelegateToFileSystem subclass to provide an AbstractFileSystem implementation for XtreemFS (just constructors that use the FileSystem implementation). When I start the wordcount example application I get the following Exception on the NodeManager:
> 
> 2014-01-20 14:18:19,349 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0: :///tmp/hadoop-yarn/staging/lkairies/.staging/job_1390223418764_0004/job.jar
>        at java.net.URI$Parser.fail(URI.java:2829)
>        at java.net.URI$Parser.failExpecting(URI.java:2835)
>        at java.net.URI$Parser.parse(URI.java:3027)
>        at java.net.URI.<init>(URI.java:753)
>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:80)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:529)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:497)
>        at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>        at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>        at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>        at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:864)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:73)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:815)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:808)
>        at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
>        at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
>        at java.lang.Thread.run(Thread.java:724)
> 
> Additionally the following is printed on the console:
> 
> 14/01/27 11:02:14 INFO input.FileInputFormat: Total input paths to process : 1
> 14/01/27 11:02:14 INFO mapreduce.JobSubmitter: number of splits:1
> 14/01/27 11:02:15 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
> 14/01/27 11:02:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1390816735288_0001
> 14/01/27 11:02:15 INFO impl.YarnClientImpl: Submitted application application_1390816735288_0001 to ResourceManager at /0.0.0.0:8032
> 14/01/27 11:02:15 INFO mapreduce.Job: The url to track the job: http://ludiwg:8088/proxy/application_1390816735288_0001/
> 14/01/27 11:02:15 INFO mapreduce.Job: Running job: job_1390816735288_0001
> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 running in uber mode : false
> 14/01/27 11:02:19 INFO mapreduce.Job:  map 0% reduce 0%
> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 failed with state FAILED due to: Application application_1390816735288_0001 failed 2 times due to AM Container for appattempt_1390816735288_0001_000002 exited with  exitCode: -1000 due to: .Failing this attempt.. Failing the application.
> 14/01/27 11:02:19 INFO mapreduce.Job: Counters: 0
> 
> 
> The job files are created in XtreemFS. After a lot of debugging I still did not find the problem.
> 
> Any idea how to fix this?
> 
> Regards,
> Lukas


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Invalide URI in job start

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.
Need your help to debug this. Seems like the scheme is getting lost somewhere along the way. Clearly as you say if job.jar is on the file-system, then JobClient is properly uploading it. There are multilple things that you'll need to check
 - Check the NodeManager logs for the URL. It does print what URL it is trying to download from. Check if the scheme is getting there or not.
 - If that doesn't tell you something, change JobClient to print the URL before it constructs the ContainerLaunchContext for the ApplicationMaster. You'll need to do this in YarnRunner.java. Specifically the method createApplicationResource.

Thanks,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Jan 27, 2014, at 2:05 AM, Lukas Kairies <lu...@googlemail.com> wrote:

> Hello,
> 
> I try to use XtreemFS as an alternative file system for Hadoop 2.x. There is an existing FileSystem implementation for Hadoop 1.x that works fine. First think I did was to implement a DelegateToFileSystem subclass to provide an AbstractFileSystem implementation for XtreemFS (just constructors that use the FileSystem implementation). When I start the wordcount example application I get the following Exception on the NodeManager:
> 
> 2014-01-20 14:18:19,349 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0: :///tmp/hadoop-yarn/staging/lkairies/.staging/job_1390223418764_0004/job.jar
>        at java.net.URI$Parser.fail(URI.java:2829)
>        at java.net.URI$Parser.failExpecting(URI.java:2835)
>        at java.net.URI$Parser.parse(URI.java:3027)
>        at java.net.URI.<init>(URI.java:753)
>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:80)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:529)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:497)
>        at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>        at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>        at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>        at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:864)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:73)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:815)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:808)
>        at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
>        at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
>        at java.lang.Thread.run(Thread.java:724)
> 
> Additionally the following is printed on the console:
> 
> 14/01/27 11:02:14 INFO input.FileInputFormat: Total input paths to process : 1
> 14/01/27 11:02:14 INFO mapreduce.JobSubmitter: number of splits:1
> 14/01/27 11:02:15 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
> 14/01/27 11:02:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1390816735288_0001
> 14/01/27 11:02:15 INFO impl.YarnClientImpl: Submitted application application_1390816735288_0001 to ResourceManager at /0.0.0.0:8032
> 14/01/27 11:02:15 INFO mapreduce.Job: The url to track the job: http://ludiwg:8088/proxy/application_1390816735288_0001/
> 14/01/27 11:02:15 INFO mapreduce.Job: Running job: job_1390816735288_0001
> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 running in uber mode : false
> 14/01/27 11:02:19 INFO mapreduce.Job:  map 0% reduce 0%
> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 failed with state FAILED due to: Application application_1390816735288_0001 failed 2 times due to AM Container for appattempt_1390816735288_0001_000002 exited with  exitCode: -1000 due to: .Failing this attempt.. Failing the application.
> 14/01/27 11:02:19 INFO mapreduce.Job: Counters: 0
> 
> 
> The job files are created in XtreemFS. After a lot of debugging I still did not find the problem.
> 
> Any idea how to fix this?
> 
> Regards,
> Lukas


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Invalide URI in job start

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.
Need your help to debug this. Seems like the scheme is getting lost somewhere along the way. Clearly as you say if job.jar is on the file-system, then JobClient is properly uploading it. There are multilple things that you'll need to check
 - Check the NodeManager logs for the URL. It does print what URL it is trying to download from. Check if the scheme is getting there or not.
 - If that doesn't tell you something, change JobClient to print the URL before it constructs the ContainerLaunchContext for the ApplicationMaster. You'll need to do this in YarnRunner.java. Specifically the method createApplicationResource.

Thanks,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Jan 27, 2014, at 2:05 AM, Lukas Kairies <lu...@googlemail.com> wrote:

> Hello,
> 
> I try to use XtreemFS as an alternative file system for Hadoop 2.x. There is an existing FileSystem implementation for Hadoop 1.x that works fine. First think I did was to implement a DelegateToFileSystem subclass to provide an AbstractFileSystem implementation for XtreemFS (just constructors that use the FileSystem implementation). When I start the wordcount example application I get the following Exception on the NodeManager:
> 
> 2014-01-20 14:18:19,349 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0: :///tmp/hadoop-yarn/staging/lkairies/.staging/job_1390223418764_0004/job.jar
>        at java.net.URI$Parser.fail(URI.java:2829)
>        at java.net.URI$Parser.failExpecting(URI.java:2835)
>        at java.net.URI$Parser.parse(URI.java:3027)
>        at java.net.URI.<init>(URI.java:753)
>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:80)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:529)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:497)
>        at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>        at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>        at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>        at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:864)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:73)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:815)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:808)
>        at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
>        at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
>        at java.lang.Thread.run(Thread.java:724)
> 
> Additionally the following is printed on the console:
> 
> 14/01/27 11:02:14 INFO input.FileInputFormat: Total input paths to process : 1
> 14/01/27 11:02:14 INFO mapreduce.JobSubmitter: number of splits:1
> 14/01/27 11:02:15 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
> 14/01/27 11:02:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1390816735288_0001
> 14/01/27 11:02:15 INFO impl.YarnClientImpl: Submitted application application_1390816735288_0001 to ResourceManager at /0.0.0.0:8032
> 14/01/27 11:02:15 INFO mapreduce.Job: The url to track the job: http://ludiwg:8088/proxy/application_1390816735288_0001/
> 14/01/27 11:02:15 INFO mapreduce.Job: Running job: job_1390816735288_0001
> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 running in uber mode : false
> 14/01/27 11:02:19 INFO mapreduce.Job:  map 0% reduce 0%
> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 failed with state FAILED due to: Application application_1390816735288_0001 failed 2 times due to AM Container for appattempt_1390816735288_0001_000002 exited with  exitCode: -1000 due to: .Failing this attempt.. Failing the application.
> 14/01/27 11:02:19 INFO mapreduce.Job: Counters: 0
> 
> 
> The job files are created in XtreemFS. After a lot of debugging I still did not find the problem.
> 
> Any idea how to fix this?
> 
> Regards,
> Lukas


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Invalide URI in job start

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.
Need your help to debug this. Seems like the scheme is getting lost somewhere along the way. Clearly as you say if job.jar is on the file-system, then JobClient is properly uploading it. There are multilple things that you'll need to check
 - Check the NodeManager logs for the URL. It does print what URL it is trying to download from. Check if the scheme is getting there or not.
 - If that doesn't tell you something, change JobClient to print the URL before it constructs the ContainerLaunchContext for the ApplicationMaster. You'll need to do this in YarnRunner.java. Specifically the method createApplicationResource.

Thanks,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/

On Jan 27, 2014, at 2:05 AM, Lukas Kairies <lu...@googlemail.com> wrote:

> Hello,
> 
> I try to use XtreemFS as an alternative file system for Hadoop 2.x. There is an existing FileSystem implementation for Hadoop 1.x that works fine. First think I did was to implement a DelegateToFileSystem subclass to provide an AbstractFileSystem implementation for XtreemFS (just constructors that use the FileSystem implementation). When I start the wordcount example application I get the following Exception on the NodeManager:
> 
> 2014-01-20 14:18:19,349 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Failed to parse resource-request
> java.net.URISyntaxException: Expected scheme name at index 0: :///tmp/hadoop-yarn/staging/lkairies/.staging/job_1390223418764_0004/job.jar
>        at java.net.URI$Parser.fail(URI.java:2829)
>        at java.net.URI$Parser.failExpecting(URI.java:2835)
>        at java.net.URI$Parser.parse(URI.java:3027)
>        at java.net.URI.<init>(URI.java:753)
>        at org.apache.hadoop.yarn.util.ConverterUtils.getPathFromYarnURL(ConverterUtils.java:80)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourceRequest.<init>(LocalResourceRequest.java:46)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:529)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:497)
>        at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>        at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>        at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>        at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:864)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:73)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:815)
>        at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:808)
>        at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134)
>        at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81)
>        at java.lang.Thread.run(Thread.java:724)
> 
> Additionally the following is printed on the console:
> 
> 14/01/27 11:02:14 INFO input.FileInputFormat: Total input paths to process : 1
> 14/01/27 11:02:14 INFO mapreduce.JobSubmitter: number of splits:1
> 14/01/27 11:02:15 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
> 14/01/27 11:02:15 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
> 14/01/27 11:02:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1390816735288_0001
> 14/01/27 11:02:15 INFO impl.YarnClientImpl: Submitted application application_1390816735288_0001 to ResourceManager at /0.0.0.0:8032
> 14/01/27 11:02:15 INFO mapreduce.Job: The url to track the job: http://ludiwg:8088/proxy/application_1390816735288_0001/
> 14/01/27 11:02:15 INFO mapreduce.Job: Running job: job_1390816735288_0001
> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 running in uber mode : false
> 14/01/27 11:02:19 INFO mapreduce.Job:  map 0% reduce 0%
> 14/01/27 11:02:19 INFO mapreduce.Job: Job job_1390816735288_0001 failed with state FAILED due to: Application application_1390816735288_0001 failed 2 times due to AM Container for appattempt_1390816735288_0001_000002 exited with  exitCode: -1000 due to: .Failing this attempt.. Failing the application.
> 14/01/27 11:02:19 INFO mapreduce.Job: Counters: 0
> 
> 
> The job files are created in XtreemFS. After a lot of debugging I still did not find the problem.
> 
> Any idea how to fix this?
> 
> Regards,
> Lukas


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.