You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jianshi Huang <ji...@gmail.com> on 2014/12/05 04:37:50 UTC
Exception adding resource files in latest Spark
I got the following error during Spark startup (Yarn-client mode):
14/12/04 19:33:58 INFO Client: Uploading resource
file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
->
hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
java.lang.IllegalArgumentException: Wrong FS:
hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
at
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
at
org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
at
org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
at
org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
at
org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
at
org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
at
org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
at
org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
at
org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
at $iwC$$iwC.<init>(<console>:9)
at $iwC.<init>(<console>:18)
at <init>(<console>:20)
at .<init>(<console>:24)
I'm using latest Spark built from master HEAD yesterday. Is this a bug?
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Posted by Patrick Wendell <pw...@gmail.com>.
Thanks for flagging this. I reverted the relevant YARN fix in Spark
1.2 release. We can try to debug this in master.
On Thu, Dec 4, 2014 at 9:51 PM, Jianshi Huang <ji...@gmail.com> wrote:
> I created a ticket for this:
>
> https://issues.apache.org/jira/browse/SPARK-4757
>
>
> Jianshi
>
> On Fri, Dec 5, 2014 at 1:31 PM, Jianshi Huang <ji...@gmail.com>
> wrote:
>>
>> Correction:
>>
>> According to Liancheng, this hotfix might be the root cause:
>>
>>
>> https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce
>>
>> Jianshi
>>
>> On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang <ji...@gmail.com>
>> wrote:
>>>
>>> Looks like the datanucleus*.jar shouldn't appear in the hdfs path in
>>> Yarn-client mode.
>>>
>>> Maybe this patch broke yarn-client.
>>>
>>>
>>> https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53
>>>
>>> Jianshi
>>>
>>> On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang <ji...@gmail.com>
>>> wrote:
>>>>
>>>> Actually my HADOOP_CLASSPATH has already been set to include
>>>> /etc/hadoop/conf/*
>>>>
>>>> export
>>>> HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
>>>> classpath)
>>>>
>>>> Jianshi
>>>>
>>>> On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang <ji...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Looks like somehow Spark failed to find the core-site.xml in
>>>>> /et/hadoop/conf
>>>>>
>>>>> I've already set the following env variables:
>>>>>
>>>>> export YARN_CONF_DIR=/etc/hadoop/conf
>>>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>>> export HBASE_CONF_DIR=/etc/hbase/conf
>>>>>
>>>>> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
>>>>>
>>>>> Jianshi
>>>>>
>>>>> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang
>>>>> <ji...@gmail.com> wrote:
>>>>>>
>>>>>> I got the following error during Spark startup (Yarn-client mode):
>>>>>>
>>>>>> 14/12/04 19:33:58 INFO Client: Uploading resource
>>>>>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
>>>>>> ->
>>>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
>>>>>> expected: file:///
>>>>>> at
>>>>>> org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
>>>>>> at
>>>>>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
>>>>>> at
>>>>>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
>>>>>> at
>>>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
>>>>>> at
>>>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
>>>>>> at
>>>>>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
>>>>>> at scala.Option.foreach(Option.scala:236)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
>>>>>> at
>>>>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>>>>>> at
>>>>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
>>>>>> at
>>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
>>>>>> at
>>>>>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
>>>>>> at $iwC$$iwC.<init>(<console>:9)
>>>>>> at $iwC.<init>(<console>:18)
>>>>>> at <init>(<console>:20)
>>>>>> at .<init>(<console>:24)
>>>>>>
>>>>>> I'm using latest Spark built from master HEAD yesterday. Is this a
>>>>>> bug?
>>>>>>
>>>>>> --
>>>>>> Jianshi Huang
>>>>>>
>>>>>> LinkedIn: jianshi
>>>>>> Twitter: @jshuang
>>>>>> Github & Blog: http://huangjs.github.com/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jianshi Huang
>>>>>
>>>>> LinkedIn: jianshi
>>>>> Twitter: @jshuang
>>>>> Github & Blog: http://huangjs.github.com/
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jianshi Huang
>>>>
>>>> LinkedIn: jianshi
>>>> Twitter: @jshuang
>>>> Github & Blog: http://huangjs.github.com/
>>>
>>>
>>>
>>>
>>> --
>>> Jianshi Huang
>>>
>>> LinkedIn: jianshi
>>> Twitter: @jshuang
>>> Github & Blog: http://huangjs.github.com/
>>
>>
>>
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org
Re: Exception adding resource files in latest Spark
Posted by Patrick Wendell <pw...@gmail.com>.
Thanks for flagging this. I reverted the relevant YARN fix in Spark
1.2 release. We can try to debug this in master.
On Thu, Dec 4, 2014 at 9:51 PM, Jianshi Huang <ji...@gmail.com> wrote:
> I created a ticket for this:
>
> https://issues.apache.org/jira/browse/SPARK-4757
>
>
> Jianshi
>
> On Fri, Dec 5, 2014 at 1:31 PM, Jianshi Huang <ji...@gmail.com>
> wrote:
>>
>> Correction:
>>
>> According to Liancheng, this hotfix might be the root cause:
>>
>>
>> https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce
>>
>> Jianshi
>>
>> On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang <ji...@gmail.com>
>> wrote:
>>>
>>> Looks like the datanucleus*.jar shouldn't appear in the hdfs path in
>>> Yarn-client mode.
>>>
>>> Maybe this patch broke yarn-client.
>>>
>>>
>>> https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53
>>>
>>> Jianshi
>>>
>>> On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang <ji...@gmail.com>
>>> wrote:
>>>>
>>>> Actually my HADOOP_CLASSPATH has already been set to include
>>>> /etc/hadoop/conf/*
>>>>
>>>> export
>>>> HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
>>>> classpath)
>>>>
>>>> Jianshi
>>>>
>>>> On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang <ji...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Looks like somehow Spark failed to find the core-site.xml in
>>>>> /et/hadoop/conf
>>>>>
>>>>> I've already set the following env variables:
>>>>>
>>>>> export YARN_CONF_DIR=/etc/hadoop/conf
>>>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>>> export HBASE_CONF_DIR=/etc/hbase/conf
>>>>>
>>>>> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
>>>>>
>>>>> Jianshi
>>>>>
>>>>> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang
>>>>> <ji...@gmail.com> wrote:
>>>>>>
>>>>>> I got the following error during Spark startup (Yarn-client mode):
>>>>>>
>>>>>> 14/12/04 19:33:58 INFO Client: Uploading resource
>>>>>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
>>>>>> ->
>>>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
>>>>>> expected: file:///
>>>>>> at
>>>>>> org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
>>>>>> at
>>>>>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
>>>>>> at
>>>>>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
>>>>>> at
>>>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
>>>>>> at
>>>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
>>>>>> at
>>>>>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
>>>>>> at scala.Option.foreach(Option.scala:236)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
>>>>>> at
>>>>>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
>>>>>> at
>>>>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>>>>>> at
>>>>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
>>>>>> at
>>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
>>>>>> at
>>>>>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
>>>>>> at $iwC$$iwC.<init>(<console>:9)
>>>>>> at $iwC.<init>(<console>:18)
>>>>>> at <init>(<console>:20)
>>>>>> at .<init>(<console>:24)
>>>>>>
>>>>>> I'm using latest Spark built from master HEAD yesterday. Is this a
>>>>>> bug?
>>>>>>
>>>>>> --
>>>>>> Jianshi Huang
>>>>>>
>>>>>> LinkedIn: jianshi
>>>>>> Twitter: @jshuang
>>>>>> Github & Blog: http://huangjs.github.com/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jianshi Huang
>>>>>
>>>>> LinkedIn: jianshi
>>>>> Twitter: @jshuang
>>>>> Github & Blog: http://huangjs.github.com/
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jianshi Huang
>>>>
>>>> LinkedIn: jianshi
>>>> Twitter: @jshuang
>>>> Github & Blog: http://huangjs.github.com/
>>>
>>>
>>>
>>>
>>> --
>>> Jianshi Huang
>>>
>>> LinkedIn: jianshi
>>> Twitter: @jshuang
>>> Github & Blog: http://huangjs.github.com/
>>
>>
>>
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org
Re: Exception adding resource files in latest Spark
Posted by Jianshi Huang <ji...@gmail.com>.
I created a ticket for this:
https://issues.apache.org/jira/browse/SPARK-4757
Jianshi
On Fri, Dec 5, 2014 at 1:31 PM, Jianshi Huang <ji...@gmail.com>
wrote:
> Correction:
>
> According to Liancheng, this hotfix might be the root cause:
>
>
> https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce
>
> Jianshi
>
> On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang <ji...@gmail.com>
> wrote:
>
>> Looks like the datanucleus*.jar shouldn't appear in the hdfs path in
>> Yarn-client mode.
>>
>> Maybe this patch broke yarn-client.
>>
>>
>> https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53
>>
>> Jianshi
>>
>> On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang <ji...@gmail.com>
>> wrote:
>>
>>> Actually my HADOOP_CLASSPATH has already been set to include
>>> /etc/hadoop/conf/*
>>>
>>> export
>>> HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
>>> classpath)
>>>
>>> Jianshi
>>>
>>> On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang <ji...@gmail.com>
>>> wrote:
>>>
>>>> Looks like somehow Spark failed to find the core-site.xml in
>>>> /et/hadoop/conf
>>>>
>>>> I've already set the following env variables:
>>>>
>>>> export YARN_CONF_DIR=/etc/hadoop/conf
>>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>> export HBASE_CONF_DIR=/etc/hbase/conf
>>>>
>>>> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
>>>>
>>>> Jianshi
>>>>
>>>> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang <jianshi.huang@gmail.com
>>>> > wrote:
>>>>
>>>>> I got the following error during Spark startup (Yarn-client mode):
>>>>>
>>>>> 14/12/04 19:33:58 INFO Client: Uploading resource
>>>>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
>>>>> ->
>>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
>>>>> expected: file:///
>>>>> at
>>>>> org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
>>>>> at
>>>>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
>>>>> at
>>>>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
>>>>> at
>>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
>>>>> at
>>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
>>>>> at
>>>>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
>>>>> at scala.Option.foreach(Option.scala:236)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
>>>>> at
>>>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>>>>> at
>>>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
>>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
>>>>> at
>>>>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
>>>>> at $iwC$$iwC.<init>(<console>:9)
>>>>> at $iwC.<init>(<console>:18)
>>>>> at <init>(<console>:20)
>>>>> at .<init>(<console>:24)
>>>>>
>>>>> I'm using latest Spark built from master HEAD yesterday. Is this a bug?
>>>>>
>>>>> --
>>>>> Jianshi Huang
>>>>>
>>>>> LinkedIn: jianshi
>>>>> Twitter: @jshuang
>>>>> Github & Blog: http://huangjs.github.com/
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jianshi Huang
>>>>
>>>> LinkedIn: jianshi
>>>> Twitter: @jshuang
>>>> Github & Blog: http://huangjs.github.com/
>>>>
>>>
>>>
>>>
>>> --
>>> Jianshi Huang
>>>
>>> LinkedIn: jianshi
>>> Twitter: @jshuang
>>> Github & Blog: http://huangjs.github.com/
>>>
>>
>>
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Posted by Jianshi Huang <ji...@gmail.com>.
I created a ticket for this:
https://issues.apache.org/jira/browse/SPARK-4757
Jianshi
On Fri, Dec 5, 2014 at 1:31 PM, Jianshi Huang <ji...@gmail.com>
wrote:
> Correction:
>
> According to Liancheng, this hotfix might be the root cause:
>
>
> https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce
>
> Jianshi
>
> On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang <ji...@gmail.com>
> wrote:
>
>> Looks like the datanucleus*.jar shouldn't appear in the hdfs path in
>> Yarn-client mode.
>>
>> Maybe this patch broke yarn-client.
>>
>>
>> https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53
>>
>> Jianshi
>>
>> On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang <ji...@gmail.com>
>> wrote:
>>
>>> Actually my HADOOP_CLASSPATH has already been set to include
>>> /etc/hadoop/conf/*
>>>
>>> export
>>> HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
>>> classpath)
>>>
>>> Jianshi
>>>
>>> On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang <ji...@gmail.com>
>>> wrote:
>>>
>>>> Looks like somehow Spark failed to find the core-site.xml in
>>>> /et/hadoop/conf
>>>>
>>>> I've already set the following env variables:
>>>>
>>>> export YARN_CONF_DIR=/etc/hadoop/conf
>>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>> export HBASE_CONF_DIR=/etc/hbase/conf
>>>>
>>>> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
>>>>
>>>> Jianshi
>>>>
>>>> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang <jianshi.huang@gmail.com
>>>> > wrote:
>>>>
>>>>> I got the following error during Spark startup (Yarn-client mode):
>>>>>
>>>>> 14/12/04 19:33:58 INFO Client: Uploading resource
>>>>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
>>>>> ->
>>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
>>>>> expected: file:///
>>>>> at
>>>>> org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
>>>>> at
>>>>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
>>>>> at
>>>>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
>>>>> at
>>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
>>>>> at
>>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
>>>>> at
>>>>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
>>>>> at scala.Option.foreach(Option.scala:236)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
>>>>> at
>>>>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
>>>>> at
>>>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>>>>> at
>>>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
>>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
>>>>> at
>>>>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
>>>>> at $iwC$$iwC.<init>(<console>:9)
>>>>> at $iwC.<init>(<console>:18)
>>>>> at <init>(<console>:20)
>>>>> at .<init>(<console>:24)
>>>>>
>>>>> I'm using latest Spark built from master HEAD yesterday. Is this a bug?
>>>>>
>>>>> --
>>>>> Jianshi Huang
>>>>>
>>>>> LinkedIn: jianshi
>>>>> Twitter: @jshuang
>>>>> Github & Blog: http://huangjs.github.com/
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jianshi Huang
>>>>
>>>> LinkedIn: jianshi
>>>> Twitter: @jshuang
>>>> Github & Blog: http://huangjs.github.com/
>>>>
>>>
>>>
>>>
>>> --
>>> Jianshi Huang
>>>
>>> LinkedIn: jianshi
>>> Twitter: @jshuang
>>> Github & Blog: http://huangjs.github.com/
>>>
>>
>>
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Posted by Jianshi Huang <ji...@gmail.com>.
Correction:
According to Liancheng, this hotfix might be the root cause:
https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce
Jianshi
On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang <ji...@gmail.com>
wrote:
> Looks like the datanucleus*.jar shouldn't appear in the hdfs path in
> Yarn-client mode.
>
> Maybe this patch broke yarn-client.
>
>
> https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53
>
> Jianshi
>
> On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang <ji...@gmail.com>
> wrote:
>
>> Actually my HADOOP_CLASSPATH has already been set to include
>> /etc/hadoop/conf/*
>>
>> export
>> HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
>> classpath)
>>
>> Jianshi
>>
>> On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang <ji...@gmail.com>
>> wrote:
>>
>>> Looks like somehow Spark failed to find the core-site.xml in
>>> /et/hadoop/conf
>>>
>>> I've already set the following env variables:
>>>
>>> export YARN_CONF_DIR=/etc/hadoop/conf
>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>> export HBASE_CONF_DIR=/etc/hbase/conf
>>>
>>> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
>>>
>>> Jianshi
>>>
>>> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang <ji...@gmail.com>
>>> wrote:
>>>
>>>> I got the following error during Spark startup (Yarn-client mode):
>>>>
>>>> 14/12/04 19:33:58 INFO Client: Uploading resource
>>>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
>>>> ->
>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
>>>> expected: file:///
>>>> at
>>>> org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
>>>> at
>>>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
>>>> at
>>>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
>>>> at
>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
>>>> at
>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
>>>> at
>>>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
>>>> at
>>>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
>>>> at
>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
>>>> at
>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
>>>> at scala.Option.foreach(Option.scala:236)
>>>> at
>>>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
>>>> at
>>>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
>>>> at
>>>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
>>>> at
>>>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
>>>> at
>>>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
>>>> at
>>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>>>> at
>>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
>>>> at
>>>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
>>>> at $iwC$$iwC.<init>(<console>:9)
>>>> at $iwC.<init>(<console>:18)
>>>> at <init>(<console>:20)
>>>> at .<init>(<console>:24)
>>>>
>>>> I'm using latest Spark built from master HEAD yesterday. Is this a bug?
>>>>
>>>> --
>>>> Jianshi Huang
>>>>
>>>> LinkedIn: jianshi
>>>> Twitter: @jshuang
>>>> Github & Blog: http://huangjs.github.com/
>>>>
>>>
>>>
>>>
>>> --
>>> Jianshi Huang
>>>
>>> LinkedIn: jianshi
>>> Twitter: @jshuang
>>> Github & Blog: http://huangjs.github.com/
>>>
>>
>>
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Posted by Jianshi Huang <ji...@gmail.com>.
Correction:
According to Liancheng, this hotfix might be the root cause:
https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce
Jianshi
On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang <ji...@gmail.com>
wrote:
> Looks like the datanucleus*.jar shouldn't appear in the hdfs path in
> Yarn-client mode.
>
> Maybe this patch broke yarn-client.
>
>
> https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53
>
> Jianshi
>
> On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang <ji...@gmail.com>
> wrote:
>
>> Actually my HADOOP_CLASSPATH has already been set to include
>> /etc/hadoop/conf/*
>>
>> export
>> HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
>> classpath)
>>
>> Jianshi
>>
>> On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang <ji...@gmail.com>
>> wrote:
>>
>>> Looks like somehow Spark failed to find the core-site.xml in
>>> /et/hadoop/conf
>>>
>>> I've already set the following env variables:
>>>
>>> export YARN_CONF_DIR=/etc/hadoop/conf
>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>> export HBASE_CONF_DIR=/etc/hbase/conf
>>>
>>> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
>>>
>>> Jianshi
>>>
>>> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang <ji...@gmail.com>
>>> wrote:
>>>
>>>> I got the following error during Spark startup (Yarn-client mode):
>>>>
>>>> 14/12/04 19:33:58 INFO Client: Uploading resource
>>>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
>>>> ->
>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
>>>> expected: file:///
>>>> at
>>>> org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
>>>> at
>>>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
>>>> at
>>>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
>>>> at
>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
>>>> at
>>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
>>>> at
>>>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
>>>> at
>>>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
>>>> at
>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
>>>> at
>>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
>>>> at scala.Option.foreach(Option.scala:236)
>>>> at
>>>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
>>>> at
>>>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
>>>> at
>>>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
>>>> at
>>>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
>>>> at
>>>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
>>>> at
>>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>>>> at
>>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
>>>> at
>>>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
>>>> at $iwC$$iwC.<init>(<console>:9)
>>>> at $iwC.<init>(<console>:18)
>>>> at <init>(<console>:20)
>>>> at .<init>(<console>:24)
>>>>
>>>> I'm using latest Spark built from master HEAD yesterday. Is this a bug?
>>>>
>>>> --
>>>> Jianshi Huang
>>>>
>>>> LinkedIn: jianshi
>>>> Twitter: @jshuang
>>>> Github & Blog: http://huangjs.github.com/
>>>>
>>>
>>>
>>>
>>> --
>>> Jianshi Huang
>>>
>>> LinkedIn: jianshi
>>> Twitter: @jshuang
>>> Github & Blog: http://huangjs.github.com/
>>>
>>
>>
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Posted by Jianshi Huang <ji...@gmail.com>.
Looks like the datanucleus*.jar shouldn't appear in the hdfs path in
Yarn-client mode.
Maybe this patch broke yarn-client.
https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53
Jianshi
On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang <ji...@gmail.com>
wrote:
> Actually my HADOOP_CLASSPATH has already been set to include
> /etc/hadoop/conf/*
>
> export
> HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
> classpath)
>
> Jianshi
>
> On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang <ji...@gmail.com>
> wrote:
>
>> Looks like somehow Spark failed to find the core-site.xml in
>> /et/hadoop/conf
>>
>> I've already set the following env variables:
>>
>> export YARN_CONF_DIR=/etc/hadoop/conf
>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>> export HBASE_CONF_DIR=/etc/hbase/conf
>>
>> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
>>
>> Jianshi
>>
>> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang <ji...@gmail.com>
>> wrote:
>>
>>> I got the following error during Spark startup (Yarn-client mode):
>>>
>>> 14/12/04 19:33:58 INFO Client: Uploading resource
>>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
>>> ->
>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
>>> java.lang.IllegalArgumentException: Wrong FS:
>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
>>> expected: file:///
>>> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
>>> at
>>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
>>> at
>>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
>>> at
>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
>>> at
>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
>>> at
>>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
>>> at
>>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
>>> at
>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
>>> at
>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
>>> at scala.Option.foreach(Option.scala:236)
>>> at
>>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
>>> at
>>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
>>> at
>>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
>>> at
>>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
>>> at
>>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
>>> at
>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>>> at
>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
>>> at
>>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
>>> at $iwC$$iwC.<init>(<console>:9)
>>> at $iwC.<init>(<console>:18)
>>> at <init>(<console>:20)
>>> at .<init>(<console>:24)
>>>
>>> I'm using latest Spark built from master HEAD yesterday. Is this a bug?
>>>
>>> --
>>> Jianshi Huang
>>>
>>> LinkedIn: jianshi
>>> Twitter: @jshuang
>>> Github & Blog: http://huangjs.github.com/
>>>
>>
>>
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Posted by Jianshi Huang <ji...@gmail.com>.
Looks like the datanucleus*.jar shouldn't appear in the hdfs path in
Yarn-client mode.
Maybe this patch broke yarn-client.
https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53
Jianshi
On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang <ji...@gmail.com>
wrote:
> Actually my HADOOP_CLASSPATH has already been set to include
> /etc/hadoop/conf/*
>
> export
> HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
> classpath)
>
> Jianshi
>
> On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang <ji...@gmail.com>
> wrote:
>
>> Looks like somehow Spark failed to find the core-site.xml in
>> /et/hadoop/conf
>>
>> I've already set the following env variables:
>>
>> export YARN_CONF_DIR=/etc/hadoop/conf
>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>> export HBASE_CONF_DIR=/etc/hbase/conf
>>
>> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
>>
>> Jianshi
>>
>> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang <ji...@gmail.com>
>> wrote:
>>
>>> I got the following error during Spark startup (Yarn-client mode):
>>>
>>> 14/12/04 19:33:58 INFO Client: Uploading resource
>>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
>>> ->
>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
>>> java.lang.IllegalArgumentException: Wrong FS:
>>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
>>> expected: file:///
>>> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
>>> at
>>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
>>> at
>>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
>>> at
>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
>>> at
>>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
>>> at
>>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
>>> at
>>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
>>> at
>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
>>> at
>>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
>>> at scala.Option.foreach(Option.scala:236)
>>> at
>>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
>>> at
>>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
>>> at
>>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
>>> at
>>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
>>> at
>>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
>>> at
>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>>> at
>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
>>> at
>>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
>>> at $iwC$$iwC.<init>(<console>:9)
>>> at $iwC.<init>(<console>:18)
>>> at <init>(<console>:20)
>>> at .<init>(<console>:24)
>>>
>>> I'm using latest Spark built from master HEAD yesterday. Is this a bug?
>>>
>>> --
>>> Jianshi Huang
>>>
>>> LinkedIn: jianshi
>>> Twitter: @jshuang
>>> Github & Blog: http://huangjs.github.com/
>>>
>>
>>
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Posted by Jianshi Huang <ji...@gmail.com>.
Actually my HADOOP_CLASSPATH has already been set to include
/etc/hadoop/conf/*
export
HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
classpath)
Jianshi
On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang <ji...@gmail.com>
wrote:
> Looks like somehow Spark failed to find the core-site.xml in
> /et/hadoop/conf
>
> I've already set the following env variables:
>
> export YARN_CONF_DIR=/etc/hadoop/conf
> export HADOOP_CONF_DIR=/etc/hadoop/conf
> export HBASE_CONF_DIR=/etc/hbase/conf
>
> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
>
> Jianshi
>
> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang <ji...@gmail.com>
> wrote:
>
>> I got the following error during Spark startup (Yarn-client mode):
>>
>> 14/12/04 19:33:58 INFO Client: Uploading resource
>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
>> ->
>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
>> java.lang.IllegalArgumentException: Wrong FS:
>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
>> expected: file:///
>> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
>> at
>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
>> at
>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
>> at
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
>> at
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
>> at
>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
>> at
>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
>> at
>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
>> at
>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
>> at scala.Option.foreach(Option.scala:236)
>> at
>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
>> at
>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
>> at
>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
>> at
>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
>> at
>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
>> at
>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>> at
>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
>> at
>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
>> at $iwC$$iwC.<init>(<console>:9)
>> at $iwC.<init>(<console>:18)
>> at <init>(<console>:20)
>> at .<init>(<console>:24)
>>
>> I'm using latest Spark built from master HEAD yesterday. Is this a bug?
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Posted by Jianshi Huang <ji...@gmail.com>.
Actually my HADOOP_CLASSPATH has already been set to include
/etc/hadoop/conf/*
export
HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase
classpath)
Jianshi
On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang <ji...@gmail.com>
wrote:
> Looks like somehow Spark failed to find the core-site.xml in
> /et/hadoop/conf
>
> I've already set the following env variables:
>
> export YARN_CONF_DIR=/etc/hadoop/conf
> export HADOOP_CONF_DIR=/etc/hadoop/conf
> export HBASE_CONF_DIR=/etc/hbase/conf
>
> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
>
> Jianshi
>
> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang <ji...@gmail.com>
> wrote:
>
>> I got the following error during Spark startup (Yarn-client mode):
>>
>> 14/12/04 19:33:58 INFO Client: Uploading resource
>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
>> ->
>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
>> java.lang.IllegalArgumentException: Wrong FS:
>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
>> expected: file:///
>> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
>> at
>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
>> at
>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
>> at
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
>> at
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
>> at
>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
>> at
>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
>> at
>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
>> at
>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
>> at scala.Option.foreach(Option.scala:236)
>> at
>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
>> at
>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
>> at
>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
>> at
>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
>> at
>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
>> at
>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
>> at
>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
>> at
>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
>> at $iwC$$iwC.<init>(<console>:9)
>> at $iwC.<init>(<console>:18)
>> at <init>(<console>:20)
>> at .<init>(<console>:24)
>>
>> I'm using latest Spark built from master HEAD yesterday. Is this a bug?
>>
>> --
>> Jianshi Huang
>>
>> LinkedIn: jianshi
>> Twitter: @jshuang
>> Github & Blog: http://huangjs.github.com/
>>
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Posted by Jianshi Huang <ji...@gmail.com>.
Looks like somehow Spark failed to find the core-site.xml in /et/hadoop/conf
I've already set the following env variables:
export YARN_CONF_DIR=/etc/hadoop/conf
export HADOOP_CONF_DIR=/etc/hadoop/conf
export HBASE_CONF_DIR=/etc/hbase/conf
Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
Jianshi
On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang <ji...@gmail.com>
wrote:
> I got the following error during Spark startup (Yarn-client mode):
>
> 14/12/04 19:33:58 INFO Client: Uploading resource
> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
> ->
> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
> java.lang.IllegalArgumentException: Wrong FS:
> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
> expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
> at
> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
> at
> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
> at
> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
> at scala.Option.foreach(Option.scala:236)
> at
> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
> at
> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
> at
> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
> at
> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
> at
> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
> at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
> at
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
> at
> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
> at $iwC$$iwC.<init>(<console>:9)
> at $iwC.<init>(<console>:18)
> at <init>(<console>:20)
> at .<init>(<console>:24)
>
> I'm using latest Spark built from master HEAD yesterday. Is this a bug?
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Posted by Jianshi Huang <ji...@gmail.com>.
Looks like somehow Spark failed to find the core-site.xml in /et/hadoop/conf
I've already set the following env variables:
export YARN_CONF_DIR=/etc/hadoop/conf
export HADOOP_CONF_DIR=/etc/hadoop/conf
export HBASE_CONF_DIR=/etc/hbase/conf
Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH?
Jianshi
On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang <ji...@gmail.com>
wrote:
> I got the following error during Spark startup (Yarn-client mode):
>
> 14/12/04 19:33:58 INFO Client: Uploading resource
> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar
> ->
> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar
> java.lang.IllegalArgumentException: Wrong FS:
> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar,
> expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
> at
> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67)
> at
> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257)
> at
> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242)
> at scala.Option.foreach(Option.scala:236)
> at
> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242)
> at
> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35)
> at
> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350)
> at
> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35)
> at
> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80)
> at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
> at
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:335)
> at
> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
> at $iwC$$iwC.<init>(<console>:9)
> at $iwC.<init>(<console>:18)
> at <init>(<console>:20)
> at .<init>(<console>:24)
>
> I'm using latest Spark built from master HEAD yesterday. Is this a bug?
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
--
Jianshi Huang
LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/