You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by samuel281 <sa...@gmail.com> on 2014/02/19 05:35:56 UTC

Unable to submit an application to standalone cluster which on hdfs.

I'm trying to launch application inside the cluster (standalone mode)

According to docs, jar-url can be either file:// or hdfs:// format. (
https://spark.incubator.apache.org/docs/latest/spark-standalone.html)

But, when I tried to run spark-class It seemed unable to parse hdfs://xx
format.

<command>
spark-class org.apache.spark.deploy.Client launch \
    cds-test05:7077 \
    hdfs:///namenode:8020/user/datalab/filename.jar \
    my.package.Runner \
    -i /user/myself/input -o /user/myself/output -m
spark://sparkmaster:7077

<output>
Jar url 'hdfs:///namenode:8020/user/datalab/filename.jar' is not a valid URL
.
Jar must be in URL format (e.g. hdfs://XX, file://XX)

I've found that *ClientArguments class is using java.net
<http://java.net/>.URL class to parse jar-url, and It doesn't support hdfs
protocol.*




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Unable to submit an application to standalone cluster which on hdfs.

Posted by "haikal.pribadi" <ha...@gmail.com>.
How do you remove the validation blocker from the compilation?

Thank you



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730p3574.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Unable to submit an application to standalone cluster which on hdfs.

Posted by Patrick Wendell <pw...@gmail.com>.
Thanks for reporting this - this is a bug with the way it validates
the URL. I'm filing this as a blocker for 0.9.1. If you are able to
compile Spark, try just removing the validation block.

On Tue, Feb 18, 2014 at 10:27 PM, samuel281 <sa...@gmail.com> wrote:
> Actually I tried them both.  (hdfs:///, hdfs://)
> Even I tried test to create java.net.URL instance by writing test code.
>
> /URL test = new URL("hdfs://namenode:8020/path/to/jar");/
>
> And it throws java.net.MalformedURLException. The message says that it
> doesn't support hdfs protocol.
>
> In the source code, ClientArguments just tries to instantiate URL object and
> that's all. No URLStreamHandler either.
> (https://github.com/apache/incubator-spark/blob/v0.9.0-incubating/core/src/main/scala/org/apache/spark/deploy/ClientArguments.scala)
>
> Anybody come across the same issue?
>
>
> Akhil Das wrote
>> It says "*not a valid URL*"
>>
>> *hdfs:///  - Invalid*
>> *hdfs://   - Valid*
>>
>> Hope that helps!
>>
>>
>> Thanks
>> Best Regards.
>>
>>
>> On Wed, Feb 19, 2014 at 10:05 AM, samuel281
>> &lt;
>> samuel281@gmail.com
>> &gt;
>>  wrote:
>>
>>> I'm trying to launch application inside the cluster (standalone mode)
>>>
>>> According to docs, jar-url can be either file:// or hdfs:// format. (
>>> https://spark.incubator.apache.org/docs/latest/spark-standalone.html)
>>>
>>> But, when I tried to run spark-class It seemed unable to parse hdfs://xx
>>> format.
>>>
>>>
>> <command>
>>> spark-class org.apache.spark.deploy.Client launch \
>>>     cds-test05:7077 \
>>>     hdfs:///namenode:8020/user/datalab/filename.jar \
>>>     my.package.Runner \
>>>     -i /user/myself/input -o /user/myself/output -m
>>> spark://sparkmaster:7077
>>>
>>>
>> <output>
>>> Jar url 'hdfs:///namenode:8020/user/datalab/filename.jar' is not a valid
>>> URL.
>>> Jar must be in URL format (e.g. hdfs://XX, file://XX)
>>>
>>> I've found that *ClientArguments class is using java.net
>>>
>> &lt;
>> http://java.net/
>> &gt;
>> .URL class to parse jar-url, and It doesn't support hdfs
>>> protocol.*
>>>
>>> ------------------------------
>>> View this message in context: Unable to submit an application to
>>> standalone cluster which on
>>> hdfs.
>> &lt;
>> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730.html
>> &gt;
>>> Sent from the Apache Spark User List mailing list
>>> archive
>> &lt;
>> http://apache-spark-user-list.1001560.n3.nabble.com/
>> &gt;
>> at
>>> Nabble.com.
>>>
>>
>>
>>
>> --
>> Thanks
>> Best Regards
>
> Quoted from:
> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730p1731.html
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730p1739.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Unable to submit an application to standalone cluster which on hdfs.

Posted by samuel281 <sa...@gmail.com>.
Actually I tried them both.  (hdfs:///, hdfs://)
Even I tried test to create java.net.URL instance by writing test code.

/URL test = new URL("hdfs://namenode:8020/path/to/jar");/

And it throws java.net.MalformedURLException. The message says that it
doesn't support hdfs protocol.

In the source code, ClientArguments just tries to instantiate URL object and
that's all. No URLStreamHandler either.
(https://github.com/apache/incubator-spark/blob/v0.9.0-incubating/core/src/main/scala/org/apache/spark/deploy/ClientArguments.scala)

Anybody come across the same issue?


Akhil Das wrote
> It says "*not a valid URL*"
> 
> *hdfs:///  - Invalid*
> *hdfs://   - Valid*
> 
> Hope that helps!
> 
> 
> Thanks
> Best Regards.
> 
> 
> On Wed, Feb 19, 2014 at 10:05 AM, samuel281 
> &lt;
> samuel281@gmail.com
> &gt;
>  wrote:
> 
>> I'm trying to launch application inside the cluster (standalone mode)
>>
>> According to docs, jar-url can be either file:// or hdfs:// format. (
>> https://spark.incubator.apache.org/docs/latest/spark-standalone.html)
>>
>> But, when I tried to run spark-class It seemed unable to parse hdfs://xx
>> format.
>>
>> 
> <command>
>> spark-class org.apache.spark.deploy.Client launch \
>>     cds-test05:7077 \
>>     hdfs:///namenode:8020/user/datalab/filename.jar \
>>     my.package.Runner \
>>     -i /user/myself/input -o /user/myself/output -m
>> spark://sparkmaster:7077
>>
>> 
> <output>
>> Jar url 'hdfs:///namenode:8020/user/datalab/filename.jar' is not a valid
>> URL.
>> Jar must be in URL format (e.g. hdfs://XX, file://XX)
>>
>> I've found that *ClientArguments class is using java.net
>> 
> &lt;
> http://java.net/
> &gt;
> .URL class to parse jar-url, and It doesn't support hdfs
>> protocol.*
>>
>> ------------------------------
>> View this message in context: Unable to submit an application to
>> standalone cluster which on
>> hdfs.
> &lt;
> http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730.html
> &gt;
>> Sent from the Apache Spark User List mailing list
>> archive
> &lt;
> http://apache-spark-user-list.1001560.n3.nabble.com/
> &gt;
> at
>> Nabble.com.
>>
> 
> 
> 
> -- 
> Thanks
> Best Regards

Quoted from: 
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730p1731.html




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730p1739.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Unable to submit an application to standalone cluster which on hdfs.

Posted by Akhil Das <ak...@mobipulse.in>.
It says "*not a valid URL*"

*hdfs:///  - Invalid*
*hdfs://   - Valid*

Hope that helps!


Thanks
Best Regards.


On Wed, Feb 19, 2014 at 10:05 AM, samuel281 <sa...@gmail.com> wrote:

> I'm trying to launch application inside the cluster (standalone mode)
>
> According to docs, jar-url can be either file:// or hdfs:// format. (
> https://spark.incubator.apache.org/docs/latest/spark-standalone.html)
>
> But, when I tried to run spark-class It seemed unable to parse hdfs://xx
> format.
>
> <command>
> spark-class org.apache.spark.deploy.Client launch \
>     cds-test05:7077 \
>     hdfs:///namenode:8020/user/datalab/filename.jar \
>     my.package.Runner \
>     -i /user/myself/input -o /user/myself/output -m
> spark://sparkmaster:7077
>
> <output>
> Jar url 'hdfs:///namenode:8020/user/datalab/filename.jar' is not a valid
> URL.
> Jar must be in URL format (e.g. hdfs://XX, file://XX)
>
> I've found that *ClientArguments class is using java.net
> <http://java.net/>.URL class to parse jar-url, and It doesn't support hdfs
> protocol.*
>
> ------------------------------
> View this message in context: Unable to submit an application to
> standalone cluster which on hdfs.<http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-submit-an-application-to-standalone-cluster-which-on-hdfs-tp1730.html>
> Sent from the Apache Spark User List mailing list archive<http://apache-spark-user-list.1001560.n3.nabble.com/>at Nabble.com.
>



-- 
Thanks
Best Regards