You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by 朱旻 <zz...@126.com> on 2016/05/10 07:51:41 UTC

spark uploading resource error

hi all:
I found a problem using spark .
WHEN I use spark-submit to launch a task. it works


spark-submit --num-executors 8 --executor-memory 8G --class com.icbc.nss.spark.PfsjnlSplit  --master yarn-cluster /home/nssbatch/nss_schedual/jar/SparkBigtableJoinSqlJava.jar /user/nss/nss-20151018-pfsjnl-004_024.txt /user/nss/output_join /user/nss/output_join2


but when i use the command created by spark-class  as below


/home/nssbatch/huaweiclient/hadoopclient/JDK/jdk/bin/java -Djava.security.krb5.conf=/home/nssbatch/huaweiclient/hadoopclient/KrbClient/kerberos/var/krb5kdc/krb5.conf -Dzookeeper.server.principal=zookeeper/hadoop.hadoop.com -Djava.security.auth.login.config=/home/nssbatch/huaweiclient/hadoopclient/Spark/adapter/client/controller/jaas.conf -Dzookeeper.kinit=/home/nssbatch/huaweiclient/hadoopclient/KrbClient/kerberos/bin/kinit -cp /home/nssbatch/huaweiclient/hadoopclient/Spark/spark/conf/:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/spark-assembly-1.3.0-hadoop2.7.1.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-core-3.2.10.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-rdbms-3.2.9.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-api-jdo-3.2.6.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/conf/:/home/nssbatch/huaweiclient/hadoopclient/Yarn/config/ org.apache.spark.deploy.SparkSubmit --master yarn-cluster --class com.icbc.nss.spark.PfsjnlSplit --num-executors 8 --executor-memory 8G /home/nssbatch/nss_schedual/jar/SparkBigtableJoinSqlJava.jar /user/nss/nss-20151018-pfsjnl-004_024.txt /user/nss/output_join /user/nss/output_join2


it didn't work. 
i compare the log.and found that: 


16/05/10 22:23:24 INFO Client: Uploading resource file:/tmp/spark-a4457754-7183-44ce-bd0d-32a071757c92/__hadoop_conf__4372868703234608846.zip -> hdfs://hacluster/user/nss/.sparkStaging/application_1462442311990_0057/__hadoop_conf__4372868703234608846.zip  


the conf_file uploaded into hdfs was different.


why is this happened?
where can i find the resource file to be uploading?

Re:Re: Re: spark uploading resource error

Posted by 朱旻 <zz...@126.com>.

thanks!
i solved the problem.
spark-submit changed the HADOOP_CONF_DIR to spark/conf and was corrent
but using java *****...  didn't change the HADOOP_CONF_DIR. it still use hadoop/etc/hadoop.






At 2016-05-10 16:39:47, "Saisai Shao" <sa...@gmail.com> wrote:

The code is in Client.scala under yarn sub-module (see the below link). Maybe you need to check the vendor version about their changes to the Apache Spark code.


https://github.com/apache/spark/blob/branch-1.3/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala



Thanks
Saisai


On Tue, May 10, 2016 at 4:17 PM, 朱旻 <zz...@126.com> wrote:




it was a product sold by huawei . name is FusionInsight. it says spark was 1.3 with hadoop 2.7.1


where can i find the code or config file which define the files to be uploaded?



At 2016-05-10 16:06:05, "Saisai Shao" <sa...@gmail.com> wrote:

What is the version of Spark are you using? From my understanding, there's no code in yarn#client will upload "__hadoop_conf__" into distributed cache.






On Tue, May 10, 2016 at 3:51 PM, 朱旻 <zz...@126.com> wrote:

hi all:
I found a problem using spark .
WHEN I use spark-submit to launch a task. it works


spark-submit --num-executors 8 --executor-memory 8G --class com.icbc.nss.spark.PfsjnlSplit  --master yarn-cluster /home/nssbatch/nss_schedual/jar/SparkBigtableJoinSqlJava.jar /user/nss/nss-20151018-pfsjnl-004_024.txt /user/nss/output_join /user/nss/output_join2


but when i use the command created by spark-class  as below


/home/nssbatch/huaweiclient/hadoopclient/JDK/jdk/bin/java -Djava.security.krb5.conf=/home/nssbatch/huaweiclient/hadoopclient/KrbClient/kerberos/var/krb5kdc/krb5.conf -Dzookeeper.server.principal=zookeeper/hadoop.hadoop.com -Djava.security.auth.login.config=/home/nssbatch/huaweiclient/hadoopclient/Spark/adapter/client/controller/jaas.conf -Dzookeeper.kinit=/home/nssbatch/huaweiclient/hadoopclient/KrbClient/kerberos/bin/kinit -cp /home/nssbatch/huaweiclient/hadoopclient/Spark/spark/conf/:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/spark-assembly-1.3.0-hadoop2.7.1.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-core-3.2.10.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-rdbms-3.2.9.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-api-jdo-3.2.6.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/conf/:/home/nssbatch/huaweiclient/hadoopclient/Yarn/config/ org.apache.spark.deploy.SparkSubmit --master yarn-cluster --class com.icbc.nss.spark.PfsjnlSplit --num-executors 8 --executor-memory 8G /home/nssbatch/nss_schedual/jar/SparkBigtableJoinSqlJava.jar /user/nss/nss-20151018-pfsjnl-004_024.txt /user/nss/output_join /user/nss/output_join2


it didn't work. 
i compare the log.and found that: 


16/05/10 22:23:24 INFO Client: Uploading resource file:/tmp/spark-a4457754-7183-44ce-bd0d-32a071757c92/__hadoop_conf__4372868703234608846.zip -> hdfs://hacluster/user/nss/.sparkStaging/application_1462442311990_0057/__hadoop_conf__4372868703234608846.zip  


the conf_file uploaded into hdfs was different.


why is this happened?
where can i find the resource file to be uploading?

Re: Re: spark uploading resource error

Posted by Saisai Shao <sa...@gmail.com>.

The code is in Client.scala under yarn sub-module (see the below link).
Maybe you need to check the vendor version about their changes to the
Apache Spark code.

https://github.com/apache/spark/blob/branch-1.3/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

Thanks
Saisai

On Tue, May 10, 2016 at 4:17 PM, 朱旻 <zz...@126.com> wrote:

>
>
> it was a product sold by huawei . name is FusionInsight. it says spark was
> 1.3 with hadoop 2.7.1
>
> where can i find the code or config file which define the files to be
> uploaded?
>
>
> At 2016-05-10 16:06:05, "Saisai Shao" <sa...@gmail.com> wrote:
>
> What is the version of Spark are you using? From my understanding, there's
> no code in yarn#client will upload "__hadoop_conf__" into distributed cache.
>
>
>
> On Tue, May 10, 2016 at 3:51 PM, 朱旻 <zz...@126.com> wrote:
>
>> hi all:
>> I found a problem using spark .
>> WHEN I use spark-submit to launch a task. it works
>>
>> *spark-submit --num-executors 8 --executor-memory 8G --class
>> com.icbc.nss.spark.PfsjnlSplit  --master yarn-cluster
>> /home/nssbatch/nss_schedual/jar/SparkBigtableJoinSqlJava.jar
>> /user/nss/nss-20151018-pfsjnl-004_024.txt /user/nss/output_join
>> /user/nss/output_join2*
>>
>> but when i use the command created by spark-class  as below
>>
>> */home/nssbatch/huaweiclient/hadoopclient/JDK/jdk/bin/java
>> -Djava.security.krb5.conf=/home/nssbatch/huaweiclient/hadoopclient/KrbClient/kerberos/var/krb5kdc/krb5.conf
>> -Dzookeeper.server.principal=zookeeper/hadoop.hadoop.com
>> <http://hadoop.hadoop.com>
>> -Djava.security.auth.login.config=/home/nssbatch/huaweiclient/hadoopclient/Spark/adapter/client/controller/jaas.conf
>> -Dzookeeper.kinit=/home/nssbatch/huaweiclient/hadoopclient/KrbClient/kerberos/bin/kinit
>> -cp
>> /home/nssbatch/huaweiclient/hadoopclient/Spark/spark/conf/:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/spark-assembly-1.3.0-hadoop2.7.1.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-core-3.2.10.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-rdbms-3.2.9.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-api-jdo-3.2.6.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/conf/:/home/nssbatch/huaweiclient/hadoopclient/Yarn/config/
>> org.apache.spark.deploy.SparkSubmit --master yarn-cluster --class
>> com.icbc.nss.spark.PfsjnlSplit --num-executors 8 --executor-memory 8G
>> /home/nssbatch/nss_schedual/jar/SparkBigtableJoinSqlJava.jar
>> /user/nss/nss-20151018-pfsjnl-004_024.txt /user/nss/output_join
>> /user/nss/output_join2*
>>
>> it didn't work.
>> i compare the log.and found that:
>>
>> 16/05/10 22:23:24 INFO Client: Uploading resource
>> file:/tmp/spark-a4457754-7183-44ce-bd0d-32a071757c92/__hadoop_conf__4372868703234608846.zip
>> ->
>> hdfs://hacluster/user/nss/.sparkStaging/application_1462442311990_0057/__hadoop_conf__4372868703234608846.zip
>>
>>
>> the conf_file uploaded into hdfs was different.
>>
>> why is this happened?
>> where can i find the resource file to be uploading?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
>
>

Re:Re: spark uploading resource error

Posted by 朱旻 <zz...@126.com>.



it was a product sold by huawei . name is FusionInsight. it says spark was 1.3 with hadoop 2.7.1


where can i find the code or config file which define the files to be uploaded?



At 2016-05-10 16:06:05, "Saisai Shao" <sa...@gmail.com> wrote:

What is the version of Spark are you using? From my understanding, there's no code in yarn#client will upload "__hadoop_conf__" into distributed cache.






On Tue, May 10, 2016 at 3:51 PM, 朱旻 <zz...@126.com> wrote:

hi all:
I found a problem using spark .
WHEN I use spark-submit to launch a task. it works


spark-submit --num-executors 8 --executor-memory 8G --class com.icbc.nss.spark.PfsjnlSplit  --master yarn-cluster /home/nssbatch/nss_schedual/jar/SparkBigtableJoinSqlJava.jar /user/nss/nss-20151018-pfsjnl-004_024.txt /user/nss/output_join /user/nss/output_join2


but when i use the command created by spark-class  as below


/home/nssbatch/huaweiclient/hadoopclient/JDK/jdk/bin/java -Djava.security.krb5.conf=/home/nssbatch/huaweiclient/hadoopclient/KrbClient/kerberos/var/krb5kdc/krb5.conf -Dzookeeper.server.principal=zookeeper/hadoop.hadoop.com -Djava.security.auth.login.config=/home/nssbatch/huaweiclient/hadoopclient/Spark/adapter/client/controller/jaas.conf -Dzookeeper.kinit=/home/nssbatch/huaweiclient/hadoopclient/KrbClient/kerberos/bin/kinit -cp /home/nssbatch/huaweiclient/hadoopclient/Spark/spark/conf/:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/spark-assembly-1.3.0-hadoop2.7.1.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-core-3.2.10.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-rdbms-3.2.9.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-api-jdo-3.2.6.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/conf/:/home/nssbatch/huaweiclient/hadoopclient/Yarn/config/ org.apache.spark.deploy.SparkSubmit --master yarn-cluster --class com.icbc.nss.spark.PfsjnlSplit --num-executors 8 --executor-memory 8G /home/nssbatch/nss_schedual/jar/SparkBigtableJoinSqlJava.jar /user/nss/nss-20151018-pfsjnl-004_024.txt /user/nss/output_join /user/nss/output_join2


it didn't work. 
i compare the log.and found that: 


16/05/10 22:23:24 INFO Client: Uploading resource file:/tmp/spark-a4457754-7183-44ce-bd0d-32a071757c92/__hadoop_conf__4372868703234608846.zip -> hdfs://hacluster/user/nss/.sparkStaging/application_1462442311990_0057/__hadoop_conf__4372868703234608846.zip  


the conf_file uploaded into hdfs was different.


why is this happened?
where can i find the resource file to be uploading?

Re: spark uploading resource error

Posted by Saisai Shao <sa...@gmail.com>.

What is the version of Spark are you using? From my understanding, there's
no code in yarn#client will upload "__hadoop_conf__" into distributed cache.



On Tue, May 10, 2016 at 3:51 PM, 朱旻 <zz...@126.com> wrote:

> hi all:
> I found a problem using spark .
> WHEN I use spark-submit to launch a task. it works
>
> *spark-submit --num-executors 8 --executor-memory 8G --class
> com.icbc.nss.spark.PfsjnlSplit  --master yarn-cluster
> /home/nssbatch/nss_schedual/jar/SparkBigtableJoinSqlJava.jar
> /user/nss/nss-20151018-pfsjnl-004_024.txt /user/nss/output_join
> /user/nss/output_join2*
>
> but when i use the command created by spark-class  as below
>
> */home/nssbatch/huaweiclient/hadoopclient/JDK/jdk/bin/java
> -Djava.security.krb5.conf=/home/nssbatch/huaweiclient/hadoopclient/KrbClient/kerberos/var/krb5kdc/krb5.conf
> -Dzookeeper.server.principal=zookeeper/hadoop.hadoop.com
> <http://hadoop.hadoop.com>
> -Djava.security.auth.login.config=/home/nssbatch/huaweiclient/hadoopclient/Spark/adapter/client/controller/jaas.conf
> -Dzookeeper.kinit=/home/nssbatch/huaweiclient/hadoopclient/KrbClient/kerberos/bin/kinit
> -cp
> /home/nssbatch/huaweiclient/hadoopclient/Spark/spark/conf/:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/spark-assembly-1.3.0-hadoop2.7.1.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-core-3.2.10.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-rdbms-3.2.9.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/lib/datanucleus-api-jdo-3.2.6.jar:/home/nssbatch/huaweiclient/hadoopclient/Spark/spark/conf/:/home/nssbatch/huaweiclient/hadoopclient/Yarn/config/
> org.apache.spark.deploy.SparkSubmit --master yarn-cluster --class
> com.icbc.nss.spark.PfsjnlSplit --num-executors 8 --executor-memory 8G
> /home/nssbatch/nss_schedual/jar/SparkBigtableJoinSqlJava.jar
> /user/nss/nss-20151018-pfsjnl-004_024.txt /user/nss/output_join
> /user/nss/output_join2*
>
> it didn't work.
> i compare the log.and found that:
>
> 16/05/10 22:23:24 INFO Client: Uploading resource
> file:/tmp/spark-a4457754-7183-44ce-bd0d-32a071757c92/__hadoop_conf__4372868703234608846.zip
> ->
> hdfs://hacluster/user/nss/.sparkStaging/application_1462442311990_0057/__hadoop_conf__4372868703234608846.zip
>
>
> the conf_file uploaded into hdfs was different.
>
> why is this happened?
> where can i find the resource file to be uploading?
>
>
>
>
>
>
>
>
>
>
>
>