You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@predictionio.apache.org by amal kumar <am...@gmail.com> on 2016/10/17 08:43:42 UTC

PredictionIO with Remote Spark | pio train error

Hi,

I am trying to use Spark installed on a remote cluster, other than
PredictionIO( intalled on ec2).

As per my understanding, I have done the below changes.
1. installed a matching version of Spark locally
2. updated the  SPARK_HOME in conf/pio-env.sh to point to Spark (installed
locally)

'pio status' is success and also, I am able to start the event server 'pio
eventserver --ip 0.0.0.0 &'


But, I am getting error while training the model.

*$ pio train -- --master spark://MYSPARKCLUSTER:7077*

*[INFO] [Remoting] Starting remoting*
*[INFO] [Remoting] Remoting started; listening on addresses
:[akka.tcp://sparkDriverActorSystem@172.31.6.92:35117
<http://sparkDriverActorSystem@172.31.6.92:35117>]*
*[WARN] [AppClient$ClientEndpoint] Failed to connect to
master MYSPARKCLUSTER:7077*
*[WARN] [AppClient$ClientEndpoint] Failed to connect to
master MYSPARKCLUSTER:7077*
*[WARN] [AppClient$ClientEndpoint] Failed to connect to
master MYSPARKCLUSTER:7077*
*[ERROR] [SparkDeploySchedulerBackend] Application has been killed. Reason:
All masters are unresponsive! Giving up.*
*[WARN] [SparkDeploySchedulerBackend] Application ID is not initialized
yet.*
*[WARN] [AppClient$ClientEndpoint] Failed to connect to
master MYSPARKCLUSTER:7077*
*[WARN] [AppClient$ClientEndpoint] Drop UnregisterApplication(null) because
has not yet connected to master*
*[ERROR] [SparkContext] Error initializing SparkContext.*



Can you advice, what am I missing here?


Thanks,
Amal Kumar

Re: PredictionIO with Remote Spark | pio train error

Posted by Donald Szeto <do...@apache.org>.
Hi Amal,

For yarn-cluster mode, please use the --scratch-ufo argument and point it
to a location on HDFS where you have write access to. PIO will copy the
necessary files to the HDFS location for your yarn-cluster Spark driver to
read.

Regards,
Donald

On Tuesday, October 18, 2016, amal kumar <am...@gmail.com> wrote:

> Hi,
>
> We are using Spark on yarn and after referring the below URL, we have been
> able to submit the jobs to yarn from remote machine (i.e. PredictionIO
> Server).
>
> http://theckang.com/2015/remote-spark-jobs-on-yarn/
>
> 1. copied core-site.xml and yarn-site.xml from Yarn cluster onto remote
> machine (i.e. PredictionIO Server)
> 2. set the HADOOP_CONF_DIR environment variable in spark-env.sh (locally
> installed copy) on the remote machine to locate the files core-site.xml and
> yarn-site.xml
>
>
> Now, when I am trying to train the model using the below command, I get a
> new error.
>
> pio train -- --master yarn-cluster
>
>
> Error:
> [ERROR] [CreateWorkflow$] Error reading from file: File
> file:/home/user/PredictionIO/SimilarProductRecommendation/engine.json
> does not exist. Aborting workflow
>
>
> I also tried to pass the file path, but no luck.
>
> pio train -- --master yarn-cluster  --files file:/home/user/PredictionIO/
> SimilarProductRecommendation/engine.json
>
>
> Thanks,
> Amal
>

Re: PredictionIO with Remote Spark | pio train error

Posted by amal kumar <am...@gmail.com>.
Hi,

We are using Spark on yarn and after referring the below URL, we have been
able to submit the jobs to yarn from remote machine (i.e. PredictionIO
Server).

http://theckang.com/2015/remote-spark-jobs-on-yarn/

1. copied core-site.xml and yarn-site.xml from Yarn cluster onto remote
machine (i.e. PredictionIO Server)
2. set the HADOOP_CONF_DIR environment variable in spark-env.sh (locally
installed copy) on the remote machine to locate the files core-site.xml and
yarn-site.xml


Now, when I am trying to train the model using the below command, I get a
new error.

pio train -- --master yarn-cluster


Error:
[ERROR] [CreateWorkflow$] Error reading from file: File
file:/home/user/PredictionIO/SimilarProductRecommendation/engine.json does
not exist. Aborting workflow


I also tried to pass the file path, but no luck.

pio train -- --master yarn-cluster  --files
file:/home/user/PredictionIO/SimilarProductRecommendation/engine.json


Thanks,
Amal

Re: PredictionIO with Remote Spark | pio train error

Posted by Chan Lee <ch...@gmail.com>.
Hi Amal,

It seems that you're using a standalone Spark cluster. In this case,
instead of `pio eventserver &`, you would have to run `pio-start-all` to
start the master and slave processes. If you need to modify the default
settings, you can refer to here for further info (
http://spark.apache.org/docs/latest/spark-standalone.html) and start the
master/slave process yourself.

Chan

On Mon, Oct 17, 2016 at 1:43 AM, amal kumar <am...@gmail.com>
wrote:

> Hi,
>
> I am trying to use Spark installed on a remote cluster, other than
> PredictionIO( intalled on ec2).
>
> As per my understanding, I have done the below changes.
> 1. installed a matching version of Spark locally
> 2. updated the  SPARK_HOME in conf/pio-env.sh to point to Spark (installed
> locally)
>
> 'pio status' is success and also, I am able to start the event server 'pio
> eventserver --ip 0.0.0.0 &'
>
>
> But, I am getting error while training the model.
>
> *$ pio train -- --master spark://MYSPARKCLUSTER:7077*
>
> *[INFO] [Remoting] Starting remoting*
> *[INFO] [Remoting] Remoting started; listening on addresses
> :[akka.tcp://sparkDriverActorSystem@172.31.6.92:35117
> <http://sparkDriverActorSystem@172.31.6.92:35117>]*
> *[WARN] [AppClient$ClientEndpoint] Failed to connect to
> master MYSPARKCLUSTER:7077*
> *[WARN] [AppClient$ClientEndpoint] Failed to connect to
> master MYSPARKCLUSTER:7077*
> *[WARN] [AppClient$ClientEndpoint] Failed to connect to
> master MYSPARKCLUSTER:7077*
> *[ERROR] [SparkDeploySchedulerBackend] Application has been killed.
> Reason: All masters are unresponsive! Giving up.*
> *[WARN] [SparkDeploySchedulerBackend] Application ID is not initialized
> yet.*
> *[WARN] [AppClient$ClientEndpoint] Failed to connect to
> master MYSPARKCLUSTER:7077*
> *[WARN] [AppClient$ClientEndpoint] Drop UnregisterApplication(null)
> because has not yet connected to master*
> *[ERROR] [SparkContext] Error initializing SparkContext.*
>
>
>
> Can you advice, what am I missing here?
>
>
> Thanks,
> Amal Kumar
>