You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Aakash Basu <aa...@gmail.com> on 2018/09/06 10:05:53 UTC
XGBoost Not distributing on cluster having more than 1 worker
Hi,
We're trying to use the XGBoost package from DMLC, it runs successfully on
a standalone machine, but it gets stuck whenever there is 2 or more worker.
PFA:
Code Filename: test.py
Data: trainvorg.csv
Spark Submit command: *spark-submit --master spark://192.168.80.10:7077
<http://192.168.80.10:7077> --jars "$SPARK_HOME/jars/*.jar" --num-executors
2 --executor-cores 5 --executor-memory 10G --driver-cores 5 --driver-memory
25G --conf spark.sql.shuffle.partitions=100 --conf
spark.driver.maxResultSize=2G --conf
"spark.executor.extraJavaOptions=-XX:+UseG1GC" --conf
spark.default.parallelism=8 --conf
spark.scheduler.listenerbus.eventqueue.capacity=20000 /appdata/test.py*
Issue being faced:
[image: Screen Shot 2018-09-04 at 5.34.31 PM.png]
Any help?
Thanks,
Aakash.
Re: XGBoost Not distributing on cluster having more than 1 worker
Posted by Aakash Basu <aa...@gmail.com>.
Hi all,
This is the error which is the reason behind these retries and failures.
Can anyone help understanding as to why it happens and the probably fix for
this?
[image: Screen Shot 2018-09-06 at 4.40.31 PM.png]
Thanks,
Aakash.
On Thu, Sep 6, 2018 at 3:35 PM Aakash Basu <aa...@gmail.com>
wrote:
> Hi,
>
> We're trying to use the XGBoost package from DMLC, it runs successfully on
> a standalone machine, but it gets stuck whenever there is 2 or more worker.
>
> PFA:
> Code Filename: test.py
> Data: trainvorg.csv
>
> Spark Submit command: *spark-submit --master spark://192.168.80.10:7077
> <http://192.168.80.10:7077> --jars "$SPARK_HOME/jars/*.jar" --num-executors
> 2 --executor-cores 5 --executor-memory 10G --driver-cores 5 --driver-memory
> 25G --conf spark.sql.shuffle.partitions=100 --conf
> spark.driver.maxResultSize=2G --conf
> "spark.executor.extraJavaOptions=-XX:+UseG1GC" --conf
> spark.default.parallelism=8 --conf
> spark.scheduler.listenerbus.eventqueue.capacity=20000 /appdata/test.py*
>
> Issue being faced:
>
> [image: Screen Shot 2018-09-04 at 5.34.31 PM.png]
> Any help?
>
> Thanks,
> Aakash.
>