You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@predictionio.apache.org by Shane Johnson <sh...@liftiq.com> on 2018/07/27 23:11:30 UTC

Optimizing num-cores, driver and executor mem in PIO

Hi team,

We just moved over to deploying our ML prototype to AWS. I currently have
everything configured on a single machine r5d.xlarge. Our training is
taking about an hour when running most things with default settings. Can
someone advise what I can do to leverage the cores differently to split up
the jobs/tasks and speed up the processing. If I migrate to the
r5d.4xlarge, is it reasonable to think that the processing will speed to be
4x faster because I am moving from 4 cores to 16 cores? Are there
parameters I need to set or will spark make the best use of the cores and
memory automatically. I am using a simple randomforest model in the lead
scoring template.

Perhaps I need to adjust the spark-config or the spark-submit parameters.
Can someone help me understand how driver mem, executor mem and num cores
play together and how I should think about them and other params to
optimize the training process given that I am still running on a single
machine and not a cluster?

I am trying to understand the optimal setup for training based on the
r5d.xlarge
pio train -- --driver-memory 32G --executor-memory 32G --num-cores 4

I would like to move to a r5d.4xlarge to get the training to 15 minutes or
faster once I can get a better handle on tuning spark. Thank you for the
help.

Best,

Shane






*Shane Johnson | LIFT IQ*
*Founder | CEO*

*www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
<sh...@liftiq.com>*
mobile: (801) 360-3350
LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
<https://twitter.com/SWaldenJ> |  Facebook
<https://www.facebook.com/shane.johnson.71653>

Re: Optimizing num-cores, driver and executor mem in PIO

Posted by Shane Johnson <sh...@liftiq.com>.

Hi team,

I think I have been digging into this over the weekend and believe I am on
the right path of speeding up our training through the following settings
on a single machine. (16 cores, 64g RAM). Now we are moving to a cluster
and are having other issues. I will post a separate thread on our cluster
issues. I think my initial question on this thread has been solved by
reviewing this article.
https://stackoverflow.com/questions/37871194/how-to-tune-spark-executor-number-cores-and-executor-memory

I am now using a m5-4xlarge with the following spark-submit parameters

pio train -- --executor-cores 5 --executor-memory 19g --num-executors 3
--driver-memory 48g


Thanks

*Shane Johnson | LIFT IQ*
*Founder | CEO*

*www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
<sh...@liftiq.com>*
mobile: (801) 360-3350
LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
<https://twitter.com/SWaldenJ> |  Facebook
<https://www.facebook.com/shane.johnson.71653>



On Fri, Jul 27, 2018 at 5:11 PM, Shane Johnson <sh...@liftiq.com> wrote:

> Hi team,
>
> We just moved over to deploying our ML prototype to AWS. I currently have
> everything configured on a single machine r5d.xlarge. Our training is
> taking about an hour when running most things with default settings. Can
> someone advise what I can do to leverage the cores differently to split up
> the jobs/tasks and speed up the processing. If I migrate to the
> r5d.4xlarge, is it reasonable to think that the processing will speed to be
> 4x faster because I am moving from 4 cores to 16 cores? Are there
> parameters I need to set or will spark make the best use of the cores and
> memory automatically. I am using a simple randomforest model in the lead
> scoring template.
>
> Perhaps I need to adjust the spark-config or the spark-submit parameters.
> Can someone help me understand how driver mem, executor mem and num cores
> play together and how I should think about them and other params to
> optimize the training process given that I am still running on a single
> machine and not a cluster?
>
> I am trying to understand the optimal setup for training based on the
> r5d.xlarge
> pio train -- --driver-memory 32G --executor-memory 32G --num-cores 4
>
> I would like to move to a r5d.4xlarge to get the training to 15 minutes or
> faster once I can get a better handle on tuning spark. Thank you for the
> help.
>
> Best,
>
> Shane
>
>
>
>
>
>
> *Shane Johnson | LIFT IQ*
> *Founder | CEO*
>
> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
> <sh...@liftiq.com>*
> mobile: (801) 360-3350
> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
> <https://twitter.com/SWaldenJ> |  Facebook
> <https://www.facebook.com/shane.johnson.71653>
>
>
>