You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@predictionio.apache.org by Digambar Bhat <di...@gmail.com> on 2016/09/06 04:05:08 UTC

Re: Setup PredictionIO for large events

Update please..

On 30-Aug-2016 8:06 pm, "Digambar Bhat" <di...@gmail.com> wrote:

> I am using Universal Recommender.
>
> On 30-Aug-2016 8:05 pm, "Pat Ferrel" <pa...@occamsmachete.com> wrote:
>
>> Training time is also template dependent, what template are you using?
>>
>> On Aug 30, 2016, at 12:21 AM, Digambar Bhat <di...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> I am using PredictionIO since last one  year. It's working fine for me.
>>
>> Earlier importing, training was working flawlessly. But now training is
>> very slow as events are increased. Training almost taking 9-10 hours.
>>
>> Currently, events are about 15 million and items are about 10 million.
>>
>> Architecture is like below:
>> Spark and elastic search is on two machines. Hadoop and hbase is on
>> another two separate machines.
>>
>> Each machine has following configuration:
>> 160GB ram, CPUs 40, Cores per socket 10, cpu MHz 3000
>>
>> So please let me know what is right configuration for such large events.
>> Also let me know what possibility should I consider as my events are going
>> to increase to billion. Will it work for such large data set?
>>
>> Thanks in advance.
>>
>> Thanks,
>> Digambar
>>
>>

Re: Setup PredictionIO for large events

Posted by Pat Ferrel <pa...@occamsmachete.com>.

If the question is about training time post them to the UR support forum here: https://groups.google.com/forum/#!forum/actionml-user <https://groups.google.com/forum/#!forum/actionml-user>

The best way to answer this, as I said below is to capture the Spark GUI timeline output. It will show down to the task how long things are taking. There are several things that can be bottlenecks like reading from HBase, writing to ES, the math itself. The timeline will have all these broken out.

Spark does not persist the logs after the job is done so you will need to tell it to persist in the job params in order to examine them after completion.

There are many way to set log persistence so you can google that. To use the PIO CLI set spark.eventLog.enabled to true:

pio train -- --conf spark.eventLog.enabled=true

notice that -- is a separator so put any pio options before it in the command line, everything after is passed raw to SparkSubmit so see that for further options.

You will find the timeline by clicking the finished job on the from page of the GUI then expanding the “timeline” link at the top of the job page.

On Sep 5, 2016, at 11:08 PM, Digambar Bhat <di...@gmail.com> wrote:

Thanks Tom for reply.

I checked no. of Cores. There two CPUs with 10 cores of each. Also virtualization is enabled so we get 40 CPUs in total. And number of regions for app table is 2.

So may I know how to increase regions for app table?

On 06-Sep-2016 10:40 am, "Tom Chan" <yukhei.chan@gmail.com <ma...@gmail.com>> wrote:
One quick thing to check is the number of regions in the HBase table for your app. If it's less than the number of cores you have then you won't be utilizing all computing power. Hope this helps.

Tom

On Sep 5, 2016 9:05 PM, "Digambar Bhat" <digambarbhat14@gmail.com <ma...@gmail.com>> wrote:
Update please..

On 30-Aug-2016 8:06 pm, "Digambar Bhat" <digambarbhat14@gmail.com <ma...@gmail.com>> wrote:
I am using Universal Recommender.

On 30-Aug-2016 8:05 pm, "Pat Ferrel" <pat@occamsmachete.com <ma...@occamsmachete.com>> wrote:
Training time is also template dependent, what template are you using?

On Aug 30, 2016, at 12:21 AM, Digambar Bhat <digambarbhat14@gmail.com <ma...@gmail.com>> wrote:

Hello,

I am using PredictionIO since last one year. It's working fine for me.

Earlier importing, training was working flawlessly. But now training is very slow as events are increased. Training almost taking 9-10 hours.

Currently, events are about 15 million and items are about 10 million.

Architecture is like below:
Spark and elastic search is on two machines. Hadoop and hbase is on another two separate machines.

Each machine has following configuration:
160GB ram, CPUs 40, Cores per socket 10, cpu MHz 3000

So please let me know what is right configuration for such large events. Also let me know what possibility should I consider as my events are going to increase to billion. Will it work for such large data set?

Thanks in advance.

Thanks,
Digambar