You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@predictionio.apache.org by George Yarish <gy...@griddynamics.com> on 2018/04/13 22:51:49 UTC

pio deploy without spark context

Hi all,

We use pio engine which doesn't require apache spark in serving time, but
from my understanding anyway sparkContext will be created by "pio deploy"
process by default.
My question is there any way to deploy an engine avoiding creation of spark
application if I don't need it?

Thanks,
George

Re: pio deploy without spark context

Posted by Matthew Tovbin <to...@apache.org>.

Donald,

It would be great to collaborate on that!

- Matthew

On Sat, Apr 14, 2018, 10:23 Pat Ferrel <pa...@occamsmachete.com> wrote:

> The need for Spark at query time depends on the engine. Which are you
> using? The Universal Recommender, which I maintain, does not require Spark
> for queries but uses PIO. We simply don’t use the Spark context so it is
> ignored. To make PIO work you need to have the Spark code accessible but
> that doesn’t mean there must be a Spark cluster, you can  set the Spark
> master to “local” and there are no Spark resources used in the deployed pio
> PredictionServer.
>
> We have infra code to spin up a Spark cluster for training and bring it
> back down afterward. This all works just fine. The UR PredictionServer also
> has no need to be re-deployed since the model is hot-swapped after
> training, Deploy once run forever. And no real requirement for Spark to do
> queries.
>
> So depending on the Engine the requirement for Spark is code level not
> system level.
>
>
> From: Donald Szeto <do...@apache.org> <do...@apache.org>
> Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
> <us...@predictionio.apache.org>
> Date: April 13, 2018 at 4:48:15 PM
> To: user@predictionio.apache.org <us...@predictionio.apache.org>
> <us...@predictionio.apache.org>
> Subject:  Re: pio deploy without spark context
>
> Hi George,
>
> This is unfortunately not possible now without modifying the source code,
> but we are planning to refactor PredictionIO to be runtime-agnostic,
> meaning the engine server would be independent and SparkContext would not
> be created if not necessary.
>
> We will start a discussion on the refactoring soon. You are very welcome
> to add your input then, and any subsequent contribution would be highly
> appreciated.
>
> Regards,
> Donald
>
> On Fri, Apr 13, 2018 at 3:51 PM George Yarish <gy...@griddynamics.com>
> wrote:
>
>> Hi all,
>>
>> We use pio engine which doesn't require apache spark in serving time, but
>> from my understanding anyway sparkContext will be created by "pio deploy"
>> process by default.
>> My question is there any way to deploy an engine avoiding creation of
>> spark application if I don't need it?
>>
>> Thanks,
>> George
>>
>>

Re: pio deploy without spark context

Posted by Pat Ferrel <pa...@occamsmachete.com>.

The need for Spark at query time depends on the engine. Which are you
using? The Universal Recommender, which I maintain, does not require Spark
for queries but uses PIO. We simply don’t use the Spark context so it is
ignored. To make PIO work you need to have the Spark code accessible but
that doesn’t mean there must be a Spark cluster, you can  set the Spark
master to “local” and there are no Spark resources used in the deployed pio
PredictionServer.

We have infra code to spin up a Spark cluster for training and bring it
back down afterward. This all works just fine. The UR PredictionServer also
has no need to be re-deployed since the model is hot-swapped after
training, Deploy once run forever. And no real requirement for Spark to do
queries.

So depending on the Engine the requirement for Spark is code level not
system level.

From: Donald Szeto <do...@apache.org> <do...@apache.org>
Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
<us...@predictionio.apache.org>
Date: April 13, 2018 at 4:48:15 PM
To: user@predictionio.apache.org <us...@predictionio.apache.org>
<us...@predictionio.apache.org>
Subject:  Re: pio deploy without spark context

Hi George,

This is unfortunately not possible now without modifying the source code,
but we are planning to refactor PredictionIO to be runtime-agnostic,
meaning the engine server would be independent and SparkContext would not
be created if not necessary.

We will start a discussion on the refactoring soon. You are very welcome to
add your input then, and any subsequent contribution would be highly
appreciated.

Regards,
Donald

On Fri, Apr 13, 2018 at 3:51 PM George Yarish <gy...@griddynamics.com>
wrote:

> Hi all,
>
> We use pio engine which doesn't require apache spark in serving time, but
> from my understanding anyway sparkContext will be created by "pio deploy"
> process by default.
> My question is there any way to deploy an engine avoiding creation of
> spark application if I don't need it?
>
> Thanks,
> George
>
>

Re: pio deploy without spark context

Posted by Donald Szeto <do...@apache.org>.

Hi George,

This is unfortunately not possible now without modifying the source code,
but we are planning to refactor PredictionIO to be runtime-agnostic,
meaning the engine server would be independent and SparkContext would not
be created if not necessary.

We will start a discussion on the refactoring soon. You are very welcome to
add your input then, and any subsequent contribution would be highly
appreciated.

Regards,
Donald

On Fri, Apr 13, 2018 at 3:51 PM George Yarish <gy...@griddynamics.com>
wrote:

> Hi all,
>
> We use pio engine which doesn't require apache spark in serving time, but
> from my understanding anyway sparkContext will be created by "pio deploy"
> process by default.
> My question is there any way to deploy an engine avoiding creation of
> spark application if I don't need it?
>
> Thanks,
> George
>
>