You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@predictionio.apache.org by "Mars Hall (JIRA)" <ji...@apache.org> on 2017/11/09 16:55:01 UTC

[jira] [Created] (PIO-138) Batch predict fails when using a PersistentModel

Mars Hall created PIO-138:
-----------------------------

             Summary: Batch predict fails when using a PersistentModel
                 Key: PIO-138
                 URL: https://issues.apache.org/jira/browse/PIO-138
             Project: PredictionIO
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.12.0-incubating
            Reporter: Mars Hall


Issue based on a PR/issue opened on GitHub:
https://github.com/apache/incubator-predictionio/pull/441

h2. Problem

{quote}pio batchpredict --input /tmp/pio/batchpredict-input.json --output /tmp/pio/batchpredict-output.json

[WARN] [ALSModel] Product factor is not cached. Prediction could be slow.
Exception in thread "main" org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true. {quote}

h2. Root Cause

BatchPredict makes multiple SparkContexts:
https://github.com/apache/incubator-predictionio/blob/v0.12.0-incubating/core/src/main/scala/org/apache/predictionio/workflow/BatchPredict.scala#L160
https://github.com/apache/incubator-predictionio/blob/v0.12.0-incubating/core/src/main/scala/org/apache/predictionio/workflow/BatchPredict.scala#L183

When using a {{PersistentModel}}/{{PersistentModelLoader}}, PredictionIO don't stop the first SparkContext:
https://github.com/apache/incubator-predictionio/blob/v0.12.0-incubating/core/src/main/scala/org/apache/predictionio/controller/Engine.scala#L241-L250

For example, the Recommendation Engine Template uses this technique:
https://github.com/apache/incubator-predictionio-template-recommender/blob/develop/src/main/scala/ALSModel.scala

h2. Solutions?

Due to the variability of SparkContext usage during deploy, how do we ensure a viable SparkContext for running batch queries? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)