You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Radoslaw Gasiorek (JIRA)" <ji...@apache.org> on 2016/06/13 12:43:20 UTC

[jira] [Comment Edited] (SPARK-8546) PMML export for Naive Bayes

    [ https://issues.apache.org/jira/browse/SPARK-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327167#comment-15327167 ] 

Radoslaw Gasiorek edited comment on SPARK-8546 at 6/13/16 12:43 PM:
--------------------------------------------------------------------

hi there, [~josephkb], [~apachespark]
We would like to use Mllib built models to classify outside spark therefore without Spark context available. We would like to export the models built in spark into PMML format, that then would be read by a stand alone java application without spark context (but with Mllib jar). 
The java application would load the model from the PMML file and would use the model to 'predict'  or rather 'classify' the new data we get. 
This feature would enable us to proceed without big architectural and operational changes, without this feature we might need get the the sparkContext available to the standalone application that would be bigger operational and architectural overhead.

We might need to use the plain java serialization for the proof of concept anyways, but surely not for produtionized product.

Can we prioritize this feature as well as https://issues.apache.org/jira/browse/SPARK-8542 and https://issues.apache.org/jira/browse/SPARK-8543 ?
What would be LOE and EAT for these?
thanks guys in advance for responses, and feedback.


was (Author: rgasiorek):
hi there, [~josephkb]
We would like to use Mllib built models to classify outside spark therefore without Spark context available. We would like to export the models built in spark into PMML format, that then would be read by a stand alone java application without spark context (but with Mllib jar). 
The java application would load the model from the PMML file and would use the model to 'predict'  or rather 'classify' the new data we get. 
This feature would enable us to proceed without big architectural and operational changes, without this feature we might need get the the sparkContext available to the standalone application that would be bigger operational and architectural overhead.

We might need to use the plain java serialization for the proof of concept anyways, but surely not for produtionized product.

Can we prioritize this feature as well as https://issues.apache.org/jira/browse/SPARK-8542 and https://issues.apache.org/jira/browse/SPARK-8543 ?
What would be LOE and EAT for these?
thanks guys in advance for responses, and feedback.

> PMML export for Naive Bayes
> ---------------------------
>
>                 Key: SPARK-8546
>                 URL: https://issues.apache.org/jira/browse/SPARK-8546
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Joseph K. Bradley
>            Assignee: Xusen Yin
>            Priority: Minor
>
> The naive Bayes section of PMML standard can be found at http://www.dmg.org/v4-1/NaiveBayes.html. We should first figure out how to generate PMML for both binomial and multinomial naive Bayes models using JPMML (maybe [~vfed] can help).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org