You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ambari.apache.org by "Ananda Verma (JIRA)" <ji...@apache.org> on 2017/03/31 16:40:41 UTC

[jira] [Updated] (AMBARI-18622) Integrate PredictionIO (Machine Learning Engine) With Ambari

     [ https://issues.apache.org/jira/browse/AMBARI-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ananda Verma updated AMBARI-18622:
----------------------------------
    Affects Version/s:     (was: 2.1.0)
                       2.5.0

> Integrate PredictionIO (Machine Learning Engine) With Ambari
> ------------------------------------------------------------
>
>                 Key: AMBARI-18622
>                 URL: https://issues.apache.org/jira/browse/AMBARI-18622
>             Project: Ambari
>          Issue Type: New Feature
>          Components: ambari-server
>    Affects Versions: 2.5.0
>            Reporter: Ananda Verma
>
> It makes sense to integrate PredictionIO with Ambari since it is now part of apache group and also heavily depends on current amabri/hdp stack.  
> Feature includes adding support for apache predictionIO cluster provisioning via Ambari.
> In general, pio can be defined as a service in HDP which has following components - 
> 1) Event Server  - stores events (data)
> 2) Engine - Engine is responsible for making prediction. It contains one or more machine learning algorithms. An engine reads training data and build predictive model(s). It is then deployed as a web service. A deployed engine responds to prediction queries from your application through REST API in real-time.
> PredictionIO also has external dependencies on following  - 
> 1. HBase: Event Server uses Apache HBase as the data store. It stores imported events. If you are not using the PredictionIO Event Server, you do not need to install HBase.
> 2. Apache Spark: Spark is a large-scale data processing engine that powers the algorithm, training, and serving processing.
> 3. HDFS: The output of training has two parts: a model and its meta-data. The model is then stored in HDFS or a local file system.
> 4. Elasticsearch: It stores metadata such as model versions, engine versions, access key and app id mappings, evaluation results, etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)