You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@predictionio.apache.org by "Mars Hall (JIRA)" <ji...@apache.org> on 2017/07/15 00:07:01 UTC

[jira] [Updated] (PIO-105) Batch Predictions

     [ https://issues.apache.org/jira/browse/PIO-105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mars Hall updated PIO-105:
--------------------------
    External issue URL: https://github.com/apache/incubator-predictionio/pull/412
           Description: 
Implement a new {{pio batchpredict}} command to enable massive, fast, batch predictions from a trained model. Read a multi-object JSON file as the input format, with one query object per line. Similarly, write results to a multi-object JSON file, with one prediction result + its original query per line.

Currently getting bulk predictions from PredictionIO is possible with either:

* a {{pio eval}} script, which will always train a fresh, unvalidated model before getting predictions
* a custom script that hits the {{queries.json}} HTTP API, which is a serious bottleneck when requesting hundreds-of-thousands or millions of predictions

Neither of these existing bulk-prediction hacks are adequate for the reasons mentioned.

It's time for this use-case to be a firstclass command :D

Pull request https://github.com/apache/incubator-predictionio/pull/412

  was:
Implement a new {{pio batchpredict}} command to enable massive, fast, batch predictions from a trained model. Read a multi-object JSON file as the input format, with one query object per line. Similarly, write results to a multi-object JSON file, with one prediction result + its original query per line.

Currently getting bulk predictions from PredictionIO is possible with either:

* a {{pio eval}} script, which will always train a fresh, unvalidated model before getting predictions
* a custom script that hits the {{queries.json}} HTTP API, which is a serious bottleneck when requesting hundreds-of-thousands or millions of predictions

Neither of these existing bulk-prediction hacks are adequate for the reasons mentioned.

It's time for this use-case to be a firstclass command :D


> Batch Predictions
> -----------------
>
>                 Key: PIO-105
>                 URL: https://issues.apache.org/jira/browse/PIO-105
>             Project: PredictionIO
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Mars Hall
>            Assignee: Mars Hall
>
> Implement a new {{pio batchpredict}} command to enable massive, fast, batch predictions from a trained model. Read a multi-object JSON file as the input format, with one query object per line. Similarly, write results to a multi-object JSON file, with one prediction result + its original query per line.
> Currently getting bulk predictions from PredictionIO is possible with either:
> * a {{pio eval}} script, which will always train a fresh, unvalidated model before getting predictions
> * a custom script that hits the {{queries.json}} HTTP API, which is a serious bottleneck when requesting hundreds-of-thousands or millions of predictions
> Neither of these existing bulk-prediction hacks are adequate for the reasons mentioned.
> It's time for this use-case to be a firstclass command :D
> Pull request https://github.com/apache/incubator-predictionio/pull/412



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)