You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@predictionio.apache.org by do...@apache.org on 2016/08/25 20:14:32 UTC

[37/51] [partial] incubator-predictionio-site git commit: Update template gallery

http://git-wip-us.apache.org/repos/asf/incubator-predictionio-site/blob/afc2fa9a/demo/tapster/index.html
----------------------------------------------------------------------
diff --git a/demo/tapster/index.html b/demo/tapster/index.html
index e31ee75..fa62ec6 100644
--- a/demo/tapster/index.html
+++ b/demo/tapster/index.html
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html><head><title>Comics Recommendation Demo</title><meta charset="utf-8"/><meta content="IE=edge,chrome=1" http-equiv="X-UA-Compatible"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><meta class="swiftype" name="title" data-type="string" content="Comics Recommendation Demo"/><link rel="canonical" href="https://docs.prediction.io/demo/tapster/"/><link href="/images/favicon/normal-b330020a.png" rel="shortcut icon"/><link href="/images/favicon/apple-c0febcf2.png" rel="apple-touch-icon"/><link href="//fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,800italic,400,300,600,700,800" rel="stylesheet"/><link href="//maxcdn.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.min.css" rel="stylesheet"/><link href="/stylesheets/application-a2a2f408.css" rel="stylesheet" type="text/css"/><script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.2/html5shiv.min.js"></script><script src="//cdn.mathjax.org/mathjax/latest/
 MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><script src="//use.typekit.net/pqo0itb.js"></script><script>try{Typekit.load({ async: true });}catch(e){}</script></head><body><div id="global"><header><div class="container" id="header-wrapper"><div class="row"><div class="col-sm-12"><div id="logo-wrapper"><span id="drawer-toggle"></span><a href="#"></a><a href="http://predictionio.incubator.apache.org/"><img alt="PredictionIO" id="logo" src="/images/logos/logo-ee2b9bb3.png"/></a></div><div id="menu-wrapper"><div id="pill-wrapper"><a class="pill left" href="//templates.prediction.io/">TEMPLATES</a> <a class="pill right" href="//github.com/apache/incubator-predictionio/">OPEN SOURCE</a></div></div><img class="mobile-search-bar-toggler hidden-md hidden-lg" src="/images/icons/search-glass-704bd4ff.png"/></div></div></div></header><div id="search-bar-row-wrapper"><div class="container-fluid" id="search-bar-row"><div class="row"><div class="col-md-9 col-sm-11 col-xs-11"><div class="hidde
 n-md hidden-lg" id="mobile-page-heading-wrapper"><p>PredictionIO Docs</p><h4>Comics Recommendation Demo</h4></div><h4 class="hidden-sm hidden-xs">PredictionIO Docs</h4></div><div class="col-md-3 col-sm-1 col-xs-1 hidden-md hidden-lg"><img id="left-menu-indicator" src="/images/icons/down-arrow-dfe9f7fe.png"/></div><div class="col-md-3 col-sm-12 col-xs-12 swiftype-wrapper"><div class="swiftype"><form class="search-form"><img class="search-box-toggler hidden-xs hidden-sm" src="/images/icons/search-glass-704bd4ff.png"/><div class="search-box"><img src="/images/icons/search-glass-704bd4ff.png"/><input type="text" id="st-search-input" class="st-search-input" placeholder="Search Doc..."/></div><img class="swiftype-row-hider hidden-md hidden-lg" src="/images/icons/drawer-toggle-active-fcbef12a.png"/></form></div></div><div class="mobile-left-menu-toggler hidden-md hidden-lg"></div></div></div></div><div id="page" class="container-fluid"><div class="row"><div id="left-menu-wrapper" class="co
 l-md-3"><nav id="nav-main"><ul><li class="level-1"><a class="expandible" href="/"><span>Apache PredictionIO (incubating) Documentation</span></a><ul><li class="level-2"><a class="final" href="/"><span>Welcome to Apache PredictionIO (incubating)</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Getting Started</span></a><ul><li class="level-2"><a class="final" href="/start/"><span>A Quick Intro</span></a></li><li class="level-2"><a class="final" href="/install/"><span>Installing Apache PredictionIO (incubating)</span></a></li><li class="level-2"><a class="final" href="/start/download/"><span>Downloading an Engine Template</span></a></li><li class="level-2"><a class="final" href="/start/deploy/"><span>Deploying Your First Engine</span></a></li><li class="level-2"><a class="final" href="/start/customize/"><span>Customizing the Engine</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Integrating with Your App</span></a><ul>
 <li class="level-2"><a class="final" href="/appintegration/"><span>App Integration Overview</span></a></li><li class="level-2"><a class="expandible" href="/sdk/"><span>List of SDKs</span></a><ul><li class="level-3"><a class="final" href="/sdk/java/"><span>Java & Android SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/php/"><span>PHP SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/python/"><span>Python SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/ruby/"><span>Ruby SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/community/"><span>Community Powered SDKs</span></a></li></ul></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Deploying an Engine</span></a><ul><li class="level-2"><a class="final" href="/deploy/"><span>Deploying as a Web Service</span></a></li><li class="level-2"><a class="final" href="/cli/#engine-commands"><span>Engine Command-line Interface</span></a></li><li class="level-
 2"><a class="final" href="/deploy/monitoring/"><span>Monitoring Engine</span></a></li><li class="level-2"><a class="final" href="/deploy/engineparams/"><span>Setting Engine Parameters</span></a></li><li class="level-2"><a class="final" href="/deploy/enginevariants/"><span>Deploying Multiple Engine Variants</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Customizing an Engine</span></a><ul><li class="level-2"><a class="final" href="/customize/"><span>Learning DASE</span></a></li><li class="level-2"><a class="final" href="/customize/dase/"><span>Implement DASE</span></a></li><li class="level-2"><a class="final" href="/customize/troubleshooting/"><span>Troubleshooting Engine Development</span></a></li><li class="level-2"><a class="final" href="/api/current/#package"><span>Engine Scala APIs</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Collecting and Analyzing Data</span></a><ul><li class="level-2"><a class="final" hr
 ef="/datacollection/"><span>Event Server Overview</span></a></li><li class="level-2"><a class="final" href="/cli/#event-server-commands"><span>Event Server Command-line Interface</span></a></li><li class="level-2"><a class="final" href="/datacollection/eventapi/"><span>Collecting Data with REST/SDKs</span></a></li><li class="level-2"><a class="final" href="/datacollection/eventmodel/"><span>Events Modeling</span></a></li><li class="level-2"><a class="final" href="/datacollection/webhooks/"><span>Unifying Multichannel Data with Webhooks</span></a></li><li class="level-2"><a class="final" href="/datacollection/channel/"><span>Channel</span></a></li><li class="level-2"><a class="final" href="/datacollection/batchimport/"><span>Importing Data in Batch</span></a></li><li class="level-2"><a class="final" href="/datacollection/analytics/"><span>Using Analytics Tools</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Choosing an Algorithm(s)</span></a><ul><li 
 class="level-2"><a class="final" href="/algorithm/"><span>Built-in Algorithm Libraries</span></a></li><li class="level-2"><a class="final" href="/algorithm/switch/"><span>Switching to Another Algorithm</span></a></li><li class="level-2"><a class="final" href="/algorithm/multiple/"><span>Combining Multiple Algorithms</span></a></li><li class="level-2"><a class="final" href="/algorithm/custom/"><span>Adding Your Own Algorithms</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>ML Tuning and Evaluation</span></a><ul><li class="level-2"><a class="final" href="/evaluation/"><span>Overview</span></a></li><li class="level-2"><a class="final" href="/evaluation/paramtuning/"><span>Hyperparameter Tuning</span></a></li><li class="level-2"><a class="final" href="/evaluation/evaluationdashboard/"><span>Evaluation Dashboard</span></a></li><li class="level-2"><a class="final" href="/evaluation/metricchoose/"><span>Choosing Evaluation Metrics</span></a></li><li class=
 "level-2"><a class="final" href="/evaluation/metricbuild/"><span>Building Evaluation Metrics</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>System Architecture</span></a><ul><li class="level-2"><a class="final" href="/system/"><span>Architecture Overview</span></a></li><li class="level-2"><a class="final" href="/system/anotherdatastore/"><span>Using Another Data Store</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Engine Template Gallery</span></a><ul><li class="level-2"><a class="final" href="http://templates.prediction.io"><span>Browse</span></a></li><li class="level-2"><a class="final" href="/community/submit-template/"><span>Submit your Engine as a Template</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Demo Tutorials</span></a><ul><li class="level-2"><a class="final active" href="/demo/tapster/"><span>Comics Recommendation Demo</span></a></li><li class="level-2"><a class="fi
 nal" href="/demo/community/"><span>Community Contributed Demo</span></a></li><li class="level-2"><a class="final" href="/demo/textclassification/"><span>Text Classification Engine Tutorial</span></a></li></ul></li><li class="level-1"><a class="expandible" href="/community/"><span>Getting Involved</span></a><ul><li class="level-2"><a class="final" href="/community/contribute-code/"><span>Contribute Code</span></a></li><li class="level-2"><a class="final" href="/community/contribute-documentation/"><span>Contribute Documentation</span></a></li><li class="level-2"><a class="final" href="/community/contribute-sdk/"><span>Contribute a SDK</span></a></li><li class="level-2"><a class="final" href="/community/contribute-webhook/"><span>Contribute a Webhook</span></a></li><li class="level-2"><a class="final" href="/community/projects/"><span>Community Projects</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Getting Help</span></a><ul><li class="level-2"><a c
 lass="final" href="/resources/faq/"><span>FAQs</span></a></li><li class="level-2"><a class="final" href="/support/"><span>Support</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Resources</span></a><ul><li class="level-2"><a class="final" href="/resources/intellij/"><span>Developing Engines with IntelliJ IDEA</span></a></li><li class="level-2"><a class="final" href="/resources/upgrade/"><span>Upgrade Instructions</span></a></li><li class="level-2"><a class="final" href="/resources/glossary/"><span>Glossary</span></a></li></ul></li></ul></nav></div><div class="col-md-9 col-sm-12"><div class="content-header hidden-md hidden-lg"><div id="breadcrumbs" class="hidden-sm hidden xs"><ul><li><a href="#">Demo Tutorials</a><span class="spacer">&gt;</span></li><li><span class="last">Comics Recommendation Demo</span></li></ul></div><div id="page-title"><h1>Comics Recommendation Demo</h1></div></div><div id="table-of-content-wrapper"><h5>On this page</h5><aside i
 d="table-of-contents"><ul> <li> <a href="#introduction">Introduction</a> </li> <li> <a href="#tapster-demo-application">Tapster Demo Application</a> </li> <li> <a href="#apache-predictionio-incubating-setup">Apache PredictionIO (incubating) Setup</a> </li> <li> <a href="#import-data">Import Data</a> </li> <li> <a href="#connect-demo-app-with-apache-predictionio-incubating">Connect Demo app with Apache PredictionIO (incubating)</a> </li> <li> <a href="#links">Links</a> </li> <li> <a href="#conclusion">Conclusion</a> </li> </ul> </aside><hr/><a id="edit-page-link" href="https://github.com/apache/incubator-predictionio/tree/livedoc/docs/manual/source/demo/tapster.html.md"><img src="/images/icons/edit-pencil-d6c1bb3d.png"/>Edit this page</a></div><div class="content-header hidden-sm hidden-xs"><div id="breadcrumbs" class="hidden-sm hidden xs"><ul><li><a href="#">Demo Tutorials</a><span class="spacer">&gt;</span></li><li><span class="last">Comics Recommendation Demo</span></li></ul></div
 ><div id="page-title"><h1>Comics Recommendation Demo</h1></div></div><div class="content"><h2 id='introduction' class='header-anchors'>Introduction</h2><p>In this demo, we will show you how to build a Tinder-style web application (named &quot;Tapster&quot;) recommending comics to users based on their likes/dislikes of episodes interactively.</p><p>The demo will use <a href="https://docs.prediction.io/templates/similarproduct/quickstart/">Similar Product Template</a>. Similar Product Template is a great choice if you want to make recommendations based on immediate user activities or for new users with limited history. It uses MLLib Alternating Least Squares (ALS) recommendation algorithm, a <a href="http://en.wikipedia.org/wiki/Recommender_system#Collaborative_filtering">Collaborative filtering</a> (CF) algorithm commonly used for recommender systems. These techniques aim to fill in the missing entries of a user-item association matrix. Users and products are described by a small set
  of latent factors that can be used to predict missing entries. A layman&#39;s interpretation of Collaborative Filtering is &quot;People who like this comic, also like these comics.&quot;</p><p>All the code and data is on GitHub at: <a href="https://github.com/PredictionIO/Demo-Tapster">github.com/PredictionIO/Demo-Tapster</a>.</p><h3 id='data' class='header-anchors'>Data</h3><p>The source of the data is from <a href="http://tapastic.com/">Tapastic</a>. You can find the data files <a href="https://github.com/PredictionIO/Demo-Tapster/tree/master/data">here</a>.</p><p>The data structure looks like this:</p><p><a href="https://github.com/PredictionIO/Demo-Tapster/blob/master/data/episode_list.csv">Episode List</a> <code>data/episode_list.csv</code></p><p><strong>Fields:</strong> episodeId | episodeTitle | episodeCategories | episodeUrl | episodeImageUrls</p><p>1,000 rows. Each row represents one episode.</p><p><a href="https://github.com/PredictionIO/Demo-Tapster/blob/master/data/user
 _list.csv">User Like Event List</a> <code>data/user_list.csv</code></p><p><strong>Fields:</strong> userId | episodeId | likedTimestamp</p><p>192,587 rows. Each row represents one user like for the given episode.</p><p>The tutorial has four major steps: - Demo application setup - PredictionIO installation and setup - Import data into database and PredictionIO - Integrate demo application with PredictionIO</p><h2 id='tapster-demo-application' class='header-anchors'>Tapster Demo Application</h2><p>The demo application is built using Rails.</p><p>You can clone the existing application with:</p><div class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+<!DOCTYPE html><html><head><title>Comics Recommendation Demo</title><meta charset="utf-8"/><meta content="IE=edge,chrome=1" http-equiv="X-UA-Compatible"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><meta class="swiftype" name="title" data-type="string" content="Comics Recommendation Demo"/><link rel="canonical" href="https://docs.prediction.io/demo/tapster/"/><link href="/images/favicon/normal-b330020a.png" rel="shortcut icon"/><link href="/images/favicon/apple-c0febcf2.png" rel="apple-touch-icon"/><link href="//fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,800italic,400,300,600,700,800" rel="stylesheet"/><link href="//maxcdn.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.min.css" rel="stylesheet"/><link href="/stylesheets/application-a2a2f408.css" rel="stylesheet" type="text/css"/><script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.2/html5shiv.min.js"></script><script src="//cdn.mathjax.org/mathjax/latest/
 MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><script src="//use.typekit.net/pqo0itb.js"></script><script>try{Typekit.load({ async: true });}catch(e){}</script></head><body><div id="global"><header><div class="container" id="header-wrapper"><div class="row"><div class="col-sm-12"><div id="logo-wrapper"><span id="drawer-toggle"></span><a href="#"></a><a href="http://predictionio.incubator.apache.org/"><img alt="PredictionIO" id="logo" src="/images/logos/logo-ee2b9bb3.png"/></a></div><div id="menu-wrapper"><div id="pill-wrapper"><a class="pill left" href="//templates.prediction.io/">TEMPLATES</a> <a class="pill right" href="//github.com/apache/incubator-predictionio/">OPEN SOURCE</a></div></div><img class="mobile-search-bar-toggler hidden-md hidden-lg" src="/images/icons/search-glass-704bd4ff.png"/></div></div></div></header><div id="search-bar-row-wrapper"><div class="container-fluid" id="search-bar-row"><div class="row"><div class="col-md-9 col-sm-11 col-xs-11"><div class="hidde
 n-md hidden-lg" id="mobile-page-heading-wrapper"><p>PredictionIO Docs</p><h4>Comics Recommendation Demo</h4></div><h4 class="hidden-sm hidden-xs">PredictionIO Docs</h4></div><div class="col-md-3 col-sm-1 col-xs-1 hidden-md hidden-lg"><img id="left-menu-indicator" src="/images/icons/down-arrow-dfe9f7fe.png"/></div><div class="col-md-3 col-sm-12 col-xs-12 swiftype-wrapper"><div class="swiftype"><form class="search-form"><img class="search-box-toggler hidden-xs hidden-sm" src="/images/icons/search-glass-704bd4ff.png"/><div class="search-box"><img src="/images/icons/search-glass-704bd4ff.png"/><input type="text" id="st-search-input" class="st-search-input" placeholder="Search Doc..."/></div><img class="swiftype-row-hider hidden-md hidden-lg" src="/images/icons/drawer-toggle-active-fcbef12a.png"/></form></div></div><div class="mobile-left-menu-toggler hidden-md hidden-lg"></div></div></div></div><div id="page" class="container-fluid"><div class="row"><div id="left-menu-wrapper" class="co
 l-md-3"><nav id="nav-main"><ul><li class="level-1"><a class="expandible" href="/"><span>Apache PredictionIO (incubating) Documentation</span></a><ul><li class="level-2"><a class="final" href="/"><span>Welcome to Apache PredictionIO (incubating)</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Getting Started</span></a><ul><li class="level-2"><a class="final" href="/start/"><span>A Quick Intro</span></a></li><li class="level-2"><a class="final" href="/install/"><span>Installing Apache PredictionIO (incubating)</span></a></li><li class="level-2"><a class="final" href="/start/download/"><span>Downloading an Engine Template</span></a></li><li class="level-2"><a class="final" href="/start/deploy/"><span>Deploying Your First Engine</span></a></li><li class="level-2"><a class="final" href="/start/customize/"><span>Customizing the Engine</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Integrating with Your App</span></a><ul>
 <li class="level-2"><a class="final" href="/appintegration/"><span>App Integration Overview</span></a></li><li class="level-2"><a class="expandible" href="/sdk/"><span>List of SDKs</span></a><ul><li class="level-3"><a class="final" href="/sdk/java/"><span>Java & Android SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/php/"><span>PHP SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/python/"><span>Python SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/ruby/"><span>Ruby SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/community/"><span>Community Powered SDKs</span></a></li></ul></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Deploying an Engine</span></a><ul><li class="level-2"><a class="final" href="/deploy/"><span>Deploying as a Web Service</span></a></li><li class="level-2"><a class="final" href="/cli/#engine-commands"><span>Engine Command-line Interface</span></a></li><li class="level-
 2"><a class="final" href="/deploy/monitoring/"><span>Monitoring Engine</span></a></li><li class="level-2"><a class="final" href="/deploy/engineparams/"><span>Setting Engine Parameters</span></a></li><li class="level-2"><a class="final" href="/deploy/enginevariants/"><span>Deploying Multiple Engine Variants</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Customizing an Engine</span></a><ul><li class="level-2"><a class="final" href="/customize/"><span>Learning DASE</span></a></li><li class="level-2"><a class="final" href="/customize/dase/"><span>Implement DASE</span></a></li><li class="level-2"><a class="final" href="/customize/troubleshooting/"><span>Troubleshooting Engine Development</span></a></li><li class="level-2"><a class="final" href="/api/current/#package"><span>Engine Scala APIs</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Collecting and Analyzing Data</span></a><ul><li class="level-2"><a class="final" hr
 ef="/datacollection/"><span>Event Server Overview</span></a></li><li class="level-2"><a class="final" href="/cli/#event-server-commands"><span>Event Server Command-line Interface</span></a></li><li class="level-2"><a class="final" href="/datacollection/eventapi/"><span>Collecting Data with REST/SDKs</span></a></li><li class="level-2"><a class="final" href="/datacollection/eventmodel/"><span>Events Modeling</span></a></li><li class="level-2"><a class="final" href="/datacollection/webhooks/"><span>Unifying Multichannel Data with Webhooks</span></a></li><li class="level-2"><a class="final" href="/datacollection/channel/"><span>Channel</span></a></li><li class="level-2"><a class="final" href="/datacollection/batchimport/"><span>Importing Data in Batch</span></a></li><li class="level-2"><a class="final" href="/datacollection/analytics/"><span>Using Analytics Tools</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Choosing an Algorithm(s)</span></a><ul><li 
 class="level-2"><a class="final" href="/algorithm/"><span>Built-in Algorithm Libraries</span></a></li><li class="level-2"><a class="final" href="/algorithm/switch/"><span>Switching to Another Algorithm</span></a></li><li class="level-2"><a class="final" href="/algorithm/multiple/"><span>Combining Multiple Algorithms</span></a></li><li class="level-2"><a class="final" href="/algorithm/custom/"><span>Adding Your Own Algorithms</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>ML Tuning and Evaluation</span></a><ul><li class="level-2"><a class="final" href="/evaluation/"><span>Overview</span></a></li><li class="level-2"><a class="final" href="/evaluation/paramtuning/"><span>Hyperparameter Tuning</span></a></li><li class="level-2"><a class="final" href="/evaluation/evaluationdashboard/"><span>Evaluation Dashboard</span></a></li><li class="level-2"><a class="final" href="/evaluation/metricchoose/"><span>Choosing Evaluation Metrics</span></a></li><li class=
 "level-2"><a class="final" href="/evaluation/metricbuild/"><span>Building Evaluation Metrics</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>System Architecture</span></a><ul><li class="level-2"><a class="final" href="/system/"><span>Architecture Overview</span></a></li><li class="level-2"><a class="final" href="/system/anotherdatastore/"><span>Using Another Data Store</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Engine Template Gallery</span></a><ul><li class="level-2"><a class="final" href="/gallery/template-gallery/"><span>Browse</span></a></li><li class="level-2"><a class="final" href="/community/submit-template/"><span>Submit your Engine as a Template</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Demo Tutorials</span></a><ul><li class="level-2"><a class="final active" href="/demo/tapster/"><span>Comics Recommendation Demo</span></a></li><li class="level-2"><a class="final"
  href="/demo/community/"><span>Community Contributed Demo</span></a></li><li class="level-2"><a class="final" href="/demo/textclassification/"><span>Text Classification Engine Tutorial</span></a></li></ul></li><li class="level-1"><a class="expandible" href="/community/"><span>Getting Involved</span></a><ul><li class="level-2"><a class="final" href="/community/contribute-code/"><span>Contribute Code</span></a></li><li class="level-2"><a class="final" href="/community/contribute-documentation/"><span>Contribute Documentation</span></a></li><li class="level-2"><a class="final" href="/community/contribute-sdk/"><span>Contribute a SDK</span></a></li><li class="level-2"><a class="final" href="/community/contribute-webhook/"><span>Contribute a Webhook</span></a></li><li class="level-2"><a class="final" href="/community/projects/"><span>Community Projects</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Getting Help</span></a><ul><li class="level-2"><a class
 ="final" href="/resources/faq/"><span>FAQs</span></a></li><li class="level-2"><a class="final" href="/support/"><span>Support</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Resources</span></a><ul><li class="level-2"><a class="final" href="/resources/intellij/"><span>Developing Engines with IntelliJ IDEA</span></a></li><li class="level-2"><a class="final" href="/resources/upgrade/"><span>Upgrade Instructions</span></a></li><li class="level-2"><a class="final" href="/resources/glossary/"><span>Glossary</span></a></li></ul></li></ul></nav></div><div class="col-md-9 col-sm-12"><div class="content-header hidden-md hidden-lg"><div id="breadcrumbs" class="hidden-sm hidden xs"><ul><li><a href="#">Demo Tutorials</a><span class="spacer">&gt;</span></li><li><span class="last">Comics Recommendation Demo</span></li></ul></div><div id="page-title"><h1>Comics Recommendation Demo</h1></div></div><div id="table-of-content-wrapper"><h5>On this page</h5><aside id="t
 able-of-contents"><ul> <li> <a href="#introduction">Introduction</a> </li> <li> <a href="#tapster-demo-application">Tapster Demo Application</a> </li> <li> <a href="#apache-predictionio-incubating-setup">Apache PredictionIO (incubating) Setup</a> </li> <li> <a href="#import-data">Import Data</a> </li> <li> <a href="#connect-demo-app-with-apache-predictionio-incubating">Connect Demo app with Apache PredictionIO (incubating)</a> </li> <li> <a href="#links">Links</a> </li> <li> <a href="#conclusion">Conclusion</a> </li> </ul> </aside><hr/><a id="edit-page-link" href="https://github.com/apache/incubator-predictionio/tree/livedoc/docs/manual/source/demo/tapster.html.md"><img src="/images/icons/edit-pencil-d6c1bb3d.png"/>Edit this page</a></div><div class="content-header hidden-sm hidden-xs"><div id="breadcrumbs" class="hidden-sm hidden xs"><ul><li><a href="#">Demo Tutorials</a><span class="spacer">&gt;</span></li><li><span class="last">Comics Recommendation Demo</span></li></ul></div><di
 v id="page-title"><h1>Comics Recommendation Demo</h1></div></div><div class="content"><h2 id='introduction' class='header-anchors'>Introduction</h2><p>In this demo, we will show you how to build a Tinder-style web application (named &quot;Tapster&quot;) recommending comics to users based on their likes/dislikes of episodes interactively.</p><p>The demo will use <a href="https://docs.prediction.io/templates/similarproduct/quickstart/">Similar Product Template</a>. Similar Product Template is a great choice if you want to make recommendations based on immediate user activities or for new users with limited history. It uses MLLib Alternating Least Squares (ALS) recommendation algorithm, a <a href="http://en.wikipedia.org/wiki/Recommender_system#Collaborative_filtering">Collaborative filtering</a> (CF) algorithm commonly used for recommender systems. These techniques aim to fill in the missing entries of a user-item association matrix. Users and products are described by a small set of 
 latent factors that can be used to predict missing entries. A layman&#39;s interpretation of Collaborative Filtering is &quot;People who like this comic, also like these comics.&quot;</p><p>All the code and data is on GitHub at: <a href="https://github.com/PredictionIO/Demo-Tapster">github.com/PredictionIO/Demo-Tapster</a>.</p><h3 id='data' class='header-anchors'>Data</h3><p>The source of the data is from <a href="http://tapastic.com/">Tapastic</a>. You can find the data files <a href="https://github.com/PredictionIO/Demo-Tapster/tree/master/data">here</a>.</p><p>The data structure looks like this:</p><p><a href="https://github.com/PredictionIO/Demo-Tapster/blob/master/data/episode_list.csv">Episode List</a> <code>data/episode_list.csv</code></p><p><strong>Fields:</strong> episodeId | episodeTitle | episodeCategories | episodeUrl | episodeImageUrls</p><p>1,000 rows. Each row represents one episode.</p><p><a href="https://github.com/PredictionIO/Demo-Tapster/blob/master/data/user_lis
 t.csv">User Like Event List</a> <code>data/user_list.csv</code></p><p><strong>Fields:</strong> userId | episodeId | likedTimestamp</p><p>192,587 rows. Each row represents one user like for the given episode.</p><p>The tutorial has four major steps: - Demo application setup - PredictionIO installation and setup - Import data into database and PredictionIO - Integrate demo application with PredictionIO</p><h2 id='tapster-demo-application' class='header-anchors'>Tapster Demo Application</h2><p>The demo application is built using Rails.</p><p>You can clone the existing application with:</p><div class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
 2
 3</pre></td><td class="code"><pre><span class="gp">$ </span>git clone  https://github.com/PredictionIO/Demo-Tapster.git
 <span class="gp">$ </span><span class="nb">cd </span>Demo-Tapster

http://git-wip-us.apache.org/repos/asf/incubator-predictionio-site/blob/afc2fa9a/demo/tapster/index.html.gz
----------------------------------------------------------------------
diff --git a/demo/tapster/index.html.gz b/demo/tapster/index.html.gz
index 656c69b..6b581d5 100644
Binary files a/demo/tapster/index.html.gz and b/demo/tapster/index.html.gz differ

http://git-wip-us.apache.org/repos/asf/incubator-predictionio-site/blob/afc2fa9a/demo/textclassification/index.html
----------------------------------------------------------------------
diff --git a/demo/textclassification/index.html b/demo/textclassification/index.html
index 6aed967..70cb356 100644
--- a/demo/textclassification/index.html
+++ b/demo/textclassification/index.html
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html><head><title>Text Classification Engine Tutorial</title><meta charset="utf-8"/><meta content="IE=edge,chrome=1" http-equiv="X-UA-Compatible"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><meta class="swiftype" name="title" data-type="string" content="Text Classification Engine Tutorial"/><link rel="canonical" href="https://docs.prediction.io/demo/textclassification/"/><link href="/images/favicon/normal-b330020a.png" rel="shortcut icon"/><link href="/images/favicon/apple-c0febcf2.png" rel="apple-touch-icon"/><link href="//fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,800italic,400,300,600,700,800" rel="stylesheet"/><link href="//maxcdn.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.min.css" rel="stylesheet"/><link href="/stylesheets/application-a2a2f408.css" rel="stylesheet" type="text/css"/><script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.2/html5shiv.min.js"></script><script src="//cd
 n.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><script src="//use.typekit.net/pqo0itb.js"></script><script>try{Typekit.load({ async: true });}catch(e){}</script></head><body><div id="global"><header><div class="container" id="header-wrapper"><div class="row"><div class="col-sm-12"><div id="logo-wrapper"><span id="drawer-toggle"></span><a href="#"></a><a href="http://predictionio.incubator.apache.org/"><img alt="PredictionIO" id="logo" src="/images/logos/logo-ee2b9bb3.png"/></a></div><div id="menu-wrapper"><div id="pill-wrapper"><a class="pill left" href="//templates.prediction.io/">TEMPLATES</a> <a class="pill right" href="//github.com/apache/incubator-predictionio/">OPEN SOURCE</a></div></div><img class="mobile-search-bar-toggler hidden-md hidden-lg" src="/images/icons/search-glass-704bd4ff.png"/></div></div></div></header><div id="search-bar-row-wrapper"><div class="container-fluid" id="search-bar-row"><div class="row"><div class="col-md-9 col-sm-11
  col-xs-11"><div class="hidden-md hidden-lg" id="mobile-page-heading-wrapper"><p>PredictionIO Docs</p><h4>Text Classification Engine Tutorial</h4></div><h4 class="hidden-sm hidden-xs">PredictionIO Docs</h4></div><div class="col-md-3 col-sm-1 col-xs-1 hidden-md hidden-lg"><img id="left-menu-indicator" src="/images/icons/down-arrow-dfe9f7fe.png"/></div><div class="col-md-3 col-sm-12 col-xs-12 swiftype-wrapper"><div class="swiftype"><form class="search-form"><img class="search-box-toggler hidden-xs hidden-sm" src="/images/icons/search-glass-704bd4ff.png"/><div class="search-box"><img src="/images/icons/search-glass-704bd4ff.png"/><input type="text" id="st-search-input" class="st-search-input" placeholder="Search Doc..."/></div><img class="swiftype-row-hider hidden-md hidden-lg" src="/images/icons/drawer-toggle-active-fcbef12a.png"/></form></div></div><div class="mobile-left-menu-toggler hidden-md hidden-lg"></div></div></div></div><div id="page" class="container-fluid"><div class="row"
 ><div id="left-menu-wrapper" class="col-md-3"><nav id="nav-main"><ul><li class="level-1"><a class="expandible" href="/"><span>Apache PredictionIO (incubating) Documentation</span></a><ul><li class="level-2"><a class="final" href="/"><span>Welcome to Apache PredictionIO (incubating)</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Getting Started</span></a><ul><li class="level-2"><a class="final" href="/start/"><span>A Quick Intro</span></a></li><li class="level-2"><a class="final" href="/install/"><span>Installing Apache PredictionIO (incubating)</span></a></li><li class="level-2"><a class="final" href="/start/download/"><span>Downloading an Engine Template</span></a></li><li class="level-2"><a class="final" href="/start/deploy/"><span>Deploying Your First Engine</span></a></li><li class="level-2"><a class="final" href="/start/customize/"><span>Customizing the Engine</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>In
 tegrating with Your App</span></a><ul><li class="level-2"><a class="final" href="/appintegration/"><span>App Integration Overview</span></a></li><li class="level-2"><a class="expandible" href="/sdk/"><span>List of SDKs</span></a><ul><li class="level-3"><a class="final" href="/sdk/java/"><span>Java & Android SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/php/"><span>PHP SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/python/"><span>Python SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/ruby/"><span>Ruby SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/community/"><span>Community Powered SDKs</span></a></li></ul></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Deploying an Engine</span></a><ul><li class="level-2"><a class="final" href="/deploy/"><span>Deploying as a Web Service</span></a></li><li class="level-2"><a class="final" href="/cli/#engine-commands"><span>Engine Command-line Inte
 rface</span></a></li><li class="level-2"><a class="final" href="/deploy/monitoring/"><span>Monitoring Engine</span></a></li><li class="level-2"><a class="final" href="/deploy/engineparams/"><span>Setting Engine Parameters</span></a></li><li class="level-2"><a class="final" href="/deploy/enginevariants/"><span>Deploying Multiple Engine Variants</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Customizing an Engine</span></a><ul><li class="level-2"><a class="final" href="/customize/"><span>Learning DASE</span></a></li><li class="level-2"><a class="final" href="/customize/dase/"><span>Implement DASE</span></a></li><li class="level-2"><a class="final" href="/customize/troubleshooting/"><span>Troubleshooting Engine Development</span></a></li><li class="level-2"><a class="final" href="/api/current/#package"><span>Engine Scala APIs</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Collecting and Analyzing Data</span></a><ul><
 li class="level-2"><a class="final" href="/datacollection/"><span>Event Server Overview</span></a></li><li class="level-2"><a class="final" href="/cli/#event-server-commands"><span>Event Server Command-line Interface</span></a></li><li class="level-2"><a class="final" href="/datacollection/eventapi/"><span>Collecting Data with REST/SDKs</span></a></li><li class="level-2"><a class="final" href="/datacollection/eventmodel/"><span>Events Modeling</span></a></li><li class="level-2"><a class="final" href="/datacollection/webhooks/"><span>Unifying Multichannel Data with Webhooks</span></a></li><li class="level-2"><a class="final" href="/datacollection/channel/"><span>Channel</span></a></li><li class="level-2"><a class="final" href="/datacollection/batchimport/"><span>Importing Data in Batch</span></a></li><li class="level-2"><a class="final" href="/datacollection/analytics/"><span>Using Analytics Tools</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Choos
 ing an Algorithm(s)</span></a><ul><li class="level-2"><a class="final" href="/algorithm/"><span>Built-in Algorithm Libraries</span></a></li><li class="level-2"><a class="final" href="/algorithm/switch/"><span>Switching to Another Algorithm</span></a></li><li class="level-2"><a class="final" href="/algorithm/multiple/"><span>Combining Multiple Algorithms</span></a></li><li class="level-2"><a class="final" href="/algorithm/custom/"><span>Adding Your Own Algorithms</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>ML Tuning and Evaluation</span></a><ul><li class="level-2"><a class="final" href="/evaluation/"><span>Overview</span></a></li><li class="level-2"><a class="final" href="/evaluation/paramtuning/"><span>Hyperparameter Tuning</span></a></li><li class="level-2"><a class="final" href="/evaluation/evaluationdashboard/"><span>Evaluation Dashboard</span></a></li><li class="level-2"><a class="final" href="/evaluation/metricchoose/"><span>Choosing Evalua
 tion Metrics</span></a></li><li class="level-2"><a class="final" href="/evaluation/metricbuild/"><span>Building Evaluation Metrics</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>System Architecture</span></a><ul><li class="level-2"><a class="final" href="/system/"><span>Architecture Overview</span></a></li><li class="level-2"><a class="final" href="/system/anotherdatastore/"><span>Using Another Data Store</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Engine Template Gallery</span></a><ul><li class="level-2"><a class="final" href="http://templates.prediction.io"><span>Browse</span></a></li><li class="level-2"><a class="final" href="/community/submit-template/"><span>Submit your Engine as a Template</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Demo Tutorials</span></a><ul><li class="level-2"><a class="final" href="/demo/tapster/"><span>Comics Recommendation Demo</span></a></li><
 li class="level-2"><a class="final" href="/demo/community/"><span>Community Contributed Demo</span></a></li><li class="level-2"><a class="final active" href="/demo/textclassification/"><span>Text Classification Engine Tutorial</span></a></li></ul></li><li class="level-1"><a class="expandible" href="/community/"><span>Getting Involved</span></a><ul><li class="level-2"><a class="final" href="/community/contribute-code/"><span>Contribute Code</span></a></li><li class="level-2"><a class="final" href="/community/contribute-documentation/"><span>Contribute Documentation</span></a></li><li class="level-2"><a class="final" href="/community/contribute-sdk/"><span>Contribute a SDK</span></a></li><li class="level-2"><a class="final" href="/community/contribute-webhook/"><span>Contribute a Webhook</span></a></li><li class="level-2"><a class="final" href="/community/projects/"><span>Community Projects</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Getting Help<
 /span></a><ul><li class="level-2"><a class="final" href="/resources/faq/"><span>FAQs</span></a></li><li class="level-2"><a class="final" href="/support/"><span>Support</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Resources</span></a><ul><li class="level-2"><a class="final" href="/resources/intellij/"><span>Developing Engines with IntelliJ IDEA</span></a></li><li class="level-2"><a class="final" href="/resources/upgrade/"><span>Upgrade Instructions</span></a></li><li class="level-2"><a class="final" href="/resources/glossary/"><span>Glossary</span></a></li></ul></li></ul></nav></div><div class="col-md-9 col-sm-12"><div class="content-header hidden-md hidden-lg"><div id="breadcrumbs" class="hidden-sm hidden xs"><ul><li><a href="#">Demo Tutorials</a><span class="spacer">&gt;</span></li><li><span class="last">Text Classification Engine Tutorial</span></li></ul></div><div id="page-title"><h1>Text Classification Engine Tutorial</h1></div></div><div id=
 "table-of-content-wrapper"><h5>On this page</h5><aside id="table-of-contents"><ul> <li> <a href="#introduction">Introduction</a> </li> <li> <a href="#prerequisites">Prerequisites</a> </li> <li> <a href="#engine-overview">Engine Overview</a> </li> <li> <a href="#quick-start">Quick Start</a> </li> </ul> </li> <li> <a href="#detailed-explanation-of-dase">Detailed Explanation of DASE</a> <ul> <li> <a href="#importing-data">Importing Data</a> </li> <li> <a href="#data-source-reading-event-data">Data Source: Reading Event Data</a> </li> <li> <a href="#preparator-data-processing-with-dase">Preparator : Data Processing With DASE</a> </li> <li> <a href="#algorithm-component">Algorithm Component</a> </li> <li> <a href="#serving-delivering-the-final-prediction">Serving: Delivering the Final Prediction</a> </li> <li> <a href="#evaluation-model-assessment-and-selection">Evaluation: Model Assessment and Selection</a> </li> <li> <a href="#engine-deployment">Engine Deployment</a> </li> </ul> </asid
 e><hr/><a id="edit-page-link" href="https://github.com/apache/incubator-predictionio/tree/livedoc/docs/manual/source/demo/textclassification.html.md.erb"><img src="/images/icons/edit-pencil-d6c1bb3d.png"/>Edit this page</a></div><div class="content-header hidden-sm hidden-xs"><div id="breadcrumbs" class="hidden-sm hidden xs"><ul><li><a href="#">Demo Tutorials</a><span class="spacer">&gt;</span></li><li><span class="last">Text Classification Engine Tutorial</span></li></ul></div><div id="page-title"><h1>Text Classification Engine Tutorial</h1></div></div><div class="content"><p>(Updated for Text Classification Template version 3.1)</p><h2 id='introduction' class='header-anchors'>Introduction</h2><p>In the real world, there are many applications that collect text as data. For example, spam detectors take email and header content to automatically determine what is or is not spam; applications can gague the general sentiment in a geographical area by analyzing Twitter data; and news art
 icles can be automatically categorized based solely on the text content.There are a wide array of machine learning models you can use to create, or train, a predictive model to assign an incoming article, or query, to an existing category. Before you can use these techniques you must first transform the text data (in this case the set of news articles) into numeric vectors, or feature vectors, that can be used to train your model.</p><p>The purpose of this tutorial is to illustrate how you can go about doing this using PredictionIO&#39;s platform. The advantages of using this platform include: a dynamic engine that responds to queries in real-time; <a href="http://en.wikipedia.org/wiki/Separation_of_concerns">separation of concerns</a>, which offers code re-use and maintainability, and distributed computing capabilities for scalability and efficiency. Moreover, it is easy to incorporate non-trivial data modeling tasks into the DASE architecture allowing Data Scientists to focus on t
 asks related to modeling. This tutorial will exemplify some of these ideas by guiding you through PredictionIO&#39;s [text classification template(<a href="http://templates.prediction.io/PredictionIO/template-scala-parallel-textclassification/">http://templates.prediction.io/PredictionIO/template-scala-parallel-textclassification/</a>).</p><h2 id='prerequisites' class='header-anchors'>Prerequisites</h2><p>Before getting started, please make sure that you have the latest version of Apache PredictionIO (incubating) <a href="http://predictionio.incubator.apache.org/install/">installed</a>. We emphasize here that this is an engine template written in <strong>Scala</strong> and can be more generally thought of as an SBT project containing all the necessary components.</p><p>You should also download the engine template named Text Classification Engine that accompanies this tutorial by cloning the template repository:</p><div class="highlight shell"><table style="border-spacing: 0"><tbody>
 <tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1</pre></td><td class="code"><pre>pio template get PredictionIO/template-scala-parallel-textclassification &lt; Your new engine directory &gt;
+<!DOCTYPE html><html><head><title>Text Classification Engine Tutorial</title><meta charset="utf-8"/><meta content="IE=edge,chrome=1" http-equiv="X-UA-Compatible"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><meta class="swiftype" name="title" data-type="string" content="Text Classification Engine Tutorial"/><link rel="canonical" href="https://docs.prediction.io/demo/textclassification/"/><link href="/images/favicon/normal-b330020a.png" rel="shortcut icon"/><link href="/images/favicon/apple-c0febcf2.png" rel="apple-touch-icon"/><link href="//fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,800italic,400,300,600,700,800" rel="stylesheet"/><link href="//maxcdn.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.min.css" rel="stylesheet"/><link href="/stylesheets/application-a2a2f408.css" rel="stylesheet" type="text/css"/><script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.2/html5shiv.min.js"></script><script src="//cd
 n.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><script src="//use.typekit.net/pqo0itb.js"></script><script>try{Typekit.load({ async: true });}catch(e){}</script></head><body><div id="global"><header><div class="container" id="header-wrapper"><div class="row"><div class="col-sm-12"><div id="logo-wrapper"><span id="drawer-toggle"></span><a href="#"></a><a href="http://predictionio.incubator.apache.org/"><img alt="PredictionIO" id="logo" src="/images/logos/logo-ee2b9bb3.png"/></a></div><div id="menu-wrapper"><div id="pill-wrapper"><a class="pill left" href="//templates.prediction.io/">TEMPLATES</a> <a class="pill right" href="//github.com/apache/incubator-predictionio/">OPEN SOURCE</a></div></div><img class="mobile-search-bar-toggler hidden-md hidden-lg" src="/images/icons/search-glass-704bd4ff.png"/></div></div></div></header><div id="search-bar-row-wrapper"><div class="container-fluid" id="search-bar-row"><div class="row"><div class="col-md-9 col-sm-11
  col-xs-11"><div class="hidden-md hidden-lg" id="mobile-page-heading-wrapper"><p>PredictionIO Docs</p><h4>Text Classification Engine Tutorial</h4></div><h4 class="hidden-sm hidden-xs">PredictionIO Docs</h4></div><div class="col-md-3 col-sm-1 col-xs-1 hidden-md hidden-lg"><img id="left-menu-indicator" src="/images/icons/down-arrow-dfe9f7fe.png"/></div><div class="col-md-3 col-sm-12 col-xs-12 swiftype-wrapper"><div class="swiftype"><form class="search-form"><img class="search-box-toggler hidden-xs hidden-sm" src="/images/icons/search-glass-704bd4ff.png"/><div class="search-box"><img src="/images/icons/search-glass-704bd4ff.png"/><input type="text" id="st-search-input" class="st-search-input" placeholder="Search Doc..."/></div><img class="swiftype-row-hider hidden-md hidden-lg" src="/images/icons/drawer-toggle-active-fcbef12a.png"/></form></div></div><div class="mobile-left-menu-toggler hidden-md hidden-lg"></div></div></div></div><div id="page" class="container-fluid"><div class="row"
 ><div id="left-menu-wrapper" class="col-md-3"><nav id="nav-main"><ul><li class="level-1"><a class="expandible" href="/"><span>Apache PredictionIO (incubating) Documentation</span></a><ul><li class="level-2"><a class="final" href="/"><span>Welcome to Apache PredictionIO (incubating)</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Getting Started</span></a><ul><li class="level-2"><a class="final" href="/start/"><span>A Quick Intro</span></a></li><li class="level-2"><a class="final" href="/install/"><span>Installing Apache PredictionIO (incubating)</span></a></li><li class="level-2"><a class="final" href="/start/download/"><span>Downloading an Engine Template</span></a></li><li class="level-2"><a class="final" href="/start/deploy/"><span>Deploying Your First Engine</span></a></li><li class="level-2"><a class="final" href="/start/customize/"><span>Customizing the Engine</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>In
 tegrating with Your App</span></a><ul><li class="level-2"><a class="final" href="/appintegration/"><span>App Integration Overview</span></a></li><li class="level-2"><a class="expandible" href="/sdk/"><span>List of SDKs</span></a><ul><li class="level-3"><a class="final" href="/sdk/java/"><span>Java & Android SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/php/"><span>PHP SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/python/"><span>Python SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/ruby/"><span>Ruby SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/community/"><span>Community Powered SDKs</span></a></li></ul></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Deploying an Engine</span></a><ul><li class="level-2"><a class="final" href="/deploy/"><span>Deploying as a Web Service</span></a></li><li class="level-2"><a class="final" href="/cli/#engine-commands"><span>Engine Command-line Inte
 rface</span></a></li><li class="level-2"><a class="final" href="/deploy/monitoring/"><span>Monitoring Engine</span></a></li><li class="level-2"><a class="final" href="/deploy/engineparams/"><span>Setting Engine Parameters</span></a></li><li class="level-2"><a class="final" href="/deploy/enginevariants/"><span>Deploying Multiple Engine Variants</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Customizing an Engine</span></a><ul><li class="level-2"><a class="final" href="/customize/"><span>Learning DASE</span></a></li><li class="level-2"><a class="final" href="/customize/dase/"><span>Implement DASE</span></a></li><li class="level-2"><a class="final" href="/customize/troubleshooting/"><span>Troubleshooting Engine Development</span></a></li><li class="level-2"><a class="final" href="/api/current/#package"><span>Engine Scala APIs</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Collecting and Analyzing Data</span></a><ul><
 li class="level-2"><a class="final" href="/datacollection/"><span>Event Server Overview</span></a></li><li class="level-2"><a class="final" href="/cli/#event-server-commands"><span>Event Server Command-line Interface</span></a></li><li class="level-2"><a class="final" href="/datacollection/eventapi/"><span>Collecting Data with REST/SDKs</span></a></li><li class="level-2"><a class="final" href="/datacollection/eventmodel/"><span>Events Modeling</span></a></li><li class="level-2"><a class="final" href="/datacollection/webhooks/"><span>Unifying Multichannel Data with Webhooks</span></a></li><li class="level-2"><a class="final" href="/datacollection/channel/"><span>Channel</span></a></li><li class="level-2"><a class="final" href="/datacollection/batchimport/"><span>Importing Data in Batch</span></a></li><li class="level-2"><a class="final" href="/datacollection/analytics/"><span>Using Analytics Tools</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Choos
 ing an Algorithm(s)</span></a><ul><li class="level-2"><a class="final" href="/algorithm/"><span>Built-in Algorithm Libraries</span></a></li><li class="level-2"><a class="final" href="/algorithm/switch/"><span>Switching to Another Algorithm</span></a></li><li class="level-2"><a class="final" href="/algorithm/multiple/"><span>Combining Multiple Algorithms</span></a></li><li class="level-2"><a class="final" href="/algorithm/custom/"><span>Adding Your Own Algorithms</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>ML Tuning and Evaluation</span></a><ul><li class="level-2"><a class="final" href="/evaluation/"><span>Overview</span></a></li><li class="level-2"><a class="final" href="/evaluation/paramtuning/"><span>Hyperparameter Tuning</span></a></li><li class="level-2"><a class="final" href="/evaluation/evaluationdashboard/"><span>Evaluation Dashboard</span></a></li><li class="level-2"><a class="final" href="/evaluation/metricchoose/"><span>Choosing Evalua
 tion Metrics</span></a></li><li class="level-2"><a class="final" href="/evaluation/metricbuild/"><span>Building Evaluation Metrics</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>System Architecture</span></a><ul><li class="level-2"><a class="final" href="/system/"><span>Architecture Overview</span></a></li><li class="level-2"><a class="final" href="/system/anotherdatastore/"><span>Using Another Data Store</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Engine Template Gallery</span></a><ul><li class="level-2"><a class="final" href="/gallery/template-gallery/"><span>Browse</span></a></li><li class="level-2"><a class="final" href="/community/submit-template/"><span>Submit your Engine as a Template</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Demo Tutorials</span></a><ul><li class="level-2"><a class="final" href="/demo/tapster/"><span>Comics Recommendation Demo</span></a></li><li c
 lass="level-2"><a class="final" href="/demo/community/"><span>Community Contributed Demo</span></a></li><li class="level-2"><a class="final active" href="/demo/textclassification/"><span>Text Classification Engine Tutorial</span></a></li></ul></li><li class="level-1"><a class="expandible" href="/community/"><span>Getting Involved</span></a><ul><li class="level-2"><a class="final" href="/community/contribute-code/"><span>Contribute Code</span></a></li><li class="level-2"><a class="final" href="/community/contribute-documentation/"><span>Contribute Documentation</span></a></li><li class="level-2"><a class="final" href="/community/contribute-sdk/"><span>Contribute a SDK</span></a></li><li class="level-2"><a class="final" href="/community/contribute-webhook/"><span>Contribute a Webhook</span></a></li><li class="level-2"><a class="final" href="/community/projects/"><span>Community Projects</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Getting Help</spa
 n></a><ul><li class="level-2"><a class="final" href="/resources/faq/"><span>FAQs</span></a></li><li class="level-2"><a class="final" href="/support/"><span>Support</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Resources</span></a><ul><li class="level-2"><a class="final" href="/resources/intellij/"><span>Developing Engines with IntelliJ IDEA</span></a></li><li class="level-2"><a class="final" href="/resources/upgrade/"><span>Upgrade Instructions</span></a></li><li class="level-2"><a class="final" href="/resources/glossary/"><span>Glossary</span></a></li></ul></li></ul></nav></div><div class="col-md-9 col-sm-12"><div class="content-header hidden-md hidden-lg"><div id="breadcrumbs" class="hidden-sm hidden xs"><ul><li><a href="#">Demo Tutorials</a><span class="spacer">&gt;</span></li><li><span class="last">Text Classification Engine Tutorial</span></li></ul></div><div id="page-title"><h1>Text Classification Engine Tutorial</h1></div></div><div id="tab
 le-of-content-wrapper"><h5>On this page</h5><aside id="table-of-contents"><ul> <li> <a href="#introduction">Introduction</a> </li> <li> <a href="#prerequisites">Prerequisites</a> </li> <li> <a href="#engine-overview">Engine Overview</a> </li> <li> <a href="#quick-start">Quick Start</a> </li> </ul> </li> <li> <a href="#detailed-explanation-of-dase">Detailed Explanation of DASE</a> <ul> <li> <a href="#importing-data">Importing Data</a> </li> <li> <a href="#data-source-reading-event-data">Data Source: Reading Event Data</a> </li> <li> <a href="#preparator-data-processing-with-dase">Preparator : Data Processing With DASE</a> </li> <li> <a href="#algorithm-component">Algorithm Component</a> </li> <li> <a href="#serving-delivering-the-final-prediction">Serving: Delivering the Final Prediction</a> </li> <li> <a href="#evaluation-model-assessment-and-selection">Evaluation: Model Assessment and Selection</a> </li> <li> <a href="#engine-deployment">Engine Deployment</a> </li> </ul> </aside><h
 r/><a id="edit-page-link" href="https://github.com/apache/incubator-predictionio/tree/livedoc/docs/manual/source/demo/textclassification.html.md.erb"><img src="/images/icons/edit-pencil-d6c1bb3d.png"/>Edit this page</a></div><div class="content-header hidden-sm hidden-xs"><div id="breadcrumbs" class="hidden-sm hidden xs"><ul><li><a href="#">Demo Tutorials</a><span class="spacer">&gt;</span></li><li><span class="last">Text Classification Engine Tutorial</span></li></ul></div><div id="page-title"><h1>Text Classification Engine Tutorial</h1></div></div><div class="content"><p>(Updated for Text Classification Template version 3.1)</p><h2 id='introduction' class='header-anchors'>Introduction</h2><p>In the real world, there are many applications that collect text as data. For example, spam detectors take email and header content to automatically determine what is or is not spam; applications can gague the general sentiment in a geographical area by analyzing Twitter data; and news article
 s can be automatically categorized based solely on the text content.There are a wide array of machine learning models you can use to create, or train, a predictive model to assign an incoming article, or query, to an existing category. Before you can use these techniques you must first transform the text data (in this case the set of news articles) into numeric vectors, or feature vectors, that can be used to train your model.</p><p>The purpose of this tutorial is to illustrate how you can go about doing this using PredictionIO&#39;s platform. The advantages of using this platform include: a dynamic engine that responds to queries in real-time; <a href="http://en.wikipedia.org/wiki/Separation_of_concerns">separation of concerns</a>, which offers code re-use and maintainability, and distributed computing capabilities for scalability and efficiency. Moreover, it is easy to incorporate non-trivial data modeling tasks into the DASE architecture allowing Data Scientists to focus on tasks
  related to modeling. This tutorial will exemplify some of these ideas by guiding you through PredictionIO&#39;s [text classification template(<a href="http://templates.prediction.io/PredictionIO/template-scala-parallel-textclassification/">http://templates.prediction.io/PredictionIO/template-scala-parallel-textclassification/</a>).</p><h2 id='prerequisites' class='header-anchors'>Prerequisites</h2><p>Before getting started, please make sure that you have the latest version of Apache PredictionIO (incubating) <a href="http://predictionio.incubator.apache.org/install/">installed</a>. We emphasize here that this is an engine template written in <strong>Scala</strong> and can be more generally thought of as an SBT project containing all the necessary components.</p><p>You should also download the engine template named Text Classification Engine that accompanies this tutorial by cloning the template repository:</p><div class="highlight shell"><table style="border-spacing: 0"><tbody><tr>
 <td class="gutter gl" style="text-align: right"><pre class="lineno">1</pre></td><td class="code"><pre>pio template get PredictionIO/template-scala-parallel-textclassification &lt; Your new engine directory &gt;
 </pre></td></tr></tbody></table> </div> <h2 id='engine-overview' class='header-anchors'>Engine Overview</h2><p>The engine follows the DASE architecture which we briefly review here. As a user, you are tasked with collecting data for your web or application, and importing it into PredictionIO&#39;s Event Server. Once the data is in the server, it can be read and processed by the engine via the Data Source and Preparation components, respectively. The Algorithm component uses the processed, or prepared, data to train a set of predictive models. Once you have trained these models, you are ready to deploy your engine and respond to real-time queries via the Serving component which combines the results from different fitted models. The Evaluation component is used to compute an appropriate metric to test the performance of a fitted model, as well as aid in the tuning of model hyper parameters.</p><p>This engine template is meant to handle text classification which means you will be worki
 ng with text data. This means that a query, or newly observed documents, will be of the form:</p><p><code>{text : String}</code>.</p><p>In the running example, a query would be an incoming news article. Once the engine is deployed it can process the query, and then return a Predicted Result of the form</p><p><code>{category : String, confidence : Double}</code>.</p><p>Here category is the model&#39;s class assignment for this new text document (i.e. the best guess for this article&#39;s categorization), and confidence, a value between 0 and 1 representing your confidence in the category prediction (0 meaning you have no confidence in the prediction). The Actual Result is of the form</p><p><code>{category : String}</code>.</p><p>This is used in the evaluation stage when estimating the performance of your predictive model (how well does the model predict categories). Please refer to the <a href="https://docs.prediction.io/customize/">following tutorial</a> for a more detailed explanat
 ion of how your engine will interact with your web application, as well as an in depth-overview of DASE.</p><h2 id='quick-start' class='header-anchors'>Quick Start</h2><p>This is a quick start guide in case you want to start using the engine right away. Sample email data for spam classification will be used. For more detailed information, read the subsequent sections.</p><h3 id='1.-create-a-new-application.' class='header-anchors'>1. Create a new application.</h3><p>After the application is created, you will be given an access key and application ID for the application.</p><div class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1</pre></td><td class="code"><pre><span class="gp">$ </span>pio app new MyTextApp
 </pre></td></tr></tbody></table> </div> <h3 id='2.-import-the-tutorial-data.' class='header-anchors'>2. Import the tutorial data.</h3><p>There are three different data sets available, each giving a different use case for this engine. Please refer to the <strong>Data Source: Reading Event Data</strong> section to see how to appropriate modify the <code>DataSource</code> class for use with each respective data set. The default data set is an e-mail spam data set.</p><p>These data sets have already been processed and are ready for <a href="/datacollection/batchimport/">batch import</a>. Replace <code>***</code> with your actual application ID.</p><div class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
 2
@@ -386,7 +386,7 @@
 </pre></td></tr></tbody></table> </div> <p>The last and final object implemented in this class simply creates a Map with keys being class labels and values, the corresponding category.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
 2</pre></td><td class="code"><pre> <span class="c1">// 5. Finally extract category map, associating label to category.
 </span>  <span class="k">val</span> <span class="n">categoryMap</span> <span class="k">=</span> <span class="n">td</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="n">e</span> <span class="k">=&gt;</span> <span class="o">(</span><span class="n">e</span><span class="o">.</span><span class="n">label</span><span class="o">,</span> <span class="n">e</span><span class="o">.</span><span class="n">category</span><span class="o">)).</span><span class="n">collectAsMap</span>
-</pre></td></tr></tbody></table> </div> <h2 id='algorithm-component' class='header-anchors'>Algorithm Component</h2><p>The algorithm components in this engine, <code>NBAlgorithm</code> and <code>LRAlgorithm</code>, actually follows a very general form. Firstly, a parameter class must again be initialized to feed in the corresponding Algorithm model parameters. For example, NBAlgorithm incorporates NBAlgorithmParams which holds the appropriate additive smoothing parameter lambda for the Naive Bayes model.</p><p>The main class of interest in this component is the class that extends <a href="https://docs.prediction.io/api/current/#io.prediction.controller.P2LAlgorithm">P2LAlgorithm</a>. This class must implement a method named train which will output your predictive model (as a concrete object, this will be implemented via a Scala class). It must also implement a predict method that transforms a query to an appropriate feature vector, and uses this to predict with the fitted model. The
  vectorization function is implemented by a PreparedData object, and the categorization (prediction) is handled by an instance of the NBModel implementation. Again, this demonstrates the facility with which different models can be incorporated into PredictionIO&#39;s DASE architecture.</p><p>The model class itself will be discussed in the following section, however, turn your attention to the TextManipulationEngine object defined in the script <code>Engine.scala</code>. You can see here that the engine is initialized by specifying the DataSource, Preparator, and Serving classes, as well as a Map of algorithm names to Algorithm classes. This tells the engine which algorithms to run. In practice, you can have as many statistical learning models as you&#39;d like, you simply have to implement a new algorithm component to do this. However, this general design form will persist, and the main meat of the work should be in the implementation of your model class.</p><p>The following subsect
 ion will go over our Naive Bayes implementation in NBModel.</p><h3 id='naive-bayes-classification' class='header-anchors'>Naive Bayes Classification</h3><p>This Training Model class only uses the Multinomial Naive Bayes <a href="https://spark.apache.org/docs/latest/mllib-naive-bayes.html">implementation</a> found in the Spark MLLib library. However, recall that the predicted results required in the specifications listed in the overview are of the form:</p><p><code>{category: String, confidence: Double}</code>.</p><p>The confidence value should really be interpreted as the probability that a document belongs to a category given its vectorized form. Note that MLLib&#39;s Naive Bayes model has the class members pi (\(\pi\)), and theta (\(\theta\)). \(\pi\) is a vector of log prior class probabilities, which shows your prior beliefs regarding the probability that an arbitrary document belongs in a category. \(\theta\) is a C \(\times\) D matrix, where C is the number of classes, and D, 
 the number of features, giving the log probabilities that parametrize the Multinomial likelihood model assumed for each class. The multinomial model is easiest to think about as a problem of randomly throwing balls into bins, where the ball lands in each bin with a certain probability. The model treats each n-gram as a bin, and the corresponding t.f.-i.d.f. value as the number of balls thrown into it. The likelihood is the probability of observing a (vectorized) document given that it comes from a particular class.</p><p>Now, letting \(\mathbf{x}\) be a vectorized text document, then it can be shown that the vector</p><p>$$ \frac{\exp\left(\pi + \theta\mathbf{x}\right)}{\left|\left|\exp\left(\pi + \theta\mathbf{x}\right)\right|\right|} $$</p><p>is a vector with C components that represent the posterior class membership probabilities for the document given \(\mathbf{x}\). That is, the update belief regarding what category this document belongs to after observing its vectorized form. 
 This is the motivation behind defining the class NBModel which uses Spark MLLib&#39;s NaiveBayesModel, but implements a separate prediction method.</p><p>The private methods innerProduct and getScores are implemented to do the matrix computation above.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+</pre></td></tr></tbody></table> </div> <h2 id='algorithm-component' class='header-anchors'>Algorithm Component</h2><p>The algorithm components in this engine, <code>NBAlgorithm</code> and <code>LRAlgorithm</code>, actually follows a very general form. Firstly, a parameter class must again be initialized to feed in the corresponding Algorithm model parameters. For example, NBAlgorithm incorporates NBAlgorithmParams which holds the appropriate additive smoothing parameter lambda for the Naive Bayes model.</p><p>The main class of interest in this component is the class that extends <a href="https://docs.prediction.io/api/current/#org.apache.predictionio.controller.P2LAlgorithm">P2LAlgorithm</a>. This class must implement a method named train which will output your predictive model (as a concrete object, this will be implemented via a Scala class). It must also implement a predict method that transforms a query to an appropriate feature vector, and uses this to predict with the fitted 
 model. The vectorization function is implemented by a PreparedData object, and the categorization (prediction) is handled by an instance of the NBModel implementation. Again, this demonstrates the facility with which different models can be incorporated into PredictionIO&#39;s DASE architecture.</p><p>The model class itself will be discussed in the following section, however, turn your attention to the TextManipulationEngine object defined in the script <code>Engine.scala</code>. You can see here that the engine is initialized by specifying the DataSource, Preparator, and Serving classes, as well as a Map of algorithm names to Algorithm classes. This tells the engine which algorithms to run. In practice, you can have as many statistical learning models as you&#39;d like, you simply have to implement a new algorithm component to do this. However, this general design form will persist, and the main meat of the work should be in the implementation of your model class.</p><p>The followi
 ng subsection will go over our Naive Bayes implementation in NBModel.</p><h3 id='naive-bayes-classification' class='header-anchors'>Naive Bayes Classification</h3><p>This Training Model class only uses the Multinomial Naive Bayes <a href="https://spark.apache.org/docs/latest/mllib-naive-bayes.html">implementation</a> found in the Spark MLLib library. However, recall that the predicted results required in the specifications listed in the overview are of the form:</p><p><code>{category: String, confidence: Double}</code>.</p><p>The confidence value should really be interpreted as the probability that a document belongs to a category given its vectorized form. Note that MLLib&#39;s Naive Bayes model has the class members pi (\(\pi\)), and theta (\(\theta\)). \(\pi\) is a vector of log prior class probabilities, which shows your prior beliefs regarding the probability that an arbitrary document belongs in a category. \(\theta\) is a C \(\times\) D matrix, where C is the number of classe
 s, and D, the number of features, giving the log probabilities that parametrize the Multinomial likelihood model assumed for each class. The multinomial model is easiest to think about as a problem of randomly throwing balls into bins, where the ball lands in each bin with a certain probability. The model treats each n-gram as a bin, and the corresponding t.f.-i.d.f. value as the number of balls thrown into it. The likelihood is the probability of observing a (vectorized) document given that it comes from a particular class.</p><p>Now, letting \(\mathbf{x}\) be a vectorized text document, then it can be shown that the vector</p><p>$$ \frac{\exp\left(\pi + \theta\mathbf{x}\right)}{\left|\left|\exp\left(\pi + \theta\mathbf{x}\right)\right|\right|} $$</p><p>is a vector with C components that represent the posterior class membership probabilities for the document given \(\mathbf{x}\). That is, the update belief regarding what category this document belongs to after observing its vectori
 zed form. This is the motivation behind defining the class NBModel which uses Spark MLLib&#39;s NaiveBayesModel, but implements a separate prediction method.</p><p>The private methods innerProduct and getScores are implemented to do the matrix computation above.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
 2
 3
 4
@@ -513,7 +513,7 @@
     </span><span class="p">}</span><span class="w">
   </span><span class="p">]</span><span class="w">
 </span><span class="p">}</span><span class="w">
-</span></pre></td></tr></tbody></table> </div> <h2 id='serving:-delivering-the-final-prediction' class='header-anchors'>Serving: Delivering the Final Prediction</h2><p>The serving component is the final stage in the engine, and in a sense, the most important. This is the final stage in which you combine the results obtained from the different models you choose to run. The Serving class extends the <a href="https://docs.prediction.io/api/current/#io.prediction.controller.LServing">LServing</a> class which must implement a method called serve. This takes a query and an associated sequence of predicted results, which contains the predicted results from the different algorithms that are implemented in your engine, and combines the results to yield a final prediction. It is this final prediction that you will receive after sending a query.</p><p>For example, you could choose to slightly modify the implementation to return class probabilities coming from a mixture of model estimates for c
 lass probabilities, or any other technique you could conceive for combining your results. The default engine setting has this set to yield the label from the model predicting with greater confidence.</p><h2 id='evaluation:-model-assessment-and-selection' class='header-anchors'>Evaluation: Model Assessment and Selection</h2><p> A predictive model needs to be evaluated to see how it will generalize to future observations. PredictionIO uses cross-validation to perform model performance metric estimates needed to assess your particular choice of model. The script <code>Evaluation.scala</code> available with the engine template exemplifies what a usual evaluator setup will look like. First, you must define an appropriate metric. In the engine template, since the topic is text classification, the default metric implemented is category accuracy.</p><p> Second you must define an evaluation object (i.e. extends the class <a href="https://docs.prediction.io/api/current/#io.prediction.controll
 er.Evaluation">Evaluation</a>). Here, you must specify the actual engine and metric components that are to be used for the evaluation. In the engine template, the specified engine is the TextManipulationEngine object, and metric, Accuracy. Lastly, you must specify the parameter values that you want to test in the cross validation. You see in the following block of code:</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+</span></pre></td></tr></tbody></table> </div> <h2 id='serving:-delivering-the-final-prediction' class='header-anchors'>Serving: Delivering the Final Prediction</h2><p>The serving component is the final stage in the engine, and in a sense, the most important. This is the final stage in which you combine the results obtained from the different models you choose to run. The Serving class extends the <a href="https://docs.prediction.io/api/current/#org.apache.predictionio.controller.LServing">LServing</a> class which must implement a method called serve. This takes a query and an associated sequence of predicted results, which contains the predicted results from the different algorithms that are implemented in your engine, and combines the results to yield a final prediction. It is this final prediction that you will receive after sending a query.</p><p>For example, you could choose to slightly modify the implementation to return class probabilities coming from a mixture of model estim
 ates for class probabilities, or any other technique you could conceive for combining your results. The default engine setting has this set to yield the label from the model predicting with greater confidence.</p><h2 id='evaluation:-model-assessment-and-selection' class='header-anchors'>Evaluation: Model Assessment and Selection</h2><p> A predictive model needs to be evaluated to see how it will generalize to future observations. PredictionIO uses cross-validation to perform model performance metric estimates needed to assess your particular choice of model. The script <code>Evaluation.scala</code> available with the engine template exemplifies what a usual evaluator setup will look like. First, you must define an appropriate metric. In the engine template, since the topic is text classification, the default metric implemented is category accuracy.</p><p> Second you must define an evaluation object (i.e. extends the class <a href="https://docs.prediction.io/api/current/#org.apache.p
 redictionio.controller.Evaluation">Evaluation</a>). Here, you must specify the actual engine and metric components that are to be used for the evaluation. In the engine template, the specified engine is the TextManipulationEngine object, and metric, Accuracy. Lastly, you must specify the parameter values that you want to test in the cross validation. You see in the following block of code:</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
 2
 3
 4

http://git-wip-us.apache.org/repos/asf/incubator-predictionio-site/blob/afc2fa9a/demo/textclassification/index.html.gz
----------------------------------------------------------------------
diff --git a/demo/textclassification/index.html.gz b/demo/textclassification/index.html.gz
index b95807d..ff59017 100644
Binary files a/demo/textclassification/index.html.gz and b/demo/textclassification/index.html.gz differ