You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@predictionio.apache.org by do...@apache.org on 2016/10/08 04:57:49 UTC

[13/51] [abbrv] [partial] incubator-predictionio-site git commit: Documentation based on apache/incubator-predictionio#237e17acfbae7ca7eba43c61ce1c4f5498f980be

http://git-wip-us.apache.org/repos/asf/incubator-predictionio-site/blob/02715c51/templates/ecommercerecommendation/dase/index.html
----------------------------------------------------------------------
diff --git a/templates/ecommercerecommendation/dase/index.html b/templates/ecommercerecommendation/dase/index.html
new file mode 100644
index 0000000..33c4da4
--- /dev/null
+++ b/templates/ecommercerecommendation/dase/index.html
@@ -0,0 +1,806 @@
+<!DOCTYPE html><html><head><title>DASE Components Explained (E-Commerce Recommendation)</title><meta charset="utf-8"/><meta content="IE=edge,chrome=1" http-equiv="X-UA-Compatible"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><meta class="swiftype" name="title" data-type="string" content="DASE Components Explained (E-Commerce Recommendation)"/><link rel="canonical" href="https://docs.prediction.io/templates/ecommercerecommendation/dase/"/><link href="/images/favicon/normal-b330020a.png" rel="shortcut icon"/><link href="/images/favicon/apple-c0febcf2.png" rel="apple-touch-icon"/><link href="//fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,800italic,400,300,600,700,800" rel="stylesheet"/><link href="//maxcdn.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.min.css" rel="stylesheet"/><link href="/stylesheets/application-a2a2f408.css" rel="stylesheet" type="text/css"/><script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv
 /3.7.2/html5shiv.min.js"></script><script src="//cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script><script src="//use.typekit.net/pqo0itb.js"></script><script>try{Typekit.load({ async: true });}catch(e){}</script></head><body><div id="global"><header><div class="container" id="header-wrapper"><div class="row"><div class="col-sm-12"><div id="logo-wrapper"><span id="drawer-toggle"></span><a href="#"></a><a href="http://predictionio.incubator.apache.org/"><img alt="PredictionIO" id="logo" src="/images/logos/logo-ee2b9bb3.png"/></a></div><div id="menu-wrapper"><div id="pill-wrapper"><a class="pill left" href="/gallery/template-gallery">TEMPLATES</a> <a class="pill right" href="//github.com/apache/incubator-predictionio/">OPEN SOURCE</a></div></div><img class="mobile-search-bar-toggler hidden-md hidden-lg" src="/images/icons/search-glass-704bd4ff.png"/></div></div></div></header><div id="search-bar-row-wrapper"><div class="container-fluid" id="search-bar-ro
 w"><div class="row"><div class="col-md-9 col-sm-11 col-xs-11"><div class="hidden-md hidden-lg" id="mobile-page-heading-wrapper"><p>PredictionIO Docs</p><h4>DASE Components Explained (E-Commerce Recommendation)</h4></div><h4 class="hidden-sm hidden-xs">PredictionIO Docs</h4></div><div class="col-md-3 col-sm-1 col-xs-1 hidden-md hidden-lg"><img id="left-menu-indicator" src="/images/icons/down-arrow-dfe9f7fe.png"/></div><div class="col-md-3 col-sm-12 col-xs-12 swiftype-wrapper"><div class="swiftype"><form class="search-form"><img class="search-box-toggler hidden-xs hidden-sm" src="/images/icons/search-glass-704bd4ff.png"/><div class="search-box"><img src="/images/icons/search-glass-704bd4ff.png"/><input type="text" id="st-search-input" class="st-search-input" placeholder="Search Doc..."/></div><img class="swiftype-row-hider hidden-md hidden-lg" src="/images/icons/drawer-toggle-active-fcbef12a.png"/></form></div></div><div class="mobile-left-menu-toggler hidden-md hidden-lg"></div></div
 ></div></div><div id="page" class="container-fluid"><div class="row"><div id="left-menu-wrapper" class="col-md-3"><nav id="nav-main"><ul><li class="level-1"><a class="expandible" href="/"><span>Apache PredictionIO (incubating) Documentation</span></a><ul><li class="level-2"><a class="final" href="/"><span>Welcome to Apache PredictionIO (incubating)</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Getting Started</span></a><ul><li class="level-2"><a class="final" href="/start/"><span>A Quick Intro</span></a></li><li class="level-2"><a class="final" href="/install/"><span>Installing Apache PredictionIO (incubating)</span></a></li><li class="level-2"><a class="final" href="/start/download/"><span>Downloading an Engine Template</span></a></li><li class="level-2"><a class="final" href="/start/deploy/"><span>Deploying Your First Engine</span></a></li><li class="level-2"><a class="final" href="/start/customize/"><span>Customizing the Engine</span></a></li><
 /ul></li><li class="level-1"><a class="expandible" href="#"><span>Integrating with Your App</span></a><ul><li class="level-2"><a class="final" href="/appintegration/"><span>App Integration Overview</span></a></li><li class="level-2"><a class="expandible" href="/sdk/"><span>List of SDKs</span></a><ul><li class="level-3"><a class="final" href="/sdk/java/"><span>Java & Android SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/php/"><span>PHP SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/python/"><span>Python SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/ruby/"><span>Ruby SDK</span></a></li><li class="level-3"><a class="final" href="/sdk/community/"><span>Community Powered SDKs</span></a></li></ul></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Deploying an Engine</span></a><ul><li class="level-2"><a class="final" href="/deploy/"><span>Deploying as a Web Service</span></a></li><li class="level-2"><a class
 ="final" href="/cli/#engine-commands"><span>Engine Command-line Interface</span></a></li><li class="level-2"><a class="final" href="/deploy/monitoring/"><span>Monitoring Engine</span></a></li><li class="level-2"><a class="final" href="/deploy/engineparams/"><span>Setting Engine Parameters</span></a></li><li class="level-2"><a class="final" href="/deploy/enginevariants/"><span>Deploying Multiple Engine Variants</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Customizing an Engine</span></a><ul><li class="level-2"><a class="final" href="/customize/"><span>Learning DASE</span></a></li><li class="level-2"><a class="final" href="/customize/dase/"><span>Implement DASE</span></a></li><li class="level-2"><a class="final" href="/customize/troubleshooting/"><span>Troubleshooting Engine Development</span></a></li><li class="level-2"><a class="final" href="/api/current/#package"><span>Engine Scala APIs</span></a></li></ul></li><li class="level-1"><a class="expa
 ndible" href="#"><span>Collecting and Analyzing Data</span></a><ul><li class="level-2"><a class="final" href="/datacollection/"><span>Event Server Overview</span></a></li><li class="level-2"><a class="final" href="/cli/#event-server-commands"><span>Event Server Command-line Interface</span></a></li><li class="level-2"><a class="final" href="/datacollection/eventapi/"><span>Collecting Data with REST/SDKs</span></a></li><li class="level-2"><a class="final" href="/datacollection/eventmodel/"><span>Events Modeling</span></a></li><li class="level-2"><a class="final" href="/datacollection/webhooks/"><span>Unifying Multichannel Data with Webhooks</span></a></li><li class="level-2"><a class="final" href="/datacollection/channel/"><span>Channel</span></a></li><li class="level-2"><a class="final" href="/datacollection/batchimport/"><span>Importing Data in Batch</span></a></li><li class="level-2"><a class="final" href="/datacollection/analytics/"><span>Using Analytics Tools</span></a></li></ul
 ></li><li class="level-1"><a class="expandible" href="#"><span>Choosing an Algorithm(s)</span></a><ul><li class="level-2"><a class="final" href="/algorithm/"><span>Built-in Algorithm Libraries</span></a></li><li class="level-2"><a class="final" href="/algorithm/switch/"><span>Switching to Another Algorithm</span></a></li><li class="level-2"><a class="final" href="/algorithm/multiple/"><span>Combining Multiple Algorithms</span></a></li><li class="level-2"><a class="final" href="/algorithm/custom/"><span>Adding Your Own Algorithms</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>ML Tuning and Evaluation</span></a><ul><li class="level-2"><a class="final" href="/evaluation/"><span>Overview</span></a></li><li class="level-2"><a class="final" href="/evaluation/paramtuning/"><span>Hyperparameter Tuning</span></a></li><li class="level-2"><a class="final" href="/evaluation/evaluationdashboard/"><span>Evaluation Dashboard</span></a></li><li class="level-2"><a 
 class="final" href="/evaluation/metricchoose/"><span>Choosing Evaluation Metrics</span></a></li><li class="level-2"><a class="final" href="/evaluation/metricbuild/"><span>Building Evaluation Metrics</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>System Architecture</span></a><ul><li class="level-2"><a class="final" href="/system/"><span>Architecture Overview</span></a></li><li class="level-2"><a class="final" href="/system/anotherdatastore/"><span>Using Another Data Store</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Engine Template Gallery</span></a><ul><li class="level-2"><a class="final" href="/gallery/template-gallery/"><span>Browse</span></a></li><li class="level-2"><a class="final" href="/community/submit-template/"><span>Submit your Engine as a Template</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Demo Tutorials</span></a><ul><li class="level-2"><a class="final" href="/
 demo/tapster/"><span>Comics Recommendation Demo</span></a></li><li class="level-2"><a class="final" href="/demo/community/"><span>Community Contributed Demo</span></a></li><li class="level-2"><a class="final" href="/demo/textclassification/"><span>Text Classification Engine Tutorial</span></a></li></ul></li><li class="level-1"><a class="expandible" href="/community/"><span>Getting Involved</span></a><ul><li class="level-2"><a class="final" href="/community/contribute-code/"><span>Contribute Code</span></a></li><li class="level-2"><a class="final" href="/community/contribute-documentation/"><span>Contribute Documentation</span></a></li><li class="level-2"><a class="final" href="/community/contribute-sdk/"><span>Contribute a SDK</span></a></li><li class="level-2"><a class="final" href="/community/contribute-webhook/"><span>Contribute a Webhook</span></a></li><li class="level-2"><a class="final" href="/community/projects/"><span>Community Projects</span></a></li></ul></li><li class="le
 vel-1"><a class="expandible" href="#"><span>Getting Help</span></a><ul><li class="level-2"><a class="final" href="/resources/faq/"><span>FAQs</span></a></li><li class="level-2"><a class="final" href="/support/"><span>Support</span></a></li></ul></li><li class="level-1"><a class="expandible" href="#"><span>Resources</span></a><ul><li class="level-2"><a class="final" href="/resources/intellij/"><span>Developing Engines with IntelliJ IDEA</span></a></li><li class="level-2"><a class="final" href="/resources/upgrade/"><span>Upgrade Instructions</span></a></li><li class="level-2"><a class="final" href="/resources/glossary/"><span>Glossary</span></a></li></ul></li></ul></nav></div><div class="col-md-9 col-sm-12"><div class="content-header hidden-md hidden-lg"><div id="page-title"><h1>DASE Components Explained (E-Commerce Recommendation)</h1></div></div><div id="table-of-content-wrapper"><h5>On this page</h5><aside id="table-of-contents"><ul> <li> <a href="#the-engine-design">The Engine Des
 ign</a> </li> <li> <a href="#data">Data</a> </li> <li> <a href="#algorithm">Algorithm</a> </li> <li> <a href="#serving">Serving</a> </li> </ul> </aside><hr/><a id="edit-page-link" href="https://github.com/apache/incubator-predictionio/tree/livedoc/docs/manual/source/templates/ecommercerecommendation/dase.html.md.erb"><img src="/images/icons/edit-pencil-d6c1bb3d.png"/>Edit this page</a></div><div class="content-header hidden-sm hidden-xs"><div id="page-title"><h1>DASE Components Explained (E-Commerce Recommendation)</h1></div></div><div class="content"><p>PredictionIO&#39;s DASE architecture brings the separation-of-concerns design principle to predictive engine development. DASE stands for the following components of an engine:</p> <ul> <li><strong>D</strong>ata - includes Data Source and Data Preparator</li> <li><strong>A</strong>lgorithm(s)</li> <li><strong>S</strong>erving</li> <li><strong>E</strong>valuator</li> </ul> <p><p>Let&#39;s look at the code and see how you can customiz
 e the engine you built from the E-Commerce Recommendation Engine Template.</p><div class="alert-message note"><p>Evaluator will not be covered in this tutorial.</p></div></p><h2 id='the-engine-design' class='header-anchors'>The Engine Design</h2><p>As you can see from the Quick Start, <em>MyECommerceRecommendation</em> takes a JSON prediction query, e.g. <code>{ &quot;user&quot;: &quot;u1&quot;, &quot;num&quot;: 4 }</code>, and return a JSON predicted result. In MyECommerceRecommendation/src/main/scala/<strong><em>Engine.scala</em></strong>, the <code>Query</code> case class defines the format of such <strong>query</strong>:</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7</pre></td><td class="code"><pre><span class="k">case</span> <span class="k">class</span> <span class="nc">Query</span><span class="o">(</span>
+  <span class="n">user</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span>
+  <span class="n">num</span><span class="k">:</span> <span class="kt">Int</span><span class="o">,</span>
+  <span class="n">categories</span><span class="k">:</span> <span class="kt">Option</span><span class="o">[</span><span class="kt">Set</span><span class="o">[</span><span class="kt">String</span><span class="o">]],</span>
+  <span class="n">whiteList</span><span class="k">:</span> <span class="kt">Option</span><span class="o">[</span><span class="kt">Set</span><span class="o">[</span><span class="kt">String</span><span class="o">]],</span>
+  <span class="n">blackList</span><span class="k">:</span> <span class="kt">Option</span><span class="o">[</span><span class="kt">Set</span><span class="o">[</span><span class="kt">String</span><span class="o">]]</span>
+<span class="o">)</span> <span class="k">extends</span> <span class="nc">Serializable</span>
+</pre></td></tr></tbody></table> </div> <p>The <code>PredictedResult</code> case class defines the format of <strong>predicted result</strong>, such as</p><div class="highlight json"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6</pre></td><td class="code"><pre><span class="p">{</span><span class="s2">"itemScores"</span><span class="p">:[</span><span class="w">
+  </span><span class="p">{</span><span class="s2">"item"</span><span class="p">:</span><span class="mi">22</span><span class="p">,</span><span class="s2">"score"</span><span class="p">:</span><span class="mf">4.07</span><span class="p">},</span><span class="w">
+  </span><span class="p">{</span><span class="s2">"item"</span><span class="p">:</span><span class="mi">62</span><span class="p">,</span><span class="s2">"score"</span><span class="p">:</span><span class="mf">4.05</span><span class="p">},</span><span class="w">
+  </span><span class="p">{</span><span class="s2">"item"</span><span class="p">:</span><span class="mi">75</span><span class="p">,</span><span class="s2">"score"</span><span class="p">:</span><span class="mf">4.04</span><span class="p">},</span><span class="w">
+  </span><span class="p">{</span><span class="s2">"item"</span><span class="p">:</span><span class="mi">68</span><span class="p">,</span><span class="s2">"score"</span><span class="p">:</span><span class="mf">3.81</span><span class="p">}</span><span class="w">
+</span><span class="p">]}</span><span class="w">
+</span></pre></td></tr></tbody></table> </div> <p>with:</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8</pre></td><td class="code"><pre><span class="k">case</span> <span class="k">class</span> <span class="nc">PredictedResult</span><span class="o">(</span>
+  <span class="n">itemScores</span><span class="k">:</span> <span class="kt">Array</span><span class="o">[</span><span class="kt">ItemScore</span><span class="o">]</span>
+<span class="o">)</span> <span class="k">extends</span> <span class="nc">Serializable</span>
+
+<span class="k">case</span> <span class="k">class</span> <span class="nc">ItemScore</span><span class="o">(</span>
+  <span class="n">item</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span>
+  <span class="n">score</span><span class="k">:</span> <span class="kt">Double</span>
+<span class="o">)</span> <span class="k">extends</span> <span class="nc">Serializable</span>
+</pre></td></tr></tbody></table> </div> <p>Finally, <code>ECommerceRecommendationEngine</code> is the <em>Engine Factory</em> that defines the components this engine will use: Data Source, Data Preparator, Algorithm(s) and Serving components.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9</pre></td><td class="code"><pre><span class="k">object</span> <span class="nc">ECommerceRecommendationEngine</span> <span class="k">extends</span> <span class="nc">IEngineFactory</span> <span class="o">{</span>
+  <span class="k">def</span> <span class="n">apply</span><span class="o">()</span> <span class="k">=</span> <span class="o">{</span>
+    <span class="k">new</span> <span class="nc">Engine</span><span class="o">(</span>
+      <span class="n">classOf</span><span class="o">[</span><span class="kt">DataSource</span><span class="o">],</span>
+      <span class="n">classOf</span><span class="o">[</span><span class="kt">Preparator</span><span class="o">],</span>
+      <span class="nc">Map</span><span class="o">(</span><span class="s">"ecomm"</span> <span class="o">-&gt;</span> <span class="n">classOf</span><span class="o">[</span><span class="kt">ECommAlgorithm</span><span class="o">]),</span>
+      <span class="n">classOf</span><span class="o">[</span><span class="kt">Serving</span><span class="o">])</span>
+  <span class="o">}</span>
+<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <h3 id='spark-mllib' class='header-anchors'>Spark MLlib</h3><p>The PredictionIO E-Commerce Recommendation Engine Template integrates Spark&#39;s MLlib ALS algorithm under the DASE architecture. We will take a closer look at the DASE code below.</p><p>The MLlib ALS algorithm takes training data of RDD type, i.e. <code>RDD[Rating]</code> and train a model, which is a <code>MatrixFactorizationModel</code> object.</p><p>You can visit <a href="https://spark.apache.org/docs/latest/mllib-collaborative-filtering.html">here</a> to learn more about MLlib&#39;s ALS collaborative filtering algorithm.</p><h2 id='data' class='header-anchors'>Data</h2><p>In the DASE architecture, data is prepared by 2 components sequentially: <em>DataSource</em> and <em>DataPreparator</em>. They take data from the data store and prepare them for Algorithm.</p><h3 id='data-source' class='header-anchors'>Data Source</h3><p>In MyECommerceRecommendation/src/main/scala/<strong><e
 m>DataSource.scala</em></strong>, the <code>readTraining</code> method of class <code>DataSource</code> reads and selects data from the <em>Event Store</em> (data store of the <em>Event Server</em>). It returns <code>TrainingData</code>.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
+30
+31
+32
+33
+34</pre></td><td class="code"><pre><span class="k">case</span> <span class="k">class</span> <span class="nc">DataSourceParams</span><span class="o">(</span><span class="n">appName</span><span class="k">:</span> <span class="kt">String</span><span class="o">)</span> <span class="k">extends</span> <span class="nc">Params</span>
+
+<span class="k">class</span> <span class="nc">DataSource</span><span class="o">(</span><span class="k">val</span> <span class="n">dsp</span><span class="k">:</span> <span class="kt">DataSourceParams</span><span class="o">)</span>
+  <span class="k">extends</span> <span class="nc">PDataSource</span><span class="o">[</span><span class="kt">TrainingData</span>,
+      <span class="kt">EmptyEvaluationInfo</span>, <span class="kt">Query</span>, <span class="kt">EmptyActualResult</span><span class="o">]</span> <span class="o">{</span>
+
+  <span class="nd">@transient</span> <span class="k">lazy</span> <span class="k">val</span> <span class="n">logger</span> <span class="k">=</span> <span class="nc">Logger</span><span class="o">[</span><span class="kt">this.</span><span class="k">type</span><span class="o">]</span>
+
+  <span class="k">override</span>
+  <span class="k">def</span> <span class="n">readTraining</span><span class="o">(</span><span class="n">sc</span><span class="k">:</span> <span class="kt">SparkContext</span><span class="o">)</span><span class="k">:</span> <span class="kt">TrainingData</span> <span class="o">=</span> <span class="o">{</span>
+
+    <span class="c1">// create a RDD of (entityID, User)
+</span>    <span class="k">val</span> <span class="n">usersRDD</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span class="kt">String</span>, <span class="kt">User</span><span class="o">)]</span> <span class="k">=</span> <span class="nc">PEventStore</span><span class="o">.</span><span class="n">aggregateProperties</span><span class="o">(...)</span> <span class="o">...</span>
+
+    <span class="c1">// create a RDD of (entityID, Item)
+</span>    <span class="k">val</span> <span class="n">itemsRDD</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span class="kt">String</span>, <span class="kt">Item</span><span class="o">)]</span> <span class="k">=</span> <span class="nc">PEventStore</span><span class="o">.</span><span class="n">aggregateProperties</span><span class="o">(...)</span> <span class="o">...</span>
+
+    <span class="c1">// get all "user" "view" or "buy" "item" events from event store
+</span>    <span class="k">val</span> <span class="n">eventsRDD</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">Event</span><span class="o">]</span> <span class="k">=</span> <span class="nc">PEventStore</span><span class="o">.</span><span class="n">find</span><span class="o">(...)</span> <span class="o">...</span>
+
+    <span class="c1">// filter all view events
+</span>    <span class="k">val</span> <span class="n">viewEventsRDD</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">ViewEvent</span><span class="o">]</span> <span class="k">=</span> <span class="n">eventsRDD</span><span class="o">.</span><span class="n">filter</span> <span class="o">{</span> <span class="o">...</span> <span class="o">}</span> <span class="o">...</span>
+
+    <span class="c1">// filter all buy events
+</span>    <span class="k">val</span> <span class="n">buyEventsRDD</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">BuyEvent</span><span class="o">]</span> <span class="k">=</span> <span class="n">eventsRDD</span><span class="o">.</span><span class="n">filter</span> <span class="o">{</span> <span class="o">...}</span> <span class="o">...</span>
+
+    <span class="k">new</span> <span class="nc">TrainingData</span><span class="o">(</span>
+      <span class="n">users</span> <span class="k">=</span> <span class="n">usersRDD</span><span class="o">,</span>
+      <span class="n">items</span> <span class="k">=</span> <span class="n">itemsRDD</span><span class="o">,</span>
+      <span class="n">viewEvents</span> <span class="k">=</span> <span class="n">viewEventsRDD</span><span class="o">,</span>
+      <span class="n">buyEvents</span> <span class="k">=</span> <span class="n">buyEventsRDD</span>
+    <span class="o">)</span>
+  <span class="o">}</span>
+<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>PredictionIO automatically loads the parameters of <em>datasource</em> specified in MyECommerceRecommendation/<strong><em>engine.json</em></strong>, including <em>appName</em>, to <code>dsp</code>.</p><p>In <strong><em>engine.json</em></strong>:</p><div class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9</pre></td><td class="code"><pre><span class="o">{</span>
+  ...
+  <span class="s2">"datasource"</span>: <span class="o">{</span>
+    <span class="s2">"params"</span> : <span class="o">{</span>
+      <span class="s2">"appName"</span>: <span class="s2">"MyApp1"</span>
+    <span class="o">}</span>
+  <span class="o">}</span>,
+  ...
+<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>In <code>readTraining()</code>, <code>PEventStore</code> is an object which provides function to access dataa that is collected by PredictionIO Event Server.</p><p>This E-Commerce Recommendation Engine Template requires &quot;user&quot; and &quot;item&quot; entities that are set by events.</p><p><code>PEventStore.aggregateProperties(...)</code> aggregates properties of the <code>user</code> and <code>item</code> that are set, unset, or delete by special events <strong>$set</strong>, <strong>$unset</strong> and <strong>$delete</strong>. Please refer to <a href="/datacollection/eventapi/#note-about-properties">Event API</a> for more details of using these events.</p><p>The following code aggregates the properties of <code>user</code> and then map each result to a <code>User()</code> object.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18</pre></td><td class="code"><pre>
+  <span class="c1">// create a RDD of (entityID, User)
+</span>  <span class="k">val</span> <span class="n">usersRDD</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span class="kt">String</span>, <span class="kt">User</span><span class="o">)]</span> <span class="k">=</span> <span class="nc">PEventStore</span><span class="o">.</span><span class="n">aggregateProperties</span><span class="o">(</span>
+    <span class="n">appName</span> <span class="k">=</span> <span class="n">dsp</span><span class="o">.</span><span class="n">appName</span><span class="o">,</span>
+    <span class="n">entityType</span> <span class="k">=</span> <span class="s">"user"</span>
+  <span class="o">)(</span><span class="n">sc</span><span class="o">).</span><span class="n">map</span> <span class="o">{</span> <span class="k">case</span> <span class="o">(</span><span class="n">entityId</span><span class="o">,</span> <span class="n">properties</span><span class="o">)</span> <span class="k">=&gt;</span>
+    <span class="k">val</span> <span class="n">user</span> <span class="k">=</span> <span class="k">try</span> <span class="o">{</span>
+      <span class="nc">User</span><span class="o">()</span>
+    <span class="o">}</span> <span class="k">catch</span> <span class="o">{</span>
+      <span class="k">case</span> <span class="n">e</span><span class="k">:</span> <span class="kt">Exception</span> <span class="o">=&gt;</span> <span class="o">{</span>
+        <span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="o">(</span><span class="n">s</span><span class="s">"Failed to get properties ${properties} of"</span> <span class="o">+</span>
+          <span class="n">s</span><span class="s">" user ${entityId}. Exception: ${e}."</span><span class="o">)</span>
+        <span class="k">throw</span> <span class="n">e</span>
+      <span class="o">}</span>
+    <span class="o">}</span>
+    <span class="o">(</span><span class="n">entityId</span><span class="o">,</span> <span class="n">user</span><span class="o">)</span>
+  <span class="o">}.</span><span class="n">cache</span><span class="o">()</span>
+
+</pre></td></tr></tbody></table> </div> <p>In the template, <code>User()</code> object is a simple dummy as a placeholder for you to customize and expand.</p><p>Similarly, the following code aggregates <code>item</code> properties and then map each result to an <code>Item()</code> object. By default, this template assumes each item has an optional property <code>categories</code>, which is a list of String.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17</pre></td><td class="code"><pre>  <span class="c1">// create a RDD of (entityID, Item)
+</span>  <span class="k">val</span> <span class="n">itemsRDD</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span class="kt">String</span>, <span class="kt">Item</span><span class="o">)]</span> <span class="k">=</span> <span class="nc">PEventStore</span><span class="o">.</span><span class="n">aggregateProperties</span><span class="o">(</span>
+    <span class="n">appName</span> <span class="k">=</span> <span class="n">dsp</span><span class="o">.</span><span class="n">appName</span><span class="o">,</span>
+    <span class="n">entityType</span> <span class="k">=</span> <span class="s">"item"</span>
+  <span class="o">)(</span><span class="n">sc</span><span class="o">).</span><span class="n">map</span> <span class="o">{</span> <span class="k">case</span> <span class="o">(</span><span class="n">entityId</span><span class="o">,</span> <span class="n">properties</span><span class="o">)</span> <span class="k">=&gt;</span>
+    <span class="k">val</span> <span class="n">item</span> <span class="k">=</span> <span class="k">try</span> <span class="o">{</span>
+      <span class="c1">// Assume categories is optional property of item.
+</span>      <span class="nc">Item</span><span class="o">(</span><span class="n">categories</span> <span class="k">=</span> <span class="n">properties</span><span class="o">.</span><span class="n">getOpt</span><span class="o">[</span><span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]](</span><span class="s">"categories"</span><span class="o">))</span>
+    <span class="o">}</span> <span class="k">catch</span> <span class="o">{</span>
+      <span class="k">case</span> <span class="n">e</span><span class="k">:</span> <span class="kt">Exception</span> <span class="o">=&gt;</span> <span class="o">{</span>
+        <span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="o">(</span><span class="n">s</span><span class="s">"Failed to get properties ${properties} of"</span> <span class="o">+</span>
+          <span class="n">s</span><span class="s">" item ${entityId}. Exception: ${e}."</span><span class="o">)</span>
+        <span class="k">throw</span> <span class="n">e</span>
+      <span class="o">}</span>
+    <span class="o">}</span>
+    <span class="o">(</span><span class="n">entityId</span><span class="o">,</span> <span class="n">item</span><span class="o">)</span>
+  <span class="o">}.</span><span class="n">cache</span><span class="o">()</span>
+</pre></td></tr></tbody></table> </div> <p>The <code>Item</code> case class is defined as</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1</pre></td><td class="code"><pre><span class="k">case</span> <span class="k">class</span> <span class="nc">Item</span><span class="o">(</span><span class="n">categories</span><span class="k">:</span> <span class="kt">Option</span><span class="o">[</span><span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">]])</span>
+</pre></td></tr></tbody></table> </div> <p><code>PEventStore.find(...)</code> specifies the events that you want to read. In this case, &quot;user view item&quot; and &quot;user buy item&quot; events are read</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10</pre></td><td class="code"><pre>
+  <span class="c1">// get all "user" "view" "item" events
+</span>  <span class="k">val</span> <span class="n">eventsRDD</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">Event</span><span class="o">]</span> <span class="k">=</span> <span class="nc">PEventStore</span><span class="o">.</span><span class="n">find</span><span class="o">(</span>
+      <span class="n">appName</span> <span class="k">=</span> <span class="n">dsp</span><span class="o">.</span><span class="n">appName</span><span class="o">,</span>
+      <span class="n">entityType</span> <span class="k">=</span> <span class="nc">Some</span><span class="o">(</span><span class="s">"user"</span><span class="o">),</span>
+      <span class="n">eventNames</span> <span class="k">=</span> <span class="nc">Some</span><span class="o">(</span><span class="nc">List</span><span class="o">(</span><span class="s">"view"</span><span class="o">,</span> <span class="s">"buy"</span><span class="o">)),</span>
+      <span class="c1">// targetEntityType is optional field of an event.
+</span>      <span class="n">targetEntityType</span> <span class="k">=</span> <span class="nc">Some</span><span class="o">(</span><span class="nc">Some</span><span class="o">(</span><span class="s">"item"</span><span class="o">)))(</span><span class="n">sc</span><span class="o">)</span>
+      <span class="o">.</span><span class="n">cache</span><span class="o">()</span>
+
+</pre></td></tr></tbody></table> </div> <p>Note that <code>.cache()</code> is used to cache the RDD data into memory since eventsRDD will be used multiple times later.</p><p>Then we filter the events we are intersted in and map the event to a <code>ViewEvent</code> object.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17</pre></td><td class="code"><pre>
+  <span class="k">val</span> <span class="n">viewEventsRDD</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">ViewEvent</span><span class="o">]</span> <span class="k">=</span> <span class="n">eventsRDD</span>
+      <span class="o">.</span><span class="n">filter</span> <span class="o">{</span> <span class="n">event</span> <span class="k">=&gt;</span> <span class="n">event</span><span class="o">.</span><span class="n">event</span> <span class="o">==</span> <span class="s">"view"</span> <span class="o">}</span>
+      <span class="o">.</span><span class="n">map</span> <span class="o">{</span> <span class="n">event</span> <span class="k">=&gt;</span>
+        <span class="k">try</span> <span class="o">{</span>
+          <span class="nc">ViewEvent</span><span class="o">(</span>
+            <span class="n">user</span> <span class="k">=</span> <span class="n">event</span><span class="o">.</span><span class="n">entityId</span><span class="o">,</span>
+            <span class="n">item</span> <span class="k">=</span> <span class="n">event</span><span class="o">.</span><span class="n">targetEntityId</span><span class="o">.</span><span class="n">get</span><span class="o">,</span>
+            <span class="n">t</span> <span class="k">=</span> <span class="n">event</span><span class="o">.</span><span class="n">eventTime</span><span class="o">.</span><span class="n">getMillis</span>
+          <span class="o">)</span>
+        <span class="o">}</span> <span class="k">catch</span> <span class="o">{</span>
+          <span class="k">case</span> <span class="n">e</span><span class="k">:</span> <span class="kt">Exception</span> <span class="o">=&gt;</span>
+            <span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="o">(</span><span class="n">s</span><span class="s">"Cannot convert ${event} to ViewEvent."</span> <span class="o">+</span>
+              <span class="n">s</span><span class="s">" Exception: ${e}."</span><span class="o">)</span>
+            <span class="k">throw</span> <span class="n">e</span>
+        <span class="o">}</span>
+      <span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p><code>ViewEvent</code> case class is defined as:</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1</pre></td><td class="code"><pre><span class="k">case</span> <span class="k">class</span> <span class="nc">ViewEvent</span><span class="o">(</span><span class="n">user</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">item</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">t</span><span class="k">:</span> <span class="kt">Long</span><span class="o">)</span>
+</pre></td></tr></tbody></table> </div> <p>We filter buy event in similar way and map to <code>BuyEvent</code> object for later use.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18</pre></td><td class="code"><pre>
+  <span class="k">val</span> <span class="n">buyEventsRDD</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">BuyEvent</span><span class="o">]</span> <span class="k">=</span> <span class="n">eventsRDD</span>
+      <span class="o">.</span><span class="n">filter</span> <span class="o">{</span> <span class="n">event</span> <span class="k">=&gt;</span> <span class="n">event</span><span class="o">.</span><span class="n">event</span> <span class="o">==</span> <span class="s">"buy"</span> <span class="o">}</span>
+      <span class="o">.</span><span class="n">map</span> <span class="o">{</span> <span class="n">event</span> <span class="k">=&gt;</span>
+        <span class="k">try</span> <span class="o">{</span>
+          <span class="nc">BuyEvent</span><span class="o">(</span>
+            <span class="n">user</span> <span class="k">=</span> <span class="n">event</span><span class="o">.</span><span class="n">entityId</span><span class="o">,</span>
+            <span class="n">item</span> <span class="k">=</span> <span class="n">event</span><span class="o">.</span><span class="n">targetEntityId</span><span class="o">.</span><span class="n">get</span><span class="o">,</span>
+            <span class="n">t</span> <span class="k">=</span> <span class="n">event</span><span class="o">.</span><span class="n">eventTime</span><span class="o">.</span><span class="n">getMillis</span>
+          <span class="o">)</span>
+        <span class="o">}</span> <span class="k">catch</span> <span class="o">{</span>
+          <span class="k">case</span> <span class="n">e</span><span class="k">:</span> <span class="kt">Exception</span> <span class="o">=&gt;</span>
+            <span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="o">(</span><span class="n">s</span><span class="s">"Cannot convert ${event} to BuyEvent."</span> <span class="o">+</span>
+              <span class="n">s</span><span class="s">" Exception: ${e}."</span><span class="o">)</span>
+            <span class="k">throw</span> <span class="n">e</span>
+        <span class="o">}</span>
+      <span class="o">}</span>
+
+</pre></td></tr></tbody></table> </div> <p><code>BuyEvent</code> case class is defined as:</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1</pre></td><td class="code"><pre><span class="k">case</span> <span class="k">class</span> <span class="nc">BuyEvent</span><span class="o">(</span><span class="n">user</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">item</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span> <span class="n">t</span><span class="k">:</span> <span class="kt">Long</span><span class="o">)</span>
+</pre></td></tr></tbody></table> </div> <div class="alert-message info"><p>For flexibility, this template is designed to support user ID and item ID in String.</p></div><p><code>TrainingData</code> contains an RDD of <code>User</code>, <code>Item</code> and <code>ViewEvent</code> objects. The class definition of <code>TrainingData</code> is:</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6</pre></td><td class="code"><pre><span class="k">class</span> <span class="nc">TrainingData</span><span class="o">(</span>
+  <span class="k">val</span> <span class="n">users</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span class="kt">String</span>, <span class="kt">User</span><span class="o">)],</span>
+  <span class="k">val</span> <span class="n">items</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span class="kt">String</span>, <span class="kt">Item</span><span class="o">)],</span>
+  <span class="k">val</span> <span class="n">viewEvents</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">ViewEvent</span><span class="o">],</span>
+  <span class="k">val</span> <span class="n">buyEvents</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">BuyEvent</span><span class="o">]</span>
+<span class="o">)</span> <span class="k">extends</span> <span class="nc">Serializable</span> <span class="o">{</span> <span class="o">...</span> <span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>PredictionIO then passes the returned <code>TrainingData</code> object to <em>Data Preparator</em>.</p><div class="alert-message note"><p>You could modify the DataSource to read other event other than the default <strong>view</strong> or <strong>buy</strong>.</p></div><h3 id='data-preparator' class='header-anchors'>Data Preparator</h3><p>In MyECommerceRecommendation/src/main/scala/<strong><em>Preparator.scala</em></strong>, the <code>prepare</code> method of class <code>Preparator</code> takes <code>TrainingData</code> as its input and performs any necessary feature selection and data processing tasks. At the end, it returns <code>PreparedData</code> which should contain the data <em>Algorithm</em> needs.</p><p>By default, <code>prepare</code> simply copies the unprocessed <code>TrainingData</code> data to <code>PreparedData</code>:</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text
 -align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18</pre></td><td class="code"><pre><span class="k">class</span> <span class="nc">Preparator</span>
+  <span class="k">extends</span> <span class="nc">PPreparator</span><span class="o">[</span><span class="kt">TrainingData</span>, <span class="kt">PreparedData</span><span class="o">]</span> <span class="o">{</span>
+
+  <span class="k">def</span> <span class="n">prepare</span><span class="o">(</span><span class="n">sc</span><span class="k">:</span> <span class="kt">SparkContext</span><span class="o">,</span> <span class="n">trainingData</span><span class="k">:</span> <span class="kt">TrainingData</span><span class="o">)</span><span class="k">:</span> <span class="kt">PreparedData</span> <span class="o">=</span> <span class="o">{</span>
+    <span class="k">new</span> <span class="nc">PreparedData</span><span class="o">(</span>
+      <span class="n">users</span> <span class="k">=</span> <span class="n">trainingData</span><span class="o">.</span><span class="n">users</span><span class="o">,</span>
+      <span class="n">items</span> <span class="k">=</span> <span class="n">trainingData</span><span class="o">.</span><span class="n">items</span><span class="o">,</span>
+      <span class="n">viewEvents</span> <span class="k">=</span> <span class="n">trainingData</span><span class="o">.</span><span class="n">viewEvents</span><span class="o">,</span>
+      <span class="n">buyEvents</span> <span class="k">=</span> <span class="n">trainingData</span><span class="o">.</span><span class="n">buyEvents</span><span class="o">)</span>
+  <span class="o">}</span>
+<span class="o">}</span>
+
+<span class="k">class</span> <span class="nc">PreparedData</span><span class="o">(</span>
+  <span class="k">val</span> <span class="n">users</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span class="kt">String</span>, <span class="kt">User</span><span class="o">)],</span>
+  <span class="k">val</span> <span class="n">items</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[(</span><span class="kt">String</span>, <span class="kt">Item</span><span class="o">)],</span>
+  <span class="k">val</span> <span class="n">viewEvents</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">ViewEvent</span><span class="o">],</span>
+  <span class="k">val</span> <span class="n">buyEvents</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">BuyEvent</span><span class="o">]</span>
+<span class="o">)</span> <span class="k">extends</span> <span class="nc">Serializable</span>
+</pre></td></tr></tbody></table> </div> <p>PredictionIO passes the returned <code>PreparedData</code> object to Algorithm&#39;s <code>train</code> function.</p><h2 id='algorithm' class='header-anchors'>Algorithm</h2><p>In MyECommerceRecommendation/src/main/scala/<strong><em>ECommAlgorithm.scala</em></strong>, the two methods of the algorithm class are <code>train</code> and <code>predict</code>. <code>train</code> is responsible for training the predictive model;<code>predict</code> is responsible for using this model to make prediction.</p><h3 id='algorithm-parameters' class='header-anchors'>Algorithm parameters</h3><p>The ECommAlgorithm takes the following parameters, as defined by the <code>ECommAlgorithmParams</code> case class:</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10</pre></td><td class="code"><pre><span class="k">case</span> <span class="k">class</span> <span class="nc">ECommAlgorithmParams</span><span class="o">(</span>
+  <span class="n">appName</span><span class="k">:</span> <span class="kt">String</span><span class="o">,</span>
+  <span class="n">unseenOnly</span><span class="k">:</span> <span class="kt">Boolean</span><span class="o">,</span>
+  <span class="n">seenEvents</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">],</span>
+  <span class="n">similarEvents</span><span class="k">:</span> <span class="kt">List</span><span class="o">[</span><span class="kt">String</span><span class="o">],</span>
+  <span class="n">rank</span><span class="k">:</span> <span class="kt">Int</span><span class="o">,</span>
+  <span class="n">numIterations</span><span class="k">:</span> <span class="kt">Int</span><span class="o">,</span>
+  <span class="n">lambda</span><span class="k">:</span> <span class="kt">Double</span><span class="o">,</span>
+  <span class="n">seed</span><span class="k">:</span> <span class="kt">Option</span><span class="o">[</span><span class="kt">Long</span><span class="o">]</span>
+<span class="o">)</span> <span class="k">extends</span> <span class="nc">Params</span>
+</pre></td></tr></tbody></table> </div> <p>Parameter description:</p> <ul> <li><strong>appName</strong>: Your App name. Events defined by &quot;seenEvents&quot; and &quot;similarEvents&quot; will be read from this app during <code>predict</code>.</li> <li><strong>unseenOnly</strong>: true or false. Set to true if you want to recommmend unseen items only. Seen items are defined by <em>seenEvents</em> which mean if the user has these events on the items, then it&#39;s treated as <em>seen</em>.</li> <li><strong>seenEvents</strong>: A list of user-to-item events which will be treated as <em>seen</em> events. Used when <em>unseenOnly</em> is set to true.</li> <li><strong>similarEvents</strong>: A list of user-item-item events which will be used to find similar items to the items which the user has performend these events on.</li> <li><strong>rank</strong>: Parameter of the MLlib ALS algorithm. Number of latent features.</li> <li><strong>numIterations</strong>: Parameter of the MLlib ALS 
 algorithm. Number of iterations.</li> <li><strong>lambda</strong>: Regularization parameter of the MLlib ALS algorithm.</li> <li><strong>seed</strong>: Optional. A random seed of the MLlib ALS algorithm. Specify a fixed value if want to have deterministic result.</li> </ul> <h3 id='train(...)' class='header-anchors'>train(...)</h3><p><code>train</code> is called when you run <strong>pio train</strong>. This is where MLlib ALS algorithm, i.e. <code>ALS.trainImplicit()</code>, is used to train a predictive model. In addition, we also count the number of items being bought for each item as default model which will be used when there is no ALS model avaiable or other useful information about the user is avaiable during <code>predict</code>.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
+30
+31
+32
+33
+34
+35
+36
+37
+38
+39</pre></td><td class="code"><pre>
+  <span class="k">def</span> <span class="n">train</span><span class="o">(</span><span class="n">sc</span><span class="k">:</span> <span class="kt">SparkContext</span><span class="o">,</span> <span class="n">data</span><span class="k">:</span> <span class="kt">PreparedData</span><span class="o">)</span><span class="k">:</span> <span class="kt">ECommModel</span> <span class="o">=</span> <span class="o">{</span>
+    <span class="o">...</span>
+
+    <span class="c1">// create User and item's String ID to integer index BiMap
+</span>    <span class="k">val</span> <span class="n">userStringIntMap</span> <span class="k">=</span> <span class="nc">BiMap</span><span class="o">.</span><span class="n">stringInt</span><span class="o">(</span><span class="n">data</span><span class="o">.</span><span class="n">users</span><span class="o">.</span><span class="n">keys</span><span class="o">)</span>
+    <span class="k">val</span> <span class="n">itemStringIntMap</span> <span class="k">=</span> <span class="nc">BiMap</span><span class="o">.</span><span class="n">stringInt</span><span class="o">(</span><span class="n">data</span><span class="o">.</span><span class="n">items</span><span class="o">.</span><span class="n">keys</span><span class="o">)</span>
+
+    <span class="c1">// generate MLlibRating data for ALS algorithm
+</span>    <span class="k">val</span> <span class="n">mllibRatings</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">MLlibRating</span><span class="o">]</span> <span class="k">=</span> <span class="n">genMLlibRating</span><span class="o">(</span>
+      <span class="n">userStringIntMap</span> <span class="k">=</span> <span class="n">userStringIntMap</span><span class="o">,</span>
+      <span class="n">itemStringIntMap</span> <span class="k">=</span> <span class="n">itemStringIntMap</span><span class="o">,</span>
+      <span class="n">data</span> <span class="k">=</span> <span class="n">data</span>
+    <span class="o">)</span>
+
+    <span class="c1">// seed for MLlib ALS
+</span>    <span class="k">val</span> <span class="n">seed</span> <span class="k">=</span> <span class="n">ap</span><span class="o">.</span><span class="n">seed</span><span class="o">.</span><span class="n">getOrElse</span><span class="o">(</span><span class="nc">System</span><span class="o">.</span><span class="n">nanoTime</span><span class="o">)</span>
+
+    <span class="k">val</span> <span class="n">m</span> <span class="k">=</span> <span class="nc">ALS</span><span class="o">.</span><span class="n">trainImplicit</span><span class="o">(</span>
+      <span class="n">ratings</span> <span class="k">=</span> <span class="n">mllibRatings</span><span class="o">,</span>
+      <span class="n">rank</span> <span class="k">=</span> <span class="n">ap</span><span class="o">.</span><span class="n">rank</span><span class="o">,</span>
+      <span class="n">iterations</span> <span class="k">=</span> <span class="n">ap</span><span class="o">.</span><span class="n">numIterations</span><span class="o">,</span>
+      <span class="n">lambda</span> <span class="k">=</span> <span class="n">ap</span><span class="o">.</span><span class="n">lambda</span><span class="o">,</span>
+      <span class="n">blocks</span> <span class="k">=</span> <span class="o">-</span><span class="mi">1</span><span class="o">,</span>
+      <span class="n">alpha</span> <span class="k">=</span> <span class="mf">1.0</span><span class="o">,</span>
+      <span class="n">seed</span> <span class="k">=</span> <span class="n">seed</span><span class="o">)</span>
+
+    <span class="o">...</span>
+
+    <span class="c1">// count the number of items being bought for recommendation popular items as default case
+</span>    <span class="k">val</span> <span class="n">popularCount</span> <span class="k">=</span> <span class="n">trainDefault</span><span class="o">(</span>
+      <span class="n">userStringIntMap</span> <span class="k">=</span> <span class="n">userStringIntMap</span><span class="o">,</span>
+      <span class="n">itemStringIntMap</span> <span class="k">=</span> <span class="n">itemStringIntMap</span><span class="o">,</span>
+      <span class="n">data</span> <span class="k">=</span> <span class="n">data</span>
+    <span class="o">)</span>
+    <span class="o">...</span>
+
+  <span class="o">}</span>
+
+</pre></td></tr></tbody></table> </div> <h4 id='working-with-spark-mllib&#39;s-als.trainimplicit(....)' class='header-anchors'>Working with Spark MLlib&#39;s ALS.trainImplicit(....)</h4><p>MLlib ALS does not support <code>String</code> user ID and item ID. <code>ALS.trainImplicit</code> thus also assumes int-only <code>Rating</code> object. First, you can rename MLlib&#39;s Integer-only <code>Rating</code> to <code>MLlibRating</code> for clarity:</p><div class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1</pre></td><td class="code"><pre>import org.apache.spark.mllib.recommendation.<span class="o">{</span>Rating <span class="o">=</span>&gt; MLlibRating<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>In order to use MLlib&#39;s ALS algorithm, we need to convert the <code>viewEvents</code> into <code>MLlibRating</code>. There are two things we need to handle:</p> <ol> <li>Map user and item String ID of the ViewEvent into Integer ID, as required by <code>MLlibRating</code>.</li> <li><code>ViewEvent</code> object is an implicit event that does not have an explicit rating value. <code>ALS.trainImplicit()</code> supports implicit preference. If the <code>MLlibRating</code> has higher rating value, it means higher confidence that the user prefers the item. Hence we can aggregate how many times the user has viewed the item to indicate the confidence level that the user may prefer the item.</li> </ol> <p>You create a bi-directional map with <code>BiMap.stringInt</code> which maps each String record to an Integer index.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pr
 e class="lineno">1
+2</pre></td><td class="code"><pre><span class="k">val</span> <span class="n">userStringIntMap</span> <span class="k">=</span> <span class="nc">BiMap</span><span class="o">.</span><span class="n">stringInt</span><span class="o">(</span><span class="n">data</span><span class="o">.</span><span class="n">users</span><span class="o">.</span><span class="n">keys</span><span class="o">)</span>
+<span class="k">val</span> <span class="n">itemStringIntMap</span> <span class="k">=</span> <span class="nc">BiMap</span><span class="o">.</span><span class="n">stringInt</span><span class="o">(</span><span class="n">data</span><span class="o">.</span><span class="n">items</span><span class="o">.</span><span class="n">keys</span><span class="o">)</span>
+</pre></td></tr></tbody></table> </div> <p>Then convert the user and item String ID in each ViewEvent to Int with these BiMaps. We use default -1 if the user or item String ID couldn&#39;t be found in the BiMap and filter out these events with invalid user and item ID later. After filtering, we use <code>reduceByKey()</code> to add up all values for the same key (uindex, iindex) and then finally map to <code>MLlibRating</code> object. You can find the code inside the function <code>genMLlibRating()</code>:</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
+30
+31
+32
+33
+34
+35
+36</pre></td><td class="code"><pre>
+  <span class="k">def</span> <span class="n">genMLlibRating</span><span class="o">(</span>
+    <span class="n">userStringIntMap</span><span class="k">:</span> <span class="kt">BiMap</span><span class="o">[</span><span class="kt">String</span>, <span class="kt">Int</span><span class="o">],</span>
+    <span class="n">itemStringIntMap</span><span class="k">:</span> <span class="kt">BiMap</span><span class="o">[</span><span class="kt">String</span>, <span class="kt">Int</span><span class="o">],</span>
+    <span class="n">data</span><span class="k">:</span> <span class="kt">PreparedData</span><span class="o">)</span><span class="k">:</span> <span class="kt">RDD</span><span class="o">[</span><span class="kt">MLlibRating</span><span class="o">]</span> <span class="k">=</span> <span class="o">{</span>
+
+    <span class="k">val</span> <span class="n">mllibRatings</span> <span class="k">=</span> <span class="n">data</span><span class="o">.</span><span class="n">viewEvents</span>
+      <span class="o">.</span><span class="n">map</span> <span class="o">{</span> <span class="n">r</span> <span class="k">=&gt;</span>
+        <span class="c1">// Convert user and item String IDs to Int index for MLlib
+</span>        <span class="k">val</span> <span class="n">uindex</span> <span class="k">=</span> <span class="n">userStringIntMap</span><span class="o">.</span><span class="n">getOrElse</span><span class="o">(</span><span class="n">r</span><span class="o">.</span><span class="n">user</span><span class="o">,</span> <span class="o">-</span><span class="mi">1</span><span class="o">)</span>
+        <span class="k">val</span> <span class="n">iindex</span> <span class="k">=</span> <span class="n">itemStringIntMap</span><span class="o">.</span><span class="n">getOrElse</span><span class="o">(</span><span class="n">r</span><span class="o">.</span><span class="n">item</span><span class="o">,</span> <span class="o">-</span><span class="mi">1</span><span class="o">)</span>
+
+        <span class="k">if</span> <span class="o">(</span><span class="n">uindex</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="o">)</span>
+          <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="o">(</span><span class="n">s</span><span class="s">"Couldn't convert nonexistent user ID ${r.user}"</span>
+            <span class="o">+</span> <span class="s">" to Int index."</span><span class="o">)</span>
+
+        <span class="k">if</span> <span class="o">(</span><span class="n">iindex</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="o">)</span>
+          <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="o">(</span><span class="n">s</span><span class="s">"Couldn't convert nonexistent item ID ${r.item}"</span>
+            <span class="o">+</span> <span class="s">" to Int index."</span><span class="o">)</span>
+
+        <span class="o">((</span><span class="n">uindex</span><span class="o">,</span> <span class="n">iindex</span><span class="o">),</span> <span class="mi">1</span><span class="o">)</span>
+      <span class="o">}</span>
+      <span class="o">.</span><span class="n">filter</span> <span class="o">{</span> <span class="k">case</span> <span class="o">((</span><span class="n">u</span><span class="o">,</span> <span class="n">i</span><span class="o">),</span> <span class="n">v</span><span class="o">)</span> <span class="k">=&gt;</span>
+        <span class="c1">// keep events with valid user and item index
+</span>        <span class="o">(</span><span class="n">u</span> <span class="o">!=</span> <span class="o">-</span><span class="mi">1</span><span class="o">)</span> <span class="o">&amp;&amp;</span> <span class="o">(</span><span class="n">i</span> <span class="o">!=</span> <span class="o">-</span><span class="mi">1</span><span class="o">)</span>
+      <span class="o">}</span>
+      <span class="o">.</span><span class="n">reduceByKey</span><span class="o">(</span><span class="k">_</span> <span class="o">+</span> <span class="k">_</span><span class="o">)</span> <span class="c1">// aggregate all view events of same user-item pair
+</span>      <span class="o">.</span><span class="n">map</span> <span class="o">{</span> <span class="k">case</span> <span class="o">((</span><span class="n">u</span><span class="o">,</span> <span class="n">i</span><span class="o">),</span> <span class="n">v</span><span class="o">)</span> <span class="k">=&gt;</span>
+        <span class="c1">// MLlibRating requires integer index for user and item
+</span>        <span class="nc">MLlibRating</span><span class="o">(</span><span class="n">u</span><span class="o">,</span> <span class="n">i</span><span class="o">,</span> <span class="n">v</span><span class="o">)</span>
+      <span class="o">}</span>
+      <span class="o">.</span><span class="n">cache</span><span class="o">()</span>
+
+    <span class="n">mllibRatings</span>
+  <span class="o">}</span>
+
+</pre></td></tr></tbody></table> </div> <div class="alert-message note"><p>You can customize this function if you want to convert other events to MLlibRating or need different ways to aggreagte the events into MLlibRating.</p></div><p>In addition to <code>RDD[MLlibRating]</code>, <code>ALS.trainImplicit</code> takes the following parameters: <em>rank</em>, <em>iterations</em>, <em>lambda</em> and <em>seed</em>.</p><p>The values of these parameters are specified in <em>algorithms</em> of MyECommerceRecommendation/<strong><em>engine.json</em></strong>:</p><div class="highlight shell"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19</pre></td><td class="code"><pre><span class="o">{</span>
+  ...
+  <span class="s2">"algorithms"</span>: <span class="o">[</span>
+    <span class="o">{</span>
+      <span class="s2">"name"</span>: <span class="s2">"als"</span>,
+      <span class="s2">"params"</span>: <span class="o">{</span>
+        <span class="s2">"appName"</span>: <span class="s2">"MyApp1"</span>,
+        <span class="s2">"unseenOnly"</span>: <span class="nb">true</span>,
+        <span class="s2">"seenEvents"</span>: <span class="o">[</span><span class="s2">"buy"</span>, <span class="s2">"view"</span><span class="o">]</span>,
+        <span class="s2">"similarEvents"</span> : <span class="o">[</span><span class="s2">"view"</span><span class="o">]</span>
+        <span class="s2">"rank"</span>: 10,
+        <span class="s2">"numIterations"</span> : 20,
+        <span class="s2">"lambda"</span>: 0.01,
+        <span class="s2">"seed"</span>: 3
+      <span class="o">}</span>
+    <span class="o">}</span>
+  <span class="o">]</span>
+  ...
+<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>The parameters <code>appName</code>, <code>unseenOnly</code>, <code>seenEvents</code> and <code>similarEvents</code> are used during <code>predict()</code>, which will be explained later.</p><p>PredictionIO will automatically loads these values into the constructor <code>ap</code>, which has a corresponding case class <code>ECommAlgorithmParams</code>.</p><p>The <code>seed</code> parameter is an optional parameter, which is used by MLlib ALS algorithm internally to generate random values. If the <code>seed</code> is not specified, current system time would be used and hence each train may produce different reuslts. Specify a fixed value for the <code>seed</code> if you want to have deterministic result (For example, when you are testing).</p><p><code>ALS.trainImplicit()</code> returns a <code>MatrixFactorizationModel</code> model which contains two RDDs: userFeatures and productFeatures. They correspond to the user X latent features matrix 
 and item X latent features matrix, respectively.</p><p>In addition to the latent feature vector, the item properties (e.g. categories) and popular count are also used during <code>predict()</code>. Hence, we also save these data along with the feature vector by joining them and then collect the data as local Map. Each item is represented by a <code>ProductModel</code> class, which cosists of the <code>item</code> information, <code>features</code> calculated by ALS, and <code>count</code> returned by <code>trainDefault()</code>.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7</pre></td><td class="code"><pre>
+<span class="k">case</span> <span class="k">class</span> <span class="nc">ProductModel</span><span class="o">(</span>
+  <span class="n">item</span><span class="k">:</span> <span class="kt">Item</span><span class="o">,</span>
+  <span class="n">features</span><span class="k">:</span> <span class="kt">Option</span><span class="o">[</span><span class="kt">Array</span><span class="o">[</span><span class="kt">Double</span><span class="o">]],</span> <span class="c1">// features by ALS
+</span>  <span class="n">count</span><span class="k">:</span> <span class="kt">Int</span> <span class="c1">// popular count for default score
+</span><span class="o">)</span>
+
+</pre></td></tr></tbody></table> </div> <div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25</pre></td><td class="code"><pre>    <span class="c1">// join item with the trained productFeatures
+</span>    <span class="k">val</span> <span class="n">productFeatures</span><span class="k">:</span> <span class="kt">Map</span><span class="o">[</span><span class="kt">Int</span>, <span class="o">(</span><span class="kt">Item</span>, <span class="kt">Option</span><span class="o">[</span><span class="kt">Array</span><span class="o">[</span><span class="kt">Double</span><span class="o">]])]</span> <span class="k">=</span>
+      <span class="n">items</span><span class="o">.</span><span class="n">leftOuterJoin</span><span class="o">(</span><span class="n">m</span><span class="o">.</span><span class="n">productFeatures</span><span class="o">).</span><span class="n">collectAsMap</span><span class="o">.</span><span class="n">toMap</span>
+
+    <span class="o">...</span>
+
+    <span class="k">val</span> <span class="n">productModels</span><span class="k">:</span> <span class="kt">Map</span><span class="o">[</span><span class="kt">Int</span>, <span class="kt">ProductModel</span><span class="o">]</span> <span class="k">=</span> <span class="n">productFeatures</span>
+      <span class="o">.</span><span class="n">map</span> <span class="o">{</span> <span class="k">case</span> <span class="o">(</span><span class="n">index</span><span class="o">,</span> <span class="o">(</span><span class="n">item</span><span class="o">,</span> <span class="n">features</span><span class="o">))</span> <span class="k">=&gt;</span>
+        <span class="k">val</span> <span class="n">pm</span> <span class="k">=</span> <span class="nc">ProductModel</span><span class="o">(</span>
+          <span class="n">item</span> <span class="k">=</span> <span class="n">item</span><span class="o">,</span>
+          <span class="n">features</span> <span class="k">=</span> <span class="n">features</span><span class="o">,</span>
+          <span class="c1">// NOTE: use getOrElse because popularCount may not contain all items.
+</span>          <span class="n">count</span> <span class="k">=</span> <span class="n">popularCount</span><span class="o">.</span><span class="n">getOrElse</span><span class="o">(</span><span class="n">index</span><span class="o">,</span> <span class="mi">0</span><span class="o">)</span>
+        <span class="o">)</span>
+        <span class="o">(</span><span class="n">index</span><span class="o">,</span> <span class="n">pm</span><span class="o">)</span>
+      <span class="o">}</span>
+
+    <span class="k">new</span> <span class="nc">ECommModel</span><span class="o">(</span>
+      <span class="n">rank</span> <span class="k">=</span> <span class="n">m</span><span class="o">.</span><span class="n">rank</span><span class="o">,</span>
+      <span class="n">userFeatures</span> <span class="k">=</span> <span class="n">userFeatures</span><span class="o">,</span>
+      <span class="n">productModels</span> <span class="k">=</span> <span class="n">productModels</span><span class="o">,</span>
+      <span class="n">userStringIntMap</span> <span class="k">=</span> <span class="n">userStringIntMap</span><span class="o">,</span>
+      <span class="n">itemStringIntMap</span> <span class="k">=</span> <span class="n">itemStringIntMap</span>
+    <span class="o">)</span>
+
+</pre></td></tr></tbody></table> </div> <p>Note that <code>leftOuterJoin</code> is used because the productFeatures returned by ALS may not contain all items.</p><p>The <code>ECommModel</code> is defined as the following:</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7</pre></td><td class="code"><pre><span class="k">class</span> <span class="nc">ECommModel</span><span class="o">(</span>
+  <span class="k">val</span> <span class="n">rank</span><span class="k">:</span> <span class="kt">Int</span><span class="o">,</span>
+  <span class="k">val</span> <span class="n">userFeatures</span><span class="k">:</span> <span class="kt">Map</span><span class="o">[</span><span class="kt">Int</span>, <span class="kt">Array</span><span class="o">[</span><span class="kt">Double</span><span class="o">]],</span>
+  <span class="k">val</span> <span class="n">productModels</span><span class="k">:</span> <span class="kt">Map</span><span class="o">[</span><span class="kt">Int</span>, <span class="kt">ProductModel</span><span class="o">],</span>
+  <span class="k">val</span> <span class="n">userStringIntMap</span><span class="k">:</span> <span class="kt">BiMap</span><span class="o">[</span><span class="kt">String</span>, <span class="kt">Int</span><span class="o">],</span>
+  <span class="k">val</span> <span class="n">itemStringIntMap</span><span class="k">:</span> <span class="kt">BiMap</span><span class="o">[</span><span class="kt">String</span>, <span class="kt">Int</span><span class="o">]</span>
+<span class="o">)</span> <span class="k">extends</span> <span class="nc">Serializable</span>  <span class="o">{</span> <span class="o">...</span> <span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>PredictionIO will automatically store the returned model after training, i.e. <code>ECommModel</code> in this example.</p><h3 id='predict(...)' class='header-anchors'>predict(...)</h3><p><code>predict</code> is called when you send a JSON query to <a href="http://localhost:8000/queries.json">http://localhost:8000/queries.json</a>. PredictionIO converts the query, such as <code>{ &quot;user&quot;: &quot;u1&quot;, &quot;num&quot;: 4 }</code> to the <code>Query</code> class you defined previously.</p><p>We can use the userFeatures and productFeatures stored in ECommModel to calculate the scores of items for the user.</p><p>This template also supports additional business logic features, such as filtering items by categories, recommending items in the white list, excluding items in the black list, recommend unseen items only, and exclude unavaiable items defined in constraint event.</p><p>The <code>predict()</code> function does the following:</
 p> <ol> <li>Convert the item in query&#39;s whilteList from string ID to integer index</li> <li>Get a list seen items by the user (defined by parmater <code>seenEvents</code>)</li> <li>Get the latest unavailableItems which is used to exclude unavailable items for all users</li> <li>Combine query&#39;s blackList, seenItems, and unavailableItems into a final black list of items to be excluded from recommendation.</li> <li>Get the user feature vector from the ECommModel.</li> <li>If there is feature vector for the user, recommend top N items based on the user feature and prodcut features.</li> <li>If there is no feature vector for the user, use the recent items acted by the user (defined by <code>similarEvents</code> parameter) to recommend similar items.</li> <li>If there is no recent <code>similarEvents</code> available for the user, popular items are then recommended (added in template version 0.4.0).</li> </ol> <p>Only items which satisfy the <code>isCandidate()</code> condition wi
 ll be recommended. By default, the item can be recommended if:</p> <ul> <li>it belongs to one of the categories defined in query.</li> <li>it is one of the white list items if white list is defined.</li> <li>it is not in the black list.</li> </ul> <div class="alert-message info"><p>You can easily modify <code>isCandidate()</code> checking or related logic if you have different requirements or condition to determine if an item is a candidate item to be recommended.</p></div><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
+14
+15
+16
+17
+18
+19
+20
+21
+22
+23
+24
+25
+26
+27
+28
+29
+30
+31
+32
+33
+34
+35
+36
+37
+38
+39
+40
+41
+42
+43
+44
+45
+46
+47
+48
+49
+50
+51
+52
+53
+54
+55
+56
+57
+58
+59
+60
+61
+62
+63
+64
+65
+66
+67
+68
+69
+70</pre></td><td class="code"><pre>
+  <span class="k">def</span> <span class="n">predict</span><span class="o">(</span><span class="n">model</span><span class="k">:</span> <span class="kt">ECommModel</span><span class="o">,</span> <span class="n">query</span><span class="k">:</span> <span class="kt">Query</span><span class="o">)</span><span class="k">:</span> <span class="kt">PredictedResult</span> <span class="o">=</span> <span class="o">{</span>
+
+    <span class="k">val</span> <span class="n">userFeatures</span> <span class="k">=</span> <span class="n">model</span><span class="o">.</span><span class="n">userFeatures</span>
+    <span class="k">val</span> <span class="n">productFeatures</span> <span class="k">=</span> <span class="n">model</span><span class="o">.</span><span class="n">productFeatures</span>
+
+    <span class="c1">// convert whiteList's string ID to integer index
+</span>    <span class="k">val</span> <span class="n">whiteList</span><span class="k">:</span> <span class="kt">Option</span><span class="o">[</span><span class="kt">Set</span><span class="o">[</span><span class="kt">Int</span><span class="o">]]</span> <span class="k">=</span> <span class="n">query</span><span class="o">.</span><span class="n">whiteList</span><span class="o">.</span><span class="n">map</span><span class="o">(</span> <span class="n">set</span> <span class="k">=&gt;</span>
+      <span class="n">set</span><span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="n">model</span><span class="o">.</span><span class="n">itemStringIntMap</span><span class="o">.</span><span class="n">get</span><span class="o">(</span><span class="k">_</span><span class="o">)).</span><span class="n">flatten</span>
+    <span class="o">)</span>
+
+    <span class="c1">// generate final blackList based on additional constraints
+</span>    <span class="k">val</span> <span class="n">finalBlackList</span><span class="k">:</span> <span class="kt">Set</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="n">genBlackList</span><span class="o">(</span><span class="n">query</span> <span class="k">=</span> <span class="n">query</span><span class="o">)</span>
+      <span class="c1">// convert seen Items list from String ID to interger Index
+</span>      <span class="o">.</span><span class="n">flatMap</span><span class="o">(</span><span class="n">x</span> <span class="k">=&gt;</span> <span class="n">model</span><span class="o">.</span><span class="n">itemStringIntMap</span><span class="o">.</span><span class="n">get</span><span class="o">(</span><span class="n">x</span><span class="o">))</span>
+
+    <span class="c1">// look up user feature from model
+</span>    <span class="k">val</span> <span class="n">userFeature</span> <span class="k">=</span>
+      <span class="n">model</span><span class="o">.</span><span class="n">userStringIntMap</span><span class="o">.</span><span class="n">get</span><span class="o">(</span><span class="n">query</span><span class="o">.</span><span class="n">user</span><span class="o">).</span><span class="n">map</span> <span class="o">{</span> <span class="n">userIndex</span> <span class="k">=&gt;</span>
+        <span class="n">userFeatures</span><span class="o">.</span><span class="n">get</span><span class="o">(</span><span class="n">userIndex</span><span class="o">)</span>
+      <span class="o">}</span>
+      <span class="c1">// flatten Option[Option[Array[Double]]] to Option[Array[Double]]
+</span>      <span class="o">.</span><span class="n">flatten</span>
+
+    <span class="k">val</span> <span class="n">topScores</span><span class="k">:</span> <span class="kt">Array</span><span class="o">[(</span><span class="kt">Int</span>, <span class="kt">Double</span><span class="o">)]</span> <span class="k">=</span> <span class="k">if</span> <span class="o">(</span><span class="n">userFeature</span><span class="o">.</span><span class="n">isDefined</span><span class="o">)</span> <span class="o">{</span>
+      <span class="c1">// the user has feature vector
+</span>      <span class="n">predictKnownUser</span><span class="o">(</span>
+        <span class="n">userFeature</span> <span class="k">=</span> <span class="n">userFeature</span><span class="o">.</span><span class="n">get</span><span class="o">,</span>
+        <span class="n">productModels</span> <span class="k">=</span> <span class="n">productModels</span><span class="o">,</span>
+        <span class="n">query</span> <span class="k">=</span> <span class="n">query</span><span class="o">,</span>
+        <span class="n">whiteList</span> <span class="k">=</span> <span class="n">whiteList</span><span class="o">,</span>
+        <span class="n">blackList</span> <span class="k">=</span> <span class="n">finalBlackList</span>
+      <span class="o">)</span>
+    <span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
+      <span class="c1">// the user doesn't have feature vector.
+</span>      <span class="c1">// For example, new user is created after model is trained.
+</span>      <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="o">(</span><span class="n">s</span><span class="s">"No userFeature found for user ${query.user}."</span><span class="o">)</span>
+
+      <span class="c1">// check if the user has recent events on some items
+</span>      <span class="k">val</span> <span class="n">recentItems</span><span class="k">:</span> <span class="kt">Set</span><span class="o">[</span><span class="kt">String</span><span class="o">]</span> <span class="k">=</span> <span class="n">getRecentItems</span><span class="o">(</span><span class="n">query</span><span class="o">)</span>
+      <span class="k">val</span> <span class="n">recentList</span><span class="k">:</span> <span class="kt">Set</span><span class="o">[</span><span class="kt">Int</span><span class="o">]</span> <span class="k">=</span> <span class="n">recentItems</span><span class="o">.</span><span class="n">flatMap</span> <span class="o">(</span><span class="n">x</span> <span class="k">=&gt;</span>
+        <span class="n">model</span><span class="o">.</span><span class="n">itemStringIntMap</span><span class="o">.</span><span class="n">get</span><span class="o">(</span><span class="n">x</span><span class="o">))</span>
+
+      <span class="k">val</span> <span class="n">recentFeatures</span><span class="k">:</span> <span class="kt">Vector</span><span class="o">[</span><span class="kt">Array</span><span class="o">[</span><span class="kt">Double</span><span class="o">]]</span> <span class="k">=</span> <span class="n">recentList</span><span class="o">.</span><span class="n">toVector</span>
+        <span class="c1">// productModels may not contain the requested item
+</span>        <span class="o">.</span><span class="n">map</span> <span class="o">{</span> <span class="n">i</span> <span class="k">=&gt;</span>
+          <span class="n">productModels</span><span class="o">.</span><span class="n">get</span><span class="o">(</span><span class="n">i</span><span class="o">).</span><span class="n">flatMap</span> <span class="o">{</span> <span class="n">pm</span> <span class="k">=&gt;</span> <span class="n">pm</span><span class="o">.</span><span class="n">features</span> <span class="o">}</span>
+        <span class="o">}.</span><span class="n">flatten</span>
+
+      <span class="k">if</span> <span class="o">(</span><span class="n">recentFeatures</span><span class="o">.</span><span class="n">isEmpty</span><span class="o">)</span> <span class="o">{</span>
+        <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="o">(</span><span class="n">s</span><span class="s">"No features vector for recent items ${recentItems}."</span><span class="o">)</span>
+        <span class="n">predictDefault</span><span class="o">(</span>
+          <span class="n">productModels</span> <span class="k">=</span> <span class="n">productModels</span><span class="o">,</span>
+          <span class="n">query</span> <span class="k">=</span> <span class="n">query</span><span class="o">,</span>
+          <span class="n">whiteList</span> <span class="k">=</span> <span class="n">whiteList</span><span class="o">,</span>
+          <span class="n">blackList</span> <span class="k">=</span> <span class="n">finalBlackList</span>
+        <span class="o">)</span>
+      <span class="o">}</span> <span class="k">else</span> <span class="o">{</span>
+        <span class="n">predictSimilar</span><span class="o">(</span>
+          <span class="n">recentFeatures</span> <span class="k">=</span> <span class="n">recentFeatures</span><span class="o">,</span>
+          <span class="n">productModels</span> <span class="k">=</span> <span class="n">productModels</span><span class="o">,</span>
+          <span class="n">query</span> <span class="k">=</span> <span class="n">query</span><span class="o">,</span>
+          <span class="n">whiteList</span> <span class="k">=</span> <span class="n">whiteList</span><span class="o">,</span>
+          <span class="n">blackList</span> <span class="k">=</span> <span class="n">finalBlackList</span>
+        <span class="o">)</span>
+      <span class="o">}</span>
+    <span class="o">}</span>
+
+    <span class="o">...</span>
+  <span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>Note that the item IDs in top N results are the <code>Int</code> indices. You map them back to <code>String</code> with <code>itemIntStringMap</code> before they are returned.</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9</pre></td><td class="code"><pre>  <span class="k">val</span> <span class="n">itemScores</span> <span class="k">=</span> <span class="n">topScores</span><span class="o">.</span><span class="n">map</span> <span class="o">{</span> <span class="k">case</span> <span class="o">(</span><span class="n">i</span><span class="o">,</span> <span class="n">s</span><span class="o">)</span> <span class="k">=&gt;</span>
+    <span class="k">new</span> <span class="nc">ItemScore</span><span class="o">(</span>
+      <span class="c1">// convert item int index back to string ID
+</span>      <span class="n">item</span> <span class="k">=</span> <span class="n">model</span><span class="o">.</span><span class="n">itemIntStringMap</span><span class="o">(</span><span class="n">i</span><span class="o">),</span>
+      <span class="n">score</span> <span class="k">=</span> <span class="n">s</span>
+    <span class="o">)</span>
+  <span class="o">}</span>
+
+  <span class="k">new</span> <span class="nc">PredictedResult</span><span class="o">(</span><span class="n">itemScores</span><span class="o">)</span>
+</pre></td></tr></tbody></table> </div> <p>PredictionIO passes the returned <code>PredictedResult</code> object to <em>Serving</em>.</p><h2 id='serving' class='header-anchors'>Serving</h2><p>The <code>serve</code> method of class <code>Serving</code> processes predicted result. It is also responsible for combining multiple predicted results into one if you have more than one predictive model. <em>Serving</em> then returns the final predicted result. PredictionIO will convert it to a JSON response automatically.</p><p>In MyECommerceRecommendation/src/main/scala/<strong><em>Serving.scala</em></strong>,</p><div class="highlight scala"><table style="border-spacing: 0"><tbody><tr><td class="gutter gl" style="text-align: right"><pre class="lineno">1
+2
+3
+4
+5
+6
+7
+8
+9</pre></td><td class="code"><pre><span class="k">class</span> <span class="nc">Serving</span>
+  <span class="k">extends</span> <span class="nc">LServing</span><span class="o">[</span><span class="kt">Query</span>, <span class="kt">PredictedResult</span><span class="o">]</span> <span class="o">{</span>
+
+  <span class="k">override</span>
+  <span class="k">def</span> <span class="n">serve</span><span class="o">(</span><span class="n">query</span><span class="k">:</span> <span class="kt">Query</span><span class="o">,</span>
+    <span class="n">predictedResults</span><span class="k">:</span> <span class="kt">Seq</span><span class="o">[</span><span class="kt">PredictedResult</span><span class="o">])</span><span class="k">:</span> <span class="kt">PredictedResult</span> <span class="o">=</span> <span class="o">{</span>
+    <span class="n">predictedResults</span><span class="o">.</span><span class="n">head</span>
+  <span class="o">}</span>
+<span class="o">}</span>
+</pre></td></tr></tbody></table> </div> <p>When you send a JSON query to <a href="http://localhost:8000/queries.json">http://localhost:8000/queries.json</a>, <code>PredictedResult</code> from all models will be passed to <code>serve</code> as a sequence, i.e. <code>Seq[PredictedResult]</code>.</p> <blockquote> <p>An engine can train multiple models if you specify more than one Algorithm component in <code>object RecommendationEngine</code> inside <strong><em>Engine.scala</em></strong>. Since only one <code>ECommAlgorithm</code> is implemented by default, this <code>Seq</code> contains one element.</p></blockquote> </div></div></div></div><footer><div class="container"><div class="seperator"></div><div class="row"><div class="col-md-6 col-xs-6 footer-link-column"><div class="footer-link-column-row"><h4>Community</h4><ul><li><a href="//docs.prediction.io/install/" target="blank">Download</a></li><li><a href="//docs.prediction.io/" target="blank">Docs</a></li><li><a href="//github.com/
 apache/incubator-predictionio" target="blank">GitHub</a></li><li><a href="mailto:user-subscribe@predictionio

<TRUNCATED>
http://git-wip-us.apache.org/repos/asf/incubator-predictionio-site/blob/02715c51/templates/ecommercerecommendation/dase/index.html.gz
----------------------------------------------------------------------
diff --git a/templates/ecommercerecommendation/dase/index.html.gz b/templates/ecommercerecommendation/dase/index.html.gz
new file mode 100644
index 0000000..1288aee
Binary files /dev/null and b/templates/ecommercerecommendation/dase/index.html.gz differ