You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by "jay vyas (JIRA)" <ji...@apache.org> on 2014/04/16 12:18:14 UTC

[jira] [Created] (BIGTOP-1271) BigPetStore: Embed user "types" into the generated data.

jay vyas created BIGTOP-1271:
--------------------------------

             Summary: BigPetStore: Embed user "types" into the generated data.
                 Key: BIGTOP-1271
                 URL: https://issues.apache.org/jira/browse/BIGTOP-1271
             Project: Bigtop
          Issue Type: New Feature
          Components: Blueprints
            Reporter: jay vyas


The data set generation in BigPetStore results in data with temporal and geographic patterns, however, there are no "personal" biases in the data.

We need to add personal biases into the data so that the Mahout recommender is capable of teasing out statistically significant product clusters for users. 

A simple implementation:  

{noformat} 
given 2 "types" of customers (i.e. dog people, cat people)
t = hash (customer_name) % 2
if(t==0)
   customer buys only dog products
if(t==1) 
   customer buys only cat products
{noformat}

This approach will easily scale and consistently embed profiles into each persons purchases.  Obviously using some OO magic we can create customers who also buy cat and dog products both... but the basic approach still remains (hash code -> customer type -> product biases).




--
This message was sent by Atlassian JIRA
(v6.2#6252)