You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by "jay vyas (JIRA)" <ji...@apache.org> on 2014/04/16 12:18:14 UTC
[jira] [Created] (BIGTOP-1271) BigPetStore: Embed user "types" into
the generated data.
jay vyas created BIGTOP-1271:
--------------------------------
Summary: BigPetStore: Embed user "types" into the generated data.
Key: BIGTOP-1271
URL: https://issues.apache.org/jira/browse/BIGTOP-1271
Project: Bigtop
Issue Type: New Feature
Components: Blueprints
Reporter: jay vyas
The data set generation in BigPetStore results in data with temporal and geographic patterns, however, there are no "personal" biases in the data.
We need to add personal biases into the data so that the Mahout recommender is capable of teasing out statistically significant product clusters for users.
A simple implementation:
{noformat}
given 2 "types" of customers (i.e. dog people, cat people)
t = hash (customer_name) % 2
if(t==0)
customer buys only dog products
if(t==1)
customer buys only cat products
{noformat}
This approach will easily scale and consistently embed profiles into each persons purchases. Obviously using some OO magic we can create customers who also buy cat and dog products both... but the basic approach still remains (hash code -> customer type -> product biases).
--
This message was sent by Atlassian JIRA
(v6.2#6252)