You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Douglass Davis <do...@gmail.com> on 2013/02/18 00:21:55 UTC

product recommendations engine

Hello,

I don't have any prior experience with Hadoop.  I am also not a statistics
expert.  I am a software engineer, however, after looking at the docs,
Hadoop still seems pretty intimidating to set up.

I am interested in doing product recommendations.  However, I want to store
many things about user behavior, for example whether they click on a link
in an email, how they rate a product, whether they buy it, etc.  Then I
would like to come up with similar items that a user may like.  I have seen
an example just based on user ratings, but would like to add much more data.

Also, I think the clustering could be used in terms of recommending based
on similar descriptions, attributes, and keywords.

Or, I could use a combination of the two approaches.

Another question, I wonder if Hadoop takes into account the passage of
time.  For example, a user may rate something high, then change their
rating a couple months later.

Lastly, my site is based on PHP.  I need to be able to integrate that with
Hadoop.

How feasible is this approach?  I saw a clustering example, and a
recommendation example based on user ratings.  Are there any other advice,
docs, or examples that you could point me to that deals with any of these
issues?

Thanks,
Doug

Re: product recommendations engine

Posted by Sofia Georgiakaki <ge...@yahoo.com>.
Good morning,

Myrrix provides a Recommender that implements a specific recommendation algorithm based on matrix factorization, which is generally efficient in most cases. However, depending on your data and access pattern, it may be better to use Mahout as well, as it provides many different Recommenders. So you can evaluate each implementation and use the recommender that the given time best suits your dataset.

Regards,
Sofia





>________________________________
> From: Manoj Babu <ma...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Tuesday, February 19, 2013 7:03 AM
>Subject: Re: product recommendations engine
> 
>
>Hi Sofia,
>
>I am just hearing about the Myrrix project looks interesting. Thanks for sharing the information.
>
>
>Cheers!
>Manoj.
>
>
>On Tue, Feb 19, 2013 at 12:45 AM, Douglass Davis <do...@gmail.com> wrote:
>
>Ok thanks.  Myrrix looks like it has much of the set-up work done so I am taking a closer look at that.
>>
>>
>>
>>
>>On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki <ge...@yahoo.com> wrote:
>>
>>Hello Douglass,
>>>
>>>you could take a look at Mahout and Myrrix projects. These are two projects thatprovide implementations of recommendation & machine learning algorithms. There are MapReduce implementations as well, to support massive datasets.
>>>In addition, these systems provide client APIs/various integration points, so its easy to integrate them to your system.
>>>
>>>Regards,
>>>Sofia
>>>
>>>
>>>
>>>
>>>
>>>
>>>>________________________________
>>>> From: Douglass Davis <do...@gmail.com>
>>>>To: user@hadoop.apache.org 
>>>>Sent: Monday, February 18, 2013 1:21 AM
>>>>Subject: product recommendations engine
>>>> 
>>>>
>>>>
>>>>Hello,
>>>>
>>>>I don't have any prior experience with Hadoop.  I am also not a statistics expert.  I am a software engineer, however, after looking at the docs, Hadoop still seems pretty intimidating to set up.  
>>>>
>>>>I am interested in doing product recommendations.  However, I want to store many things about user behavior, for example whether they click on a link in an email, how they rate a product, whether they buy it, etc.  Then I would like to come up with similar items that a user may like.  I have seen an example just based on user ratings, but would like to add much more data.
>>>>
>>>>Also, I think the clustering could be used in terms of recommending based on similar descriptions, attributes, and keywords. 
>>>>
>>>>Or, I could use a combination of the two approaches.
>>>>
>>>>Another question, I wonder if Hadoop takes into account the passage of time.  For example, a user may rate something high, then change their rating a couple months later.
>>>>
>>>>Lastly, my site is based on PHP.  I need to be able to integrate that with Hadoop.
>>>>
>>>>How feasible is this approach?  I saw a clustering example, and a recommendation example based on user ratings.  Are there any other advice, docs, or examples that you could point me to that deals with any of these issues?
>>>>
>>>>Thanks,
>>>>Doug
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>
>
>

Re: product recommendations engine

Posted by Sofia Georgiakaki <ge...@yahoo.com>.
Good morning,

Myrrix provides a Recommender that implements a specific recommendation algorithm based on matrix factorization, which is generally efficient in most cases. However, depending on your data and access pattern, it may be better to use Mahout as well, as it provides many different Recommenders. So you can evaluate each implementation and use the recommender that the given time best suits your dataset.

Regards,
Sofia





>________________________________
> From: Manoj Babu <ma...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Tuesday, February 19, 2013 7:03 AM
>Subject: Re: product recommendations engine
> 
>
>Hi Sofia,
>
>I am just hearing about the Myrrix project looks interesting. Thanks for sharing the information.
>
>
>Cheers!
>Manoj.
>
>
>On Tue, Feb 19, 2013 at 12:45 AM, Douglass Davis <do...@gmail.com> wrote:
>
>Ok thanks.  Myrrix looks like it has much of the set-up work done so I am taking a closer look at that.
>>
>>
>>
>>
>>On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki <ge...@yahoo.com> wrote:
>>
>>Hello Douglass,
>>>
>>>you could take a look at Mahout and Myrrix projects. These are two projects thatprovide implementations of recommendation & machine learning algorithms. There are MapReduce implementations as well, to support massive datasets.
>>>In addition, these systems provide client APIs/various integration points, so its easy to integrate them to your system.
>>>
>>>Regards,
>>>Sofia
>>>
>>>
>>>
>>>
>>>
>>>
>>>>________________________________
>>>> From: Douglass Davis <do...@gmail.com>
>>>>To: user@hadoop.apache.org 
>>>>Sent: Monday, February 18, 2013 1:21 AM
>>>>Subject: product recommendations engine
>>>> 
>>>>
>>>>
>>>>Hello,
>>>>
>>>>I don't have any prior experience with Hadoop.  I am also not a statistics expert.  I am a software engineer, however, after looking at the docs, Hadoop still seems pretty intimidating to set up.  
>>>>
>>>>I am interested in doing product recommendations.  However, I want to store many things about user behavior, for example whether they click on a link in an email, how they rate a product, whether they buy it, etc.  Then I would like to come up with similar items that a user may like.  I have seen an example just based on user ratings, but would like to add much more data.
>>>>
>>>>Also, I think the clustering could be used in terms of recommending based on similar descriptions, attributes, and keywords. 
>>>>
>>>>Or, I could use a combination of the two approaches.
>>>>
>>>>Another question, I wonder if Hadoop takes into account the passage of time.  For example, a user may rate something high, then change their rating a couple months later.
>>>>
>>>>Lastly, my site is based on PHP.  I need to be able to integrate that with Hadoop.
>>>>
>>>>How feasible is this approach?  I saw a clustering example, and a recommendation example based on user ratings.  Are there any other advice, docs, or examples that you could point me to that deals with any of these issues?
>>>>
>>>>Thanks,
>>>>Doug
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>
>
>

Re: product recommendations engine

Posted by Sofia Georgiakaki <ge...@yahoo.com>.
Good morning,

Myrrix provides a Recommender that implements a specific recommendation algorithm based on matrix factorization, which is generally efficient in most cases. However, depending on your data and access pattern, it may be better to use Mahout as well, as it provides many different Recommenders. So you can evaluate each implementation and use the recommender that the given time best suits your dataset.

Regards,
Sofia





>________________________________
> From: Manoj Babu <ma...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Tuesday, February 19, 2013 7:03 AM
>Subject: Re: product recommendations engine
> 
>
>Hi Sofia,
>
>I am just hearing about the Myrrix project looks interesting. Thanks for sharing the information.
>
>
>Cheers!
>Manoj.
>
>
>On Tue, Feb 19, 2013 at 12:45 AM, Douglass Davis <do...@gmail.com> wrote:
>
>Ok thanks.  Myrrix looks like it has much of the set-up work done so I am taking a closer look at that.
>>
>>
>>
>>
>>On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki <ge...@yahoo.com> wrote:
>>
>>Hello Douglass,
>>>
>>>you could take a look at Mahout and Myrrix projects. These are two projects thatprovide implementations of recommendation & machine learning algorithms. There are MapReduce implementations as well, to support massive datasets.
>>>In addition, these systems provide client APIs/various integration points, so its easy to integrate them to your system.
>>>
>>>Regards,
>>>Sofia
>>>
>>>
>>>
>>>
>>>
>>>
>>>>________________________________
>>>> From: Douglass Davis <do...@gmail.com>
>>>>To: user@hadoop.apache.org 
>>>>Sent: Monday, February 18, 2013 1:21 AM
>>>>Subject: product recommendations engine
>>>> 
>>>>
>>>>
>>>>Hello,
>>>>
>>>>I don't have any prior experience with Hadoop.  I am also not a statistics expert.  I am a software engineer, however, after looking at the docs, Hadoop still seems pretty intimidating to set up.  
>>>>
>>>>I am interested in doing product recommendations.  However, I want to store many things about user behavior, for example whether they click on a link in an email, how they rate a product, whether they buy it, etc.  Then I would like to come up with similar items that a user may like.  I have seen an example just based on user ratings, but would like to add much more data.
>>>>
>>>>Also, I think the clustering could be used in terms of recommending based on similar descriptions, attributes, and keywords. 
>>>>
>>>>Or, I could use a combination of the two approaches.
>>>>
>>>>Another question, I wonder if Hadoop takes into account the passage of time.  For example, a user may rate something high, then change their rating a couple months later.
>>>>
>>>>Lastly, my site is based on PHP.  I need to be able to integrate that with Hadoop.
>>>>
>>>>How feasible is this approach?  I saw a clustering example, and a recommendation example based on user ratings.  Are there any other advice, docs, or examples that you could point me to that deals with any of these issues?
>>>>
>>>>Thanks,
>>>>Doug
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>
>
>

Re: product recommendations engine

Posted by Sofia Georgiakaki <ge...@yahoo.com>.
Good morning,

Myrrix provides a Recommender that implements a specific recommendation algorithm based on matrix factorization, which is generally efficient in most cases. However, depending on your data and access pattern, it may be better to use Mahout as well, as it provides many different Recommenders. So you can evaluate each implementation and use the recommender that the given time best suits your dataset.

Regards,
Sofia





>________________________________
> From: Manoj Babu <ma...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Tuesday, February 19, 2013 7:03 AM
>Subject: Re: product recommendations engine
> 
>
>Hi Sofia,
>
>I am just hearing about the Myrrix project looks interesting. Thanks for sharing the information.
>
>
>Cheers!
>Manoj.
>
>
>On Tue, Feb 19, 2013 at 12:45 AM, Douglass Davis <do...@gmail.com> wrote:
>
>Ok thanks.  Myrrix looks like it has much of the set-up work done so I am taking a closer look at that.
>>
>>
>>
>>
>>On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki <ge...@yahoo.com> wrote:
>>
>>Hello Douglass,
>>>
>>>you could take a look at Mahout and Myrrix projects. These are two projects thatprovide implementations of recommendation & machine learning algorithms. There are MapReduce implementations as well, to support massive datasets.
>>>In addition, these systems provide client APIs/various integration points, so its easy to integrate them to your system.
>>>
>>>Regards,
>>>Sofia
>>>
>>>
>>>
>>>
>>>
>>>
>>>>________________________________
>>>> From: Douglass Davis <do...@gmail.com>
>>>>To: user@hadoop.apache.org 
>>>>Sent: Monday, February 18, 2013 1:21 AM
>>>>Subject: product recommendations engine
>>>> 
>>>>
>>>>
>>>>Hello,
>>>>
>>>>I don't have any prior experience with Hadoop.  I am also not a statistics expert.  I am a software engineer, however, after looking at the docs, Hadoop still seems pretty intimidating to set up.  
>>>>
>>>>I am interested in doing product recommendations.  However, I want to store many things about user behavior, for example whether they click on a link in an email, how they rate a product, whether they buy it, etc.  Then I would like to come up with similar items that a user may like.  I have seen an example just based on user ratings, but would like to add much more data.
>>>>
>>>>Also, I think the clustering could be used in terms of recommending based on similar descriptions, attributes, and keywords. 
>>>>
>>>>Or, I could use a combination of the two approaches.
>>>>
>>>>Another question, I wonder if Hadoop takes into account the passage of time.  For example, a user may rate something high, then change their rating a couple months later.
>>>>
>>>>Lastly, my site is based on PHP.  I need to be able to integrate that with Hadoop.
>>>>
>>>>How feasible is this approach?  I saw a clustering example, and a recommendation example based on user ratings.  Are there any other advice, docs, or examples that you could point me to that deals with any of these issues?
>>>>
>>>>Thanks,
>>>>Doug
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>
>
>

Re: product recommendations engine

Posted by Manoj Babu <ma...@gmail.com>.
Hi Sofia,

I am just hearing about the Myrrix project looks interesting. Thanks for
sharing the information.

Cheers!
Manoj.


On Tue, Feb 19, 2013 at 12:45 AM, Douglass Davis
<do...@gmail.com>wrote:

> Ok thanks.  Myrrix looks like it has much of the set-up work done so I am
> taking a closer look at that.
>
>
>
> On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki <geosofie_tuc@yahoo.com
> > wrote:
>
>> Hello Douglass,
>>
>> you could take a look at Mahout and Myrrix projects. These are two
>> projects that provide implementations of recommendation & machine
>> learning algorithms. There are MapReduce implementations as well, to
>> support massive datasets.
>> In addition, these systems provide client APIs/various integration points,
>> so its easy to integrate them to your system.
>>
>> Regards,
>> Sofia
>>
>>
>>   ------------------------------
>> *From:* Douglass Davis <do...@gmail.com>
>> *To:* user@hadoop.apache.org
>> *Sent:* Monday, February 18, 2013 1:21 AM
>> *Subject:* product recommendations engine
>>
>> Hello,
>>
>> I don't have any prior experience with Hadoop.  I am also not a
>> statistics expert.  I am a software engineer, however, after looking at the
>> docs, Hadoop still seems pretty intimidating to set up.
>>
>> I am interested in doing product recommendations.  However, I want to
>> store many things about user behavior, for example whether they click on a
>> link in an email, how they rate a product, whether they buy it, etc.  Then
>> I would like to come up with similar items that a user may like.  I have
>> seen an example just based on user ratings, but would like to add much more
>> data.
>>
>> Also, I think the clustering could be used in terms of recommending based
>> on similar descriptions, attributes, and keywords.
>>
>> Or, I could use a combination of the two approaches.
>>
>> Another question, I wonder if Hadoop takes into account the passage of
>> time.  For example, a user may rate something high, then change their
>> rating a couple months later.
>>
>> Lastly, my site is based on PHP.  I need to be able to integrate that
>> with Hadoop.
>>
>> How feasible is this approach?  I saw a clustering example, and a
>> recommendation example based on user ratings.  Are there any other advice,
>> docs, or examples that you could point me to that deals with any of these
>> issues?
>>
>> Thanks,
>> Doug
>>
>>
>>
>>
>>
>>
>>
>

Re: product recommendations engine

Posted by Manoj Babu <ma...@gmail.com>.
Hi Sofia,

I am just hearing about the Myrrix project looks interesting. Thanks for
sharing the information.

Cheers!
Manoj.


On Tue, Feb 19, 2013 at 12:45 AM, Douglass Davis
<do...@gmail.com>wrote:

> Ok thanks.  Myrrix looks like it has much of the set-up work done so I am
> taking a closer look at that.
>
>
>
> On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki <geosofie_tuc@yahoo.com
> > wrote:
>
>> Hello Douglass,
>>
>> you could take a look at Mahout and Myrrix projects. These are two
>> projects that provide implementations of recommendation & machine
>> learning algorithms. There are MapReduce implementations as well, to
>> support massive datasets.
>> In addition, these systems provide client APIs/various integration points,
>> so its easy to integrate them to your system.
>>
>> Regards,
>> Sofia
>>
>>
>>   ------------------------------
>> *From:* Douglass Davis <do...@gmail.com>
>> *To:* user@hadoop.apache.org
>> *Sent:* Monday, February 18, 2013 1:21 AM
>> *Subject:* product recommendations engine
>>
>> Hello,
>>
>> I don't have any prior experience with Hadoop.  I am also not a
>> statistics expert.  I am a software engineer, however, after looking at the
>> docs, Hadoop still seems pretty intimidating to set up.
>>
>> I am interested in doing product recommendations.  However, I want to
>> store many things about user behavior, for example whether they click on a
>> link in an email, how they rate a product, whether they buy it, etc.  Then
>> I would like to come up with similar items that a user may like.  I have
>> seen an example just based on user ratings, but would like to add much more
>> data.
>>
>> Also, I think the clustering could be used in terms of recommending based
>> on similar descriptions, attributes, and keywords.
>>
>> Or, I could use a combination of the two approaches.
>>
>> Another question, I wonder if Hadoop takes into account the passage of
>> time.  For example, a user may rate something high, then change their
>> rating a couple months later.
>>
>> Lastly, my site is based on PHP.  I need to be able to integrate that
>> with Hadoop.
>>
>> How feasible is this approach?  I saw a clustering example, and a
>> recommendation example based on user ratings.  Are there any other advice,
>> docs, or examples that you could point me to that deals with any of these
>> issues?
>>
>> Thanks,
>> Doug
>>
>>
>>
>>
>>
>>
>>
>

Re: product recommendations engine

Posted by Manoj Babu <ma...@gmail.com>.
Hi Sofia,

I am just hearing about the Myrrix project looks interesting. Thanks for
sharing the information.

Cheers!
Manoj.


On Tue, Feb 19, 2013 at 12:45 AM, Douglass Davis
<do...@gmail.com>wrote:

> Ok thanks.  Myrrix looks like it has much of the set-up work done so I am
> taking a closer look at that.
>
>
>
> On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki <geosofie_tuc@yahoo.com
> > wrote:
>
>> Hello Douglass,
>>
>> you could take a look at Mahout and Myrrix projects. These are two
>> projects that provide implementations of recommendation & machine
>> learning algorithms. There are MapReduce implementations as well, to
>> support massive datasets.
>> In addition, these systems provide client APIs/various integration points,
>> so its easy to integrate them to your system.
>>
>> Regards,
>> Sofia
>>
>>
>>   ------------------------------
>> *From:* Douglass Davis <do...@gmail.com>
>> *To:* user@hadoop.apache.org
>> *Sent:* Monday, February 18, 2013 1:21 AM
>> *Subject:* product recommendations engine
>>
>> Hello,
>>
>> I don't have any prior experience with Hadoop.  I am also not a
>> statistics expert.  I am a software engineer, however, after looking at the
>> docs, Hadoop still seems pretty intimidating to set up.
>>
>> I am interested in doing product recommendations.  However, I want to
>> store many things about user behavior, for example whether they click on a
>> link in an email, how they rate a product, whether they buy it, etc.  Then
>> I would like to come up with similar items that a user may like.  I have
>> seen an example just based on user ratings, but would like to add much more
>> data.
>>
>> Also, I think the clustering could be used in terms of recommending based
>> on similar descriptions, attributes, and keywords.
>>
>> Or, I could use a combination of the two approaches.
>>
>> Another question, I wonder if Hadoop takes into account the passage of
>> time.  For example, a user may rate something high, then change their
>> rating a couple months later.
>>
>> Lastly, my site is based on PHP.  I need to be able to integrate that
>> with Hadoop.
>>
>> How feasible is this approach?  I saw a clustering example, and a
>> recommendation example based on user ratings.  Are there any other advice,
>> docs, or examples that you could point me to that deals with any of these
>> issues?
>>
>> Thanks,
>> Doug
>>
>>
>>
>>
>>
>>
>>
>

Re: product recommendations engine

Posted by Manoj Babu <ma...@gmail.com>.
Hi Sofia,

I am just hearing about the Myrrix project looks interesting. Thanks for
sharing the information.

Cheers!
Manoj.


On Tue, Feb 19, 2013 at 12:45 AM, Douglass Davis
<do...@gmail.com>wrote:

> Ok thanks.  Myrrix looks like it has much of the set-up work done so I am
> taking a closer look at that.
>
>
>
> On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki <geosofie_tuc@yahoo.com
> > wrote:
>
>> Hello Douglass,
>>
>> you could take a look at Mahout and Myrrix projects. These are two
>> projects that provide implementations of recommendation & machine
>> learning algorithms. There are MapReduce implementations as well, to
>> support massive datasets.
>> In addition, these systems provide client APIs/various integration points,
>> so its easy to integrate them to your system.
>>
>> Regards,
>> Sofia
>>
>>
>>   ------------------------------
>> *From:* Douglass Davis <do...@gmail.com>
>> *To:* user@hadoop.apache.org
>> *Sent:* Monday, February 18, 2013 1:21 AM
>> *Subject:* product recommendations engine
>>
>> Hello,
>>
>> I don't have any prior experience with Hadoop.  I am also not a
>> statistics expert.  I am a software engineer, however, after looking at the
>> docs, Hadoop still seems pretty intimidating to set up.
>>
>> I am interested in doing product recommendations.  However, I want to
>> store many things about user behavior, for example whether they click on a
>> link in an email, how they rate a product, whether they buy it, etc.  Then
>> I would like to come up with similar items that a user may like.  I have
>> seen an example just based on user ratings, but would like to add much more
>> data.
>>
>> Also, I think the clustering could be used in terms of recommending based
>> on similar descriptions, attributes, and keywords.
>>
>> Or, I could use a combination of the two approaches.
>>
>> Another question, I wonder if Hadoop takes into account the passage of
>> time.  For example, a user may rate something high, then change their
>> rating a couple months later.
>>
>> Lastly, my site is based on PHP.  I need to be able to integrate that
>> with Hadoop.
>>
>> How feasible is this approach?  I saw a clustering example, and a
>> recommendation example based on user ratings.  Are there any other advice,
>> docs, or examples that you could point me to that deals with any of these
>> issues?
>>
>> Thanks,
>> Doug
>>
>>
>>
>>
>>
>>
>>
>

Re: product recommendations engine

Posted by Douglass Davis <do...@gmail.com>.
Ok thanks.  Myrrix looks like it has much of the set-up work done so I am
taking a closer look at that.


On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki
<ge...@yahoo.com>wrote:

> Hello Douglass,
>
> you could take a look at Mahout and Myrrix projects. These are two
> projects that provide implementations of recommendation & machine
> learning algorithms. There are MapReduce implementations as well, to
> support massive datasets.
> In addition, these systems provide client APIs/various integration points,
> so its easy to integrate them to your system.
>
> Regards,
> Sofia
>
>
>   ------------------------------
> *From:* Douglass Davis <do...@gmail.com>
> *To:* user@hadoop.apache.org
> *Sent:* Monday, February 18, 2013 1:21 AM
> *Subject:* product recommendations engine
>
> Hello,
>
> I don't have any prior experience with Hadoop.  I am also not a statistics
> expert.  I am a software engineer, however, after looking at the docs,
> Hadoop still seems pretty intimidating to set up.
>
> I am interested in doing product recommendations.  However, I want to
> store many things about user behavior, for example whether they click on a
> link in an email, how they rate a product, whether they buy it, etc.  Then
> I would like to come up with similar items that a user may like.  I have
> seen an example just based on user ratings, but would like to add much more
> data.
>
> Also, I think the clustering could be used in terms of recommending based
> on similar descriptions, attributes, and keywords.
>
> Or, I could use a combination of the two approaches.
>
> Another question, I wonder if Hadoop takes into account the passage of
> time.  For example, a user may rate something high, then change their
> rating a couple months later.
>
> Lastly, my site is based on PHP.  I need to be able to integrate that with
> Hadoop.
>
> How feasible is this approach?  I saw a clustering example, and a
> recommendation example based on user ratings.  Are there any other advice,
> docs, or examples that you could point me to that deals with any of these
> issues?
>
> Thanks,
> Doug
>
>
>
>
>
>
>

Re: product recommendations engine

Posted by Douglass Davis <do...@gmail.com>.
Ok thanks.  Myrrix looks like it has much of the set-up work done so I am
taking a closer look at that.


On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki
<ge...@yahoo.com>wrote:

> Hello Douglass,
>
> you could take a look at Mahout and Myrrix projects. These are two
> projects that provide implementations of recommendation & machine
> learning algorithms. There are MapReduce implementations as well, to
> support massive datasets.
> In addition, these systems provide client APIs/various integration points,
> so its easy to integrate them to your system.
>
> Regards,
> Sofia
>
>
>   ------------------------------
> *From:* Douglass Davis <do...@gmail.com>
> *To:* user@hadoop.apache.org
> *Sent:* Monday, February 18, 2013 1:21 AM
> *Subject:* product recommendations engine
>
> Hello,
>
> I don't have any prior experience with Hadoop.  I am also not a statistics
> expert.  I am a software engineer, however, after looking at the docs,
> Hadoop still seems pretty intimidating to set up.
>
> I am interested in doing product recommendations.  However, I want to
> store many things about user behavior, for example whether they click on a
> link in an email, how they rate a product, whether they buy it, etc.  Then
> I would like to come up with similar items that a user may like.  I have
> seen an example just based on user ratings, but would like to add much more
> data.
>
> Also, I think the clustering could be used in terms of recommending based
> on similar descriptions, attributes, and keywords.
>
> Or, I could use a combination of the two approaches.
>
> Another question, I wonder if Hadoop takes into account the passage of
> time.  For example, a user may rate something high, then change their
> rating a couple months later.
>
> Lastly, my site is based on PHP.  I need to be able to integrate that with
> Hadoop.
>
> How feasible is this approach?  I saw a clustering example, and a
> recommendation example based on user ratings.  Are there any other advice,
> docs, or examples that you could point me to that deals with any of these
> issues?
>
> Thanks,
> Doug
>
>
>
>
>
>
>

Re: product recommendations engine

Posted by Douglass Davis <do...@gmail.com>.
Ok thanks.  Myrrix looks like it has much of the set-up work done so I am
taking a closer look at that.


On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki
<ge...@yahoo.com>wrote:

> Hello Douglass,
>
> you could take a look at Mahout and Myrrix projects. These are two
> projects that provide implementations of recommendation & machine
> learning algorithms. There are MapReduce implementations as well, to
> support massive datasets.
> In addition, these systems provide client APIs/various integration points,
> so its easy to integrate them to your system.
>
> Regards,
> Sofia
>
>
>   ------------------------------
> *From:* Douglass Davis <do...@gmail.com>
> *To:* user@hadoop.apache.org
> *Sent:* Monday, February 18, 2013 1:21 AM
> *Subject:* product recommendations engine
>
> Hello,
>
> I don't have any prior experience with Hadoop.  I am also not a statistics
> expert.  I am a software engineer, however, after looking at the docs,
> Hadoop still seems pretty intimidating to set up.
>
> I am interested in doing product recommendations.  However, I want to
> store many things about user behavior, for example whether they click on a
> link in an email, how they rate a product, whether they buy it, etc.  Then
> I would like to come up with similar items that a user may like.  I have
> seen an example just based on user ratings, but would like to add much more
> data.
>
> Also, I think the clustering could be used in terms of recommending based
> on similar descriptions, attributes, and keywords.
>
> Or, I could use a combination of the two approaches.
>
> Another question, I wonder if Hadoop takes into account the passage of
> time.  For example, a user may rate something high, then change their
> rating a couple months later.
>
> Lastly, my site is based on PHP.  I need to be able to integrate that with
> Hadoop.
>
> How feasible is this approach?  I saw a clustering example, and a
> recommendation example based on user ratings.  Are there any other advice,
> docs, or examples that you could point me to that deals with any of these
> issues?
>
> Thanks,
> Doug
>
>
>
>
>
>
>

Re: product recommendations engine

Posted by Douglass Davis <do...@gmail.com>.
Ok thanks.  Myrrix looks like it has much of the set-up work done so I am
taking a closer look at that.


On Mon, Feb 18, 2013 at 4:00 AM, Sofia Georgiakaki
<ge...@yahoo.com>wrote:

> Hello Douglass,
>
> you could take a look at Mahout and Myrrix projects. These are two
> projects that provide implementations of recommendation & machine
> learning algorithms. There are MapReduce implementations as well, to
> support massive datasets.
> In addition, these systems provide client APIs/various integration points,
> so its easy to integrate them to your system.
>
> Regards,
> Sofia
>
>
>   ------------------------------
> *From:* Douglass Davis <do...@gmail.com>
> *To:* user@hadoop.apache.org
> *Sent:* Monday, February 18, 2013 1:21 AM
> *Subject:* product recommendations engine
>
> Hello,
>
> I don't have any prior experience with Hadoop.  I am also not a statistics
> expert.  I am a software engineer, however, after looking at the docs,
> Hadoop still seems pretty intimidating to set up.
>
> I am interested in doing product recommendations.  However, I want to
> store many things about user behavior, for example whether they click on a
> link in an email, how they rate a product, whether they buy it, etc.  Then
> I would like to come up with similar items that a user may like.  I have
> seen an example just based on user ratings, but would like to add much more
> data.
>
> Also, I think the clustering could be used in terms of recommending based
> on similar descriptions, attributes, and keywords.
>
> Or, I could use a combination of the two approaches.
>
> Another question, I wonder if Hadoop takes into account the passage of
> time.  For example, a user may rate something high, then change their
> rating a couple months later.
>
> Lastly, my site is based on PHP.  I need to be able to integrate that with
> Hadoop.
>
> How feasible is this approach?  I saw a clustering example, and a
> recommendation example based on user ratings.  Are there any other advice,
> docs, or examples that you could point me to that deals with any of these
> issues?
>
> Thanks,
> Doug
>
>
>
>
>
>
>

Re: product recommendations engine

Posted by Sofia Georgiakaki <ge...@yahoo.com>.
Hello Douglass,

you could take a look at Mahout and Myrrix projects. These are two projects thatprovide implementations of recommendation & machine learning algorithms. There are MapReduce implementations as well, to support massive datasets.
In addition, these systems provide client APIs/various integration points, so its easy to integrate them to your system.

Regards,
Sofia





>________________________________
> From: Douglass Davis <do...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Monday, February 18, 2013 1:21 AM
>Subject: product recommendations engine
> 
>
>Hello,
>
>I don't have any prior experience with Hadoop.  I am also not a statistics expert.  I am a software engineer, however, after looking at the docs, Hadoop still seems pretty intimidating to set up.  
>
>I am interested in doing product recommendations.  However, I want to store many things about user behavior, for example whether they click on a link in an email, how they rate a product, whether they buy it, etc.  Then I would like to come up with similar items that a user may like.  I have seen an example just based on user ratings, but would like to add much more data.
>
>Also, I think the clustering could be used in terms of recommending based on similar descriptions, attributes, and keywords. 
>
>Or, I could use a combination of the two approaches.
>
>Another question, I wonder if Hadoop takes into account the passage of time.  For example, a user may rate something high, then change their rating a couple months later.
>
>Lastly, my site is based on PHP.  I need to be able to integrate that with Hadoop.
>
>How feasible is this approach?  I saw a clustering example, and a recommendation example based on user ratings.  Are there any other advice, docs, or examples that you could point me to that deals with any of these issues?
>
>Thanks,
>Doug
>
>
>
>
>
>
>

Re: product recommendations engine

Posted by Ted Dunning <td...@maprtech.com>.
Yeah... you can make this work.

First, if your setup is relatively small, then you won't need Hadoop.

Second, having lots of kinds of actions is a very reasonable thing to have.
 My own suggestion is that you analyze these each for their predictive
power independently and then combine them at recommendation time.

My own suggestion for how to deploy the recommendation model is in the form
of a search engine that has fields for each kind of recommendation cue that
you need to have.  You can combine any or all of these cues in the process
of doing a non-textual search using the recent history of the user as the
query.

This search-abuse style of recommendations is pretty easy to deploy and PHP
has a reasonably good package for sending queries to Solr, which is the
search engine I tend to recommend.

You should also make a provision for A/B testing on different
recommendation approaches and combinations of inputs.  This is pretty
straightforward, but usually requires some sort of experimental condition
assignment and definitely requires good log recording and analysis.

That said, this isn't a tiny project.  It involves quite a bit of work.  It
isn't terribly hard at any point and the overall architecture is pretty
straightforward, but there is a good bit of work to be done.

On Sun, Feb 17, 2013 at 4:21 PM, Douglass Davis
<do...@gmail.com>wrote:

> Hello,
>
> I don't have any prior experience with Hadoop.  I am also not a statistics
> expert.  I am a software engineer, however, after looking at the docs,
> Hadoop still seems pretty intimidating to set up.
>
> I am interested in doing product recommendations.  However, I want to
> store many things about user behavior, for example whether they click on a
> link in an email, how they rate a product, whether they buy it, etc.  Then
> I would like to come up with similar items that a user may like.  I have
> seen an example just based on user ratings, but would like to add much more
> data.
>
> Also, I think the clustering could be used in terms of recommending based
> on similar descriptions, attributes, and keywords.
>
> Or, I could use a combination of the two approaches.
>
> Another question, I wonder if Hadoop takes into account the passage of
> time.  For example, a user may rate something high, then change their
> rating a couple months later.
>
> Lastly, my site is based on PHP.  I need to be able to integrate that with
> Hadoop.
>
> How feasible is this approach?  I saw a clustering example, and a
> recommendation example based on user ratings.  Are there any other advice,
> docs, or examples that you could point me to that deals with any of these
> issues?
>
> Thanks,
> Doug
>
>
>
>
>

Re: product recommendations engine

Posted by Sofia Georgiakaki <ge...@yahoo.com>.
Hello Douglass,

you could take a look at Mahout and Myrrix projects. These are two projects thatprovide implementations of recommendation & machine learning algorithms. There are MapReduce implementations as well, to support massive datasets.
In addition, these systems provide client APIs/various integration points, so its easy to integrate them to your system.

Regards,
Sofia





>________________________________
> From: Douglass Davis <do...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Monday, February 18, 2013 1:21 AM
>Subject: product recommendations engine
> 
>
>Hello,
>
>I don't have any prior experience with Hadoop.  I am also not a statistics expert.  I am a software engineer, however, after looking at the docs, Hadoop still seems pretty intimidating to set up.  
>
>I am interested in doing product recommendations.  However, I want to store many things about user behavior, for example whether they click on a link in an email, how they rate a product, whether they buy it, etc.  Then I would like to come up with similar items that a user may like.  I have seen an example just based on user ratings, but would like to add much more data.
>
>Also, I think the clustering could be used in terms of recommending based on similar descriptions, attributes, and keywords. 
>
>Or, I could use a combination of the two approaches.
>
>Another question, I wonder if Hadoop takes into account the passage of time.  For example, a user may rate something high, then change their rating a couple months later.
>
>Lastly, my site is based on PHP.  I need to be able to integrate that with Hadoop.
>
>How feasible is this approach?  I saw a clustering example, and a recommendation example based on user ratings.  Are there any other advice, docs, or examples that you could point me to that deals with any of these issues?
>
>Thanks,
>Doug
>
>
>
>
>
>
>

Re: product recommendations engine

Posted by Ted Dunning <td...@maprtech.com>.
Yeah... you can make this work.

First, if your setup is relatively small, then you won't need Hadoop.

Second, having lots of kinds of actions is a very reasonable thing to have.
 My own suggestion is that you analyze these each for their predictive
power independently and then combine them at recommendation time.

My own suggestion for how to deploy the recommendation model is in the form
of a search engine that has fields for each kind of recommendation cue that
you need to have.  You can combine any or all of these cues in the process
of doing a non-textual search using the recent history of the user as the
query.

This search-abuse style of recommendations is pretty easy to deploy and PHP
has a reasonably good package for sending queries to Solr, which is the
search engine I tend to recommend.

You should also make a provision for A/B testing on different
recommendation approaches and combinations of inputs.  This is pretty
straightforward, but usually requires some sort of experimental condition
assignment and definitely requires good log recording and analysis.

That said, this isn't a tiny project.  It involves quite a bit of work.  It
isn't terribly hard at any point and the overall architecture is pretty
straightforward, but there is a good bit of work to be done.

On Sun, Feb 17, 2013 at 4:21 PM, Douglass Davis
<do...@gmail.com>wrote:

> Hello,
>
> I don't have any prior experience with Hadoop.  I am also not a statistics
> expert.  I am a software engineer, however, after looking at the docs,
> Hadoop still seems pretty intimidating to set up.
>
> I am interested in doing product recommendations.  However, I want to
> store many things about user behavior, for example whether they click on a
> link in an email, how they rate a product, whether they buy it, etc.  Then
> I would like to come up with similar items that a user may like.  I have
> seen an example just based on user ratings, but would like to add much more
> data.
>
> Also, I think the clustering could be used in terms of recommending based
> on similar descriptions, attributes, and keywords.
>
> Or, I could use a combination of the two approaches.
>
> Another question, I wonder if Hadoop takes into account the passage of
> time.  For example, a user may rate something high, then change their
> rating a couple months later.
>
> Lastly, my site is based on PHP.  I need to be able to integrate that with
> Hadoop.
>
> How feasible is this approach?  I saw a clustering example, and a
> recommendation example based on user ratings.  Are there any other advice,
> docs, or examples that you could point me to that deals with any of these
> issues?
>
> Thanks,
> Doug
>
>
>
>
>

Re: product recommendations engine

Posted by Ted Dunning <td...@maprtech.com>.
Yeah... you can make this work.

First, if your setup is relatively small, then you won't need Hadoop.

Second, having lots of kinds of actions is a very reasonable thing to have.
 My own suggestion is that you analyze these each for their predictive
power independently and then combine them at recommendation time.

My own suggestion for how to deploy the recommendation model is in the form
of a search engine that has fields for each kind of recommendation cue that
you need to have.  You can combine any or all of these cues in the process
of doing a non-textual search using the recent history of the user as the
query.

This search-abuse style of recommendations is pretty easy to deploy and PHP
has a reasonably good package for sending queries to Solr, which is the
search engine I tend to recommend.

You should also make a provision for A/B testing on different
recommendation approaches and combinations of inputs.  This is pretty
straightforward, but usually requires some sort of experimental condition
assignment and definitely requires good log recording and analysis.

That said, this isn't a tiny project.  It involves quite a bit of work.  It
isn't terribly hard at any point and the overall architecture is pretty
straightforward, but there is a good bit of work to be done.

On Sun, Feb 17, 2013 at 4:21 PM, Douglass Davis
<do...@gmail.com>wrote:

> Hello,
>
> I don't have any prior experience with Hadoop.  I am also not a statistics
> expert.  I am a software engineer, however, after looking at the docs,
> Hadoop still seems pretty intimidating to set up.
>
> I am interested in doing product recommendations.  However, I want to
> store many things about user behavior, for example whether they click on a
> link in an email, how they rate a product, whether they buy it, etc.  Then
> I would like to come up with similar items that a user may like.  I have
> seen an example just based on user ratings, but would like to add much more
> data.
>
> Also, I think the clustering could be used in terms of recommending based
> on similar descriptions, attributes, and keywords.
>
> Or, I could use a combination of the two approaches.
>
> Another question, I wonder if Hadoop takes into account the passage of
> time.  For example, a user may rate something high, then change their
> rating a couple months later.
>
> Lastly, my site is based on PHP.  I need to be able to integrate that with
> Hadoop.
>
> How feasible is this approach?  I saw a clustering example, and a
> recommendation example based on user ratings.  Are there any other advice,
> docs, or examples that you could point me to that deals with any of these
> issues?
>
> Thanks,
> Doug
>
>
>
>
>

Re: product recommendations engine

Posted by Sofia Georgiakaki <ge...@yahoo.com>.
Hello Douglass,

you could take a look at Mahout and Myrrix projects. These are two projects thatprovide implementations of recommendation & machine learning algorithms. There are MapReduce implementations as well, to support massive datasets.
In addition, these systems provide client APIs/various integration points, so its easy to integrate them to your system.

Regards,
Sofia





>________________________________
> From: Douglass Davis <do...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Monday, February 18, 2013 1:21 AM
>Subject: product recommendations engine
> 
>
>Hello,
>
>I don't have any prior experience with Hadoop.  I am also not a statistics expert.  I am a software engineer, however, after looking at the docs, Hadoop still seems pretty intimidating to set up.  
>
>I am interested in doing product recommendations.  However, I want to store many things about user behavior, for example whether they click on a link in an email, how they rate a product, whether they buy it, etc.  Then I would like to come up with similar items that a user may like.  I have seen an example just based on user ratings, but would like to add much more data.
>
>Also, I think the clustering could be used in terms of recommending based on similar descriptions, attributes, and keywords. 
>
>Or, I could use a combination of the two approaches.
>
>Another question, I wonder if Hadoop takes into account the passage of time.  For example, a user may rate something high, then change their rating a couple months later.
>
>Lastly, my site is based on PHP.  I need to be able to integrate that with Hadoop.
>
>How feasible is this approach?  I saw a clustering example, and a recommendation example based on user ratings.  Are there any other advice, docs, or examples that you could point me to that deals with any of these issues?
>
>Thanks,
>Doug
>
>
>
>
>
>
>

Re: product recommendations engine

Posted by Ted Dunning <td...@maprtech.com>.
Yeah... you can make this work.

First, if your setup is relatively small, then you won't need Hadoop.

Second, having lots of kinds of actions is a very reasonable thing to have.
 My own suggestion is that you analyze these each for their predictive
power independently and then combine them at recommendation time.

My own suggestion for how to deploy the recommendation model is in the form
of a search engine that has fields for each kind of recommendation cue that
you need to have.  You can combine any or all of these cues in the process
of doing a non-textual search using the recent history of the user as the
query.

This search-abuse style of recommendations is pretty easy to deploy and PHP
has a reasonably good package for sending queries to Solr, which is the
search engine I tend to recommend.

You should also make a provision for A/B testing on different
recommendation approaches and combinations of inputs.  This is pretty
straightforward, but usually requires some sort of experimental condition
assignment and definitely requires good log recording and analysis.

That said, this isn't a tiny project.  It involves quite a bit of work.  It
isn't terribly hard at any point and the overall architecture is pretty
straightforward, but there is a good bit of work to be done.

On Sun, Feb 17, 2013 at 4:21 PM, Douglass Davis
<do...@gmail.com>wrote:

> Hello,
>
> I don't have any prior experience with Hadoop.  I am also not a statistics
> expert.  I am a software engineer, however, after looking at the docs,
> Hadoop still seems pretty intimidating to set up.
>
> I am interested in doing product recommendations.  However, I want to
> store many things about user behavior, for example whether they click on a
> link in an email, how they rate a product, whether they buy it, etc.  Then
> I would like to come up with similar items that a user may like.  I have
> seen an example just based on user ratings, but would like to add much more
> data.
>
> Also, I think the clustering could be used in terms of recommending based
> on similar descriptions, attributes, and keywords.
>
> Or, I could use a combination of the two approaches.
>
> Another question, I wonder if Hadoop takes into account the passage of
> time.  For example, a user may rate something high, then change their
> rating a couple months later.
>
> Lastly, my site is based on PHP.  I need to be able to integrate that with
> Hadoop.
>
> How feasible is this approach?  I saw a clustering example, and a
> recommendation example based on user ratings.  Are there any other advice,
> docs, or examples that you could point me to that deals with any of these
> issues?
>
> Thanks,
> Doug
>
>
>
>
>

Re: product recommendations engine

Posted by Sofia Georgiakaki <ge...@yahoo.com>.
Hello Douglass,

you could take a look at Mahout and Myrrix projects. These are two projects thatprovide implementations of recommendation & machine learning algorithms. There are MapReduce implementations as well, to support massive datasets.
In addition, these systems provide client APIs/various integration points, so its easy to integrate them to your system.

Regards,
Sofia





>________________________________
> From: Douglass Davis <do...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Monday, February 18, 2013 1:21 AM
>Subject: product recommendations engine
> 
>
>Hello,
>
>I don't have any prior experience with Hadoop.  I am also not a statistics expert.  I am a software engineer, however, after looking at the docs, Hadoop still seems pretty intimidating to set up.  
>
>I am interested in doing product recommendations.  However, I want to store many things about user behavior, for example whether they click on a link in an email, how they rate a product, whether they buy it, etc.  Then I would like to come up with similar items that a user may like.  I have seen an example just based on user ratings, but would like to add much more data.
>
>Also, I think the clustering could be used in terms of recommending based on similar descriptions, attributes, and keywords. 
>
>Or, I could use a combination of the two approaches.
>
>Another question, I wonder if Hadoop takes into account the passage of time.  For example, a user may rate something high, then change their rating a couple months later.
>
>Lastly, my site is based on PHP.  I need to be able to integrate that with Hadoop.
>
>How feasible is this approach?  I saw a clustering example, and a recommendation example based on user ratings.  Are there any other advice, docs, or examples that you could point me to that deals with any of these issues?
>
>Thanks,
>Doug
>
>
>
>
>
>
>