You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Nishant Chandra <ni...@gmail.com> on 2012/01/03 19:02:48 UTC

Purchase prediction

Hi,

I am trying to predict shopper purchase and non-purchase intention in
E-Commerce context. I am more interested in finding the later.
A near-real time approach will be great. So given a sequence of pages
a shopper views, I would like the algorithm to predict the intention.

Any algorithms in Mahout or otherwise that can help?

Thanks,
Nishant

Re: Purchase prediction

Posted by Manuel Blechschmidt <Ma...@gmx.de>.
Hi Mike,
actually it is a very tough research task to make predictions in real time.

I would expect that you can tune hidden markov models to work in semi real time.

Further if you have a trained model you can use this model in real time. The big question is how often can and should you rebuild your model. Further the question is how much computation time do you want to spend for every customer?

Perhaps the KDD Cup from 2000 is valueable:
http://www.kdd.org/kddcup/index.php?section=2000&method=result

Tasks:
Given a set of page views, will the visitor view another page on the site or will the visitor leave?
Given a set of page views, which product brand will the visitor view in the remainder of the session?
...

Agrawal et al. described a method to semi real time recommendations for news stories:
Fast Online Learning through Offline Initialization for Time-sensitive Recommendation
http://users.cs.fiu.edu/~lzhen001/activities/KDD_USB_key_2010/docs/p703.pdf

Hope that helps. If you have any results I would be interested in them.

/Manuel


On 03.01.2012, at 20:59, Mike Spreitzer wrote:

> I suspect the original request was concerned with --- and I, on my own, am 
> concerned with --- a scenario in which it is desired to be able to quickly 
> make predictions based on very recent data.  Thus, approaches that 
> occasionally take a lot of time to build a model are non-solutions.  Are 
> there solutions for my scenario in what you mentioned, or elsewhere?
> 
> Thanks,
> Mike
> 
> 
> 
> From:   Manuel Blechschmidt <Ma...@gmx.de>
> To:     user@mahout.apache.org
> Date:   01/03/2012 02:40 PM
> Subject:        Re: Purchase prediction
> 
> 
> 
> Hello Nishan,
> you can use the recommender approaches with the boolean reference model.
> 
> You can use IRStatistics (Precision, Recall, F-Measure) to benchmark your 
> results.
> https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
> 
> 
> Further you could also use the hidden markov model to predict 
> probabilities of next purchases.
> http://isabel-drost.de/hadoop/slides/HMM.pdf
> https://issues.apache.org/jira/browse/MAHOUT-396
> 
> There are some papers describing how to combine some of these methods:
> 
> Rendle. et. al presented a paper using a combination of both:
> Factorizing Personalized Markov Chains for Next-Basket Recommendation
> http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf
> 
> 
> In my opinion some seasonal models could also help to better predict next 
> purchases.
> 
> There is currently an resolved enhancement request for 0.6 making 
> evaluation for a use case like yours better:
> https://issues.apache.org/jira/browse/MAHOUT-906
> 
> If you have further questions feel free to ask.
> 
> /Manuel
> 
> On 03.01.2012, at 19:02, Nishant Chandra wrote:
> 
>> Hi,
>> 
>> I am trying to predict shopper purchase and non-purchase intention in
>> E-Commerce context. I am more interested in finding the later.
>> A near-real time approach will be great. So given a sequence of pages
>> a shopper views, I would like the algorithm to predict the intention.
>> 
>> Any algorithms in Mahout or otherwise that can help?
>> 
>> Thanks,
>> Nishant
> 
> -- 
> Manuel Blechschmidt
> Dortustr. 57
> 14467 Potsdam
> Mobil: 0173/6322621
> Twitter: http://twitter.com/Manuel_B
> 
> 

-- 
Manuel Blechschmidt
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B


Re: Purchase prediction

Posted by Sean Owen <sr...@gmail.com>.
The recommender idea is real-time, where real-time means "less than about a
second" at some moderate scale. Any model-building-like processes are done
online. I think you might model this a simple most-similar-item problem,
which is even faster -- though I can only guess whether the result is good
or not as I've not tried to solve this kind of problem.

On Tue, Jan 3, 2012 at 7:59 PM, Mike Spreitzer <ms...@us.ibm.com> wrote:

> I suspect the original request was concerned with --- and I, on my own, am
> concerned with --- a scenario in which it is desired to be able to quickly
> make predictions based on very recent data.  Thus, approaches that
> occasionally take a lot of time to build a model are non-solutions.  Are
> there solutions for my scenario in what you mentioned, or elsewhere?
>
> Thanks,
> Mike
>
>
>
> From:   Manuel Blechschmidt <Ma...@gmx.de>
> To:     user@mahout.apache.org
> Date:   01/03/2012 02:40 PM
> Subject:        Re: Purchase prediction
>
>
>
> Hello Nishan,
> you can use the recommender approaches with the boolean reference model.
>
> You can use IRStatistics (Precision, Recall, F-Measure) to benchmark your
> results.
>
> https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
>
>
> Further you could also use the hidden markov model to predict
> probabilities of next purchases.
> http://isabel-drost.de/hadoop/slides/HMM.pdf
> https://issues.apache.org/jira/browse/MAHOUT-396
>
> There are some papers describing how to combine some of these methods:
>
> Rendle. et. al presented a paper using a combination of both:
> Factorizing Personalized Markov Chains for Next-Basket Recommendation
>
> http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf
>
>
> In my opinion some seasonal models could also help to better predict next
> purchases.
>
> There is currently an resolved enhancement request for 0.6 making
> evaluation for a use case like yours better:
>  https://issues.apache.org/jira/browse/MAHOUT-906
>
> If you have further questions feel free to ask.
>
> /Manuel
>
> On 03.01.2012, at 19:02, Nishant Chandra wrote:
>
> > Hi,
> >
> > I am trying to predict shopper purchase and non-purchase intention in
> > E-Commerce context. I am more interested in finding the later.
> > A near-real time approach will be great. So given a sequence of pages
> > a shopper views, I would like the algorithm to predict the intention.
> >
> > Any algorithms in Mahout or otherwise that can help?
> >
> > Thanks,
> > Nishant
>
> --
> Manuel Blechschmidt
> Dortustr. 57
> 14467 Potsdam
> Mobil: 0173/6322621
> Twitter: http://twitter.com/Manuel_B
>
>
>

Re: Purchase prediction

Posted by Nishant Chandra <ni...@gmail.com>.
Hi Manuel,

Please send the paper as I don't have access. Thanks.

On Wed, Jan 4, 2012 at 11:02 PM, Manuel Blechschmidt
<Ma...@gmx.de> wrote:
> Hello Nishant,
> intent prediction based on the behavior on the website is a tough task.
>
> Here is a paper which trained bayes networks to guess the task that a person is doing:
>
> An approach to situational market segmentation on on-line newspapers based on current tasks
> Anne Gutschmidt
> http://dl.acm.org/citation.cfm?id=1864777
>
> For the overall data set, we attained a prediction accuracy of 57.69%.
>
> If you do not have access to ACM portal. I can send you the paper manually.
>
> /Manuel
>
> On 04.01.2012, at 08:15, Nishant Chandra wrote:
>
>> As for my use case and as Manuel pointed out is this:
>>
>> a. Given a set of page views happening in real time, will the visitor
>> view another page on the site or will the visitor leave or is he
>> comparing prices or just researching? The intention is what I want to
>> capture. Building the model offline sounds like the right approach.
>>
>> b. Given a set of page views, which product brand will the visitor
>> view in the remainder of the session? This is an addon and I would
>> like to explore it.
>>
>> To solve a), is HMM the right approach?
>>
>> Thanks,
>> Nishant
>>
>>
>> On Wed, Jan 4, 2012 at 10:15 AM, Ted Dunning <te...@gmail.com> wrote:
>>> That doesn't help the cold-start problem, of course.
>>>
>>> On Tue, Jan 3, 2012 at 8:07 PM, Lance Norskog <go...@gmail.com> wrote:
>>>
>>>> If you can use an SVD-based recommender, here is a way to update an
>>>> SVD in constant time that is much much smaller than the original
>>>> decomposition.
>>>>
>>>> http://www.merl.com/papers/docs/TR2006-059.pdf
>>>>
>>>> On Tue, Jan 3, 2012 at 1:44 PM, Ted Dunning <te...@gmail.com> wrote:
>>>>> The recent data is usually just the user history, not the off-line
>>>>> item-item relationship build.
>>>>>
>>>>> For brand new items, there is the cold start problem, but this is often
>>>>> handled by putting these items on a "New Arrivals" page so that you can
>>>>> expose them to users until you get enough data to include them in the
>>>> next
>>>>> item-item build.  Enough data is usually around 10 clicks.
>>>>>
>>>>> It is also plausible to cold-start items based on feature similarity.
>>>>>
>>>>> On Tue, Jan 3, 2012 at 11:59 AM, Mike Spreitzer <ms...@us.ibm.com>
>>>> wrote:
>>>>>
>>>>>> I suspect the original request was concerned with --- and I, on my own,
>>>> am
>>>>>> concerned with --- a scenario in which it is desired to be able to
>>>> quickly
>>>>>> make predictions based on very recent data.  Thus, approaches that
>>>>>> occasionally take a lot of time to build a model are non-solutions.  Are
>>>>>> there solutions for my scenario in what you mentioned, or elsewhere?
>>>>>>
>>>>>> Thanks,
>>>>>> Mike
>>>>>>
>>>>>>
>>>>>>
>>>>>> From:   Manuel Blechschmidt <Ma...@gmx.de>
>>>>>> To:     user@mahout.apache.org
>>>>>> Date:   01/03/2012 02:40 PM
>>>>>> Subject:        Re: Purchase prediction
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hello Nishan,
>>>>>> you can use the recommender approaches with the boolean reference model.
>>>>>>
>>>>>> You can use IRStatistics (Precision, Recall, F-Measure) to benchmark
>>>> your
>>>>>> results.
>>>>>>
>>>>>>
>>>> https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
>>>>>>
>>>>>>
>>>>>> Further you could also use the hidden markov model to predict
>>>>>> probabilities of next purchases.
>>>>>> http://isabel-drost.de/hadoop/slides/HMM.pdf
>>>>>> https://issues.apache.org/jira/browse/MAHOUT-396
>>>>>>
>>>>>> There are some papers describing how to combine some of these methods:
>>>>>>
>>>>>> Rendle. et. al presented a paper using a combination of both:
>>>>>> Factorizing Personalized Markov Chains for Next-Basket Recommendation
>>>>>>
>>>>>>
>>>> http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf
>>>>>>
>>>>>>
>>>>>> In my opinion some seasonal models could also help to better predict
>>>> next
>>>>>> purchases.
>>>>>>
>>>>>> There is currently an resolved enhancement request for 0.6 making
>>>>>> evaluation for a use case like yours better:
>>>>>>  https://issues.apache.org/jira/browse/MAHOUT-906
>>>>>>
>>>>>> If you have further questions feel free to ask.
>>>>>>
>>>>>> /Manuel
>>>>>>
>>>>>> On 03.01.2012, at 19:02, Nishant Chandra wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am trying to predict shopper purchase and non-purchase intention in
>>>>>>> E-Commerce context. I am more interested in finding the later.
>>>>>>> A near-real time approach will be great. So given a sequence of pages
>>>>>>> a shopper views, I would like the algorithm to predict the intention.
>>>>>>>
>>>>>>> Any algorithms in Mahout or otherwise that can help?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Nishant
>>>>>>
>>>>>> --
>>>>>> Manuel Blechschmidt
>>>>>> Dortustr. 57
>>>>>> 14467 Potsdam
>>>>>> Mobil: 0173/6322621
>>>>>> Twitter: http://twitter.com/Manuel_B
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Lance Norskog
>>>> goksron@gmail.com
>>>>
>>
>>
>>
>> --
>> Nishant Chandra
>> Bangalore, India
>> Cell : +91 9739131616
>
> --
> Manuel Blechschmidt
> Dortustr. 57
> 14467 Potsdam
> Mobil: 0173/6322621
> Twitter: http://twitter.com/Manuel_B
>



-- 
Nishant Chandra
Bangalore, India
Cell : +91 9739131616

Re: Purchase prediction

Posted by Manuel Blechschmidt <Ma...@gmx.de>.
Hello Nishant,
intent prediction based on the behavior on the website is a tough task.

Here is a paper which trained bayes networks to guess the task that a person is doing:

An approach to situational market segmentation on on-line newspapers based on current tasks
Anne Gutschmidt
http://dl.acm.org/citation.cfm?id=1864777

For the overall data set, we attained a prediction accuracy of 57.69%.

If you do not have access to ACM portal. I can send you the paper manually.

/Manuel

On 04.01.2012, at 08:15, Nishant Chandra wrote:

> As for my use case and as Manuel pointed out is this:
> 
> a. Given a set of page views happening in real time, will the visitor
> view another page on the site or will the visitor leave or is he
> comparing prices or just researching? The intention is what I want to
> capture. Building the model offline sounds like the right approach.
> 
> b. Given a set of page views, which product brand will the visitor
> view in the remainder of the session? This is an addon and I would
> like to explore it.
> 
> To solve a), is HMM the right approach?
> 
> Thanks,
> Nishant
> 
> 
> On Wed, Jan 4, 2012 at 10:15 AM, Ted Dunning <te...@gmail.com> wrote:
>> That doesn't help the cold-start problem, of course.
>> 
>> On Tue, Jan 3, 2012 at 8:07 PM, Lance Norskog <go...@gmail.com> wrote:
>> 
>>> If you can use an SVD-based recommender, here is a way to update an
>>> SVD in constant time that is much much smaller than the original
>>> decomposition.
>>> 
>>> http://www.merl.com/papers/docs/TR2006-059.pdf
>>> 
>>> On Tue, Jan 3, 2012 at 1:44 PM, Ted Dunning <te...@gmail.com> wrote:
>>>> The recent data is usually just the user history, not the off-line
>>>> item-item relationship build.
>>>> 
>>>> For brand new items, there is the cold start problem, but this is often
>>>> handled by putting these items on a "New Arrivals" page so that you can
>>>> expose them to users until you get enough data to include them in the
>>> next
>>>> item-item build.  Enough data is usually around 10 clicks.
>>>> 
>>>> It is also plausible to cold-start items based on feature similarity.
>>>> 
>>>> On Tue, Jan 3, 2012 at 11:59 AM, Mike Spreitzer <ms...@us.ibm.com>
>>> wrote:
>>>> 
>>>>> I suspect the original request was concerned with --- and I, on my own,
>>> am
>>>>> concerned with --- a scenario in which it is desired to be able to
>>> quickly
>>>>> make predictions based on very recent data.  Thus, approaches that
>>>>> occasionally take a lot of time to build a model are non-solutions.  Are
>>>>> there solutions for my scenario in what you mentioned, or elsewhere?
>>>>> 
>>>>> Thanks,
>>>>> Mike
>>>>> 
>>>>> 
>>>>> 
>>>>> From:   Manuel Blechschmidt <Ma...@gmx.de>
>>>>> To:     user@mahout.apache.org
>>>>> Date:   01/03/2012 02:40 PM
>>>>> Subject:        Re: Purchase prediction
>>>>> 
>>>>> 
>>>>> 
>>>>> Hello Nishan,
>>>>> you can use the recommender approaches with the boolean reference model.
>>>>> 
>>>>> You can use IRStatistics (Precision, Recall, F-Measure) to benchmark
>>> your
>>>>> results.
>>>>> 
>>>>> 
>>> https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
>>>>> 
>>>>> 
>>>>> Further you could also use the hidden markov model to predict
>>>>> probabilities of next purchases.
>>>>> http://isabel-drost.de/hadoop/slides/HMM.pdf
>>>>> https://issues.apache.org/jira/browse/MAHOUT-396
>>>>> 
>>>>> There are some papers describing how to combine some of these methods:
>>>>> 
>>>>> Rendle. et. al presented a paper using a combination of both:
>>>>> Factorizing Personalized Markov Chains for Next-Basket Recommendation
>>>>> 
>>>>> 
>>> http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf
>>>>> 
>>>>> 
>>>>> In my opinion some seasonal models could also help to better predict
>>> next
>>>>> purchases.
>>>>> 
>>>>> There is currently an resolved enhancement request for 0.6 making
>>>>> evaluation for a use case like yours better:
>>>>>  https://issues.apache.org/jira/browse/MAHOUT-906
>>>>> 
>>>>> If you have further questions feel free to ask.
>>>>> 
>>>>> /Manuel
>>>>> 
>>>>> On 03.01.2012, at 19:02, Nishant Chandra wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I am trying to predict shopper purchase and non-purchase intention in
>>>>>> E-Commerce context. I am more interested in finding the later.
>>>>>> A near-real time approach will be great. So given a sequence of pages
>>>>>> a shopper views, I would like the algorithm to predict the intention.
>>>>>> 
>>>>>> Any algorithms in Mahout or otherwise that can help?
>>>>>> 
>>>>>> Thanks,
>>>>>> Nishant
>>>>> 
>>>>> --
>>>>> Manuel Blechschmidt
>>>>> Dortustr. 57
>>>>> 14467 Potsdam
>>>>> Mobil: 0173/6322621
>>>>> Twitter: http://twitter.com/Manuel_B
>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Lance Norskog
>>> goksron@gmail.com
>>> 
> 
> 
> 
> -- 
> Nishant Chandra
> Bangalore, India
> Cell : +91 9739131616

-- 
Manuel Blechschmidt
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B


Re: Purchase prediction

Posted by Ted Dunning <te...@gmail.com>.
Decision tree learning is fine for relatively small data, but it doesn't
model latent variables directly.  You can use any supervised classifier as
a component of something like a conditional random field, but the use of
decision tree learning isn't a deciding factor.

HMM's are a form of sequential pattern mining.  Most forms, however, don't
handle latent factors well since this method usually tries to predict based
only on recent events.

On Wed, Jan 4, 2012 at 8:44 AM, Nishant Chandra
<ni...@gmail.com>wrote:

> How about using decision tree learning or sequential pattern mining?
> Any thoughts?
>
> Thanks,
> Nishant
>
> On Wed, Jan 4, 2012 at 1:25 PM, Ted Dunning <te...@gmail.com> wrote:
> > On Tue, Jan 3, 2012 at 11:15 PM, Nishant Chandra
> > <ni...@gmail.com>wrote:
> >
> >> As for my use case and as Manuel pointed out is this:
> >>
> >> a. Given a set of page views happening in real time, will the visitor
> >> view another page on the site or will the visitor leave or is he
> >> comparing prices or just researching? The intention is what I want to
> >> capture. Building the model offline sounds like the right approach.
> >>
> > ...
> >
> > To solve a), is HMM the right approach?
> >
> >
> > It is a plausible approach.  But not the only one.  It is attractive in
> > that it tries to model intent.
> >
> > You might also look at something like a latent log-linear model.  That
> > would allow you to model per user bias in intent.
> >
> >
> >> b. Given a set of page views, which product brand will the visitor
> >> view in the remainder of the session? This is an addon and I would
> >> like to explore it.
> >>
> >
> > This is a reasonable task for recommendation engines.
>

Re: Purchase prediction

Posted by Nishant Chandra <ni...@gmail.com>.
How about using decision tree learning or sequential pattern mining?
Any thoughts?

Thanks,
Nishant

On Wed, Jan 4, 2012 at 1:25 PM, Ted Dunning <te...@gmail.com> wrote:
> On Tue, Jan 3, 2012 at 11:15 PM, Nishant Chandra
> <ni...@gmail.com>wrote:
>
>> As for my use case and as Manuel pointed out is this:
>>
>> a. Given a set of page views happening in real time, will the visitor
>> view another page on the site or will the visitor leave or is he
>> comparing prices or just researching? The intention is what I want to
>> capture. Building the model offline sounds like the right approach.
>>
> ...
>
> To solve a), is HMM the right approach?
>
>
> It is a plausible approach.  But not the only one.  It is attractive in
> that it tries to model intent.
>
> You might also look at something like a latent log-linear model.  That
> would allow you to model per user bias in intent.
>
>
>> b. Given a set of page views, which product brand will the visitor
>> view in the remainder of the session? This is an addon and I would
>> like to explore it.
>>
>
> This is a reasonable task for recommendation engines.

Re: Purchase prediction

Posted by Ted Dunning <te...@gmail.com>.
On Tue, Jan 3, 2012 at 11:15 PM, Nishant Chandra
<ni...@gmail.com>wrote:

> As for my use case and as Manuel pointed out is this:
>
> a. Given a set of page views happening in real time, will the visitor
> view another page on the site or will the visitor leave or is he
> comparing prices or just researching? The intention is what I want to
> capture. Building the model offline sounds like the right approach.
>
...

To solve a), is HMM the right approach?


It is a plausible approach.  But not the only one.  It is attractive in
that it tries to model intent.

You might also look at something like a latent log-linear model.  That
would allow you to model per user bias in intent.


> b. Given a set of page views, which product brand will the visitor
> view in the remainder of the session? This is an addon and I would
> like to explore it.
>

This is a reasonable task for recommendation engines.

Re: Purchase prediction

Posted by Nishant Chandra <ni...@gmail.com>.
As for my use case and as Manuel pointed out is this:

a. Given a set of page views happening in real time, will the visitor
view another page on the site or will the visitor leave or is he
comparing prices or just researching? The intention is what I want to
capture. Building the model offline sounds like the right approach.

b. Given a set of page views, which product brand will the visitor
view in the remainder of the session? This is an addon and I would
like to explore it.

To solve a), is HMM the right approach?

Thanks,
Nishant


On Wed, Jan 4, 2012 at 10:15 AM, Ted Dunning <te...@gmail.com> wrote:
> That doesn't help the cold-start problem, of course.
>
> On Tue, Jan 3, 2012 at 8:07 PM, Lance Norskog <go...@gmail.com> wrote:
>
>> If you can use an SVD-based recommender, here is a way to update an
>> SVD in constant time that is much much smaller than the original
>> decomposition.
>>
>> http://www.merl.com/papers/docs/TR2006-059.pdf
>>
>> On Tue, Jan 3, 2012 at 1:44 PM, Ted Dunning <te...@gmail.com> wrote:
>> > The recent data is usually just the user history, not the off-line
>> > item-item relationship build.
>> >
>> > For brand new items, there is the cold start problem, but this is often
>> > handled by putting these items on a "New Arrivals" page so that you can
>> > expose them to users until you get enough data to include them in the
>> next
>> > item-item build.  Enough data is usually around 10 clicks.
>> >
>> > It is also plausible to cold-start items based on feature similarity.
>> >
>> > On Tue, Jan 3, 2012 at 11:59 AM, Mike Spreitzer <ms...@us.ibm.com>
>> wrote:
>> >
>> >> I suspect the original request was concerned with --- and I, on my own,
>> am
>> >> concerned with --- a scenario in which it is desired to be able to
>> quickly
>> >> make predictions based on very recent data.  Thus, approaches that
>> >> occasionally take a lot of time to build a model are non-solutions.  Are
>> >> there solutions for my scenario in what you mentioned, or elsewhere?
>> >>
>> >> Thanks,
>> >> Mike
>> >>
>> >>
>> >>
>> >> From:   Manuel Blechschmidt <Ma...@gmx.de>
>> >> To:     user@mahout.apache.org
>> >> Date:   01/03/2012 02:40 PM
>> >> Subject:        Re: Purchase prediction
>> >>
>> >>
>> >>
>> >> Hello Nishan,
>> >> you can use the recommender approaches with the boolean reference model.
>> >>
>> >> You can use IRStatistics (Precision, Recall, F-Measure) to benchmark
>> your
>> >> results.
>> >>
>> >>
>> https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
>> >>
>> >>
>> >> Further you could also use the hidden markov model to predict
>> >> probabilities of next purchases.
>> >> http://isabel-drost.de/hadoop/slides/HMM.pdf
>> >> https://issues.apache.org/jira/browse/MAHOUT-396
>> >>
>> >> There are some papers describing how to combine some of these methods:
>> >>
>> >> Rendle. et. al presented a paper using a combination of both:
>> >> Factorizing Personalized Markov Chains for Next-Basket Recommendation
>> >>
>> >>
>> http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf
>> >>
>> >>
>> >> In my opinion some seasonal models could also help to better predict
>> next
>> >> purchases.
>> >>
>> >> There is currently an resolved enhancement request for 0.6 making
>> >> evaluation for a use case like yours better:
>> >>  https://issues.apache.org/jira/browse/MAHOUT-906
>> >>
>> >> If you have further questions feel free to ask.
>> >>
>> >> /Manuel
>> >>
>> >> On 03.01.2012, at 19:02, Nishant Chandra wrote:
>> >>
>> >> > Hi,
>> >> >
>> >> > I am trying to predict shopper purchase and non-purchase intention in
>> >> > E-Commerce context. I am more interested in finding the later.
>> >> > A near-real time approach will be great. So given a sequence of pages
>> >> > a shopper views, I would like the algorithm to predict the intention.
>> >> >
>> >> > Any algorithms in Mahout or otherwise that can help?
>> >> >
>> >> > Thanks,
>> >> > Nishant
>> >>
>> >> --
>> >> Manuel Blechschmidt
>> >> Dortustr. 57
>> >> 14467 Potsdam
>> >> Mobil: 0173/6322621
>> >> Twitter: http://twitter.com/Manuel_B
>> >>
>> >>
>> >>
>>
>>
>>
>> --
>> Lance Norskog
>> goksron@gmail.com
>>



-- 
Nishant Chandra
Bangalore, India
Cell : +91 9739131616

Re: Purchase prediction

Posted by Ted Dunning <te...@gmail.com>.
That doesn't help the cold-start problem, of course.

On Tue, Jan 3, 2012 at 8:07 PM, Lance Norskog <go...@gmail.com> wrote:

> If you can use an SVD-based recommender, here is a way to update an
> SVD in constant time that is much much smaller than the original
> decomposition.
>
> http://www.merl.com/papers/docs/TR2006-059.pdf
>
> On Tue, Jan 3, 2012 at 1:44 PM, Ted Dunning <te...@gmail.com> wrote:
> > The recent data is usually just the user history, not the off-line
> > item-item relationship build.
> >
> > For brand new items, there is the cold start problem, but this is often
> > handled by putting these items on a "New Arrivals" page so that you can
> > expose them to users until you get enough data to include them in the
> next
> > item-item build.  Enough data is usually around 10 clicks.
> >
> > It is also plausible to cold-start items based on feature similarity.
> >
> > On Tue, Jan 3, 2012 at 11:59 AM, Mike Spreitzer <ms...@us.ibm.com>
> wrote:
> >
> >> I suspect the original request was concerned with --- and I, on my own,
> am
> >> concerned with --- a scenario in which it is desired to be able to
> quickly
> >> make predictions based on very recent data.  Thus, approaches that
> >> occasionally take a lot of time to build a model are non-solutions.  Are
> >> there solutions for my scenario in what you mentioned, or elsewhere?
> >>
> >> Thanks,
> >> Mike
> >>
> >>
> >>
> >> From:   Manuel Blechschmidt <Ma...@gmx.de>
> >> To:     user@mahout.apache.org
> >> Date:   01/03/2012 02:40 PM
> >> Subject:        Re: Purchase prediction
> >>
> >>
> >>
> >> Hello Nishan,
> >> you can use the recommender approaches with the boolean reference model.
> >>
> >> You can use IRStatistics (Precision, Recall, F-Measure) to benchmark
> your
> >> results.
> >>
> >>
> https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
> >>
> >>
> >> Further you could also use the hidden markov model to predict
> >> probabilities of next purchases.
> >> http://isabel-drost.de/hadoop/slides/HMM.pdf
> >> https://issues.apache.org/jira/browse/MAHOUT-396
> >>
> >> There are some papers describing how to combine some of these methods:
> >>
> >> Rendle. et. al presented a paper using a combination of both:
> >> Factorizing Personalized Markov Chains for Next-Basket Recommendation
> >>
> >>
> http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf
> >>
> >>
> >> In my opinion some seasonal models could also help to better predict
> next
> >> purchases.
> >>
> >> There is currently an resolved enhancement request for 0.6 making
> >> evaluation for a use case like yours better:
> >>  https://issues.apache.org/jira/browse/MAHOUT-906
> >>
> >> If you have further questions feel free to ask.
> >>
> >> /Manuel
> >>
> >> On 03.01.2012, at 19:02, Nishant Chandra wrote:
> >>
> >> > Hi,
> >> >
> >> > I am trying to predict shopper purchase and non-purchase intention in
> >> > E-Commerce context. I am more interested in finding the later.
> >> > A near-real time approach will be great. So given a sequence of pages
> >> > a shopper views, I would like the algorithm to predict the intention.
> >> >
> >> > Any algorithms in Mahout or otherwise that can help?
> >> >
> >> > Thanks,
> >> > Nishant
> >>
> >> --
> >> Manuel Blechschmidt
> >> Dortustr. 57
> >> 14467 Potsdam
> >> Mobil: 0173/6322621
> >> Twitter: http://twitter.com/Manuel_B
> >>
> >>
> >>
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Re: Purchase prediction

Posted by Lance Norskog <go...@gmail.com>.
If you can use an SVD-based recommender, here is a way to update an
SVD in constant time that is much much smaller than the original
decomposition.

http://www.merl.com/papers/docs/TR2006-059.pdf

On Tue, Jan 3, 2012 at 1:44 PM, Ted Dunning <te...@gmail.com> wrote:
> The recent data is usually just the user history, not the off-line
> item-item relationship build.
>
> For brand new items, there is the cold start problem, but this is often
> handled by putting these items on a "New Arrivals" page so that you can
> expose them to users until you get enough data to include them in the next
> item-item build.  Enough data is usually around 10 clicks.
>
> It is also plausible to cold-start items based on feature similarity.
>
> On Tue, Jan 3, 2012 at 11:59 AM, Mike Spreitzer <ms...@us.ibm.com> wrote:
>
>> I suspect the original request was concerned with --- and I, on my own, am
>> concerned with --- a scenario in which it is desired to be able to quickly
>> make predictions based on very recent data.  Thus, approaches that
>> occasionally take a lot of time to build a model are non-solutions.  Are
>> there solutions for my scenario in what you mentioned, or elsewhere?
>>
>> Thanks,
>> Mike
>>
>>
>>
>> From:   Manuel Blechschmidt <Ma...@gmx.de>
>> To:     user@mahout.apache.org
>> Date:   01/03/2012 02:40 PM
>> Subject:        Re: Purchase prediction
>>
>>
>>
>> Hello Nishan,
>> you can use the recommender approaches with the boolean reference model.
>>
>> You can use IRStatistics (Precision, Recall, F-Measure) to benchmark your
>> results.
>>
>> https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
>>
>>
>> Further you could also use the hidden markov model to predict
>> probabilities of next purchases.
>> http://isabel-drost.de/hadoop/slides/HMM.pdf
>> https://issues.apache.org/jira/browse/MAHOUT-396
>>
>> There are some papers describing how to combine some of these methods:
>>
>> Rendle. et. al presented a paper using a combination of both:
>> Factorizing Personalized Markov Chains for Next-Basket Recommendation
>>
>> http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf
>>
>>
>> In my opinion some seasonal models could also help to better predict next
>> purchases.
>>
>> There is currently an resolved enhancement request for 0.6 making
>> evaluation for a use case like yours better:
>>  https://issues.apache.org/jira/browse/MAHOUT-906
>>
>> If you have further questions feel free to ask.
>>
>> /Manuel
>>
>> On 03.01.2012, at 19:02, Nishant Chandra wrote:
>>
>> > Hi,
>> >
>> > I am trying to predict shopper purchase and non-purchase intention in
>> > E-Commerce context. I am more interested in finding the later.
>> > A near-real time approach will be great. So given a sequence of pages
>> > a shopper views, I would like the algorithm to predict the intention.
>> >
>> > Any algorithms in Mahout or otherwise that can help?
>> >
>> > Thanks,
>> > Nishant
>>
>> --
>> Manuel Blechschmidt
>> Dortustr. 57
>> 14467 Potsdam
>> Mobil: 0173/6322621
>> Twitter: http://twitter.com/Manuel_B
>>
>>
>>



-- 
Lance Norskog
goksron@gmail.com

Re: Purchase prediction

Posted by Ted Dunning <te...@gmail.com>.
The recent data is usually just the user history, not the off-line
item-item relationship build.

For brand new items, there is the cold start problem, but this is often
handled by putting these items on a "New Arrivals" page so that you can
expose them to users until you get enough data to include them in the next
item-item build.  Enough data is usually around 10 clicks.

It is also plausible to cold-start items based on feature similarity.

On Tue, Jan 3, 2012 at 11:59 AM, Mike Spreitzer <ms...@us.ibm.com> wrote:

> I suspect the original request was concerned with --- and I, on my own, am
> concerned with --- a scenario in which it is desired to be able to quickly
> make predictions based on very recent data.  Thus, approaches that
> occasionally take a lot of time to build a model are non-solutions.  Are
> there solutions for my scenario in what you mentioned, or elsewhere?
>
> Thanks,
> Mike
>
>
>
> From:   Manuel Blechschmidt <Ma...@gmx.de>
> To:     user@mahout.apache.org
> Date:   01/03/2012 02:40 PM
> Subject:        Re: Purchase prediction
>
>
>
> Hello Nishan,
> you can use the recommender approaches with the boolean reference model.
>
> You can use IRStatistics (Precision, Recall, F-Measure) to benchmark your
> results.
>
> https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
>
>
> Further you could also use the hidden markov model to predict
> probabilities of next purchases.
> http://isabel-drost.de/hadoop/slides/HMM.pdf
> https://issues.apache.org/jira/browse/MAHOUT-396
>
> There are some papers describing how to combine some of these methods:
>
> Rendle. et. al presented a paper using a combination of both:
> Factorizing Personalized Markov Chains for Next-Basket Recommendation
>
> http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf
>
>
> In my opinion some seasonal models could also help to better predict next
> purchases.
>
> There is currently an resolved enhancement request for 0.6 making
> evaluation for a use case like yours better:
>  https://issues.apache.org/jira/browse/MAHOUT-906
>
> If you have further questions feel free to ask.
>
> /Manuel
>
> On 03.01.2012, at 19:02, Nishant Chandra wrote:
>
> > Hi,
> >
> > I am trying to predict shopper purchase and non-purchase intention in
> > E-Commerce context. I am more interested in finding the later.
> > A near-real time approach will be great. So given a sequence of pages
> > a shopper views, I would like the algorithm to predict the intention.
> >
> > Any algorithms in Mahout or otherwise that can help?
> >
> > Thanks,
> > Nishant
>
> --
> Manuel Blechschmidt
> Dortustr. 57
> 14467 Potsdam
> Mobil: 0173/6322621
> Twitter: http://twitter.com/Manuel_B
>
>
>

Re: Purchase prediction

Posted by Sebastian Schelter <ss...@apache.org>.
A very simple approach would be to use an item-based recommender with a
precomputed model (that might be a day old) and simply use the items
most similar to the latest items the user preferred as recommendations.

These recommendations can be found in "real time" where "real time"
means that a user fills a shopping cart and his recommendations are
immediately updated after each item he adds.

--sebastian

On 03.01.2012 20:59, Mike Spreitzer wrote:
> I suspect the original request was concerned with --- and I, on my own, am 
> concerned with --- a scenario in which it is desired to be able to quickly 
> make predictions based on very recent data.  Thus, approaches that 
> occasionally take a lot of time to build a model are non-solutions.  Are 
> there solutions for my scenario in what you mentioned, or elsewhere?
> 
> Thanks,
> Mike
> 
> 
> 
> From:   Manuel Blechschmidt <Ma...@gmx.de>
> To:     user@mahout.apache.org
> Date:   01/03/2012 02:40 PM
> Subject:        Re: Purchase prediction
> 
> 
> 
> Hello Nishan,
> you can use the recommender approaches with the boolean reference model.
> 
> You can use IRStatistics (Precision, Recall, F-Measure) to benchmark your 
> results.
> https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
> 
> 
> Further you could also use the hidden markov model to predict 
> probabilities of next purchases.
> http://isabel-drost.de/hadoop/slides/HMM.pdf
> https://issues.apache.org/jira/browse/MAHOUT-396
> 
> There are some papers describing how to combine some of these methods:
> 
> Rendle. et. al presented a paper using a combination of both:
> Factorizing Personalized Markov Chains for Next-Basket Recommendation
> http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf
> 
> 
> In my opinion some seasonal models could also help to better predict next 
> purchases.
> 
> There is currently an resolved enhancement request for 0.6 making 
> evaluation for a use case like yours better:
>  https://issues.apache.org/jira/browse/MAHOUT-906
> 
> If you have further questions feel free to ask.
> 
> /Manuel
> 
> On 03.01.2012, at 19:02, Nishant Chandra wrote:
> 
>> Hi,
>>
>> I am trying to predict shopper purchase and non-purchase intention in
>> E-Commerce context. I am more interested in finding the later.
>> A near-real time approach will be great. So given a sequence of pages
>> a shopper views, I would like the algorithm to predict the intention.
>>
>> Any algorithms in Mahout or otherwise that can help?
>>
>> Thanks,
>> Nishant
> 


Re: Purchase prediction

Posted by Mike Spreitzer <ms...@us.ibm.com>.
I suspect the original request was concerned with --- and I, on my own, am 
concerned with --- a scenario in which it is desired to be able to quickly 
make predictions based on very recent data.  Thus, approaches that 
occasionally take a lot of time to build a model are non-solutions.  Are 
there solutions for my scenario in what you mentioned, or elsewhere?

Thanks,
Mike



From:   Manuel Blechschmidt <Ma...@gmx.de>
To:     user@mahout.apache.org
Date:   01/03/2012 02:40 PM
Subject:        Re: Purchase prediction



Hello Nishan,
you can use the recommender approaches with the boolean reference model.

You can use IRStatistics (Precision, Recall, F-Measure) to benchmark your 
results.
https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation


Further you could also use the hidden markov model to predict 
probabilities of next purchases.
http://isabel-drost.de/hadoop/slides/HMM.pdf
https://issues.apache.org/jira/browse/MAHOUT-396

There are some papers describing how to combine some of these methods:

Rendle. et. al presented a paper using a combination of both:
Factorizing Personalized Markov Chains for Next-Basket Recommendation
http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf


In my opinion some seasonal models could also help to better predict next 
purchases.

There is currently an resolved enhancement request for 0.6 making 
evaluation for a use case like yours better:
 https://issues.apache.org/jira/browse/MAHOUT-906

If you have further questions feel free to ask.

/Manuel

On 03.01.2012, at 19:02, Nishant Chandra wrote:

> Hi,
> 
> I am trying to predict shopper purchase and non-purchase intention in
> E-Commerce context. I am more interested in finding the later.
> A near-real time approach will be great. So given a sequence of pages
> a shopper views, I would like the algorithm to predict the intention.
> 
> Any algorithms in Mahout or otherwise that can help?
> 
> Thanks,
> Nishant

-- 
Manuel Blechschmidt
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B



Re: Purchase prediction

Posted by Manuel Blechschmidt <Ma...@gmx.de>.
Hello Nishan,
you can use the recommender approaches with the boolean reference model.

You can use IRStatistics (Precision, Recall, F-Measure) to benchmark your results.
https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation

Further you could also use the hidden markov model to predict probabilities of next purchases.
http://isabel-drost.de/hadoop/slides/HMM.pdf
https://issues.apache.org/jira/browse/MAHOUT-396

There are some papers describing how to combine some of these methods:

Rendle. et. al presented a paper using a combination of both:
Factorizing Personalized Markov Chains for Next-Basket Recommendation
http://www.ismll.uni-hildesheim.de/pub/pdfs/RendleFreudenthaler2010-FPMC.pdf

In my opinion some seasonal models could also help to better predict next purchases.

There is currently an resolved enhancement request for 0.6 making evaluation for a use case like yours better:
 https://issues.apache.org/jira/browse/MAHOUT-906

If you have further questions feel free to ask.

/Manuel

On 03.01.2012, at 19:02, Nishant Chandra wrote:

> Hi,
> 
> I am trying to predict shopper purchase and non-purchase intention in
> E-Commerce context. I am more interested in finding the later.
> A near-real time approach will be great. So given a sequence of pages
> a shopper views, I would like the algorithm to predict the intention.
> 
> Any algorithms in Mahout or otherwise that can help?
> 
> Thanks,
> Nishant

-- 
Manuel Blechschmidt
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B