You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "Martin, Nick" <Ni...@pssd.com> on 2013/08/10 20:54:58 UTC

Clustering for customer segmentation

Hi all,

I'm new to Mahout and wondering if anyone could point me in the right direction for doing customer purchase behavior clustering in Mahout. Seems most of what I encounter in online and book examples for clustering is text/document based.

Basically, I'd like to be able to explore passing n years of customer transaction data into one of the clustering algorithms and have my customer population be segmented into similar groups. Key determinants of similarity would be things like sales volume, purchase frequency, sales channel, profitability, tenure, category mix, etc.

Anywhere I can see examples of this kind of thing?

Thanks!!
Nick



Sent from my iPhone

Re: Clustering for customer segmentation

Posted by Ted Dunning <te...@gmail.com>.
On Mon, Aug 12, 2013 at 12:52 PM, Martin, Nick <Ni...@pssd.com> wrote:

> I'd love to contribute so I'll get on JIRA and sign up for the dev@mailing list to start getting a feel for that process.
>

Sounds like you already know the drill.

Welcome!

RE: Clustering for customer segmentation

Posted by "Martin, Nick" <Ni...@pssd.com>.
Great info, thanks for the help. I pulled the paper and will start looking at some options.

I'd love to contribute so I'll get on JIRA and sign up for the dev@ mailing list to start getting a feel for that process.

Thanks,
Nick

-----Original Message-----
From: Ted Dunning [mailto:ted.dunning@gmail.com] 
Sent: Monday, August 12, 2013 12:00 PM
To: user@mahout.apache.org
Subject: Re: Clustering for customer segmentation

The tasks that you need to do include:

a) group your history by user id
b) extract the features you want to use from each user history
c) repeat clustering and adjusting the scaling of your features until you are happy

If you have a few hundred examples of customers broken down by the segmentation that you want, then one thing that you might look at is this
paper:

http://www.cs.cmu.edu/~epxing/papers/Old_papers/xing_nips02_metric.pdf

It shows a method for learning a metric that optimizes clustering of labeled and unlabeled points.

Mahout currently does not have support for this kind of metric learning, but it would make an excellent addition.



On Sat, Aug 10, 2013 at 11:54 AM, Martin, Nick <Ni...@pssd.com> wrote:

> Hi all,
>
> I'm new to Mahout and wondering if anyone could point me in the right 
> direction for doing customer purchase behavior clustering in Mahout. 
> Seems most of what I encounter in online and book examples for 
> clustering is text/document based.
>
> Basically, I'd like to be able to explore passing n years of customer 
> transaction data into one of the clustering algorithms and have my 
> customer population be segmented into similar groups. Key determinants 
> of similarity would be things like sales volume, purchase frequency, 
> sales channel, profitability, tenure, category mix, etc.
>
> Anywhere I can see examples of this kind of thing?
>
> Thanks!!
> Nick
>
>
>
> Sent from my iPhone

Re: Clustering for customer segmentation

Posted by Ted Dunning <te...@gmail.com>.
The tasks that you need to do include:

a) group your history by user id
b) extract the features you want to use from each user history
c) repeat clustering and adjusting the scaling of your features until you
are happy

If you have a few hundred examples of customers broken down by the
segmentation that you want, then one thing that you might look at is this
paper:

http://www.cs.cmu.edu/~epxing/papers/Old_papers/xing_nips02_metric.pdf

It shows a method for learning a metric that optimizes clustering of
labeled and unlabeled points.

Mahout currently does not have support for this kind of metric learning,
but it would make an excellent addition.



On Sat, Aug 10, 2013 at 11:54 AM, Martin, Nick <Ni...@pssd.com> wrote:

> Hi all,
>
> I'm new to Mahout and wondering if anyone could point me in the right
> direction for doing customer purchase behavior clustering in Mahout. Seems
> most of what I encounter in online and book examples for clustering is
> text/document based.
>
> Basically, I'd like to be able to explore passing n years of customer
> transaction data into one of the clustering algorithms and have my customer
> population be segmented into similar groups. Key determinants of similarity
> would be things like sales volume, purchase frequency, sales channel,
> profitability, tenure, category mix, etc.
>
> Anywhere I can see examples of this kind of thing?
>
> Thanks!!
> Nick
>
>
>
> Sent from my iPhone