You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2011/11/29 14:03:05 UTC

Clustering graph coloring and layout

Anyone have an easy algorithm for coloring clusters in a nice way?  That is, given k clusters, color each centroid and all of it's associated points in such a way that it is visually appealing and avoids, to the extent it can, coloring two unique clusters the same color.  

Also, the same goes for laying out n-dimensional vectors in a 2-dimensional way such that one can get a sense of the distances, groupings, etc. and still navigate in decent way.

I've got the plumbing in place in the ClusterDumper + Gephi to make this work (and have a lame implementation of both on https://issues.apache.org/jira/browse/MAHOUT-899) but would really like to be able to produce much prettier visualizations out of the box.
--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com




Re: Clustering graph coloring and layout

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
Nice!

it is very obvious i cannot avoid learning R (sigh).

On Wed, Nov 30, 2011 at 2:58 PM, Ted Dunning <te...@gmail.com> wrote:

> Here is some that I just whipped up.  I have also attached an example of
> the output.
>
> In the sample output, notice how you can see different stories about what
> clusters the brown-ish and purple clusters are near.[image: xyz.png]
>
>
> On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gs...@apache.org>wrote:
>
>> I'm still learning R, do you have code handy you could share?
>>
>> On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
>>
>> > Coloring is pretty easy in R, which is what I use.  I just build a color
>> > map with the right number of indices and use the cluster id to index the
>> > colormap.  For grins, I vary the transparency according to how seriously
>> > down-sampled the cluster is.  That lets me get a good visual feel for
>> the
>> > actual cluster size.
>> >
>> > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gsingers@apache.org
>> >wrote:
>> >
>> >> Anyone have an easy algorithm for coloring clusters in a nice way?
>>  That
>> >> is, given k clusters, color each centroid and all of it's associated
>> points
>> >> in such a way that it is visually appealing and avoids, to the extent
>> it
>> >> can, coloring two unique clusters the same color.
>> >>
>>
>>
>>
>>
>>
>

Re: Clustering graph coloring and layout

Posted by Grant Ingersoll <gs...@apache.org>.
On Dec 1, 2011, at 5:02 AM, Steven Bourke wrote:

> Sorry I wasn't really following this thread - I've got lots of random throw
> away's for gephi visualisations for clustering and graphs (But not mahout
> based). Does the patch (https://issues.apache.org/jira/browse/MAHOUT-899)
> have an output file that I can use to generate visuals from? I'll can clean
> something up and add it to the patch.

It does.  Apply the patch and then build.  Then use the ClusterDumper.  Here's my example:

bin/mahout clusterdump --seqFileDir ~/projects/content/apache/sfmum/clustering/kmeans/clusters-2-final/ -o ~/projects/content/apache/sfmum/clustering/kmeans/clusters.graphml -of GRAPH_ML --distanceMeasure org.apache.mahout.common.distance.CosineDistanceMeasure --pointsDir ~/projects/content/apache/sfmum/clustering/kmeans/clusteredPoints/ --dictionaryType sequencefile --dictionary ~/projects/content/apache/sfmum/clustering/seq2sparse/dictionary.file-0 -n 3 -sp 500

> 
> 
> On Thu, Dec 1, 2011 at 7:57 AM, Dawid Weiss <da...@cs.put.poznan.pl>wrote:
> 
>> This looks great, Ted, thanks for sharing.
>> 
>> Dawid
>> 
>> On Thu, Dec 1, 2011 at 3:32 AM, Ted Dunning <te...@gmail.com> wrote:
>>> Sure.  I attached it, but those get stripped.  I didn't realize that this
>>> was going to the list.
>>> 
>>> Try here: http://dl.dropbox.com/u/36863361/cluster-viz.r
>>> 
>>> And here for the image: http://dl.dropbox.com/u/36863361/xyz.png
>>> 
>>> On Wed, Nov 30, 2011 at 4:04 PM, Grant Ingersoll <gsingers@apache.org
>>> wrote:
>>> 
>>>> Can you share the R code too?
>>>> 
>>>> On Nov 30, 2011, at 2:58 PM, Ted Dunning wrote:
>>>> 
>>>>> Here is some that I just whipped up.  I have also attached an example
>> of
>>>> the output.
>>>>> 
>>>>> In the sample output, notice how you can see different stories about
>>>> what clusters the brown-ish and purple clusters are near.<xyz.png>
>>>>> 
>>>>> On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gsingers@apache.org
>>> 
>>>> wrote:
>>>>> I'm still learning R, do you have code handy you could share?
>>>>> 
>>>>> On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
>>>>> 
>>>>>> Coloring is pretty easy in R, which is what I use.  I just build a
>>>> color
>>>>>> map with the right number of indices and use the cluster id to index
>>>> the
>>>>>> colormap.  For grins, I vary the transparency according to how
>>>> seriously
>>>>>> down-sampled the cluster is.  That lets me get a good visual feel
>> for
>>>> the
>>>>>> actual cluster size.
>>>>>> 
>>>>>> On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <
>> gsingers@apache.org
>>>>> wrote:
>>>>>> 
>>>>>>> Anyone have an easy algorithm for coloring clusters in a nice way?
>>>> That
>>>>>>> is, given k clusters, color each centroid and all of it's
>> associated
>>>> points
>>>>>>> in such a way that it is visually appealing and avoids, to the
>> extent
>>>> it
>>>>>>> can, coloring two unique clusters the same color.
>>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> --------------------------------------------
>>>> Grant Ingersoll
>>>> http://www.lucidimagination.com
>>>> 
>>>> 
>>>> 
>>>> 
>> 

--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com




Re: Clustering graph coloring and layout

Posted by Steven Bourke <sb...@gmail.com>.
Sorry I wasn't really following this thread - I've got lots of random throw
away's for gephi visualisations for clustering and graphs (But not mahout
based). Does the patch (https://issues.apache.org/jira/browse/MAHOUT-899)
have an output file that I can use to generate visuals from? I'll can clean
something up and add it to the patch.


On Thu, Dec 1, 2011 at 7:57 AM, Dawid Weiss <da...@cs.put.poznan.pl>wrote:

> This looks great, Ted, thanks for sharing.
>
> Dawid
>
> On Thu, Dec 1, 2011 at 3:32 AM, Ted Dunning <te...@gmail.com> wrote:
> > Sure.  I attached it, but those get stripped.  I didn't realize that this
> > was going to the list.
> >
> > Try here: http://dl.dropbox.com/u/36863361/cluster-viz.r
> >
> > And here for the image: http://dl.dropbox.com/u/36863361/xyz.png
> >
> > On Wed, Nov 30, 2011 at 4:04 PM, Grant Ingersoll <gsingers@apache.org
> >wrote:
> >
> >> Can you share the R code too?
> >>
> >> On Nov 30, 2011, at 2:58 PM, Ted Dunning wrote:
> >>
> >> > Here is some that I just whipped up.  I have also attached an example
> of
> >> the output.
> >> >
> >> > In the sample output, notice how you can see different stories about
> >> what clusters the brown-ish and purple clusters are near.<xyz.png>
> >> >
> >> > On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gsingers@apache.org
> >
> >> wrote:
> >> > I'm still learning R, do you have code handy you could share?
> >> >
> >> > On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
> >> >
> >> > > Coloring is pretty easy in R, which is what I use.  I just build a
> >> color
> >> > > map with the right number of indices and use the cluster id to index
> >> the
> >> > > colormap.  For grins, I vary the transparency according to how
> >> seriously
> >> > > down-sampled the cluster is.  That lets me get a good visual feel
> for
> >> the
> >> > > actual cluster size.
> >> > >
> >> > > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <
> gsingers@apache.org
> >> >wrote:
> >> > >
> >> > >> Anyone have an easy algorithm for coloring clusters in a nice way?
> >>  That
> >> > >> is, given k clusters, color each centroid and all of it's
> associated
> >> points
> >> > >> in such a way that it is visually appealing and avoids, to the
> extent
> >> it
> >> > >> can, coloring two unique clusters the same color.
> >> > >>
> >> >
> >> >
> >> >
> >> >
> >> >
> >>
> >> --------------------------------------------
> >> Grant Ingersoll
> >> http://www.lucidimagination.com
> >>
> >>
> >>
> >>
>

Re: Clustering graph coloring and layout

Posted by Dawid Weiss <da...@cs.put.poznan.pl>.
This looks great, Ted, thanks for sharing.

Dawid

On Thu, Dec 1, 2011 at 3:32 AM, Ted Dunning <te...@gmail.com> wrote:
> Sure.  I attached it, but those get stripped.  I didn't realize that this
> was going to the list.
>
> Try here: http://dl.dropbox.com/u/36863361/cluster-viz.r
>
> And here for the image: http://dl.dropbox.com/u/36863361/xyz.png
>
> On Wed, Nov 30, 2011 at 4:04 PM, Grant Ingersoll <gs...@apache.org>wrote:
>
>> Can you share the R code too?
>>
>> On Nov 30, 2011, at 2:58 PM, Ted Dunning wrote:
>>
>> > Here is some that I just whipped up.  I have also attached an example of
>> the output.
>> >
>> > In the sample output, notice how you can see different stories about
>> what clusters the brown-ish and purple clusters are near.<xyz.png>
>> >
>> > On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gs...@apache.org>
>> wrote:
>> > I'm still learning R, do you have code handy you could share?
>> >
>> > On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
>> >
>> > > Coloring is pretty easy in R, which is what I use.  I just build a
>> color
>> > > map with the right number of indices and use the cluster id to index
>> the
>> > > colormap.  For grins, I vary the transparency according to how
>> seriously
>> > > down-sampled the cluster is.  That lets me get a good visual feel for
>> the
>> > > actual cluster size.
>> > >
>> > > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gsingers@apache.org
>> >wrote:
>> > >
>> > >> Anyone have an easy algorithm for coloring clusters in a nice way?
>>  That
>> > >> is, given k clusters, color each centroid and all of it's associated
>> points
>> > >> in such a way that it is visually appealing and avoids, to the extent
>> it
>> > >> can, coloring two unique clusters the same color.
>> > >>
>> >
>> >
>> >
>> >
>> >
>>
>> --------------------------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com
>>
>>
>>
>>

Re: Clustering graph coloring and layout

Posted by Ted Dunning <te...@gmail.com>.
Sure.  I attached it, but those get stripped.  I didn't realize that this
was going to the list.

Try here: http://dl.dropbox.com/u/36863361/cluster-viz.r

And here for the image: http://dl.dropbox.com/u/36863361/xyz.png

On Wed, Nov 30, 2011 at 4:04 PM, Grant Ingersoll <gs...@apache.org>wrote:

> Can you share the R code too?
>
> On Nov 30, 2011, at 2:58 PM, Ted Dunning wrote:
>
> > Here is some that I just whipped up.  I have also attached an example of
> the output.
> >
> > In the sample output, notice how you can see different stories about
> what clusters the brown-ish and purple clusters are near.<xyz.png>
> >
> > On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gs...@apache.org>
> wrote:
> > I'm still learning R, do you have code handy you could share?
> >
> > On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
> >
> > > Coloring is pretty easy in R, which is what I use.  I just build a
> color
> > > map with the right number of indices and use the cluster id to index
> the
> > > colormap.  For grins, I vary the transparency according to how
> seriously
> > > down-sampled the cluster is.  That lets me get a good visual feel for
> the
> > > actual cluster size.
> > >
> > > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gsingers@apache.org
> >wrote:
> > >
> > >> Anyone have an easy algorithm for coloring clusters in a nice way?
>  That
> > >> is, given k clusters, color each centroid and all of it's associated
> points
> > >> in such a way that it is visually appealing and avoids, to the extent
> it
> > >> can, coloring two unique clusters the same color.
> > >>
> >
> >
> >
> >
> >
>
> --------------------------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
>
>
>

Re: Clustering graph coloring and layout

Posted by Grant Ingersoll <gs...@apache.org>.
Can you share the R code too?

On Nov 30, 2011, at 2:58 PM, Ted Dunning wrote:

> Here is some that I just whipped up.  I have also attached an example of the output.
> 
> In the sample output, notice how you can see different stories about what clusters the brown-ish and purple clusters are near.<xyz.png>
> 
> On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gs...@apache.org> wrote:
> I'm still learning R, do you have code handy you could share?
> 
> On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
> 
> > Coloring is pretty easy in R, which is what I use.  I just build a color
> > map with the right number of indices and use the cluster id to index the
> > colormap.  For grins, I vary the transparency according to how seriously
> > down-sampled the cluster is.  That lets me get a good visual feel for the
> > actual cluster size.
> >
> > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gs...@apache.org>wrote:
> >
> >> Anyone have an easy algorithm for coloring clusters in a nice way?  That
> >> is, given k clusters, color each centroid and all of it's associated points
> >> in such a way that it is visually appealing and avoids, to the extent it
> >> can, coloring two unique clusters the same color.
> >>
> 
> 
> 
> 
> 

--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com




Re: Clustering graph coloring and layout

Posted by Ted Dunning <te...@gmail.com>.
Here is some that I just whipped up.  I have also attached an example of
the output.

In the sample output, notice how you can see different stories about what
clusters the brown-ish and purple clusters are near.[image: xyz.png]

On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gs...@apache.org>wrote:

> I'm still learning R, do you have code handy you could share?
>
> On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
>
> > Coloring is pretty easy in R, which is what I use.  I just build a color
> > map with the right number of indices and use the cluster id to index the
> > colormap.  For grins, I vary the transparency according to how seriously
> > down-sampled the cluster is.  That lets me get a good visual feel for the
> > actual cluster size.
> >
> > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gsingers@apache.org
> >wrote:
> >
> >> Anyone have an easy algorithm for coloring clusters in a nice way?  That
> >> is, given k clusters, color each centroid and all of it's associated
> points
> >> in such a way that it is visually appealing and avoids, to the extent it
> >> can, coloring two unique clusters the same color.
> >>
>
>
>
>
>

Re: Clustering graph coloring and layout

Posted by Grant Ingersoll <gs...@apache.org>.
I'm still learning R, do you have code handy you could share?

On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:

> Coloring is pretty easy in R, which is what I use.  I just build a color
> map with the right number of indices and use the cluster id to index the
> colormap.  For grins, I vary the transparency according to how seriously
> down-sampled the cluster is.  That lets me get a good visual feel for the
> actual cluster size.
> 
> On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gs...@apache.org>wrote:
> 
>> Anyone have an easy algorithm for coloring clusters in a nice way?  That
>> is, given k clusters, color each centroid and all of it's associated points
>> in such a way that it is visually appealing and avoids, to the extent it
>> can, coloring two unique clusters the same color.
>> 





Re: Clustering graph coloring and layout

Posted by Ted Dunning <te...@gmail.com>.
Coloring is pretty easy in R, which is what I use.  I just build a color
map with the right number of indices and use the cluster id to index the
colormap.  For grins, I vary the transparency according to how seriously
down-sampled the cluster is.  That lets me get a good visual feel for the
actual cluster size.

On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gs...@apache.org>wrote:

> Anyone have an easy algorithm for coloring clusters in a nice way?  That
> is, given k clusters, color each centroid and all of it's associated points
> in such a way that it is visually appealing and avoids, to the extent it
> can, coloring two unique clusters the same color.
>