You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2011/11/29 14:03:05 UTC
Clustering graph coloring and layout
Anyone have an easy algorithm for coloring clusters in a nice way? That is, given k clusters, color each centroid and all of it's associated points in such a way that it is visually appealing and avoids, to the extent it can, coloring two unique clusters the same color.
Also, the same goes for laying out n-dimensional vectors in a 2-dimensional way such that one can get a sense of the distances, groupings, etc. and still navigate in decent way.
I've got the plumbing in place in the ClusterDumper + Gephi to make this work (and have a lame implementation of both on https://issues.apache.org/jira/browse/MAHOUT-899) but would really like to be able to produce much prettier visualizations out of the box.
--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com
Re: Clustering graph coloring and layout
Posted by Dmitriy Lyubimov <dl...@gmail.com>.
Nice!
it is very obvious i cannot avoid learning R (sigh).
On Wed, Nov 30, 2011 at 2:58 PM, Ted Dunning <te...@gmail.com> wrote:
> Here is some that I just whipped up. I have also attached an example of
> the output.
>
> In the sample output, notice how you can see different stories about what
> clusters the brown-ish and purple clusters are near.[image: xyz.png]
>
>
> On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gs...@apache.org>wrote:
>
>> I'm still learning R, do you have code handy you could share?
>>
>> On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
>>
>> > Coloring is pretty easy in R, which is what I use. I just build a color
>> > map with the right number of indices and use the cluster id to index the
>> > colormap. For grins, I vary the transparency according to how seriously
>> > down-sampled the cluster is. That lets me get a good visual feel for
>> the
>> > actual cluster size.
>> >
>> > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gsingers@apache.org
>> >wrote:
>> >
>> >> Anyone have an easy algorithm for coloring clusters in a nice way?
>> That
>> >> is, given k clusters, color each centroid and all of it's associated
>> points
>> >> in such a way that it is visually appealing and avoids, to the extent
>> it
>> >> can, coloring two unique clusters the same color.
>> >>
>>
>>
>>
>>
>>
>
Re: Clustering graph coloring and layout
Posted by Grant Ingersoll <gs...@apache.org>.
On Dec 1, 2011, at 5:02 AM, Steven Bourke wrote:
> Sorry I wasn't really following this thread - I've got lots of random throw
> away's for gephi visualisations for clustering and graphs (But not mahout
> based). Does the patch (https://issues.apache.org/jira/browse/MAHOUT-899)
> have an output file that I can use to generate visuals from? I'll can clean
> something up and add it to the patch.
It does. Apply the patch and then build. Then use the ClusterDumper. Here's my example:
bin/mahout clusterdump --seqFileDir ~/projects/content/apache/sfmum/clustering/kmeans/clusters-2-final/ -o ~/projects/content/apache/sfmum/clustering/kmeans/clusters.graphml -of GRAPH_ML --distanceMeasure org.apache.mahout.common.distance.CosineDistanceMeasure --pointsDir ~/projects/content/apache/sfmum/clustering/kmeans/clusteredPoints/ --dictionaryType sequencefile --dictionary ~/projects/content/apache/sfmum/clustering/seq2sparse/dictionary.file-0 -n 3 -sp 500
>
>
> On Thu, Dec 1, 2011 at 7:57 AM, Dawid Weiss <da...@cs.put.poznan.pl>wrote:
>
>> This looks great, Ted, thanks for sharing.
>>
>> Dawid
>>
>> On Thu, Dec 1, 2011 at 3:32 AM, Ted Dunning <te...@gmail.com> wrote:
>>> Sure. I attached it, but those get stripped. I didn't realize that this
>>> was going to the list.
>>>
>>> Try here: http://dl.dropbox.com/u/36863361/cluster-viz.r
>>>
>>> And here for the image: http://dl.dropbox.com/u/36863361/xyz.png
>>>
>>> On Wed, Nov 30, 2011 at 4:04 PM, Grant Ingersoll <gsingers@apache.org
>>> wrote:
>>>
>>>> Can you share the R code too?
>>>>
>>>> On Nov 30, 2011, at 2:58 PM, Ted Dunning wrote:
>>>>
>>>>> Here is some that I just whipped up. I have also attached an example
>> of
>>>> the output.
>>>>>
>>>>> In the sample output, notice how you can see different stories about
>>>> what clusters the brown-ish and purple clusters are near.<xyz.png>
>>>>>
>>>>> On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gsingers@apache.org
>>>
>>>> wrote:
>>>>> I'm still learning R, do you have code handy you could share?
>>>>>
>>>>> On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
>>>>>
>>>>>> Coloring is pretty easy in R, which is what I use. I just build a
>>>> color
>>>>>> map with the right number of indices and use the cluster id to index
>>>> the
>>>>>> colormap. For grins, I vary the transparency according to how
>>>> seriously
>>>>>> down-sampled the cluster is. That lets me get a good visual feel
>> for
>>>> the
>>>>>> actual cluster size.
>>>>>>
>>>>>> On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <
>> gsingers@apache.org
>>>>> wrote:
>>>>>>
>>>>>>> Anyone have an easy algorithm for coloring clusters in a nice way?
>>>> That
>>>>>>> is, given k clusters, color each centroid and all of it's
>> associated
>>>> points
>>>>>>> in such a way that it is visually appealing and avoids, to the
>> extent
>>>> it
>>>>>>> can, coloring two unique clusters the same color.
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --------------------------------------------
>>>> Grant Ingersoll
>>>> http://www.lucidimagination.com
>>>>
>>>>
>>>>
>>>>
>>
--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com
Re: Clustering graph coloring and layout
Posted by Steven Bourke <sb...@gmail.com>.
Sorry I wasn't really following this thread - I've got lots of random throw
away's for gephi visualisations for clustering and graphs (But not mahout
based). Does the patch (https://issues.apache.org/jira/browse/MAHOUT-899)
have an output file that I can use to generate visuals from? I'll can clean
something up and add it to the patch.
On Thu, Dec 1, 2011 at 7:57 AM, Dawid Weiss <da...@cs.put.poznan.pl>wrote:
> This looks great, Ted, thanks for sharing.
>
> Dawid
>
> On Thu, Dec 1, 2011 at 3:32 AM, Ted Dunning <te...@gmail.com> wrote:
> > Sure. I attached it, but those get stripped. I didn't realize that this
> > was going to the list.
> >
> > Try here: http://dl.dropbox.com/u/36863361/cluster-viz.r
> >
> > And here for the image: http://dl.dropbox.com/u/36863361/xyz.png
> >
> > On Wed, Nov 30, 2011 at 4:04 PM, Grant Ingersoll <gsingers@apache.org
> >wrote:
> >
> >> Can you share the R code too?
> >>
> >> On Nov 30, 2011, at 2:58 PM, Ted Dunning wrote:
> >>
> >> > Here is some that I just whipped up. I have also attached an example
> of
> >> the output.
> >> >
> >> > In the sample output, notice how you can see different stories about
> >> what clusters the brown-ish and purple clusters are near.<xyz.png>
> >> >
> >> > On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gsingers@apache.org
> >
> >> wrote:
> >> > I'm still learning R, do you have code handy you could share?
> >> >
> >> > On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
> >> >
> >> > > Coloring is pretty easy in R, which is what I use. I just build a
> >> color
> >> > > map with the right number of indices and use the cluster id to index
> >> the
> >> > > colormap. For grins, I vary the transparency according to how
> >> seriously
> >> > > down-sampled the cluster is. That lets me get a good visual feel
> for
> >> the
> >> > > actual cluster size.
> >> > >
> >> > > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <
> gsingers@apache.org
> >> >wrote:
> >> > >
> >> > >> Anyone have an easy algorithm for coloring clusters in a nice way?
> >> That
> >> > >> is, given k clusters, color each centroid and all of it's
> associated
> >> points
> >> > >> in such a way that it is visually appealing and avoids, to the
> extent
> >> it
> >> > >> can, coloring two unique clusters the same color.
> >> > >>
> >> >
> >> >
> >> >
> >> >
> >> >
> >>
> >> --------------------------------------------
> >> Grant Ingersoll
> >> http://www.lucidimagination.com
> >>
> >>
> >>
> >>
>
Re: Clustering graph coloring and layout
Posted by Dawid Weiss <da...@cs.put.poznan.pl>.
This looks great, Ted, thanks for sharing.
Dawid
On Thu, Dec 1, 2011 at 3:32 AM, Ted Dunning <te...@gmail.com> wrote:
> Sure. I attached it, but those get stripped. I didn't realize that this
> was going to the list.
>
> Try here: http://dl.dropbox.com/u/36863361/cluster-viz.r
>
> And here for the image: http://dl.dropbox.com/u/36863361/xyz.png
>
> On Wed, Nov 30, 2011 at 4:04 PM, Grant Ingersoll <gs...@apache.org>wrote:
>
>> Can you share the R code too?
>>
>> On Nov 30, 2011, at 2:58 PM, Ted Dunning wrote:
>>
>> > Here is some that I just whipped up. I have also attached an example of
>> the output.
>> >
>> > In the sample output, notice how you can see different stories about
>> what clusters the brown-ish and purple clusters are near.<xyz.png>
>> >
>> > On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gs...@apache.org>
>> wrote:
>> > I'm still learning R, do you have code handy you could share?
>> >
>> > On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
>> >
>> > > Coloring is pretty easy in R, which is what I use. I just build a
>> color
>> > > map with the right number of indices and use the cluster id to index
>> the
>> > > colormap. For grins, I vary the transparency according to how
>> seriously
>> > > down-sampled the cluster is. That lets me get a good visual feel for
>> the
>> > > actual cluster size.
>> > >
>> > > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gsingers@apache.org
>> >wrote:
>> > >
>> > >> Anyone have an easy algorithm for coloring clusters in a nice way?
>> That
>> > >> is, given k clusters, color each centroid and all of it's associated
>> points
>> > >> in such a way that it is visually appealing and avoids, to the extent
>> it
>> > >> can, coloring two unique clusters the same color.
>> > >>
>> >
>> >
>> >
>> >
>> >
>>
>> --------------------------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com
>>
>>
>>
>>
Re: Clustering graph coloring and layout
Posted by Ted Dunning <te...@gmail.com>.
Sure. I attached it, but those get stripped. I didn't realize that this
was going to the list.
Try here: http://dl.dropbox.com/u/36863361/cluster-viz.r
And here for the image: http://dl.dropbox.com/u/36863361/xyz.png
On Wed, Nov 30, 2011 at 4:04 PM, Grant Ingersoll <gs...@apache.org>wrote:
> Can you share the R code too?
>
> On Nov 30, 2011, at 2:58 PM, Ted Dunning wrote:
>
> > Here is some that I just whipped up. I have also attached an example of
> the output.
> >
> > In the sample output, notice how you can see different stories about
> what clusters the brown-ish and purple clusters are near.<xyz.png>
> >
> > On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gs...@apache.org>
> wrote:
> > I'm still learning R, do you have code handy you could share?
> >
> > On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
> >
> > > Coloring is pretty easy in R, which is what I use. I just build a
> color
> > > map with the right number of indices and use the cluster id to index
> the
> > > colormap. For grins, I vary the transparency according to how
> seriously
> > > down-sampled the cluster is. That lets me get a good visual feel for
> the
> > > actual cluster size.
> > >
> > > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gsingers@apache.org
> >wrote:
> > >
> > >> Anyone have an easy algorithm for coloring clusters in a nice way?
> That
> > >> is, given k clusters, color each centroid and all of it's associated
> points
> > >> in such a way that it is visually appealing and avoids, to the extent
> it
> > >> can, coloring two unique clusters the same color.
> > >>
> >
> >
> >
> >
> >
>
> --------------------------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
>
>
>
Re: Clustering graph coloring and layout
Posted by Grant Ingersoll <gs...@apache.org>.
Can you share the R code too?
On Nov 30, 2011, at 2:58 PM, Ted Dunning wrote:
> Here is some that I just whipped up. I have also attached an example of the output.
>
> In the sample output, notice how you can see different stories about what clusters the brown-ish and purple clusters are near.<xyz.png>
>
> On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gs...@apache.org> wrote:
> I'm still learning R, do you have code handy you could share?
>
> On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
>
> > Coloring is pretty easy in R, which is what I use. I just build a color
> > map with the right number of indices and use the cluster id to index the
> > colormap. For grins, I vary the transparency according to how seriously
> > down-sampled the cluster is. That lets me get a good visual feel for the
> > actual cluster size.
> >
> > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gs...@apache.org>wrote:
> >
> >> Anyone have an easy algorithm for coloring clusters in a nice way? That
> >> is, given k clusters, color each centroid and all of it's associated points
> >> in such a way that it is visually appealing and avoids, to the extent it
> >> can, coloring two unique clusters the same color.
> >>
>
>
>
>
>
--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com
Re: Clustering graph coloring and layout
Posted by Ted Dunning <te...@gmail.com>.
Here is some that I just whipped up. I have also attached an example of
the output.
In the sample output, notice how you can see different stories about what
clusters the brown-ish and purple clusters are near.[image: xyz.png]
On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll <gs...@apache.org>wrote:
> I'm still learning R, do you have code handy you could share?
>
> On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
>
> > Coloring is pretty easy in R, which is what I use. I just build a color
> > map with the right number of indices and use the cluster id to index the
> > colormap. For grins, I vary the transparency according to how seriously
> > down-sampled the cluster is. That lets me get a good visual feel for the
> > actual cluster size.
> >
> > On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gsingers@apache.org
> >wrote:
> >
> >> Anyone have an easy algorithm for coloring clusters in a nice way? That
> >> is, given k clusters, color each centroid and all of it's associated
> points
> >> in such a way that it is visually appealing and avoids, to the extent it
> >> can, coloring two unique clusters the same color.
> >>
>
>
>
>
>
Re: Clustering graph coloring and layout
Posted by Grant Ingersoll <gs...@apache.org>.
I'm still learning R, do you have code handy you could share?
On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
> Coloring is pretty easy in R, which is what I use. I just build a color
> map with the right number of indices and use the cluster id to index the
> colormap. For grins, I vary the transparency according to how seriously
> down-sampled the cluster is. That lets me get a good visual feel for the
> actual cluster size.
>
> On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gs...@apache.org>wrote:
>
>> Anyone have an easy algorithm for coloring clusters in a nice way? That
>> is, given k clusters, color each centroid and all of it's associated points
>> in such a way that it is visually appealing and avoids, to the extent it
>> can, coloring two unique clusters the same color.
>>
Re: Clustering graph coloring and layout
Posted by Ted Dunning <te...@gmail.com>.
Coloring is pretty easy in R, which is what I use. I just build a color
map with the right number of indices and use the cluster id to index the
colormap. For grins, I vary the transparency according to how seriously
down-sampled the cluster is. That lets me get a good visual feel for the
actual cluster size.
On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll <gs...@apache.org>wrote:
> Anyone have an easy algorithm for coloring clusters in a nice way? That
> is, given k clusters, color each centroid and all of it's associated points
> in such a way that it is visually appealing and avoids, to the extent it
> can, coloring two unique clusters the same color.
>