You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2009/06/09 20:57:07 UTC

Labels for Vectors and Matrices (MAHOUT-65)

What are people doing for keeping track of their vectors and  
matrices?  MAHOUT-65 attempted to address that, but it seems to have  
gotten stuck.

For MAHOUT-126 (document clustering prep work), I ended up outputting  
Vector cell information to a separate file, which works but is  
cumbersome.  This issue, however, still needs a way to actually label  
and track what vector belongs to what document so as to make any sense  
out of them when the process is done.

Thoughts?

-Grant

Re: Labels for Vectors and Matrices (MAHOUT-65)

Posted by Ted Dunning <te...@gmail.com>.
I would say that a real user gets a bigger vote relative to theoretical
complainers.

Simple at first is fine by me.

On Tue, Jun 9, 2009 at 4:09 PM, Jeff Eastman <jd...@windwardsolutions.com>wrote:

> IIRC, the patch in M-65 works but was judged to be inadequate so I never
> committed it. After several subsequent postings on the requirements I
> started another version but it ran into impossible
> serialization/deserialization issues. Those could probably be addressed now
> with Json but the patch itself got lost in the intervening year. I was going
> to reinvent it again more recently but then the discussion of the new matrix
> library in commons came up and I thought why bother? The other night I asked
> Ted about it and he said that package also does not have labels. For better
> or worse, it seems we still need them.
>
> I'm ok with the simple-minded M-65 patch if that will set you text maniacs
> free <grin>
> Jeff
>
>
>
> Grant Ingersoll wrote:
>
>> FWIW, I'm happy w/ a simple solution right now, which may very well be
>> Jeff's initial patch.  Still, I'd like to hear more from Ted, Jeff and Karl.
>>
>> On Jun 9, 2009, at 6:20 PM, Benson Margulies wrote:
>>
>>  OK, I came in to the middle and misunderstood.
>>>
>>> This doesn't precisely seem to leave much of a space for another
>>> person to join in, but I'd be happy to be corrected by some
>>> combination of the people cited below.
>>>
>>>
>>> On Tue, Jun 9, 2009 at 6:18 PM, Grant Ingersoll<gs...@apache.org>
>>> wrote:
>>>
>>>>
>>>> On Jun 9, 2009, at 6:11 PM, Benson Margulies wrote:
>>>>
>>>>  Grant,
>>>>>
>>>>> AFAIK, there's a perfectly adequate version sitting out there in
>>>>> MAHOUT-65, waiting for a committer to commit it.
>>>>>
>>>>> If that's wrong and there's a concrete coding task I could undertake
>>>>> that would render it committable, I'd be game.
>>>>>
>>>>
>>>>
>>>> The discussion on M-65 doesn't seem to be conclusive to me.  Karl and
>>>> Ted
>>>> had suggestions on Jeff's initial patch, but Jeff hasn't posted his
>>>> prototype that he alludes to in the final comment.
>>>>
>>>>
>>
>>
>>
>>


-- 
Ted Dunning, CTO
DeepDyve

111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
http://www.deepdyve.com
858-414-0013 (m)
408-773-0220 (fax)

Re: Labels for Vectors and Matrices (MAHOUT-65)

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
IIRC, the patch in M-65 works but was judged to be inadequate so I never 
committed it. After several subsequent postings on the requirements I 
started another version but it ran into impossible 
serialization/deserialization issues. Those could probably be addressed 
now with Json but the patch itself got lost in the intervening year. I 
was going to reinvent it again more recently but then the discussion of 
the new matrix library in commons came up and I thought why bother? The 
other night I asked Ted about it and he said that package also does not 
have labels. For better or worse, it seems we still need them.

I'm ok with the simple-minded M-65 patch if that will set you text 
maniacs free <grin>
Jeff


Grant Ingersoll wrote:
> FWIW, I'm happy w/ a simple solution right now, which may very well be 
> Jeff's initial patch.  Still, I'd like to hear more from Ted, Jeff and 
> Karl.
>
> On Jun 9, 2009, at 6:20 PM, Benson Margulies wrote:
>
>> OK, I came in to the middle and misunderstood.
>>
>> This doesn't precisely seem to leave much of a space for another
>> person to join in, but I'd be happy to be corrected by some
>> combination of the people cited below.
>>
>>
>> On Tue, Jun 9, 2009 at 6:18 PM, Grant Ingersoll<gs...@apache.org> 
>> wrote:
>>>
>>> On Jun 9, 2009, at 6:11 PM, Benson Margulies wrote:
>>>
>>>> Grant,
>>>>
>>>> AFAIK, there's a perfectly adequate version sitting out there in
>>>> MAHOUT-65, waiting for a committer to commit it.
>>>>
>>>> If that's wrong and there's a concrete coding task I could undertake
>>>> that would render it committable, I'd be game.
>>>
>>>
>>> The discussion on M-65 doesn't seem to be conclusive to me.  Karl 
>>> and Ted
>>> had suggestions on Jeff's initial patch, but Jeff hasn't posted his
>>> prototype that he alludes to in the final comment.
>>>
>
>
>
>

Re: Labels for Vectors and Matrices (MAHOUT-65)

Posted by Grant Ingersoll <gs...@apache.org>.
FWIW, I'm happy w/ a simple solution right now, which may very well be  
Jeff's initial patch.  Still, I'd like to hear more from Ted, Jeff and  
Karl.

On Jun 9, 2009, at 6:20 PM, Benson Margulies wrote:

> OK, I came in to the middle and misunderstood.
>
> This doesn't precisely seem to leave much of a space for another
> person to join in, but I'd be happy to be corrected by some
> combination of the people cited below.
>
>
> On Tue, Jun 9, 2009 at 6:18 PM, Grant Ingersoll<gs...@apache.org>  
> wrote:
>>
>> On Jun 9, 2009, at 6:11 PM, Benson Margulies wrote:
>>
>>> Grant,
>>>
>>> AFAIK, there's a perfectly adequate version sitting out there in
>>> MAHOUT-65, waiting for a committer to commit it.
>>>
>>> If that's wrong and there's a concrete coding task I could undertake
>>> that would render it committable, I'd be game.
>>
>>
>> The discussion on M-65 doesn't seem to be conclusive to me.  Karl  
>> and Ted
>> had suggestions on Jeff's initial patch, but Jeff hasn't posted his
>> prototype that he alludes to in the final comment.
>>



Re: Labels for Vectors and Matrices (MAHOUT-65)

Posted by Benson Margulies <bi...@gmail.com>.
OK, I came in to the middle and misunderstood.

This doesn't precisely seem to leave much of a space for another
person to join in, but I'd be happy to be corrected by some
combination of the people cited below.


On Tue, Jun 9, 2009 at 6:18 PM, Grant Ingersoll<gs...@apache.org> wrote:
>
> On Jun 9, 2009, at 6:11 PM, Benson Margulies wrote:
>
>> Grant,
>>
>> AFAIK, there's a perfectly adequate version sitting out there in
>> MAHOUT-65, waiting for a committer to commit it.
>>
>> If that's wrong and there's a concrete coding task I could undertake
>> that would render it committable, I'd be game.
>
>
> The discussion on M-65 doesn't seem to be conclusive to me.  Karl and Ted
> had suggestions on Jeff's initial patch, but Jeff hasn't posted his
> prototype that he alludes to in the final comment.
>

Re: Labels for Vectors and Matrices (MAHOUT-65)

Posted by Grant Ingersoll <gs...@apache.org>.
On Jun 9, 2009, at 6:11 PM, Benson Margulies wrote:

> Grant,
>
> AFAIK, there's a perfectly adequate version sitting out there in
> MAHOUT-65, waiting for a committer to commit it.
>
> If that's wrong and there's a concrete coding task I could undertake
> that would render it committable, I'd be game.


The discussion on M-65 doesn't seem to be conclusive to me.  Karl and  
Ted had suggestions on Jeff's initial patch, but Jeff hasn't posted  
his prototype that he alludes to in the final comment.

Re: Labels for Vectors and Matrices (MAHOUT-65)

Posted by Benson Margulies <bi...@gmail.com>.
Grant,

AFAIK, there's a perfectly adequate version sitting out there in
MAHOUT-65, waiting for a committer to commit it.

If that's wrong and there's a concrete coding task I could undertake
that would render it committable, I'd be game.

--benson



On Tue, Jun 9, 2009 at 6:02 PM, Grant Ingersoll<gs...@apache.org> wrote:
>
> On Jun 9, 2009, at 5:44 PM, Benson Margulies wrote:
>
>> I've basically bailed on doing much with Mahout until something like
>> this commits.
>
> The only way it ever gets better is by people pitching in, but, heh, you
> know that already!  ;-)
>
>
>>
>> On Tue, Jun 9, 2009 at 2:57 PM, Grant Ingersoll<gs...@apache.org>
>> wrote:
>>>
>>> What are people doing for keeping track of their vectors and matrices?
>>>  MAHOUT-65 attempted to address that, but it seems to have gotten stuck.
>>>
>>> For MAHOUT-126 (document clustering prep work), I ended up outputting
>>> Vector
>>> cell information to a separate file, which works but is cumbersome.  This
>>> issue, however, still needs a way to actually label and track what vector
>>> belongs to what document so as to make any sense out of them when the
>>> process is done.
>>>
>>> Thoughts?
>>>
>>> -Grant
>>>
>
>
>

Re: Labels for Vectors and Matrices (MAHOUT-65)

Posted by Grant Ingersoll <gs...@apache.org>.
On Jun 9, 2009, at 5:44 PM, Benson Margulies wrote:

> I've basically bailed on doing much with Mahout until something like
> this commits.

The only way it ever gets better is by people pitching in, but, heh,  
you know that already!  ;-)


>
> On Tue, Jun 9, 2009 at 2:57 PM, Grant Ingersoll<gs...@apache.org>  
> wrote:
>> What are people doing for keeping track of their vectors and  
>> matrices?
>>  MAHOUT-65 attempted to address that, but it seems to have gotten  
>> stuck.
>>
>> For MAHOUT-126 (document clustering prep work), I ended up  
>> outputting Vector
>> cell information to a separate file, which works but is  
>> cumbersome.  This
>> issue, however, still needs a way to actually label and track what  
>> vector
>> belongs to what document so as to make any sense out of them when the
>> process is done.
>>
>> Thoughts?
>>
>> -Grant
>>



Re: Labels for Vectors and Matrices (MAHOUT-65)

Posted by Benson Margulies <bi...@gmail.com>.
I've basically bailed on doing much with Mahout until something like
this commits.

On Tue, Jun 9, 2009 at 2:57 PM, Grant Ingersoll<gs...@apache.org> wrote:
> What are people doing for keeping track of their vectors and matrices?
>  MAHOUT-65 attempted to address that, but it seems to have gotten stuck.
>
> For MAHOUT-126 (document clustering prep work), I ended up outputting Vector
> cell information to a separate file, which works but is cumbersome.  This
> issue, however, still needs a way to actually label and track what vector
> belongs to what document so as to make any sense out of them when the
> process is done.
>
> Thoughts?
>
> -Grant
>