You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2009/06/09 20:57:07 UTC
Labels for Vectors and Matrices (MAHOUT-65)
What are people doing for keeping track of their vectors and
matrices? MAHOUT-65 attempted to address that, but it seems to have
gotten stuck.
For MAHOUT-126 (document clustering prep work), I ended up outputting
Vector cell information to a separate file, which works but is
cumbersome. This issue, however, still needs a way to actually label
and track what vector belongs to what document so as to make any sense
out of them when the process is done.
Thoughts?
-Grant
Re: Labels for Vectors and Matrices (MAHOUT-65)
Posted by Ted Dunning <te...@gmail.com>.
I would say that a real user gets a bigger vote relative to theoretical
complainers.
Simple at first is fine by me.
On Tue, Jun 9, 2009 at 4:09 PM, Jeff Eastman <jd...@windwardsolutions.com>wrote:
> IIRC, the patch in M-65 works but was judged to be inadequate so I never
> committed it. After several subsequent postings on the requirements I
> started another version but it ran into impossible
> serialization/deserialization issues. Those could probably be addressed now
> with Json but the patch itself got lost in the intervening year. I was going
> to reinvent it again more recently but then the discussion of the new matrix
> library in commons came up and I thought why bother? The other night I asked
> Ted about it and he said that package also does not have labels. For better
> or worse, it seems we still need them.
>
> I'm ok with the simple-minded M-65 patch if that will set you text maniacs
> free <grin>
> Jeff
>
>
>
> Grant Ingersoll wrote:
>
>> FWIW, I'm happy w/ a simple solution right now, which may very well be
>> Jeff's initial patch. Still, I'd like to hear more from Ted, Jeff and Karl.
>>
>> On Jun 9, 2009, at 6:20 PM, Benson Margulies wrote:
>>
>> OK, I came in to the middle and misunderstood.
>>>
>>> This doesn't precisely seem to leave much of a space for another
>>> person to join in, but I'd be happy to be corrected by some
>>> combination of the people cited below.
>>>
>>>
>>> On Tue, Jun 9, 2009 at 6:18 PM, Grant Ingersoll<gs...@apache.org>
>>> wrote:
>>>
>>>>
>>>> On Jun 9, 2009, at 6:11 PM, Benson Margulies wrote:
>>>>
>>>> Grant,
>>>>>
>>>>> AFAIK, there's a perfectly adequate version sitting out there in
>>>>> MAHOUT-65, waiting for a committer to commit it.
>>>>>
>>>>> If that's wrong and there's a concrete coding task I could undertake
>>>>> that would render it committable, I'd be game.
>>>>>
>>>>
>>>>
>>>> The discussion on M-65 doesn't seem to be conclusive to me. Karl and
>>>> Ted
>>>> had suggestions on Jeff's initial patch, but Jeff hasn't posted his
>>>> prototype that he alludes to in the final comment.
>>>>
>>>>
>>
>>
>>
>>
--
Ted Dunning, CTO
DeepDyve
111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
http://www.deepdyve.com
858-414-0013 (m)
408-773-0220 (fax)
Re: Labels for Vectors and Matrices (MAHOUT-65)
Posted by Jeff Eastman <jd...@windwardsolutions.com>.
IIRC, the patch in M-65 works but was judged to be inadequate so I never
committed it. After several subsequent postings on the requirements I
started another version but it ran into impossible
serialization/deserialization issues. Those could probably be addressed
now with Json but the patch itself got lost in the intervening year. I
was going to reinvent it again more recently but then the discussion of
the new matrix library in commons came up and I thought why bother? The
other night I asked Ted about it and he said that package also does not
have labels. For better or worse, it seems we still need them.
I'm ok with the simple-minded M-65 patch if that will set you text
maniacs free <grin>
Jeff
Grant Ingersoll wrote:
> FWIW, I'm happy w/ a simple solution right now, which may very well be
> Jeff's initial patch. Still, I'd like to hear more from Ted, Jeff and
> Karl.
>
> On Jun 9, 2009, at 6:20 PM, Benson Margulies wrote:
>
>> OK, I came in to the middle and misunderstood.
>>
>> This doesn't precisely seem to leave much of a space for another
>> person to join in, but I'd be happy to be corrected by some
>> combination of the people cited below.
>>
>>
>> On Tue, Jun 9, 2009 at 6:18 PM, Grant Ingersoll<gs...@apache.org>
>> wrote:
>>>
>>> On Jun 9, 2009, at 6:11 PM, Benson Margulies wrote:
>>>
>>>> Grant,
>>>>
>>>> AFAIK, there's a perfectly adequate version sitting out there in
>>>> MAHOUT-65, waiting for a committer to commit it.
>>>>
>>>> If that's wrong and there's a concrete coding task I could undertake
>>>> that would render it committable, I'd be game.
>>>
>>>
>>> The discussion on M-65 doesn't seem to be conclusive to me. Karl
>>> and Ted
>>> had suggestions on Jeff's initial patch, but Jeff hasn't posted his
>>> prototype that he alludes to in the final comment.
>>>
>
>
>
>
Re: Labels for Vectors and Matrices (MAHOUT-65)
Posted by Grant Ingersoll <gs...@apache.org>.
FWIW, I'm happy w/ a simple solution right now, which may very well be
Jeff's initial patch. Still, I'd like to hear more from Ted, Jeff and
Karl.
On Jun 9, 2009, at 6:20 PM, Benson Margulies wrote:
> OK, I came in to the middle and misunderstood.
>
> This doesn't precisely seem to leave much of a space for another
> person to join in, but I'd be happy to be corrected by some
> combination of the people cited below.
>
>
> On Tue, Jun 9, 2009 at 6:18 PM, Grant Ingersoll<gs...@apache.org>
> wrote:
>>
>> On Jun 9, 2009, at 6:11 PM, Benson Margulies wrote:
>>
>>> Grant,
>>>
>>> AFAIK, there's a perfectly adequate version sitting out there in
>>> MAHOUT-65, waiting for a committer to commit it.
>>>
>>> If that's wrong and there's a concrete coding task I could undertake
>>> that would render it committable, I'd be game.
>>
>>
>> The discussion on M-65 doesn't seem to be conclusive to me. Karl
>> and Ted
>> had suggestions on Jeff's initial patch, but Jeff hasn't posted his
>> prototype that he alludes to in the final comment.
>>
Re: Labels for Vectors and Matrices (MAHOUT-65)
Posted by Benson Margulies <bi...@gmail.com>.
OK, I came in to the middle and misunderstood.
This doesn't precisely seem to leave much of a space for another
person to join in, but I'd be happy to be corrected by some
combination of the people cited below.
On Tue, Jun 9, 2009 at 6:18 PM, Grant Ingersoll<gs...@apache.org> wrote:
>
> On Jun 9, 2009, at 6:11 PM, Benson Margulies wrote:
>
>> Grant,
>>
>> AFAIK, there's a perfectly adequate version sitting out there in
>> MAHOUT-65, waiting for a committer to commit it.
>>
>> If that's wrong and there's a concrete coding task I could undertake
>> that would render it committable, I'd be game.
>
>
> The discussion on M-65 doesn't seem to be conclusive to me. Karl and Ted
> had suggestions on Jeff's initial patch, but Jeff hasn't posted his
> prototype that he alludes to in the final comment.
>
Re: Labels for Vectors and Matrices (MAHOUT-65)
Posted by Grant Ingersoll <gs...@apache.org>.
On Jun 9, 2009, at 6:11 PM, Benson Margulies wrote:
> Grant,
>
> AFAIK, there's a perfectly adequate version sitting out there in
> MAHOUT-65, waiting for a committer to commit it.
>
> If that's wrong and there's a concrete coding task I could undertake
> that would render it committable, I'd be game.
The discussion on M-65 doesn't seem to be conclusive to me. Karl and
Ted had suggestions on Jeff's initial patch, but Jeff hasn't posted
his prototype that he alludes to in the final comment.
Re: Labels for Vectors and Matrices (MAHOUT-65)
Posted by Benson Margulies <bi...@gmail.com>.
Grant,
AFAIK, there's a perfectly adequate version sitting out there in
MAHOUT-65, waiting for a committer to commit it.
If that's wrong and there's a concrete coding task I could undertake
that would render it committable, I'd be game.
--benson
On Tue, Jun 9, 2009 at 6:02 PM, Grant Ingersoll<gs...@apache.org> wrote:
>
> On Jun 9, 2009, at 5:44 PM, Benson Margulies wrote:
>
>> I've basically bailed on doing much with Mahout until something like
>> this commits.
>
> The only way it ever gets better is by people pitching in, but, heh, you
> know that already! ;-)
>
>
>>
>> On Tue, Jun 9, 2009 at 2:57 PM, Grant Ingersoll<gs...@apache.org>
>> wrote:
>>>
>>> What are people doing for keeping track of their vectors and matrices?
>>> MAHOUT-65 attempted to address that, but it seems to have gotten stuck.
>>>
>>> For MAHOUT-126 (document clustering prep work), I ended up outputting
>>> Vector
>>> cell information to a separate file, which works but is cumbersome. This
>>> issue, however, still needs a way to actually label and track what vector
>>> belongs to what document so as to make any sense out of them when the
>>> process is done.
>>>
>>> Thoughts?
>>>
>>> -Grant
>>>
>
>
>
Re: Labels for Vectors and Matrices (MAHOUT-65)
Posted by Grant Ingersoll <gs...@apache.org>.
On Jun 9, 2009, at 5:44 PM, Benson Margulies wrote:
> I've basically bailed on doing much with Mahout until something like
> this commits.
The only way it ever gets better is by people pitching in, but, heh,
you know that already! ;-)
>
> On Tue, Jun 9, 2009 at 2:57 PM, Grant Ingersoll<gs...@apache.org>
> wrote:
>> What are people doing for keeping track of their vectors and
>> matrices?
>> MAHOUT-65 attempted to address that, but it seems to have gotten
>> stuck.
>>
>> For MAHOUT-126 (document clustering prep work), I ended up
>> outputting Vector
>> cell information to a separate file, which works but is
>> cumbersome. This
>> issue, however, still needs a way to actually label and track what
>> vector
>> belongs to what document so as to make any sense out of them when the
>> process is done.
>>
>> Thoughts?
>>
>> -Grant
>>
Re: Labels for Vectors and Matrices (MAHOUT-65)
Posted by Benson Margulies <bi...@gmail.com>.
I've basically bailed on doing much with Mahout until something like
this commits.
On Tue, Jun 9, 2009 at 2:57 PM, Grant Ingersoll<gs...@apache.org> wrote:
> What are people doing for keeping track of their vectors and matrices?
> MAHOUT-65 attempted to address that, but it seems to have gotten stuck.
>
> For MAHOUT-126 (document clustering prep work), I ended up outputting Vector
> cell information to a separate file, which works but is cumbersome. This
> issue, however, still needs a way to actually label and track what vector
> belongs to what document so as to make any sense out of them when the
> process is done.
>
> Thoughts?
>
> -Grant
>