You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@community.apache.org by Ross Gardler <rg...@apache.org> on 2011/05/14 12:20:09 UTC

Capturing mail (was Re: Stackoverflow)

Sent from my mobile device (so please excuse typos)

On 13 May 2011, at 02:31, David Blevins <da...@gmail.com> wrote:

> For me tagging and voting and (i forgot) the marking the question answered (thanks, Benson) are the parts I would love.
> 
> I write some really good responses sometimes and even *I* have a hard time finding some of my old responses in the list archive haystack.

Right. I always ask users to provide a patch if they find an answer in the mailing list useful. Of course it rarely happens (even with devs). 

Keeping things simple, could we provide a feature in the CMS that simply copies a mail from our archives (with backlinks) into the CMS system for the appropriate project?

A link to this could also be provided in the footer of each mail (only works for committers).

In the CMS we could have some magic system to build an index. 

I appreciate this has now moved away from stack overflow (I changed the subject) but for any Perl hackers looking for something useful to do on a weekend I would certainly use such a feature. 

This could grow to fancy tagging, tracking and more. But I believe thus is a reasonably simple thin to do that would provide immediate benefits. 

Ross

> 
> And to avoid the "tag names can be spam" issue having so that only committers can introduce new tags would be fine for me.  It could be a file in svn or something else equally lame but functional.
> 
> 
> -David
> 
> On May 12, 2011, at 6:04 PM, Ted Dunning wrote:
> 
>> There is another factor that comes into play.  QA sites like SO also blend
>> in wiki and trust mechanisms.  Thus, highly rated users can and do rewrite
>> questions to be more answerable/understandable.  They can also rewrite
>> answers if necessary.
>> 
>> Without automated karma, the moderation function has to be granted manually
>> which is a process that doesn't scale as easily and is subject to attack by
>> cabals.  That way lies wikipedia's dictatorship of the editor proletariat
>> and associated drop in user participation.  That is fine for a largely
>> static knowledge base, but SO addresses much more dynamic topics in a way
>> that engages the readership much more strongly.  Moreover, the feedback
>> cycle essentially guarantees that the moderators reflect the interests of
>> the voting public.
>> 
>> On Thu, May 12, 2011 at 5:47 PM, David Blevins <da...@gmail.com>wrote:
>> 
>>> Another thought.  Sometimes I wonder how hard it would be to just allow
>>> tagging and voting on top of a plain mailing list emails.  A simple DB with
>>> the messageId as the key for tags and vote count then a slightly fancier
>>> archive view than we have now.   And hey, markdown happens to look nice as
>>> plain email.  I've actually been indenting code snippets for years.
>>> 
>>> I admit I like getting SO points and badges but they do not factor in at
>>> all when looking for the right answer.
>>> 
> 

Re: Capturing mail (was Re: Stackoverflow)

Posted by Grant Ingersoll <gs...@apache.org>.
We've also done some clustering of ASF Archives (See MAHOUT-588) that could show similar items, potentially.  I've often thought it would be cool to automatically classify email as to the expertise level required (which could also be done w/ Mahout) and I've always wanted our Mailing List manager to do a search of the archives first, before actually forwarding the email to the user list.   If the search turns up answers, send them back to the user and ask them if they answer the question or not.  If not, then the mail can go through.  

Just fun stuff to think about how to help people find answers better...

-Grant

On May 14, 2011, at 1:45 PM, Ted Dunning wrote:

> A student popped up a while ago on the Mahout mailing with a very nice
> little magic program that would sift through email archives to find good
> question/answer pairs in email threads.
> 
> The results were quite impressively good.  The program didn't find a lot of
> pairs, but the pairs it did find were uniformly pretty excellent.
> 
> Maybe a secondary search index based on the output of such a program would
> be useful.
> 
> On Sat, May 14, 2011 at 3:20 AM, Ross Gardler <rg...@apache.org> wrote:
> 
>> Sent from my mobile device (so please excuse typos)
>> 
>> On 13 May 2011, at 02:31, David Blevins <da...@gmail.com> wrote:
>> 
>>> For me tagging and voting and (i forgot) the marking the question
>> answered (thanks, Benson) are the parts I would love.
>>> 
>>> I write some really good responses sometimes and even *I* have a hard
>> time finding some of my old responses in the list archive haystack.
>> 
>> Right. I always ask users to provide a patch if they find an answer in the
>> mailing list useful. Of course it rarely happens (even with devs).
>> 
>> Keeping things simple, could we provide a feature in the CMS that simply
>> copies a mail from our archives (with backlinks) into the CMS system for the
>> appropriate project?
>> 
>> A link to this could also be provided in the footer of each mail (only
>> works for committers).
>> 
>> In the CMS we could have some magic system to build an index.
>> 
>> I appreciate this has now moved away from stack overflow (I changed the
>> subject) but for any Perl hackers looking for something useful to do on a
>> weekend I would certainly use such a feature.
>> 
>> This could grow to fancy tagging, tracking and more. But I believe thus is
>> a reasonably simple thin to do that would provide immediate benefits.
>> 
>> Ross
>> 
>>> 
>>> And to avoid the "tag names can be spam" issue having so that only
>> committers can introduce new tags would be fine for me.  It could be a file
>> in svn or something else equally lame but functional.
>>> 
>>> 
>>> -David
>>> 
>>> On May 12, 2011, at 6:04 PM, Ted Dunning wrote:
>>> 
>>>> There is another factor that comes into play.  QA sites like SO also
>> blend
>>>> in wiki and trust mechanisms.  Thus, highly rated users can and do
>> rewrite
>>>> questions to be more answerable/understandable.  They can also rewrite
>>>> answers if necessary.
>>>> 
>>>> Without automated karma, the moderation function has to be granted
>> manually
>>>> which is a process that doesn't scale as easily and is subject to attack
>> by
>>>> cabals.  That way lies wikipedia's dictatorship of the editor
>> proletariat
>>>> and associated drop in user participation.  That is fine for a largely
>>>> static knowledge base, but SO addresses much more dynamic topics in a
>> way
>>>> that engages the readership much more strongly.  Moreover, the feedback
>>>> cycle essentially guarantees that the moderators reflect the interests
>> of
>>>> the voting public.
>>>> 
>>>> On Thu, May 12, 2011 at 5:47 PM, David Blevins <david.blevins@gmail.com
>>> wrote:
>>>> 
>>>>> Another thought.  Sometimes I wonder how hard it would be to just allow
>>>>> tagging and voting on top of a plain mailing list emails.  A simple DB
>> with
>>>>> the messageId as the key for tags and vote count then a slightly
>> fancier
>>>>> archive view than we have now.   And hey, markdown happens to look nice
>> as
>>>>> plain email.  I've actually been indenting code snippets for years.
>>>>> 
>>>>> I admit I like getting SO points and badges but they do not factor in
>> at
>>>>> all when looking for the right answer.
>>>>> 
>>> 
>> 

--------------------------
Grant Ingersoll
Lucene Revolution -- Lucene and Solr User Conference
May 25-26 in San Francisco
www.lucenerevolution.org


Re: Capturing mail (was Re: Stackoverflow)

Posted by David Blevins <da...@gmail.com>.
On May 14, 2011, at 10:45 AM, Ted Dunning wrote:

> A student popped up a while ago on the Mahout mailing with a very nice
> little magic program that would sift through email archives to find good
> question/answer pairs in email threads.
> 
> The results were quite impressively good.  The program didn't find a lot of
> pairs, but the pairs it did find were uniformly pretty excellent.
> 
> Maybe a secondary search index based on the output of such a program would
> be useful.

My curiosity is definitely peaked.  How hard would it be to setup in your people account or a zone or something as an experiment?


-David

> On Sat, May 14, 2011 at 3:20 AM, Ross Gardler <rg...@apache.org> wrote:
> 
>> Sent from my mobile device (so please excuse typos)
>> 
>> On 13 May 2011, at 02:31, David Blevins <da...@gmail.com> wrote:
>> 
>>> For me tagging and voting and (i forgot) the marking the question
>> answered (thanks, Benson) are the parts I would love.
>>> 
>>> I write some really good responses sometimes and even *I* have a hard
>> time finding some of my old responses in the list archive haystack.
>> 
>> Right. I always ask users to provide a patch if they find an answer in the
>> mailing list useful. Of course it rarely happens (even with devs).
>> 
>> Keeping things simple, could we provide a feature in the CMS that simply
>> copies a mail from our archives (with backlinks) into the CMS system for the
>> appropriate project?
>> 
>> A link to this could also be provided in the footer of each mail (only
>> works for committers).
>> 
>> In the CMS we could have some magic system to build an index.
>> 
>> I appreciate this has now moved away from stack overflow (I changed the
>> subject) but for any Perl hackers looking for something useful to do on a
>> weekend I would certainly use such a feature.
>> 
>> This could grow to fancy tagging, tracking and more. But I believe thus is
>> a reasonably simple thin to do that would provide immediate benefits.
>> 
>> Ross
>> 
>>> 
>>> And to avoid the "tag names can be spam" issue having so that only
>> committers can introduce new tags would be fine for me.  It could be a file
>> in svn or something else equally lame but functional.
>>> 
>>> 
>>> -David
>>> 
>>> On May 12, 2011, at 6:04 PM, Ted Dunning wrote:
>>> 
>>>> There is another factor that comes into play.  QA sites like SO also
>> blend
>>>> in wiki and trust mechanisms.  Thus, highly rated users can and do
>> rewrite
>>>> questions to be more answerable/understandable.  They can also rewrite
>>>> answers if necessary.
>>>> 
>>>> Without automated karma, the moderation function has to be granted
>> manually
>>>> which is a process that doesn't scale as easily and is subject to attack
>> by
>>>> cabals.  That way lies wikipedia's dictatorship of the editor
>> proletariat
>>>> and associated drop in user participation.  That is fine for a largely
>>>> static knowledge base, but SO addresses much more dynamic topics in a
>> way
>>>> that engages the readership much more strongly.  Moreover, the feedback
>>>> cycle essentially guarantees that the moderators reflect the interests
>> of
>>>> the voting public.
>>>> 
>>>> On Thu, May 12, 2011 at 5:47 PM, David Blevins <david.blevins@gmail.com
>>> wrote:
>>>> 
>>>>> Another thought.  Sometimes I wonder how hard it would be to just allow
>>>>> tagging and voting on top of a plain mailing list emails.  A simple DB
>> with
>>>>> the messageId as the key for tags and vote count then a slightly
>> fancier
>>>>> archive view than we have now.   And hey, markdown happens to look nice
>> as
>>>>> plain email.  I've actually been indenting code snippets for years.
>>>>> 
>>>>> I admit I like getting SO points and badges but they do not factor in
>> at
>>>>> all when looking for the right answer.
>>>>> 
>>> 
>> 


Re: Capturing mail (was Re: Stackoverflow)

Posted by Ted Dunning <te...@gmail.com>.
A student popped up a while ago on the Mahout mailing with a very nice
little magic program that would sift through email archives to find good
question/answer pairs in email threads.

The results were quite impressively good.  The program didn't find a lot of
pairs, but the pairs it did find were uniformly pretty excellent.

Maybe a secondary search index based on the output of such a program would
be useful.

On Sat, May 14, 2011 at 3:20 AM, Ross Gardler <rg...@apache.org> wrote:

> Sent from my mobile device (so please excuse typos)
>
> On 13 May 2011, at 02:31, David Blevins <da...@gmail.com> wrote:
>
> > For me tagging and voting and (i forgot) the marking the question
> answered (thanks, Benson) are the parts I would love.
> >
> > I write some really good responses sometimes and even *I* have a hard
> time finding some of my old responses in the list archive haystack.
>
> Right. I always ask users to provide a patch if they find an answer in the
> mailing list useful. Of course it rarely happens (even with devs).
>
> Keeping things simple, could we provide a feature in the CMS that simply
> copies a mail from our archives (with backlinks) into the CMS system for the
> appropriate project?
>
> A link to this could also be provided in the footer of each mail (only
> works for committers).
>
> In the CMS we could have some magic system to build an index.
>
> I appreciate this has now moved away from stack overflow (I changed the
> subject) but for any Perl hackers looking for something useful to do on a
> weekend I would certainly use such a feature.
>
> This could grow to fancy tagging, tracking and more. But I believe thus is
> a reasonably simple thin to do that would provide immediate benefits.
>
> Ross
>
> >
> > And to avoid the "tag names can be spam" issue having so that only
> committers can introduce new tags would be fine for me.  It could be a file
> in svn or something else equally lame but functional.
> >
> >
> > -David
> >
> > On May 12, 2011, at 6:04 PM, Ted Dunning wrote:
> >
> >> There is another factor that comes into play.  QA sites like SO also
> blend
> >> in wiki and trust mechanisms.  Thus, highly rated users can and do
> rewrite
> >> questions to be more answerable/understandable.  They can also rewrite
> >> answers if necessary.
> >>
> >> Without automated karma, the moderation function has to be granted
> manually
> >> which is a process that doesn't scale as easily and is subject to attack
> by
> >> cabals.  That way lies wikipedia's dictatorship of the editor
> proletariat
> >> and associated drop in user participation.  That is fine for a largely
> >> static knowledge base, but SO addresses much more dynamic topics in a
> way
> >> that engages the readership much more strongly.  Moreover, the feedback
> >> cycle essentially guarantees that the moderators reflect the interests
> of
> >> the voting public.
> >>
> >> On Thu, May 12, 2011 at 5:47 PM, David Blevins <david.blevins@gmail.com
> >wrote:
> >>
> >>> Another thought.  Sometimes I wonder how hard it would be to just allow
> >>> tagging and voting on top of a plain mailing list emails.  A simple DB
> with
> >>> the messageId as the key for tags and vote count then a slightly
> fancier
> >>> archive view than we have now.   And hey, markdown happens to look nice
> as
> >>> plain email.  I've actually been indenting code snippets for years.
> >>>
> >>> I admit I like getting SO points and badges but they do not factor in
> at
> >>> all when looking for the right answer.
> >>>
> >
>

Re: Capturing mail (was Re: Stackoverflow)

Posted by David Blevins <da...@gmail.com>.
On May 14, 2011, at 3:20 AM, Ross Gardler wrote:

> Sent from my mobile device (so please excuse typos)
> 
> On 13 May 2011, at 02:31, David Blevins <da...@gmail.com> wrote:
> 
>> For me tagging and voting and (i forgot) the marking the question answered (thanks, Benson) are the parts I would love.
>> 
>> I write some really good responses sometimes and even *I* have a hard time finding some of my old responses in the list archive haystack.
> 
> Right. I always ask users to provide a patch if they find an answer in the mailing list useful. Of course it rarely happens (even with devs). 
> 
> Keeping things simple, could we provide a feature in the CMS that simply copies a mail from our archives (with backlinks) into the CMS system for the appropriate project?
> 
> A link to this could also be provided in the footer of each mail (only works for committers).

I get it.  I read this a couple times and thought you were talking about being able to view all mail via the CMS.  You're talking about a simple way to get the chosen few "good content" emails into the CMS easily.

That could be pretty cool.  

> In the CMS we could have some magic system to build an index. 

Or maybe these emails just go to a "drafts" bucket.  Sort of a "queue" of things that perhaps need to be cleaned up a little before being added to the documentation at the appropriate location.

> I appreciate this has now moved away from stack overflow (I changed the subject) but for any Perl hackers looking for something useful to do on a weekend I would certainly use such a feature. 

Indeed.  I tend to like things that leverage what I'm already doing and just make it a little more effective.

I'm not too aware of the CMS internals, but it seems like something like this could get started in any language as the primary part would be checking it into the related CMS.  At least if I understand the CMS correctly.


The link I suppose would need the messageId and full list name.  Anyone know how we might technically do that?  Not sure how extensible ezmlm is in this regard.


> This could grow to fancy tagging, tracking and more. But I believe thus is a reasonably simple thin to do that would provide immediate benefits. 

Really the CMS itself could benefit from tagging if it doesn't have it already.

I admit I struggle with typical tree views of documentation.  I hate spending time having to think about "what is the one true parent topic of this doc and how should it fit in the larger picture".  I'd rather just tag the heck out of pages and let people find them that way.

The 'click tags to narrow scope' is such a simple and effective way to find things. 


-David