You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Erick Erickson <er...@gmail.com> on 2010/03/08 21:48:41 UTC

Searching Subversion comments:

Before I reinvent the wheel.....

Is there any convenient way to, say, find all the files associated with
patch XXXX? I realize one can (hopefully) get this information from JIRA,
but... This is a subset of the problem of searching Subversion comments.

I can see it being useful, especially for people coming into the code fresh.
Grep (or the equivalent in the IDE) only goes so far. If there's any
interest, I'm thinking of playing with http://svn-search.sourceforge.net/ to
see what I could see and report back. It should be easy enough to set up on
my machine at home, although I'm not set up to show it to others.

And it's even based on Lucene. This is feeling recursive..

Mostly I'm checking to see if something like this has already been done and
I just missed the boat. Besides, I'm curious...

Erick

RE: Searching Subversion comments:

Posted by Uwe Schindler <uw...@thetaphi.de>.
Ahm you can open the JIRA issue and click on Subversion commits...

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Monday, March 08, 2010 10:30 PM
> To: java-user@lucene.apache.org
> Subject: Re: Searching Subversion comments:
> 
> Hi Otis!
> 
> Your examples look JIRA-centric, not code-centric. Frankly I'm not
> sure there's a difference for my use-case, but....
> 
> Let's say I want to answer the question "what source files were
> changed for JIRA-1234". Currently I'd have to open up the JIRA and
> collate all the changed files by opening the patches, writing down the
> file names and making a list. Given the number of patches sometimes
> attached, that looks like it can get tedious.
> 
> Either the committers have run across this many times and cursed
> every time or there's *already* a way to handle it....
> 
> The link in my original mail allegedly lets me query on such a thing.
> As
> well as the source code with full Lucene syntax. Of course, answering
> the question I posed above depends on the fidelity of the SVN comments
> at commit time...
> 
> You can always to a grep of the source code for stuff *in* the source
> but not with full Lucene syntax, and not in the SVN comments. And I've
> run
> into a similar situation at work, so I'd be gathering information for
> there
> too.
> 
> I'm not wedded to the idea, but I'd be willing to devote some time to
> it
> if others thought it *might* be useful. I'd imagine a proof-of-concept
> hack,
> one of the outcomes of which is "interesting, but not worth the
> maintenance",
> or "utterly and completely useless in our situation", or "Why the heck
> didn't
> we have this ages ago?". But I don't even want  to do a POC if there's
> already a mechanism that people like. I just hate manual collation....
> 
> Although I have the vision of *this* group saying "we could improve
> that,
> where's the source code?"....
> 
> Erick
> 
> On Mon, Mar 8, 2010 at 4:06 PM, Otis Gospodnetic
> <otis_gospodnetic@yahoo.com
> > wrote:
> 
> > Hi Erick,
> >
> > For what it's worth, we are considering indexing JIRA comments over
> on
> > http://search-lucene.com/ , though I'm not entirely convinced
> searching in
> > comments would be super valuable.  Would it?
> >
> > But note that JIRA (and LucidFind) already do that.  For example, go
> to
> > http://issues.apache.org/jira/browse/LUCENE-2061 and search for
> "Attached
> > first cut python script nrtBench.py."~10 (it's in that issue's
> comments) and
> > JIRA will find that issue.
> >
> > What exactly are you lokoing to do/build?
> >
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Hadoop ecosystem search :: http://search-hadoop.com/
> >
> >
> >
> > ----- Original Message ----
> > > From: Erick Erickson <er...@gmail.com>
> > > To: java-user <ja...@lucene.apache.org>
> > > Sent: Mon, March 8, 2010 3:48:41 PM
> > > Subject: Searching Subversion comments:
> > >
> > > Before I reinvent the wheel.....
> > >
> > > Is there any convenient way to, say, find all the files associated
> with
> > > patch XXXX? I realize one can (hopefully) get this information from
> JIRA,
> > > but... This is a subset of the problem of searching Subversion
> comments.
> > >
> > > I can see it being useful, especially for people coming into the
> code
> > fresh.
> > > Grep (or the equivalent in the IDE) only goes so far. If there's
> any
> > > interest, I'm thinking of playing with
> > http://svn-search.sourceforge.net/ to
> > > see what I could see and report back. It should be easy enough to
> set up
> > on
> > > my machine at home, although I'm not set up to show it to others.
> > >
> > > And it's even based on Lucene. This is feeling recursive..
> > >
> > > Mostly I'm checking to see if something like this has already been
> done
> > and
> > > I just missed the boat. Besides, I'm curious...
> > >
> > > Erick
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Searching Subversion comments:

Posted by Erick Erickson <er...@gmail.com>.
Hi Otis!

Your examples look JIRA-centric, not code-centric. Frankly I'm not
sure there's a difference for my use-case, but....

Let's say I want to answer the question "what source files were
changed for JIRA-1234". Currently I'd have to open up the JIRA and
collate all the changed files by opening the patches, writing down the
file names and making a list. Given the number of patches sometimes
attached, that looks like it can get tedious.

Either the committers have run across this many times and cursed
every time or there's *already* a way to handle it....

The link in my original mail allegedly lets me query on such a thing. As
well as the source code with full Lucene syntax. Of course, answering
the question I posed above depends on the fidelity of the SVN comments
at commit time...

You can always to a grep of the source code for stuff *in* the source
but not with full Lucene syntax, and not in the SVN comments. And I've run
into a similar situation at work, so I'd be gathering information for there
too.

I'm not wedded to the idea, but I'd be willing to devote some time to it
if others thought it *might* be useful. I'd imagine a proof-of-concept hack,
one of the outcomes of which is "interesting, but not worth the
maintenance",
or "utterly and completely useless in our situation", or "Why the heck
didn't
we have this ages ago?". But I don't even want  to do a POC if there's
already a mechanism that people like. I just hate manual collation....

Although I have the vision of *this* group saying "we could improve that,
where's the source code?"....

Erick

On Mon, Mar 8, 2010 at 4:06 PM, Otis Gospodnetic <otis_gospodnetic@yahoo.com
> wrote:

> Hi Erick,
>
> For what it's worth, we are considering indexing JIRA comments over on
> http://search-lucene.com/ , though I'm not entirely convinced searching in
> comments would be super valuable.  Would it?
>
> But note that JIRA (and LucidFind) already do that.  For example, go to
> http://issues.apache.org/jira/browse/LUCENE-2061 and search for "Attached
> first cut python script nrtBench.py."~10 (it's in that issue's comments) and
> JIRA will find that issue.
>
> What exactly are you lokoing to do/build?
>
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Hadoop ecosystem search :: http://search-hadoop.com/
>
>
>
> ----- Original Message ----
> > From: Erick Erickson <er...@gmail.com>
> > To: java-user <ja...@lucene.apache.org>
> > Sent: Mon, March 8, 2010 3:48:41 PM
> > Subject: Searching Subversion comments:
> >
> > Before I reinvent the wheel.....
> >
> > Is there any convenient way to, say, find all the files associated with
> > patch XXXX? I realize one can (hopefully) get this information from JIRA,
> > but... This is a subset of the problem of searching Subversion comments.
> >
> > I can see it being useful, especially for people coming into the code
> fresh.
> > Grep (or the equivalent in the IDE) only goes so far. If there's any
> > interest, I'm thinking of playing with
> http://svn-search.sourceforge.net/ to
> > see what I could see and report back. It should be easy enough to set up
> on
> > my machine at home, although I'm not set up to show it to others.
> >
> > And it's even based on Lucene. This is feeling recursive..
> >
> > Mostly I'm checking to see if something like this has already been done
> and
> > I just missed the boat. Besides, I'm curious...
> >
> > Erick
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Searching Subversion comments:

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Erick,

For what it's worth, we are considering indexing JIRA comments over on http://search-lucene.com/ , though I'm not entirely convinced searching in comments would be super valuable.  Would it?

But note that JIRA (and LucidFind) already do that.  For example, go to http://issues.apache.org/jira/browse/LUCENE-2061 and search for "Attached first cut python script nrtBench.py."~10 (it's in that issue's comments) and JIRA will find that issue.

What exactly are you lokoing to do/build?

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



----- Original Message ----
> From: Erick Erickson <er...@gmail.com>
> To: java-user <ja...@lucene.apache.org>
> Sent: Mon, March 8, 2010 3:48:41 PM
> Subject: Searching Subversion comments:
> 
> Before I reinvent the wheel.....
> 
> Is there any convenient way to, say, find all the files associated with
> patch XXXX? I realize one can (hopefully) get this information from JIRA,
> but... This is a subset of the problem of searching Subversion comments.
> 
> I can see it being useful, especially for people coming into the code fresh.
> Grep (or the equivalent in the IDE) only goes so far. If there's any
> interest, I'm thinking of playing with http://svn-search.sourceforge.net/ to
> see what I could see and report back. It should be easy enough to set up on
> my machine at home, although I'm not set up to show it to others.
> 
> And it's even based on Lucene. This is feeling recursive..
> 
> Mostly I'm checking to see if something like this has already been done and
> I just missed the boat. Besides, I'm curious...
> 
> Erick


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Searching Subversion comments:

Posted by Jacob Rhoden <jr...@unimelb.edu.au>.
I am not trying to evangelise git, but more just curious if you guys have ever looked at switching to a distributed source control system. The branching / merging capabilities mean you really don't have to use patches to collect changes from non trusted parties.

See this google tech talk if your interested:
    http://www.youtube.com/watch?v=4XpnKHJAok8

On 09/03/2010, at 8:37 AM, Robert Muir wrote:

>> Also, in the open source realm:
>> 
>> 3. ViewVC (formerly ViewCVS) has a facility to query revision history, including commit messages.  Apache's instance, which serves Lucene's repository, doesn't expose this functionality, though....
>> 
> 
> I think it does? Do you mean this functionality?
> http://svn.apache.org/viewvc?view=revision&revision=833298
> 
> 
> -- 
> Robert Muir
> rcmuir@gmail.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Searching Subversion comments:

Posted by Erick Erickson <er...@gmail.com>.
Thanks, all. This is the best of answers.. "The
functionality you want is already available if you
just know where to look".

Of course my life would be easier if my employer
used either JIRA or IntelliJ, but perhaps Eclipse
will show me something similar....

Erick@NeverMind.com

On Mon, Mar 8, 2010 at 4:55 PM, Steven A Rowe <sa...@syr.edu> wrote:

> On 03/08/2010 at 4:37 PM, Robert Muir wrote:
> > > Also, in the open source realm:
> > >
> > > 3. ViewVC (formerly ViewCVS) has a facility to query revision
> > history, including commit messages.  Apache's instance, which serves
> > Lucene's repository, doesn't expose this functionality, though....
> > >
> >
> > I think it does? Do you mean this functionality?
> > http://svn.apache.org/viewvc?view=revision&revision=833298
>
> Nope, there is no log comment query functionality there.  I'm talking about
> the functionality you get when you click on the "Commit Query" link up at
> the top of the page in modern ViewVC, e.g. from the ViewVC view over the
> ViewVC source repo:
>
>   http://svn.collab.net/viewvc/viewvc/
>
> Steve
>
>

RE: Searching Subversion comments:

Posted by Steven A Rowe <sa...@syr.edu>.
On 03/08/2010 at 4:37 PM, Robert Muir wrote:
> > Also, in the open source realm:
> > 
> > 3. ViewVC (formerly ViewCVS) has a facility to query revision
> history, including commit messages.  Apache's instance, which serves
> Lucene's repository, doesn't expose this functionality, though....
> > 
> 
> I think it does? Do you mean this functionality?
> http://svn.apache.org/viewvc?view=revision&revision=833298

Nope, there is no log comment query functionality there.  I'm talking about the functionality you get when you click on the "Commit Query" link up at the top of the page in modern ViewVC, e.g. from the ViewVC view over the ViewVC source repo:

   http://svn.collab.net/viewvc/viewvc/

Steve


Re: Searching Subversion comments:

Posted by Robert Muir <rc...@gmail.com>.
> Also, in the open source realm:
>
> 3. ViewVC (formerly ViewCVS) has a facility to query revision history, including commit messages.  Apache's instance, which serves Lucene's repository, doesn't expose this functionality, though....
>

I think it does? Do you mean this functionality?
http://svn.apache.org/viewvc?view=revision&revision=833298


-- 
Robert Muir
rcmuir@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Searching Subversion comments:

Posted by Steven A Rowe <sa...@syr.edu>.
Hi Erick,

On 03/08/2010 at 3:48 PM, Erick Erickson wrote:
> Is there any convenient way to, say, find all the files associated with
> patch XXXX? I realize one can (hopefully) get this information from
> JIRA, but... This is a subset of the problem of searching Subversion
> comments.

I know of two commercial implementations (probably not what you're after):

1. Atlassian provides SVN access to open source software projects via hosted FishEye instances.  Here's their ASF instance:

  http://fisheye6.atlassian.com/

Mahout has apparently set this up already, and I assume Lucene-java could do the same:

  http://fisheye6.atlassian.com/browse/mahout

You can query comments and just about anything else -- see the "Query" tab.

2. IntelliJ IDEA's "Repository" tab in the "Changes" pane provides a VC system browser (Subversion among I don't know how many others) with a commit message search box -- from a revision hit from a commit message search, you can see a tree view of modified files, and then double click on each of them to see side-by-side colored differences.  Extremely slick.

Also, in the open source realm:

3. ViewVC (formerly ViewCVS) has a facility to query revision history, including commit messages.  Apache's instance, which serves Lucene's repository, doesn't expose this functionality, though....

Steve


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Searching Subversion comments:

Posted by N Hira <nh...@cognocys.com>.
I use "svn diff --change <revisionNumber>" to get the list of files associated with a given commit.

You might also want to look at http://freshmeat.net/projects/svnweb/

HTH

-h


________________________________
From: Erick Erickson <er...@gmail.com>
To: java-user <ja...@lucene.apache.org>
Sent: Mon, March 8, 2010 2:48:41 PM
Subject: Searching Subversion comments:

Before I reinvent the wheel.....

Is there any convenient way to, say, find all the files associated with
patch XXXX? I realize one can (hopefully) get this information from JIRA,
but... This is a subset of the problem of searching Subversion comments.

I can see it being useful, especially for people coming into the code fresh.
Grep (or the equivalent in the IDE) only goes so far. If there's any
interest, I'm thinking of playing with http://svn-search.sourceforge.net/ to
see what I could see and report back. It should be easy enough to set up on
my machine at home, although I'm not set up to show it to others.

And it's even based on Lucene. This is feeling recursive..

Mostly I'm checking to see if something like this has already been done and
I just missed the boat. Besides, I'm curious...

Erick



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org