You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Brian Brophy <br...@gmail.com> on 2010/12/03 16:07:25 UTC

Searching A SVN Repo

I realize this is not directly related to SVN itself; however, I am 
hoping this community may have some suggestions.  If there is a more 
appropriate forum I should be engaging, please let me know.

We have a repo with over 125 GB of data, containing everything from 
source code to requirements documents, etc.  The repo is accessible via 
https.  We'd like to be able to search the contents of the repo.  A use 
case may be taking a phrase of interest, some text, and finding 
occurrences within the repo where this text resides (ie, within the 
actual source code or documents).

I have considered pointing something like a search appliance at the 
https interface and letting it crawl/spider/index the data.  That could 
be one option.

And yes, one could checkout/update the repo to search it, but at 125 GB 
and growing that is a cumbersome approach to do many client-side searches.

Would anyone have any other options?  Has anyone done something similar?

Thank you,
Brian

Re: Searching A SVN Repo

Posted by Karl Heinz Marbaise <kh...@gmx.de>.
Hi there,

may be it's worth to take a look at:

http://supose.org/wiki/supose

Fully written in Java...can scan not only the repos it will scan the 
content also of PDF's, Word's, Excel's, etc.

Tested with larger repos (Apache Software Foundation Repo about 32 GiB)...

If you have any questions / Feature request / Bugs etc. don't hesitate 
to contact me...

Kind regards
Karl Heinz Marbaise
-- 
SoftwareEntwicklung Beratung Schulung    Tel.: +49 (0) 2405 / 415 893
Dipl.Ing.(FH) Karl Heinz Marbaise        ICQ#: 135949029
Hauptstrasse 177                         USt.IdNr: DE191347579
52146 Würselen                           http://www.soebes.de

RES: RES: Searching A SVN Repo

Posted by Luiz Guilherme Kimel <lk...@dba.com.br>.
The project is very interesting and well structured. Maybe it would benefit
more contributions if someone volunteer to port it to Java or anything free
and open. In the meantime, Visual Studio Express is free for use and enough
to customize the SVNQuery project if necessary.

It's search database is based in the Apache Lucene project which is written
originally in Java and ported to .Net in another project. The asp.net client
is VERY simple what means you can write your own search client in Java with
minimum effort and achieve your customization goals if they exist.



-----Mensagem original-----
De: Richard England [mailto:rlengland@gmail.com] 
Enviada em: segunda-feira, 6 de dezembro de 2010 03:38
Para: users@subversion.apache.org
Assunto: Re: RES: Searching A SVN Repo

Sounded promising until I hit C# and asp.net.


On 12/03/2010 09:19 AM, Luiz Guilherme Kimel wrote:
> Try SVNQuery
>
> http://svnquery.tigris.org/
>
> ;-)
>
>
> -----Mensagem original-----
> De: Brian Brophy [mailto:brianmbrophy@gmail.com]
> Enviada em: sexta-feira, 3 de dezembro de 2010 13:07
> Para: users@subversion.apache.org
> Assunto: Searching A SVN Repo
>
> I realize this is not directly related to SVN itself; however, I am
> hoping this community may have some suggestions.  If there is a more
> appropriate forum I should be engaging, please let me know.
>
> We have a repo with over 125 GB of data, containing everything from
> source code to requirements documents, etc.  The repo is accessible via
> https.  We'd like to be able to search the contents of the repo.  A use
> case may be taking a phrase of interest, some text, and finding
> occurrences within the repo where this text resides (ie, within the
> actual source code or documents).
>
> I have considered pointing something like a search appliance at the
> https interface and letting it crawl/spider/index the data.  That could
> be one option.
>
> And yes, one could checkout/update the repo to search it, but at 125 GB
> and growing that is a cumbersome approach to do many client-side searches.
>
> Would anyone have any other options?  Has anyone done something similar?
>
> Thank you,
> Brian
>
>


Re: RES: Searching A SVN Repo

Posted by Richard England <rl...@gmail.com>.
Sounded promising until I hit C# and asp.net.


On 12/03/2010 09:19 AM, Luiz Guilherme Kimel wrote:
> Try SVNQuery
>
> http://svnquery.tigris.org/
>
> ;-)
>
>
> -----Mensagem original-----
> De: Brian Brophy [mailto:brianmbrophy@gmail.com]
> Enviada em: sexta-feira, 3 de dezembro de 2010 13:07
> Para: users@subversion.apache.org
> Assunto: Searching A SVN Repo
>
> I realize this is not directly related to SVN itself; however, I am
> hoping this community may have some suggestions.  If there is a more
> appropriate forum I should be engaging, please let me know.
>
> We have a repo with over 125 GB of data, containing everything from
> source code to requirements documents, etc.  The repo is accessible via
> https.  We'd like to be able to search the contents of the repo.  A use
> case may be taking a phrase of interest, some text, and finding
> occurrences within the repo where this text resides (ie, within the
> actual source code or documents).
>
> I have considered pointing something like a search appliance at the
> https interface and letting it crawl/spider/index the data.  That could
> be one option.
>
> And yes, one could checkout/update the repo to search it, but at 125 GB
> and growing that is a cumbersome approach to do many client-side searches.
>
> Would anyone have any other options?  Has anyone done something similar?
>
> Thank you,
> Brian
>
>

RES: Searching A SVN Repo

Posted by Luiz Guilherme Kimel <lk...@dba.com.br>.
Try SVNQuery

http://svnquery.tigris.org/

;-)


-----Mensagem original-----
De: Brian Brophy [mailto:brianmbrophy@gmail.com] 
Enviada em: sexta-feira, 3 de dezembro de 2010 13:07
Para: users@subversion.apache.org
Assunto: Searching A SVN Repo

I realize this is not directly related to SVN itself; however, I am 
hoping this community may have some suggestions.  If there is a more 
appropriate forum I should be engaging, please let me know.

We have a repo with over 125 GB of data, containing everything from 
source code to requirements documents, etc.  The repo is accessible via 
https.  We'd like to be able to search the contents of the repo.  A use 
case may be taking a phrase of interest, some text, and finding 
occurrences within the repo where this text resides (ie, within the 
actual source code or documents).

I have considered pointing something like a search appliance at the 
https interface and letting it crawl/spider/index the data.  That could 
be one option.

And yes, one could checkout/update the repo to search it, but at 125 GB 
and growing that is a cumbersome approach to do many client-side searches.

Would anyone have any other options?  Has anyone done something similar?

Thank you,
Brian

Re: Searching A SVN Repo

Posted by Les Mikesell <le...@gmail.com>.
On 12/3/2010 10:07 AM, Brian Brophy wrote:
> I realize this is not directly related to SVN itself; however, I am
> hoping this community may have some suggestions. If there is a more
> appropriate forum I should be engaging, please let me know.
>
> We have a repo with over 125 GB of data, containing everything from
> source code to requirements documents, etc. The repo is accessible via
> https. We'd like to be able to search the contents of the repo. A use
> case may be taking a phrase of interest, some text, and finding
> occurrences within the repo where this text resides (ie, within the
> actual source code or documents).
>
> I have considered pointing something like a search appliance at the
> https interface and letting it crawl/spider/index the data. That could
> be one option.
>
> And yes, one could checkout/update the repo to search it, but at 125 GB
> and growing that is a cumbersome approach to do many client-side searches.
>
> Would anyone have any other options? Has anyone done something similar?

The commercial fisheye product does this:
http://www.atlassian.com/software/fisheye/ and it has some other 
features - but it's not free.

You might be able to roll some kind of search out of htdig, perhaps 
pointing it at viewvc instead of the subversion view.

If you are mostly interested in the head version and mostly in code, you 
can point opengrok:
http://hub.opensolaris.org/bin/view/Project+opengrok/WebHome
at a checked out copy of a project and it will give you a search with 
linked cross-references between code definitions and references plus 
some support for accessing the version log.

-- 
   Les Mikesell
    lesmikesell@gmail.com

Re: Searching A SVN Repo

Posted by Karl Heinz Marbaise <kh...@gmx.de>.
Hi there,

may be it's worth to take a look at:

http://supose.org/wiki/supose

Fully written in Java...can scan not only the repos it will scan the 
content also of PDF's, Word's, Excel's, etc.

Tested with larger repos (Apache Software Foundation Repo about 32 GiB)...

If you have any questions / Feature request / Bugs etc. don't hesitate 
to contact me...

Kind regards
Karl Heinz Marbaise
-- 
SoftwareEntwicklung Beratung Schulung    Tel.: +49 (0) 2405 / 415 893
Dipl.Ing.(FH) Karl Heinz Marbaise        ICQ#: 135949029
Hauptstrasse 177                         USt.IdNr: DE191347579
52146 Würselen                           http://www.soebes.de