You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by "Bicking, David (HHoldings, IT)" <Da...@thehartford.com> on 2008/02/06 16:54:31 UTC

Seeking a way to do a full text search in a repository

I'm forwarding a request from a coworker who currently has read-only
access to this list.  See below, and I copied him in the post so a
reply-all will get to him directly.

Thanks,
David

---------------------------------------

Hello,

We are starting to implement Subversion in our organization.  Our users
have been asking me for a particular feature, and so far my searching
has been in vain.  The feature is to be able to do a full text search of
the content of a repository, but without having to download the
repository to disk.  Ideally, this would allow for either searching the
HEAD revision, a specified revision, or a range of revisions.  Is there
a client out there that allows for repository searching in this manner?
Note that we are on a Windows platform, so the tool would have to
support Windows.

Thanks,

~ Justin


*************************************************************************
This communication, including attachments, is
for the exclusive use of addressee and may contain proprietary,
confidential and/or privileged information.  If you are not the intended
recipient, any use, copying, disclosure, dissemination or distribution is
strictly prohibited.  If you are not the intended recipient, please notify
the sender immediately by return e-mail, delete this communication and
destroy all copies.
*************************************************************************


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Seeking a way to do a full text search in a repository

Posted by Blair Zajac <bl...@orcaware.com>.
This isn't about your question, it's just about how to post to this list by 
starting a new post and not-replying to another post:

   http://subversion.tigris.org/mailing-list-guidelines.html#fresh-post

Not a big deal, just letting you know for next time.

Regards,
Blair

Bicking, David (HHoldings, IT) wrote:
> I'm forwarding a request from a coworker who currently has read-only
> access to this list.  See below, and I copied him in the post so a
> reply-all will get to him directly.
> 
> Thanks,
> David
> 
> ---------------------------------------
> 
> Hello,
> 
> We are starting to implement Subversion in our organization.  Our users
> have been asking me for a particular feature, and so far my searching
> has been in vain.  The feature is to be able to do a full text search of
> the content of a repository, but without having to download the
> repository to disk.  Ideally, this would allow for either searching the
> HEAD revision, a specified revision, or a range of revisions.  Is there
> a client out there that allows for repository searching in this manner?
> Note that we are on a Windows platform, so the tool would have to
> support Windows.
> 
> Thanks,
> 
> ~ Justin
> 
> 
> *************************************************************************
> This communication, including attachments, is
> for the exclusive use of addressee and may contain proprietary,
> confidential and/or privileged information.  If you are not the intended
> recipient, any use, copying, disclosure, dissemination or distribution is
> strictly prohibited.  If you are not the intended recipient, please notify
> the sender immediately by return e-mail, delete this communication and
> destroy all copies.
> *************************************************************************
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

RE: RE: Seeking a way to do a full text search in a repository

Posted by "Reedick, Andrew" <jr...@ATT.COM>.
> -----Original Message-----
> From: Reedick, Andrew
> Sent: Wednesday, February 06, 2008 1:09 PM
> To: Bicking, David (HHoldings, IT); users@subversion.tigris.org
> Cc: Kohlhepp, Justin (HHoldings, IT)
> Subject: RE: RE: Seeking a way to do a full text search in a
repository
> 
 
> svn_search_hack.py
> ==================
> import sys
> import xml.etree.ElementTree as ET
> 
> log = ET.parse(sys.stdin)
> #log = ET.parse('log.xml')
> 
> for logentry in log.findall('logentry'):
> 
>     for path in log.findall('logentry/paths/path'):
>         # we only want to cat added or modified files.  No point
> cat'ing
> deleted ones
>         if path.attrib['action'] == 'M' or path.attrib['action'] ==
> 'A':
>             print 'svn cat -r %s "{REPOS}%s" | findstr {REGEX}' %
> (logentry.attrib['revision'], path.text)
> 


The previous code contained a bug that generated too many 'svn cat's.
Use this instead.

log = ET.parse(sys.stdin)
#log = ET.parse('data.xml')

for logentry in log.findall('logentry'):
    for path in logentry.findall('paths/path'):
        # we only want to cat added or modified files.  No point cat'ing
deleted ones
        if path.attrib['action'] == 'M' or path.attrib['action'] == 'A':
            print 'svn cat -r %s "{REPOS}%s" | findstr {REGEX}' %
(logentry.attrib['revision'], path.text)

*****

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential, proprietary, and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from all computers. GA625



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


RE: RE: Seeking a way to do a full text search in a repository

Posted by "Reedick, Andrew" <jr...@ATT.COM>.
> -----Original Message-----
> From: Reedick, Andrew
> Sent: Wednesday, February 06, 2008 12:22 PM
> To: Bicking, David (HHoldings, IT); users@subversion.tigris.org
> Cc: Kohlhepp, Justin (HHoldings, IT)
> Subject: RE: Seeking a way to do a full text search in a repository
> 
> 
> 
> Some script based on running 'svn cat -r' on each file listed by 'svn
> log -v -r' followed by grep/findstr would be my guess.
> 

Meh, since I've had to do this in the past, I might as well automate it.
Here's a crude python script (python is available from
www.activestate.com):

It simply generates a batch file containing one 'svn cat' per file
listed by 'svn log --xml -v'.  It's not smart enough to distinguish dirs
from files, so you'll see 'svn cat' complain about directories.

1.  svn log --xml -v -r 1:999 svn://server/repos/some/where | python
svn_search_hack.py > search.bat
2.  Open search.bat in your favorite text editor
2.1     search & replace {REPOS} with 'svn://server/repos'
2.2     search & replace {REGEX} with your findstr search pattern and
switches.
3.  search.bat > results.txt
4.  ???
5.  Profit!


svn_search_hack.py
==================
import sys
import xml.etree.ElementTree as ET

log = ET.parse(sys.stdin)
#log = ET.parse('log.xml')

for logentry in log.findall('logentry'):

    for path in log.findall('logentry/paths/path'):
        # we only want to cat added or modified files.  No point cat'ing
deleted ones
        if path.attrib['action'] == 'M' or path.attrib['action'] == 'A':
            print 'svn cat -r %s "{REPOS}%s" | findstr {REGEX}' %
(logentry.attrib['revision'], path.text)




*****

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential, proprietary, and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from all computers. GA622



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


RE: Seeking a way to do a full text search in a repository

Posted by "Reedick, Andrew" <jr...@ATT.COM>.
> -----Original Message-----
> From: Bicking, David (HHoldings, IT)
> [mailto:David.Bicking@thehartford.com]
> Sent: Wednesday, February 06, 2008 11:55 AM
> To: users@subversion.tigris.org
> Cc: Kohlhepp, Justin (HHoldings, IT)
> Subject: Seeking a way to do a full text search in a repository
> 
> 
> We are starting to implement Subversion in our organization.  Our
users
> have been asking me for a particular feature, and so far my searching
> has been in vain.  The feature is to be able to do a full text search
> of
> the content of a repository, but without having to download the
> repository to disk.  Ideally, this would allow for either searching
the
> HEAD revision, a specified revision, or a range of revisions.  Is
there
> a client out there that allows for repository searching in this
manner?
> Note that we are on a Windows platform, so the tool would have to
> support Windows.
> 


Some script based on running 'svn cat -r' on each file listed by 'svn
log -v -r' followed by grep/findstr would be my guess.



*****

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential, proprietary, and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from all computers. GA623



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Seeking a way to do a full text search in a repository

Posted by Jean-Claude Antonio <jc...@arcetis.com>.
Hi,

We wrote for our clients this Search Engine and Administration tool for 
SVN. And you are free to use it. You can test it here:

http://search.voilasvn.com/

Choose repository: Search demo
leave username/pwd blank
or you can download it as you wish.
You can now plug your own file parser based on file extensions.

JClaude

Stephen Armstrong a écrit :
>
>>> Hello,
>>>
>>> We are starting to implement Subversion in our organization.  Our users
>>> have been asking me for a particular feature, and so far my searching
>>> has been in vain.  The feature is to be able to do a full text 
>>> search of
>>> the content of a repository, but without having to download the
>>> repository to disk.  Ideally, this would allow for either searching the
>>> HEAD revision, a specified revision, or a range of revisions.  Is there
>>> a client out there that allows for repository searching in this manner?
>>> Note that we are on a Windows platform, so the tool would have to
>>> support Windows.
>>>
>>> Thanks,
>>>
>>> ~ Justin
>>>
>>>
>>> ************************************************************************* 
>>>
>>> This communication, including attachments, is
>>> for the exclusive use of addressee and may contain proprietary,
>>> confidential and/or privileged information.  If you are not the 
>>> intended
>>> recipient, any use, copying, disclosure, dissemination or 
>>> distribution is
>>> strictly prohibited.  If you are not the intended recipient, please 
>>> notify
>>> the sender immediately by return e-mail, delete this communication and
>>> destroy all copies.
>>> ************************************************************************* 
>>>
>>>
>>>     
>> Johnathan Gifford wrote:
>> This is a feature that MS-Visual Source Safe has.  So if your group 
>> is coming from VSS or had experience with VSS in the past, that is 
>> why they are asking.  But with Microsoft phasing out support for 
>> Visual Source Safe, many groups are looking for new options as 
>> MS-Team Systems is very expensive and not very light weight.  So you 
>> may be one of the first looking for similar features such as the 
>> search capability that VSS has.  Keep in mind, Subversion aimed to 
>> replace CVS, not VSS.  Now that Subversion is in the 1.x releases, 
>> that is starting change as features from ClearCase and Perforce are 
>> being developed.
>>
>> Currently, there is no built in option on Subversion to search the 
>> files that are in the repository for a certain pattern.  There are a 
>> couple of tools that could aid though.  First being the method 
>> mentioned by Andrew Reedick in another response to your post, but 
>> this can be ugly and time consuming.  Second, I believe there is an 
>> enhancement to Trac (trac.edgewall.org) that will allow text 
>> searches.  However, if you have a rather large repository (10,000+ 
>> revisions), it'll never get through the initial indexing process in 
>> Trac and there is no way to invoke that process from the command line.
>>
>> If this is something that you would like to see added to Subversion, 
>> please create an issue (as an enhancement) in the Subversion issue 
>> tracker.  Make sure there is not one already there though!  If you 
>> need a secondary person vouching for this feature, let me know the 
>> issue number, I'll be glad to do so and add additional input.  I 
>> would not expect this enhancement to come around until version 2.0 or 
>> later because the road map for 1.x is pretty well much laid out and 
>> this is no light weight enhancement.  I do know that a few of the 
>> committers to the project are aware of a need for this feature.
>>
>> Don't let this lack of a feature keep you from going to Subversion or 
>> abandoning it.  While this feature may be lacking, there are plenty 
>> of features that'll make the lives of your developers much better 
>> than what they had before.
>>
>> Hope this helps,
>>
>> Johnathan
>>
>>
>>   
> If using a second system for this search is an option, I'd recommend 
> opengrok (http://www.opensolaris.org/os/project/opengrok/). Opengrok 
> is a server-based webapp that allows you to search through any source 
> code. It's fast, can work with many different programming languages, 
> and any methods or variables shown are links to further searches.
>
> It requires a computer running a servlet container (like Tomcat), and 
> the computer running opengrok must have a working copy of all the code 
> you want to search.
>
> Steve
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Seeking a way to do a full text search in a repository

Posted by Stephen Armstrong <St...@nanometrics.ca>.
>> Hello,
>>
>> We are starting to implement Subversion in our organization.  Our users
>> have been asking me for a particular feature, and so far my searching
>> has been in vain.  The feature is to be able to do a full text search of
>> the content of a repository, but without having to download the
>> repository to disk.  Ideally, this would allow for either searching the
>> HEAD revision, a specified revision, or a range of revisions.  Is there
>> a client out there that allows for repository searching in this manner?
>> Note that we are on a Windows platform, so the tool would have to
>> support Windows.
>>
>> Thanks,
>>
>> ~ Justin
>>
>>
>> *************************************************************************
>> This communication, including attachments, is
>> for the exclusive use of addressee and may contain proprietary,
>> confidential and/or privileged information.  If you are not the intended
>> recipient, any use, copying, disclosure, dissemination or distribution is
>> strictly prohibited.  If you are not the intended recipient, please notify
>> the sender immediately by return e-mail, delete this communication and
>> destroy all copies.
>> *************************************************************************
>>
>>     
> Johnathan Gifford wrote:
> This is a feature that MS-Visual Source Safe has.  So if your group is coming from VSS or had experience with VSS in the past, that is why they are asking.  But with Microsoft phasing out support for Visual Source Safe, many groups are looking for new options as MS-Team Systems is very expensive and not very light weight.  So you may be one of the first looking for similar features such as the search capability that VSS has.  Keep in mind, Subversion aimed to replace CVS, not VSS.  Now that Subversion is in the 1.x releases, that is starting change as features from ClearCase and Perforce are being developed.
>
> Currently, there is no built in option on Subversion to search the files that are in the repository for a certain pattern.  There are a couple of tools that could aid though.  First being the method mentioned by Andrew Reedick in another response to your post, but this can be ugly and time consuming.  Second, I believe there is an enhancement to Trac (trac.edgewall.org) that will allow text searches.  However, if you have a rather large repository (10,000+ revisions), it'll never get through the initial indexing process in Trac and there is no way to invoke that process from the command line.
>
> If this is something that you would like to see added to Subversion, please create an issue (as an enhancement) in the Subversion issue tracker.  Make sure there is not one already there though!  If you need a secondary person vouching for this feature, let me know the issue number, I'll be glad to do so and add additional input.  I would not expect this enhancement to come around until version 2.0 or later because the road map for 1.x is pretty well much laid out and this is no light weight enhancement.  I do know that a few of the committers to the project are aware of a need for this feature.
>
> Don't let this lack of a feature keep you from going to Subversion or abandoning it.  While this feature may be lacking, there are plenty of features that'll make the lives of your developers much better than what they had before.
>
> Hope this helps,
>
> Johnathan
>
>
>   
If using a second system for this search is an option, I'd recommend 
opengrok (http://www.opensolaris.org/os/project/opengrok/). Opengrok is 
a server-based webapp that allows you to search through any source code. 
It's fast, can work with many different programming languages, and any 
methods or variables shown are links to further searches.

It requires a computer running a servlet container (like Tomcat), and 
the computer running opengrok must have a working copy of all the code 
you want to search.

Steve

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Seeking a way to do a full text search in a repository

Posted by Johnathan Gifford <jg...@wernervas.com>.
This is a feature that MS-Visual Source Safe has.  So if your group is coming from VSS or had experience with VSS in the past, that is why they are asking.  But with Microsoft phasing out support for Visual Source Safe, many groups are looking for new options as MS-Team Systems is very expensive and not very light weight.  So you may be one of the first looking for similar features such as the search capability that VSS has.  Keep in mind, Subversion aimed to replace CVS, not VSS.  Now that Subversion is in the 1.x releases, that is starting change as features from ClearCase and Perforce are being developed.

Currently, there is no built in option on Subversion to search the files that are in the repository for a certain pattern.  There are a couple of tools that could aid though.  First being the method mentioned by Andrew Reedick in another response to your post, but this can be ugly and time consuming.  Second, I believe there is an enhancement to Trac (trac.edgewall.org) that will allow text searches.  However, if you have a rather large repository (10,000+ revisions), it'll never get through the initial indexing process in Trac and there is no way to invoke that process from the command line.

If this is something that you would like to see added to Subversion, please create an issue (as an enhancement) in the Subversion issue tracker.  Make sure there is not one already there though!  If you need a secondary person vouching for this feature, let me know the issue number, I'll be glad to do so and add additional input.  I would not expect this enhancement to come around until version 2.0 or later because the road map for 1.x is pretty well much laid out and this is no light weight enhancement.  I do know that a few of the committers to the project are aware of a need for this feature.

Don't let this lack of a feature keep you from going to Subversion or abandoning it.  While this feature may be lacking, there are plenty of features that'll make the lives of your developers much better than what they had before.

Hope this helps,

Johnathan

>>> On Wed, Feb 6, 2008 at 10:54 AM, in message
<53...@AD1HFDEXC306.ad1.prod>, "Bicking,
David (HHoldings, IT)" <Da...@thehartford.com> wrote: 
> I'm forwarding a request from a coworker who currently has read-only
> access to this list.  See below, and I copied him in the post so a
> reply-all will get to him directly.
> 
> Thanks,
> David
> 
> ---------------------------------------
> 
> Hello,
> 
> We are starting to implement Subversion in our organization.  Our users
> have been asking me for a particular feature, and so far my searching
> has been in vain.  The feature is to be able to do a full text search of
> the content of a repository, but without having to download the
> repository to disk.  Ideally, this would allow for either searching the
> HEAD revision, a specified revision, or a range of revisions.  Is there
> a client out there that allows for repository searching in this manner?
> Note that we are on a Windows platform, so the tool would have to
> support Windows.
> 
> Thanks,
> 
> ~ Justin
> 
> 
> *************************************************************************
> This communication, including attachments, is
> for the exclusive use of addressee and may contain proprietary,
> confidential and/or privileged information.  If you are not the intended
> recipient, any use, copying, disclosure, dissemination or distribution is
> strictly prohibited.  If you are not the intended recipient, please notify
> the sender immediately by return e-mail, delete this communication and
> destroy all copies.
> *************************************************************************
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Seeking a way to do a full text search in a repository

Posted by Troy Bull <tr...@gmail.com>.
On Feb 6, 2008 10:54 AM, Bicking, David (HHoldings, IT)
<Da...@thehartford.com> wrote:
> I'm forwarding a request from a coworker who currently has read-only
> access to this list.  See below, and I copied him in the post so a
> reply-all will get to him directly.
>
> Thanks,
> David
>
> ---------------------------------------
>
> Hello,
>
> We are starting to implement Subversion in our organization.  Our users
> have been asking me for a particular feature, and so far my searching
> has been in vain.  The feature is to be able to do a full text search of
> the content of a repository, but without having to download the
> repository to disk.  Ideally, this would allow for either searching the
> HEAD revision, a specified revision, or a range of revisions.  Is there
> a client out there that allows for repository searching in this manner?
> Note that we are on a Windows platform, so the tool would have to
> support Windows.
>
> Thanks,
>
> ~ Justin
>


In my last job I setup "fisheye" I think it does just what you are
looking for (and probably more).  We dont have fisheye at my current
place of employment so I can't really check it right now.

Thanks
Troy

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org