You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@allura.apache.org by Ingo Hornberger <in...@gmx.net> on 2018/05/23 18:01:44 UTC

Indexing source code

Hi folks!

I searched already quite a bit through the code. But to be honest I don't
fully understand the allura concept for managing the solr index, yet.

My goal is to be able to search for code and markdown files in
repositories. The global search should simply also match source code.

Has anybody an idea how to implement that?

BR,
Ingo

Re: Indexing source code

Posted by Dave Brondsema <da...@brondsema.net>.
On 5/23/18 2:01 PM, Ingo Hornberger wrote:
> Hi folks!
> 
> I searched already quite a bit through the code. But to be honest I don't
> fully understand the allura concept for managing the solr index, yet.
> 
> My goal is to be able to search for code and markdown files in
> repositories. The global search should simply also match source code.
> 
> Has anybody an idea how to implement that?
> 
> BR,
> Ingo
> 

Hi Ingo!

Here's the key parts:

SearchIndexable is a mixin class that everything indexing into solr should use.
Examples are Users, Projects, and all the Artifact child classes (wiki pages,
comments, etc, etc).  It has an index() method with a docstring to explain it,
and you can look at other index examples for reference.

All the things that use the SearchIndexable mixin currently are models that get
saved to mongo, via ming.  That happens automatically through ming extensions:
IndexerSessionExtension (for users & projects) and ArtifactSessionExtension (for
artifacts).  Those run automatically whenever models are saved with a change.
They end up calling add_artifacts() and similar tasks in the
allura.tasks.index_tasks package.  And those call index() methods and save to solr.

For files in code repositories, the ming extensions won't work since code repos'
files aren't Ming models stored in mongo.  But you can still make a class that
uses the SearchIndexable mixin.  And still use methods similar to
allura.tasks.index_tasks to do the indexing.  You'll just need to set up
something different to go through the repos or respond to new commits (e.g.
allura.model.repo_refresh:refresh_repo) and trigger the index task.

Hope that helps point you in a good direction.  It would be a great feature indeed!


-- 
Dave Brondsema : dave@brondsema.net
http://www.brondsema.net : personal
http://www.splike.com : programming
              <><