You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by ci...@bloglines.com on 2005/12/05 21:40:17 UTC

test/extending nutch

Apologies if this email was sent out before. I seem to be having some trouble
sending messages to the list.

I'm getting my hands ( and head ) around
Lucene and Nutch.  I recently purchased "Lucene in Action" worked through
some of the basic examples, getting a feel for how all the parts work.

I'm more interested in using and extending Nutch, as it seems to have more
of the functionality that I'm interested in using as part of the basic feature
set.

I want to use Nutch to index our public available website, but to
be able to extend it in such a fashion so as to apply some basic rules when
either indexing or searching, such as weighting different types of content
so that certain pieces show up more often.

I figure I could achieve some
of that by either setting boost values when reading in the page and indexing
( some of the relevant info is available as meta-data ), or by extending the
indexwriter to store additional field values so that I can write to weigh
searches after indexing.

I'm not very familiar with the Nutch basic and
plugin architecture and any pointers to how I may be able to achieve this
would be greatly appreciated.

Thanks in advance for any help that can be
provided,

-a