You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Sathyam Y <sa...@yahoo.com> on 2008/05/06 23:09:25 UTC
Re: Fwd: Question about adding tags or attributes to indexed info
The easiest way to achieve this is by adding a custom query plugin. I added a plugin to parse META tags on HTML pages and add them to the index, which can be used during queries.
- Enhance HtmlParseFilter to parse meta tags and store it in the Document
- Enhance IndexingFilter to add the tags to the index
- Sathyam
Gene Campbell <ge...@gmail.com> wrote:
Hello
I'm new to Nutch.
I want each URL in the index to be optionally tagged with extra
information that a searcher with filter on.
How do I do this Nutch/Lucene? Some indexing plugin? Which one? Do
I need to create my own plugin? Are there examples anywhere?
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
Re: Fwd: Question about adding tags or attributes to indexed info
Posted by Sathyam Y <sa...@yahoo.com>.
The following documentation could be very helpful: http://wiki.apache.org/nutch/WritingPluginExample-0.9
AFAIK, there is a limitation in nutch source FieldQueryFilter.java where you have to create a separate plugin for each field to be indexed. One way I was able to workaround this is by enhancing FieldQueryFilter.java. It is fairly straightforward to modify FieldQueryFilter.java to support a list of fields, although that would require change to nutch base code.
- Sathyam
Sathyam Y <sa...@yahoo.com> wrote:
The easiest way to achieve this is by adding a custom query plugin. I added a plugin to parse META tags on HTML pages and add them to the index, which can be used during queries.
- Enhance HtmlParseFilter to parse meta tags and store it in the Document
- Enhance IndexingFilter to add the tags to the index
- Sathyam
Gene Campbell wrote:
Hello
I'm new to Nutch.
I want each URL in the index to be optionally tagged with extra
information that a searcher with filter on.
How do I do this Nutch/Lucene? Some indexing plugin? Which one? Do
I need to create my own plugin? Are there examples anywhere?
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.