You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sami Siren (JIRA)" <ji...@apache.org> on 2007/01/06 11:36:27 UTC
[jira] Assigned: (NUTCH-422) index-extra plugin creates additional
fields in the index, based on configurable logic
[ https://issues.apache.org/jira/browse/NUTCH-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sami Siren reassigned NUTCH-422:
--------------------------------
Assignee: Sami Siren
> index-extra plugin creates additional fields in the index, based on configurable logic
> --------------------------------------------------------------------------------------
>
> Key: NUTCH-422
> URL: https://issues.apache.org/jira/browse/NUTCH-422
> Project: Nutch
> Issue Type: New Feature
> Components: indexer
> Affects Versions: 0.8.1
> Environment: All environments
> Reporter: Alan Tanaman
> Assigned To: Sami Siren
> Attachments: index-extra-v1.0-bin-java1.5.zip, index-extra-v1.0-source.zip
>
>
> Extract from the Readme file:
> A. Introduction
> The index-extra plugin allows you to configure additional fields that you wish to be added to the index, based on one of the following sources:
> - The parsed text
> - Meta data fields
> - Previously created document-to-be-indexed fields
> - Plain constant string
> - Java expression combining one or more of the above, and resolving to a string
> A regex can also be applied to any of the above, allowing fields to be created based on patterns extracted from the source.
> B. Installation
> 1) Binaries only: Copy the 'index-extra' folder within index-extra-v1.0-bin-java1.5.zip to NUTCHDIR/build
> Copy the 'index-extra-conf.xml' file to NUTCHDIR/conf, and configure
> Enable the plugin by updating the nutch-site.xml file
> 2) Source code: Always refer to the Nutch wiki for detailed instructions on building Nutch. In short:
> Copy the 'index-extra' folder within index-extra-v1.0-source.zip to NUTCHDIR/src/plugin
> Update the build.xml in NUTCHDIR/src/plugin to include plugin
> Update the NUTCHDIR/default.properties file to include plugin
> run ant to build
> Copy the 'index-extra-conf.xml' file to NUTCHDIR/conf, and configure
> Enable the plugin by updating the nutch-site.xml file
> C. Known Issues
> 1) For this plugin to work correctly on any document field, it is necessary to run the other index filters
> first, so that all basic document fields are generated first. To do this, configure the indexingfilter.order
> property. (Please see patch NUTCH-421 to enable indexingfilter.order property. If this patch is not applied,
> the plugin will still work, but will not be able to use document fields created by other index filter plugins.)
> 2) At this stage, field boost can not be used as Nutch scoring overrides the field boost with its own
> document-level boost calculation. This occurs at the end of org.apache.nutch.indexer.Indexer's reduce method.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira