You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Claudiu Chis (JIRA)" <ji...@apache.org> on 2013/08/01 02:31:48 UTC

[jira] [Updated] (NUTCH-1294) IndexClean job with solr implementation.

     [ https://issues.apache.org/jira/browse/NUTCH-1294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claudiu Chis updated NUTCH-1294:
--------------------------------

    Attachment: NUTCH-1294-v3.patch

- no changes to java files
- added logging for IndexCleanerJob
- the patch now fully deploys (in v2 src/bin/nutch and conf/log4j.properties had to be applied manually)
                
> IndexClean job with solr implementation.
> ----------------------------------------
>
>                 Key: NUTCH-1294
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1294
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: nutchgora
>            Reporter: Dan Rosher
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: NUTCH-1294.patch, NUTCH-1294-v2.patch, NUTCH-1294-v3.patch
>
>
> I started by copying/altering the trunk version of SolrClean, though is was inadequate for our needs. We needed to mark particular pages as gone even though they still might be visible on the web, this implementation abstracts the index cleaning process, has a Solr implementation, and adds a clean index plugin extension that allows others to tailor how pages might be removed from their store.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira