You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Agnieszka Zbrzezny (JIRA)" <ji...@apache.org> on 2009/04/06 14:06:12 UTC

[jira] Commented: (NUTCH-386) Plugin to index categories by url rules

    [ https://issues.apache.org/jira/browse/NUTCH-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696051#action_12696051 ] 

Agnieszka Zbrzezny commented on NUTCH-386:
------------------------------------------

hello,
i'm trying to add your plugin to nutch 1.0. After  bin/nutch crawl urls/ -dir crawl -depth 3 in hadoop.log is: 

2009-04-06 09:09:54,128 INFO  indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2009-04-06 09:09:54,145 INFO  indexer.IndexingFilters - Adding org.apache.nutch.indexer.more.MoreIndexingFilter
2009-04-06 09:09:54,147 INFO  indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2009-04-06 09:09:54,167 INFO  indexer.IndexingFilters - Adding org.b2b.nutch.indexer.UrlCategoryIndexFilter
2009-04-06 09:09:54,168 WARN  mapred.LocalJobRunner - job_local_0016
java.lang.AbstractMethodError
        at org.apache.nutch.indexer.IndexingFilters.<init>(IndexingFilters.java:73)
        at org.apache.nutch.indexer.IndexerMapReduce.configure(IndexerMapReduce.java:61)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:338)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)

I use org.apache.hadoop.io.UTF8 and I include the plugin in the nutch-site.xml.

Thanks in advance,
Agnieszka

> Plugin to index categories by url rules
> ---------------------------------------
>
>                 Key: NUTCH-386
>                 URL: https://issues.apache.org/jira/browse/NUTCH-386
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer, searcher
>            Reporter: Ernesto De Santis
>            Priority: Minor
>         Attachments: index-url-category-0.1.zip, index-url-category.jar
>
>
> The compressed zip has a install_notes.txt file with instructions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.