You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Mike Schwartz <mf...@gmail.com> on 2007/04/13 21:46:56 UTC

"WritingPluginExample-0.8" by RicardoJMendez

Hi,

I'm interested in building a Nutch plugin.  I am having trouble 
getting the example "recommended" plugin to work - I followed all of 
the steps in http://wiki.apache.org/nutch/WritingPluginExample-0%2e9, 
confirmed after I ran the top-level ant that 
build/plugins/recommended contained the plugin.xml and jar file for 
the 'recommended' plugin, and then tried crawling a single page from 
a local webserver that contains the test content (with the 
="recommended" meta tag) from the example.  Although the page got 
crawled/indexed and I can search for it, I see no evidence of any 
rank boosting on the "explain" search link, and when I look at 
NUTCHDIR/logs/hadoop.log I don't see any indication that the 
recommended filter got loaded by the crawl.

If anyone has suggestions I'd appreciate hearing them.

Also, a couple of things I notice that I didn't understand and/or 
looked odd from the example wiki page:

1. In the section on "Getting Ant to Compile Your Plugin", it said to 
add the line into NUTCHDIR/src/plugin/build.xml:
<ant dir="reccomended" target="deploy" />

There's an extra "c" in there (typo).  (I fixed my local copy before 
I ran the crawl; telling you in case you want to update the wiki; I 
don't want to edit it myself until I have actually gotten it working...)

2. In the section on "Getting Nutch to Use Your Plugin" it said to 
add a regex to include the id of the plugin, using the example: 
<value>recommended|protocol-http|urlfilter-regex|parse-(text|html|js)|index-basic|query-(basic|site|url)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)</value>

But the <description> just above this part says you need to at least 
include the nutch-extensionpoints plugin (which is not present in 
this line).  I notice from the wiki edit history you used to have the 
nutch-extensionpoints plugin in there and removed it, so I'm not sure 
which way it's supposed to be -- what's correct?

(I tried it both with and without the nutch-extensionpoints and 
neither way worked for me.)

Thanks
  - Mike Schwartz