You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Joseph Naegele <jn...@grierforensics.com> on 2016/04/26 15:59:37 UTC
Plugin name significant when dependent on other plugins
Here's an odd one (Nutch 1.11):
I haven't tested this with other extension points, but if you extend or
depend on the "protocol-http" plugin in a new plugin, the name of the new
plugin is significant when ProtocolFactory loads the correct plugin for
fetching.
In other words:
Create a plugin "protocol-httpfoo" that is dependent on "protocol-http" to
do the heavy lifting. Its plugin.xml contains this section (note the
additional dependency):
<requires>
<import plugin="nutch-extensionpoints"/>
<import plugin="lib-http"/>
<import plugin="protocol-http"/>
</requires>
Now, in nutch-site.xml, specify *only* "protocol-httpfoo" in
plugin.includes.
Then just run "bin/nutch parsechecker ..." to test. You'll see
"protocol-http" is used rather than the "foo" version. If you rename the
plugin to "protocol-http-foo", it'll work.
Seems like a bug to me, but also pretty obscure. I can file a ticket if
suggested.
Joe
Re: Plugin name significant when dependent on other plugins
Posted by Sebastian Nagel <wa...@googlemail.com>.
Hi Joseph,
yes, sounds strange. Please, open a Jira ticket
and, if possible, add the code (could be a dummy)
and the config files to reproduce the problem.
Thanks,
Sebastian
On 04/26/2016 03:59 PM, Joseph Naegele wrote:
> Here's an odd one (Nutch 1.11):
>
> I haven't tested this with other extension points, but if you extend or
> depend on the "protocol-http" plugin in a new plugin, the name of the new
> plugin is significant when ProtocolFactory loads the correct plugin for
> fetching.
>
> In other words:
>
> Create a plugin "protocol-httpfoo" that is dependent on "protocol-http" to
> do the heavy lifting. Its plugin.xml contains this section (note the
> additional dependency):
>
> <requires>
> <import plugin="nutch-extensionpoints"/>
> <import plugin="lib-http"/>
> <import plugin="protocol-http"/>
> </requires>
>
> Now, in nutch-site.xml, specify *only* "protocol-httpfoo" in
> plugin.includes.
>
> Then just run "bin/nutch parsechecker ..." to test. You'll see
> "protocol-http" is used rather than the "foo" version. If you rename the
> plugin to "protocol-http-foo", it'll work.
>
> Seems like a bug to me, but also pretty obscure. I can file a ticket if
> suggested.
>
> Joe
>