You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Joseph Naegele <jn...@grierforensics.com> on 2016/04/26 15:59:37 UTC

Plugin name significant when dependent on other plugins

Here's an odd one (Nutch 1.11):

I haven't tested this with other extension points, but if you extend or
depend on the "protocol-http" plugin in a new plugin, the name of the new
plugin is significant when ProtocolFactory loads the correct plugin for
fetching.

In other words:

Create a plugin "protocol-httpfoo" that is dependent on "protocol-http" to
do the heavy lifting. Its plugin.xml contains this section (note the
additional dependency):

   <requires>
      <import plugin="nutch-extensionpoints"/>
        <import plugin="lib-http"/>
        <import plugin="protocol-http"/>
   </requires>

Now, in nutch-site.xml, specify *only* "protocol-httpfoo" in
plugin.includes.

Then just run "bin/nutch parsechecker ..." to test. You'll see
"protocol-http" is used rather than the "foo" version. If you rename the
plugin to "protocol-http-foo", it'll work.

Seems like a bug to me, but also pretty obscure. I can file a ticket if
suggested.

Joe


Re: Plugin name significant when dependent on other plugins

Posted by Sebastian Nagel <wa...@googlemail.com>.
Hi Joseph,

yes, sounds strange. Please, open a Jira ticket
and, if possible, add the code (could be a dummy)
and the config files to reproduce the problem.

Thanks,
Sebastian

On 04/26/2016 03:59 PM, Joseph Naegele wrote:
> Here's an odd one (Nutch 1.11):
> 
> I haven't tested this with other extension points, but if you extend or
> depend on the "protocol-http" plugin in a new plugin, the name of the new
> plugin is significant when ProtocolFactory loads the correct plugin for
> fetching.
> 
> In other words:
> 
> Create a plugin "protocol-httpfoo" that is dependent on "protocol-http" to
> do the heavy lifting. Its plugin.xml contains this section (note the
> additional dependency):
> 
>    <requires>
>       <import plugin="nutch-extensionpoints"/>
>         <import plugin="lib-http"/>
>         <import plugin="protocol-http"/>
>    </requires>
> 
> Now, in nutch-site.xml, specify *only* "protocol-httpfoo" in
> plugin.includes.
> 
> Then just run "bin/nutch parsechecker ..." to test. You'll see
> "protocol-http" is used rather than the "foo" version. If you rename the
> plugin to "protocol-http-foo", it'll work.
> 
> Seems like a bug to me, but also pretty obscure. I can file a ticket if
> suggested.
> 
> Joe
>