You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/09/22 10:25:00 UTC

[jira] [Commented] (NUTCH-2429) Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers

    [ https://issues.apache.org/jira/browse/NUTCH-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176237#comment-16176237 ] 

ASF GitHub Bot commented on NUTCH-2429:
---------------------------------------

HiranChaudhuri opened a new pull request #222: NUTCH-2429 Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers
URL: https://github.com/apache/nutch/pull/222
 
 
   This modification allows protocol plugins to bring their own URLStreamHandlers without being dependent on externally installed protocol handlers.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers
> -----------------------------------------------------------------------------
>
>                 Key: NUTCH-2429
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2429
>             Project: Nutch
>          Issue Type: Improvement
>          Components: commoncrawl
>    Affects Versions: 1.14
>         Environment: Tested on both Nutch 1.13 and 1.14 in Ubuntu Linux with OpenJDK 1.8.
>            Reporter: Hiran Chaudhuri
>
> While trying to use the protocol-smb plugin (which is not part of the Nutch distribution) I realized there are four steps to successfully make use of a protocol plugin:
> 1 - put the artifact into the plugins directory
> 2 - modify Nutch configuration files to allow smb:// urls plus include the plugin to the loaded list
> 3 - extract jcifs.jar and place it on the system classpath
> 4 - run nutch with the correct system property
> While steps 1 and 2 seem obvious, 3 and 4 require knowledge of plugin internals which does not feel right for nutch and plugin users. Even more, the jcifs.jar would exist twice on the classpath and could even cause further problems during runtime.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)