You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/12/19 22:35:14 UTC

[jira] [Commented] (NUTCH-1902) Missing nekohtml.jar

    [ https://issues.apache.org/jira/browse/NUTCH-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254055#comment-14254055 ] 

Sebastian Nagel commented on NUTCH-1902:
----------------------------------------

The nekohtml jar must not be placed in the lib folder. It's a dependency managed by ivy and will be automatically fetched and finally placed in runtime/local/plugins/lib-nekohtml/nekohtml-0.9.5.jar. It's in the plugins folder because it's a dependency of the plugin lib-nekohtml required by parse-html.
Verified with 2.2.1 that the dependency is resolved and installed properly.

> Missing nekohtml.jar
> --------------------
>
>                 Key: NUTCH-1902
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1902
>             Project: Nutch
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 2.2.1
>            Reporter: Cao Manh Dat
>              Labels: easyfix
>             Fix For: 2.3
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> There are missing nekohtml in lib folder of nutch release. So Nutch will throw an exception when parse row.
> Fix this issue by adding nekohtml.jar (can be download here at http://nekohtml.sourceforge.net/) and place in lib folder.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)