You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/12/19 22:35:14 UTC
[jira] [Commented] (NUTCH-1902) Missing nekohtml.jar
[ https://issues.apache.org/jira/browse/NUTCH-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254055#comment-14254055 ]
Sebastian Nagel commented on NUTCH-1902:
----------------------------------------
The nekohtml jar must not be placed in the lib folder. It's a dependency managed by ivy and will be automatically fetched and finally placed in runtime/local/plugins/lib-nekohtml/nekohtml-0.9.5.jar. It's in the plugins folder because it's a dependency of the plugin lib-nekohtml required by parse-html.
Verified with 2.2.1 that the dependency is resolved and installed properly.
> Missing nekohtml.jar
> --------------------
>
> Key: NUTCH-1902
> URL: https://issues.apache.org/jira/browse/NUTCH-1902
> Project: Nutch
> Issue Type: Bug
> Components: parser
> Affects Versions: 2.2.1
> Reporter: Cao Manh Dat
> Labels: easyfix
> Fix For: 2.3
>
> Original Estimate: 1m
> Remaining Estimate: 1m
>
> There are missing nekohtml in lib folder of nutch release. So Nutch will throw an exception when parse row.
> Fix this issue by adding nekohtml.jar (can be download here at http://nekohtml.sourceforge.net/) and place in lib folder.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)