You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Ake Tangkananond <ia...@gmail.com> on 2012/08/03 13:49:39 UTC
Nutch 2 plugin implementation ClassNotFoundException
Hello,
I have question on the Nutch 2 plugin implementation.
I am implementing an image parser. It used to work fine in Nutch 1.5, but
after I migrate the code to Nutch 2.0, there are some errors which I spend
several hours with it and I was unable to trace the cause of it yet. Would
appreciate the insight here in the mailing list.
While I was parsing the content fetched, I got the following error in the
logs/hadoop.log
2012-08-03 18:28:25,304 ERROR parse.ParserFactory - PluginRuntimeException
org.apache.nutch.plugin.PluginRuntimeException:
java.lang.ClassNotFoundException: <my plugin class name>
at
org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:166)
at
org.apache.nutch.parse.ParserFactory.getFields(ParserFactory.java:209)
at org.apache.nutch.parse.ParserJob.getFields(ParserJob.java:191)
at org.apache.nutch.parse.ParserJob.run(ParserJob.java:243)
at org.apache.nutch.parse.ParserJob.parse(ParserJob.java:257)
at org.apache.nutch.parse.ParserJob.run(ParserJob.java:300)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.parse.ParserJob.main(ParserJob.java:304)
Caused by: java.lang.ClassNotFoundException: <my plugin class name>
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at
org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:156)
... 7 more
2012-08-03 18:28:25,654 INFO crawl.SignatureFactory - Using Signature impl:
org.apache.nutch.crawl.MD5Signature
What I did is that I copied minimal necessary files from other plugin
folders and modify it to what I need. Then I edited nutch-site.xml to
include my plugin, edited parse-plugins.xml to register mimeType. I added
parse-image into the 2 packageset under <nutch-source>/build.xml, and added
ant target under deploy and clean in <nutch-source>/src/plugin/build.xml,
then I rebuild all. (These what I did in Nutch 1.5 and it works, but no luck
for Nutch 2)
Could you advise what else I miss, or what more information I should
provide. Thank you very much !
Regards,
Ake Tangkananond
Re: Nutch 2 plugin implementation ClassNotFoundException
Posted by Ake Tangkananond <ia...@gmail.com>.
Hi All,
I'm now able to fix the problem. Thank you everyone. The summary of the
problem is as follows:
Problem:
<nutch-source>/build/plugins/<plugin-name-existing>/plugin.xml was
overwritten when I used plugin-name-existing as an id in the
<nutch-source>/src/plugin/<plugin-name-new>/plugin.xml:/plugin[@id]. It
was my mistake, but after I corrected it (change /plugin[@id] to
plugin-name-new), the
<nutch-source>/build/plugins/<plugin-name-existing>/plugin.xml has never
been re-copied by the build script.
Not sure if this is intended.
BTW. I found whitespace typo in the PluginManifestParser.java:187
Current:
LOG.debug("plugin: id=" + id + " name=" + name + " version=" + version
+ " provider=" + providerName + "class=" + pluginClazz);
Correct: (space before class)
LOG.debug("plugin: id=" + id + " name=" + name + " version=" + version
+ " provider=" + providerName + " class=" + pluginClazz);
Regards,
Ake Tangkananond
On 8/3/12 7:56 PM, "Ake Tangkananond" <ia...@gmail.com> wrote:
>Hello,
>
>Thank you for a very quick reply. Yes I run it in local mode. And my
>plugin's plugin.xml and parse-image.jar are present in the
>runtime/local/plugins.
>
>I just knew the root cause now. Here is how I find the cause:
>I insert the following code at PluginDescriptor.java line 288 to print out
>all lookup library path
> System.out.println(java.util.Arrays.toString(urls));
>And I see some problem here:
> [file:/usr/local/apache-nutch-2.0.0-source/runtime/local/plugins/parse-ht
>m
>l/parse-image.jar]
>
>Figuring out how to gracefully fix it. But if one knows the right fixing
>spot, please give me some light. xD
>
>
>BTW, I'm using IntelliJ IDEA but I don't know how to configure it with the
>Ivy project. Would be great if one could give me hands at iamake at gmail
>dot com ;-)
>
>
>
>Regards,
>Ake Tangkananond
>
>
>
>On 8/3/12 6:59 PM, "Ferdy Galema" <fe...@kalooga.com> wrote:
>
>>Hi,
>>
>>Some quick pointers: Do you run it in local mode? Is your plugin's
>>plugin.xml and parse-image.jar present in runtime/local/plugins after you
>>build it? Do you use external libraries?
>>
>>Ferdy.
>>
>>On Fri, Aug 3, 2012 at 1:49 PM, Ake Tangkananond <ia...@gmail.com>
>>wrote:
>>
>>> Hello,
>>>
>>> I have question on the Nutch 2 plugin implementation.
>>>
>>> I am implementing an image parser. It used to work fine in Nutch 1.5,
>>>but
>>> after I migrate the code to Nutch 2.0, there are some errors which I
>>>spend
>>> several hours with it and I was unable to trace the cause of it yet.
>>>Would
>>> appreciate the insight here in the mailing list.
>>>
>>> While I was parsing the content fetched, I got the following error in
>>>the
>>> logs/hadoop.log
>>> 2012-08-03 18:28:25,304 ERROR parse.ParserFactory -
>>>PluginRuntimeException
>>> org.apache.nutch.plugin.PluginRuntimeException:
>>> java.lang.ClassNotFoundException: <my plugin class name>
>>> at
>>>
>>>org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:16
>>>6
>>>)
>>> at
>>> org.apache.nutch.parse.ParserFactory.getFields(ParserFactory.java:209)
>>> at
>>>org.apache.nutch.parse.ParserJob.getFields(ParserJob.java:191)
>>> at org.apache.nutch.parse.ParserJob.run(ParserJob.java:243)
>>> at org.apache.nutch.parse.ParserJob.parse(ParserJob.java:257)
>>> at org.apache.nutch.parse.ParserJob.run(ParserJob.java:300)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> at org.apache.nutch.parse.ParserJob.main(ParserJob.java:304)
>>> Caused by: java.lang.ClassNotFoundException: <my plugin class name>
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>> at
>>>
>>>org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:15
>>>6
>>>)
>>> ... 7 more
>>> 2012-08-03 18:28:25,654 INFO crawl.SignatureFactory - Using Signature
>>> impl:
>>> org.apache.nutch.crawl.MD5Signature
>>>
>>> What I did is that I copied minimal necessary files from other plugin
>>> folders and modify it to what I need. Then I edited nutch-site.xml to
>>> include my plugin, edited parse-plugins.xml to register mimeType. I
>>>added
>>> parse-image into the 2 packageset under <nutch-source>/build.xml, and
>>>added
>>> ant target under deploy and clean in
>>><nutch-source>/src/plugin/build.xml,
>>> then I rebuild all. (These what I did in Nutch 1.5 and it works, but no
>>> luck
>>> for Nutch 2)
>>>
>>> Could you advise what else I miss, or what more information I should
>>> provide. Thank you very much !
>>>
>>>
>>> Regards,
>>> Ake Tangkananond
>>>
>>>
>>>
>
>
Re: Nutch 2 plugin implementation ClassNotFoundException
Posted by Ake Tangkananond <ia...@gmail.com>.
Hello,
Thank you for a very quick reply. Yes I run it in local mode. And my
plugin's plugin.xml and parse-image.jar are present in the
runtime/local/plugins.
I just knew the root cause now. Here is how I find the cause:
I insert the following code at PluginDescriptor.java line 288 to print out
all lookup library path
System.out.println(java.util.Arrays.toString(urls));
And I see some problem here:
[file:/usr/local/apache-nutch-2.0.0-source/runtime/local/plugins/parse-htm
l/parse-image.jar]
Figuring out how to gracefully fix it. But if one knows the right fixing
spot, please give me some light. xD
BTW, I'm using IntelliJ IDEA but I don't know how to configure it with the
Ivy project. Would be great if one could give me hands at iamake at gmail
dot com ;-)
Regards,
Ake Tangkananond
On 8/3/12 6:59 PM, "Ferdy Galema" <fe...@kalooga.com> wrote:
>Hi,
>
>Some quick pointers: Do you run it in local mode? Is your plugin's
>plugin.xml and parse-image.jar present in runtime/local/plugins after you
>build it? Do you use external libraries?
>
>Ferdy.
>
>On Fri, Aug 3, 2012 at 1:49 PM, Ake Tangkananond <ia...@gmail.com> wrote:
>
>> Hello,
>>
>> I have question on the Nutch 2 plugin implementation.
>>
>> I am implementing an image parser. It used to work fine in Nutch 1.5,
>>but
>> after I migrate the code to Nutch 2.0, there are some errors which I
>>spend
>> several hours with it and I was unable to trace the cause of it yet.
>>Would
>> appreciate the insight here in the mailing list.
>>
>> While I was parsing the content fetched, I got the following error in
>>the
>> logs/hadoop.log
>> 2012-08-03 18:28:25,304 ERROR parse.ParserFactory -
>>PluginRuntimeException
>> org.apache.nutch.plugin.PluginRuntimeException:
>> java.lang.ClassNotFoundException: <my plugin class name>
>> at
>>
>>org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:166
>>)
>> at
>> org.apache.nutch.parse.ParserFactory.getFields(ParserFactory.java:209)
>> at
>>org.apache.nutch.parse.ParserJob.getFields(ParserJob.java:191)
>> at org.apache.nutch.parse.ParserJob.run(ParserJob.java:243)
>> at org.apache.nutch.parse.ParserJob.parse(ParserJob.java:257)
>> at org.apache.nutch.parse.ParserJob.run(ParserJob.java:300)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> at org.apache.nutch.parse.ParserJob.main(ParserJob.java:304)
>> Caused by: java.lang.ClassNotFoundException: <my plugin class name>
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>> at
>>
>>org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:156
>>)
>> ... 7 more
>> 2012-08-03 18:28:25,654 INFO crawl.SignatureFactory - Using Signature
>> impl:
>> org.apache.nutch.crawl.MD5Signature
>>
>> What I did is that I copied minimal necessary files from other plugin
>> folders and modify it to what I need. Then I edited nutch-site.xml to
>> include my plugin, edited parse-plugins.xml to register mimeType. I
>>added
>> parse-image into the 2 packageset under <nutch-source>/build.xml, and
>>added
>> ant target under deploy and clean in
>><nutch-source>/src/plugin/build.xml,
>> then I rebuild all. (These what I did in Nutch 1.5 and it works, but no
>> luck
>> for Nutch 2)
>>
>> Could you advise what else I miss, or what more information I should
>> provide. Thank you very much !
>>
>>
>> Regards,
>> Ake Tangkananond
>>
>>
>>
Re: Nutch 2 plugin implementation ClassNotFoundException
Posted by Ferdy Galema <fe...@kalooga.com>.
Hi,
Some quick pointers: Do you run it in local mode? Is your plugin's
plugin.xml and parse-image.jar present in runtime/local/plugins after you
build it? Do you use external libraries?
Ferdy.
On Fri, Aug 3, 2012 at 1:49 PM, Ake Tangkananond <ia...@gmail.com> wrote:
> Hello,
>
> I have question on the Nutch 2 plugin implementation.
>
> I am implementing an image parser. It used to work fine in Nutch 1.5, but
> after I migrate the code to Nutch 2.0, there are some errors which I spend
> several hours with it and I was unable to trace the cause of it yet. Would
> appreciate the insight here in the mailing list.
>
> While I was parsing the content fetched, I got the following error in the
> logs/hadoop.log
> 2012-08-03 18:28:25,304 ERROR parse.ParserFactory - PluginRuntimeException
> org.apache.nutch.plugin.PluginRuntimeException:
> java.lang.ClassNotFoundException: <my plugin class name>
> at
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:166)
> at
> org.apache.nutch.parse.ParserFactory.getFields(ParserFactory.java:209)
> at org.apache.nutch.parse.ParserJob.getFields(ParserJob.java:191)
> at org.apache.nutch.parse.ParserJob.run(ParserJob.java:243)
> at org.apache.nutch.parse.ParserJob.parse(ParserJob.java:257)
> at org.apache.nutch.parse.ParserJob.run(ParserJob.java:300)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.nutch.parse.ParserJob.main(ParserJob.java:304)
> Caused by: java.lang.ClassNotFoundException: <my plugin class name>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
> at
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:156)
> ... 7 more
> 2012-08-03 18:28:25,654 INFO crawl.SignatureFactory - Using Signature
> impl:
> org.apache.nutch.crawl.MD5Signature
>
> What I did is that I copied minimal necessary files from other plugin
> folders and modify it to what I need. Then I edited nutch-site.xml to
> include my plugin, edited parse-plugins.xml to register mimeType. I added
> parse-image into the 2 packageset under <nutch-source>/build.xml, and added
> ant target under deploy and clean in <nutch-source>/src/plugin/build.xml,
> then I rebuild all. (These what I did in Nutch 1.5 and it works, but no
> luck
> for Nutch 2)
>
> Could you advise what else I miss, or what more information I should
> provide. Thank you very much !
>
>
> Regards,
> Ake Tangkananond
>
>
>