You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Ake Tangkananond <ia...@gmail.com> on 2012/08/03 13:49:39 UTC

Nutch 2 plugin implementation ClassNotFoundException

Hello,

I have question on the Nutch 2 plugin implementation.

I am implementing an image parser. It used to work fine in Nutch 1.5, but
after I migrate the code to Nutch 2.0, there are some errors which I spend
several hours with it and I was unable to trace the cause of it yet. Would
appreciate the insight here in the mailing list.

While I was parsing the content fetched, I got the following error in the
logs/hadoop.log
2012-08-03 18:28:25,304 ERROR parse.ParserFactory - PluginRuntimeException
org.apache.nutch.plugin.PluginRuntimeException:
java.lang.ClassNotFoundException: <my plugin class name>
        at 
org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:166)
        at 
org.apache.nutch.parse.ParserFactory.getFields(ParserFactory.java:209)
        at org.apache.nutch.parse.ParserJob.getFields(ParserJob.java:191)
        at org.apache.nutch.parse.ParserJob.run(ParserJob.java:243)
        at org.apache.nutch.parse.ParserJob.parse(ParserJob.java:257)
        at org.apache.nutch.parse.ParserJob.run(ParserJob.java:300)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.parse.ParserJob.main(ParserJob.java:304)
Caused by: java.lang.ClassNotFoundException: <my plugin class name>
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
        at 
org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:156)
        ... 7 more
2012-08-03 18:28:25,654 INFO  crawl.SignatureFactory - Using Signature impl:
org.apache.nutch.crawl.MD5Signature

What I did is that I copied minimal necessary files from other plugin
folders and modify it to what I need. Then I edited nutch-site.xml to
include my plugin, edited parse-plugins.xml to register mimeType. I added
parse-image into the 2 packageset under <nutch-source>/build.xml, and added
ant target under deploy and clean in <nutch-source>/src/plugin/build.xml,
then I rebuild all. (These what I did in Nutch 1.5 and it works, but no luck
for Nutch 2)

Could you advise what else I miss, or what more information I should
provide. Thank you very much !


Regards,
Ake Tangkananond



Re: Nutch 2 plugin implementation ClassNotFoundException

Posted by Ake Tangkananond <ia...@gmail.com>.
Hi All,

I'm now able to fix the problem. Thank you everyone. The summary of the
problem is as follows:

Problem:
<nutch-source>/build/plugins/<plugin-name-existing>/plugin.xml was
overwritten when I used plugin-name-existing as an id in the
<nutch-source>/src/plugin/<plugin-name-new>/plugin.xml:/plugin[@id]. It
was my mistake, but after I corrected it (change /plugin[@id] to
plugin-name-new), the
<nutch-source>/build/plugins/<plugin-name-existing>/plugin.xml has never
been re-copied by the build script.

Not sure if this is intended.

BTW. I found whitespace typo in the PluginManifestParser.java:187
Current:
LOG.debug("plugin: id=" + id + " name=" + name + " version=" + version
      + " provider=" + providerName + "class=" + pluginClazz);

Correct: (space before class)
LOG.debug("plugin: id=" + id + " name=" + name + " version=" + version
      + " provider=" + providerName + " class=" + pluginClazz);




Regards,
Ake Tangkananond




On 8/3/12 7:56 PM, "Ake Tangkananond" <ia...@gmail.com> wrote:

>Hello,
>
>Thank you for a very quick reply. Yes I run it in local mode. And my
>plugin's plugin.xml and parse-image.jar are present in the
>runtime/local/plugins.
>
>I just knew the root cause now. Here is how I find the cause:
>I insert the following code at PluginDescriptor.java line 288 to print out
>all lookup library path
>	System.out.println(java.util.Arrays.toString(urls));
>And I see some problem here:
>	[file:/usr/local/apache-nutch-2.0.0-source/runtime/local/plugins/parse-ht
>m
>l/parse-image.jar]
>
>Figuring out how to gracefully fix it. But if one knows the right fixing
>spot, please give me some light. xD
>
>
>BTW, I'm using IntelliJ IDEA but I don't know how to configure it with the
>Ivy project. Would be great if one could give me hands at iamake at gmail
>dot com ;-)
>
>
>
>Regards,
>Ake Tangkananond
>
>
>
>On 8/3/12 6:59 PM, "Ferdy Galema" <fe...@kalooga.com> wrote:
>
>>Hi,
>>
>>Some quick pointers: Do you run it in local mode? Is your plugin's
>>plugin.xml and parse-image.jar present in runtime/local/plugins after you
>>build it? Do you use external libraries?
>>
>>Ferdy.
>>
>>On Fri, Aug 3, 2012 at 1:49 PM, Ake Tangkananond <ia...@gmail.com>
>>wrote:
>>
>>> Hello,
>>>
>>> I have question on the Nutch 2 plugin implementation.
>>>
>>> I am implementing an image parser. It used to work fine in Nutch 1.5,
>>>but
>>> after I migrate the code to Nutch 2.0, there are some errors which I
>>>spend
>>> several hours with it and I was unable to trace the cause of it yet.
>>>Would
>>> appreciate the insight here in the mailing list.
>>>
>>> While I was parsing the content fetched, I got the following error in
>>>the
>>> logs/hadoop.log
>>> 2012-08-03 18:28:25,304 ERROR parse.ParserFactory -
>>>PluginRuntimeException
>>> org.apache.nutch.plugin.PluginRuntimeException:
>>> java.lang.ClassNotFoundException: <my plugin class name>
>>>         at
>>> 
>>>org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:16
>>>6
>>>)
>>>         at
>>> org.apache.nutch.parse.ParserFactory.getFields(ParserFactory.java:209)
>>>         at 
>>>org.apache.nutch.parse.ParserJob.getFields(ParserJob.java:191)
>>>         at org.apache.nutch.parse.ParserJob.run(ParserJob.java:243)
>>>         at org.apache.nutch.parse.ParserJob.parse(ParserJob.java:257)
>>>         at org.apache.nutch.parse.ParserJob.run(ParserJob.java:300)
>>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>         at org.apache.nutch.parse.ParserJob.main(ParserJob.java:304)
>>> Caused by: java.lang.ClassNotFoundException: <my plugin class name>
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>>         at
>>> 
>>>org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:15
>>>6
>>>)
>>>         ... 7 more
>>> 2012-08-03 18:28:25,654 INFO  crawl.SignatureFactory - Using Signature
>>> impl:
>>> org.apache.nutch.crawl.MD5Signature
>>>
>>> What I did is that I copied minimal necessary files from other plugin
>>> folders and modify it to what I need. Then I edited nutch-site.xml to
>>> include my plugin, edited parse-plugins.xml to register mimeType. I
>>>added
>>> parse-image into the 2 packageset under <nutch-source>/build.xml, and
>>>added
>>> ant target under deploy and clean in
>>><nutch-source>/src/plugin/build.xml,
>>> then I rebuild all. (These what I did in Nutch 1.5 and it works, but no
>>> luck
>>> for Nutch 2)
>>>
>>> Could you advise what else I miss, or what more information I should
>>> provide. Thank you very much !
>>>
>>>
>>> Regards,
>>> Ake Tangkananond
>>>
>>>
>>>
>
>



Re: Nutch 2 plugin implementation ClassNotFoundException

Posted by Ake Tangkananond <ia...@gmail.com>.
Hello,

Thank you for a very quick reply. Yes I run it in local mode. And my
plugin's plugin.xml and parse-image.jar are present in the
runtime/local/plugins.

I just knew the root cause now. Here is how I find the cause:
I insert the following code at PluginDescriptor.java line 288 to print out
all lookup library path
	System.out.println(java.util.Arrays.toString(urls));
And I see some problem here:
	[file:/usr/local/apache-nutch-2.0.0-source/runtime/local/plugins/parse-htm
l/parse-image.jar]

Figuring out how to gracefully fix it. But if one knows the right fixing
spot, please give me some light. xD


BTW, I'm using IntelliJ IDEA but I don't know how to configure it with the
Ivy project. Would be great if one could give me hands at iamake at gmail
dot com ;-)



Regards,
Ake Tangkananond



On 8/3/12 6:59 PM, "Ferdy Galema" <fe...@kalooga.com> wrote:

>Hi,
>
>Some quick pointers: Do you run it in local mode? Is your plugin's
>plugin.xml and parse-image.jar present in runtime/local/plugins after you
>build it? Do you use external libraries?
>
>Ferdy.
>
>On Fri, Aug 3, 2012 at 1:49 PM, Ake Tangkananond <ia...@gmail.com> wrote:
>
>> Hello,
>>
>> I have question on the Nutch 2 plugin implementation.
>>
>> I am implementing an image parser. It used to work fine in Nutch 1.5,
>>but
>> after I migrate the code to Nutch 2.0, there are some errors which I
>>spend
>> several hours with it and I was unable to trace the cause of it yet.
>>Would
>> appreciate the insight here in the mailing list.
>>
>> While I was parsing the content fetched, I got the following error in
>>the
>> logs/hadoop.log
>> 2012-08-03 18:28:25,304 ERROR parse.ParserFactory -
>>PluginRuntimeException
>> org.apache.nutch.plugin.PluginRuntimeException:
>> java.lang.ClassNotFoundException: <my plugin class name>
>>         at
>> 
>>org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:166
>>)
>>         at
>> org.apache.nutch.parse.ParserFactory.getFields(ParserFactory.java:209)
>>         at 
>>org.apache.nutch.parse.ParserJob.getFields(ParserJob.java:191)
>>         at org.apache.nutch.parse.ParserJob.run(ParserJob.java:243)
>>         at org.apache.nutch.parse.ParserJob.parse(ParserJob.java:257)
>>         at org.apache.nutch.parse.ParserJob.run(ParserJob.java:300)
>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>         at org.apache.nutch.parse.ParserJob.main(ParserJob.java:304)
>> Caused by: java.lang.ClassNotFoundException: <my plugin class name>
>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>>         at
>> 
>>org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:156
>>)
>>         ... 7 more
>> 2012-08-03 18:28:25,654 INFO  crawl.SignatureFactory - Using Signature
>> impl:
>> org.apache.nutch.crawl.MD5Signature
>>
>> What I did is that I copied minimal necessary files from other plugin
>> folders and modify it to what I need. Then I edited nutch-site.xml to
>> include my plugin, edited parse-plugins.xml to register mimeType. I
>>added
>> parse-image into the 2 packageset under <nutch-source>/build.xml, and
>>added
>> ant target under deploy and clean in
>><nutch-source>/src/plugin/build.xml,
>> then I rebuild all. (These what I did in Nutch 1.5 and it works, but no
>> luck
>> for Nutch 2)
>>
>> Could you advise what else I miss, or what more information I should
>> provide. Thank you very much !
>>
>>
>> Regards,
>> Ake Tangkananond
>>
>>
>>



Re: Nutch 2 plugin implementation ClassNotFoundException

Posted by Ferdy Galema <fe...@kalooga.com>.
Hi,

Some quick pointers: Do you run it in local mode? Is your plugin's
plugin.xml and parse-image.jar present in runtime/local/plugins after you
build it? Do you use external libraries?

Ferdy.

On Fri, Aug 3, 2012 at 1:49 PM, Ake Tangkananond <ia...@gmail.com> wrote:

> Hello,
>
> I have question on the Nutch 2 plugin implementation.
>
> I am implementing an image parser. It used to work fine in Nutch 1.5, but
> after I migrate the code to Nutch 2.0, there are some errors which I spend
> several hours with it and I was unable to trace the cause of it yet. Would
> appreciate the insight here in the mailing list.
>
> While I was parsing the content fetched, I got the following error in the
> logs/hadoop.log
> 2012-08-03 18:28:25,304 ERROR parse.ParserFactory - PluginRuntimeException
> org.apache.nutch.plugin.PluginRuntimeException:
> java.lang.ClassNotFoundException: <my plugin class name>
>         at
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:166)
>         at
> org.apache.nutch.parse.ParserFactory.getFields(ParserFactory.java:209)
>         at org.apache.nutch.parse.ParserJob.getFields(ParserJob.java:191)
>         at org.apache.nutch.parse.ParserJob.run(ParserJob.java:243)
>         at org.apache.nutch.parse.ParserJob.parse(ParserJob.java:257)
>         at org.apache.nutch.parse.ParserJob.run(ParserJob.java:300)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.nutch.parse.ParserJob.main(ParserJob.java:304)
> Caused by: java.lang.ClassNotFoundException: <my plugin class name>
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>         at
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:156)
>         ... 7 more
> 2012-08-03 18:28:25,654 INFO  crawl.SignatureFactory - Using Signature
> impl:
> org.apache.nutch.crawl.MD5Signature
>
> What I did is that I copied minimal necessary files from other plugin
> folders and modify it to what I need. Then I edited nutch-site.xml to
> include my plugin, edited parse-plugins.xml to register mimeType. I added
> parse-image into the 2 packageset under <nutch-source>/build.xml, and added
> ant target under deploy and clean in <nutch-source>/src/plugin/build.xml,
> then I rebuild all. (These what I did in Nutch 1.5 and it works, but no
> luck
> for Nutch 2)
>
> Could you advise what else I miss, or what more information I should
> provide. Thank you very much !
>
>
> Regards,
> Ake Tangkananond
>
>
>