You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Bipin Parmar <bi...@yahoo.com> on 2006/08/09 21:47:23 UTC
HTMLParseFilter is not called by ParseSegment (nutch parse command)
Hi,
I have written a plugin implementing the
org.apache.nutch.parse.HtmlParseFilter extension
point. When I execute "fetch", it gets appropriately
called.
When I execute "fetch -noParsing", it does not get
called. I think this is how it is supposed to work.
However when I execute "parse", I thought my
HtmlParseFilter implementing plugin will be called.
However it is not. The parse of the segment is
executed successfully.
Shouldn't "parse" call HTMLParseFilter implementing
plugins?
I have the same nutch-default.xml for both fetch as
well as parse commands. I tried changing
parse-plugins.xml by adding my plugin to "text/html"
content type but it did not help.
Please help!
Thank you,
Bipin
I am using nutch-nightly build date 08/07/2006.
Re: HTMLParseFilter is not called by ParseSegment (nutch parse command)
Posted by Bipin Parmar <bi...@yahoo.com>.
Hi,
Please ignore my earlier question regarding the parse
command / HTMLParseFilter plugin. It was my mistake.
The HTMLParseFilter implementing plugins are called
during parse.
Thank you,
Bipin
--- Bipin Parmar <bi...@yahoo.com> wrote:
> Hi,
>
> I have written a plugin implementing the
> org.apache.nutch.parse.HtmlParseFilter extension
> point. When I execute "fetch", it gets appropriately
> called.
>
> When I execute "fetch -noParsing", it does not get
> called. I think this is how it is supposed to work.
>
> However when I execute "parse", I thought my
> HtmlParseFilter implementing plugin will be called.
> However it is not. The parse of the segment is
> executed successfully.
>
> Shouldn't "parse" call HTMLParseFilter implementing
> plugins?
>
> I have the same nutch-default.xml for both fetch as
> well as parse commands. I tried changing
> parse-plugins.xml by adding my plugin to "text/html"
> content type but it did not help.
>
> Please help!
>
> Thank you,
>
> Bipin
> I am using nutch-nightly build date 08/07/2006.
>