You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Emmanuel <jo...@gmail.com> on 2007/09/10 17:26:17 UTC
ParseResults
I would like to implement a way to filter the menu within the different
website. I checked the code and I tried to understand the way the parsing is
done.
I found that we are generating ParseResult based on the parser.
What is the aim of a ParseResult ? Actually I don't understand why we could
store many parseresult ? Is there any specific usage ?
Why do we call the htmlparsefilter.filter after having created a first
ParseResult ?
I may have miss something in this case i will appreciate your help.
Re: ParseResults
Posted by Doğacan Güney <do...@gmail.com>.
On 9/10/07, Emmanuel <jo...@gmail.com> wrote:
> I would like to implement a way to filter the menu within the different
> website. I checked the code and I tried to understand the way the parsing is
> done.
> I found that we are generating ParseResult based on the parser.
>
> What is the aim of a ParseResult ? [...]
You can generate more than 1 parse for a page. See feed plugin for an
example (feed plugin extracts individual entries from an rss file and
stores them seperately).
> [..] Actually I don't understand why we could
> store many parseresult ? Is there any specific usage ?
> Why do we call the htmlparsefilter.filter after having created a first
> ParseResult ?
HtmlParseFilter-s are plugins that take the parse result and the dom
object and work on that. For example, parse-js is an HtmlParseFilter
(it is also a parse plugin) that traverses DOM for script tags and
extracts extra outlinks from it.
>
> I may have miss something in this case i will appreciate your help.
>
--
Doğacan Güney