You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Emmanuel <jo...@gmail.com> on 2007/09/10 17:26:17 UTC

ParseResults

I would like to implement a way to filter the menu within the different
website. I checked the code and I tried to understand the way the parsing is
done.
I found that we are generating ParseResult based on the parser.

What is the aim of a ParseResult ? Actually I don't understand why we could
store many parseresult ? Is there any specific usage ?
Why do we call the htmlparsefilter.filter after having created a first
ParseResult ?

I may have miss something in this case i will appreciate your help.

Re: ParseResults

Posted by Doğacan Güney <do...@gmail.com>.
On 9/10/07, Emmanuel <jo...@gmail.com> wrote:
> I would like to implement a way to filter the menu within the different
> website. I checked the code and I tried to understand the way the parsing is
> done.
> I found that we are generating ParseResult based on the parser.
>
> What is the aim of a ParseResult ? [...]

You can generate more than 1 parse for a page. See feed plugin for an
example (feed plugin extracts individual entries from an rss file and
stores them seperately).

> [..] Actually I don't understand why we could
> store many parseresult ? Is there any specific usage ?
> Why do we call the htmlparsefilter.filter after having created a first
> ParseResult ?

HtmlParseFilter-s are plugins that take the parse result and the dom
object and work on that. For example, parse-js is an HtmlParseFilter
(it is also a parse plugin) that traverses DOM for script tags and
extracts extra outlinks from it.

>
> I may have miss something in this case i will appreciate your help.
>


-- 
Doğacan Güney