You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by ".: Abhishek :." <ab...@gmail.com> on 2011/02/02 06:48:22 UTC

When does parsing and application of parsing filter happen?

Hi all,

 I would like to know whether parsing and application of parsing filters
happens after the fetch of the pages or during the process of fetching
itself?

Thanks,
Abi

Re: When does parsing and application of parsing filter happen?

Posted by ".: Abhishek :." <ab...@gmail.com>.
Thanks Markus!

On Wed, Feb 2, 2011 at 6:08 PM, Markus Jelsma <ma...@openindex.io>wrote:

> Parsing is a separate job although depending on the configuration it can
> run
> together with the fetch job. Check the configuration but i remember that by
> default all fetched pages are parsed immediately. This is ok for small
> batches
> but not recommended for larger batches.
>
> > Hi all,
> >
> >  I would like to know whether parsing and application of parsing filters
> > happens after the fetch of the pages or during the process of fetching
> > itself?
> >
> > Thanks,
> > Abi
>

Re: When does parsing and application of parsing filter happen?

Posted by Markus Jelsma <ma...@openindex.io>.
Parsing is a separate job although depending on the configuration it can run 
together with the fetch job. Check the configuration but i remember that by 
default all fetched pages are parsed immediately. This is ok for small batches 
but not recommended for larger batches.

> Hi all,
> 
>  I would like to know whether parsing and application of parsing filters
> happens after the fetch of the pages or during the process of fetching
> itself?
> 
> Thanks,
> Abi