You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Prashant Dave <pd...@gwmail.gwu.edu> on 2012/08/17 17:20:43 UTC
Nutch 2.0 and Sitemap
Is there sitemap support in nutch 2.0. For a given website, first get
and parse the sitemap file and crawl only those urls that are in the
sitemap? I read online that this functionality was in the roadmap but
have not been able to find out if sitemap is supported.
Thanks
Prashant
RE: Nutch 2.0 and Sitemap
Posted by Markus Jelsma <ma...@openindex.io>.
Hi,
There is no support for sitemaps in either 1.x or 2.x but you're more than welcome to provide any patches if you have them.
Cheers
-----Original message-----
> From:Prashant Dave <pd...@gwmail.gwu.edu>
> Sent: Fri 17-Aug-2012 17:23
> To: user@nutch.apache.org
> Subject: Nutch 2.0 and Sitemap
>
> Is there sitemap support in nutch 2.0. For a given website, first get
> and parse the sitemap file and crawl only those urls that are in the
> sitemap? I read online that this functionality was in the roadmap but
> have not been able to find out if sitemap is supported.
>
> Thanks
>
> Prashant
>