You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Matthew Vickery <vi...@gmail.com> on 2007/09/27 19:47:53 UTC
Is it possible to crawl a site that requires a log in?
Hi,
I have an pretty standard installation of MediaWiki (Version 1.10) and
wish to use Nutch as a search facility rater than the built in search
facility. Just as Nutch is being used on the Mozilla Developer
Center: http://developer.mozilla.org/
However my Wiki is a company Intranet and so requires a user to log in
before they can view any pages beyond the log in page. Is it possible
to crawl a MediaWiki site that requires a log in and uses the standard
MediaWiki authentication via cookies?
Many thanks in advance,
Matthew