You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Tizy Ninan <ti...@gmail.com> on 2014/12/10 11:02:18 UTC

HttpPostAuthentication

Hi,

I am trying to develop a custom crawler to crawl websites that require form
based authentication using Nutch v1.9 in Java.  The HttpPostAuthentication
feature of Nutch is followed to implement it.

The login parameters required for authentication such as html form-id,
login post data(username, password) are specified as key-value pairs in a
configuration file. What is required to identify the html login form(id or
name of the html form)?

Thanks,
Tizy