You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Steve Yao (JIRA)" <ji...@apache.org> on 2016/07/05 12:23:10 UTC
[jira] [Updated] (NUTCH-2280) HTTP Post form authentication
CookiePolicy configuration
[ https://issues.apache.org/jira/browse/NUTCH-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Yao updated NUTCH-2280:
-----------------------------
Attachment: NUTCH-2280.YAO.160705.patch.txt
Just worked out a patch for this one. I have tested locally.
> HTTP Post form authentication CookiePolicy configuration
> --------------------------------------------------------
>
> Key: NUTCH-2280
> URL: https://issues.apache.org/jira/browse/NUTCH-2280
> Project: Nutch
> Issue Type: New Feature
> Components: protocol
> Affects Versions: 1.11
> Reporter: Steve Yao
> Priority: Minor
> Labels: authentication
> Attachments: NUTCH-2280.YAO.160705.patch.txt
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The protocol-httpclient plugin supports HTTP form authentication with form values post back to the assigned login URL and store the session cookie for following content retrieving.
> The httpclient default CookiePolicy setting is in use. This default setting will reject cookie has domain set starting as ".", for example domain=".domain.com". This kind of domain value could be accepted by most web browsers.
> I suggest to add an configurable option in conf/httpclient-auth.xml:
> {code:xml}<credentials authMethod="formMethod" ...>
> ...
> <loginCookie>
> <policy>DEFAULT | BROWSER_COMPATIBILITY | NETSCAPE RFC_2109 | RFC_2965</policy>
> </loginCookie>
> </credentials>{code}
> Then, the httpclient could take this Cookie policy value.
> I am working on a patch for this feature. But before i implement the configuration format change, i would like to hear any other suggestions or comments.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)