You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2019/07/25 15:08:00 UTC
[jira] [Commented] (NUTCH-2725) Plugin lib-http to support per-host
configurable cookies
[ https://issues.apache.org/jira/browse/NUTCH-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892858#comment-16892858 ]
Sebastian Nagel commented on NUTCH-2725:
----------------------------------------
Hi [~markus17] looks good and works. A few minor points:
* converting the URL object to a String, then parsing it again doesn't seem efficient (could just pass the URL object itself):
{code}
cookie = http.getCookie(url.toString());
...
public String getCookie(String url) {
if (hostCookies != null) {
return hostCookies.get(URLUtil.getHost(url));
}
...
{code}
* comment lines in the cookies.txt file cause an exception and the rest of the file is ignored (should generally report and skip invalid lines and continue):
{noformat}
2019-07-25 16:58:24,052 WARN http.Http - Failed to read http.agent.host.cookie.file cookies.txt: java.lang.ArrayIndexOutOfBoundsException: 1
at org.apache.nutch.protocol.http.api.HttpBase.setConf(HttpBase.java:278)
{noformat}
* could add "http.agent.host.cookie.file" to nutch-default.xml
> Plugin lib-http to support per-host configurable cookies
> --------------------------------------------------------
>
> Key: NUTCH-2725
> URL: https://issues.apache.org/jira/browse/NUTCH-2725
> Project: Nutch
> Issue Type: Improvement
> Components: protocol
> Affects Versions: 1.15
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Priority: Major
> Fix For: 1.16
>
> Attachments: NUTCH-2725.patch
>
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)