You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2019/07/25 15:08:00 UTC

[jira] [Commented] (NUTCH-2725) Plugin lib-http to support per-host configurable cookies

    [ https://issues.apache.org/jira/browse/NUTCH-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892858#comment-16892858 ] 

Sebastian Nagel commented on NUTCH-2725:
----------------------------------------

Hi [~markus17] looks good and works. A few minor points:
* converting the URL object to a String, then parsing it again doesn't seem efficient (could just pass the URL object itself):
{code}
cookie = http.getCookie(url.toString());
...
public String getCookie(String url) {
   if (hostCookies != null) {
     return hostCookies.get(URLUtil.getHost(url));
   }
...
{code}
* comment lines in the cookies.txt file cause an exception and the rest of the file is ignored (should generally report and skip invalid lines and continue):
{noformat}
2019-07-25 16:58:24,052 WARN  http.Http - Failed to read http.agent.host.cookie.file cookies.txt: java.lang.ArrayIndexOutOfBoundsException: 1
        at org.apache.nutch.protocol.http.api.HttpBase.setConf(HttpBase.java:278)
{noformat}
* could add "http.agent.host.cookie.file" to nutch-default.xml

> Plugin lib-http to support per-host configurable cookies
> --------------------------------------------------------
>
>                 Key: NUTCH-2725
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2725
>             Project: Nutch
>          Issue Type: Improvement
>          Components: protocol
>    Affects Versions: 1.15
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>            Priority: Major
>             Fix For: 1.16
>
>         Attachments: NUTCH-2725.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)