You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/12/07 17:16:00 UTC

[jira] [Commented] (NUTCH-2657) Protocol-http to store HTTP response header with "\r\n"

    [ https://issues.apache.org/jira/browse/NUTCH-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16713107#comment-16713107 ] 

ASF GitHub Bot commented on NUTCH-2657:
---------------------------------------

sebastian-nagel opened a new pull request #422: NUTCH-2657 Protocol-http to store HTTP response header with "\r\n"
URL: https://github.com/apache/nutch/pull/422
 
 
   - store response headers using "\r\n" and two trailing line breaks at the end of the headers

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Protocol-http to store HTTP response header with "\r\n"
> -------------------------------------------------------
>
>                 Key: NUTCH-2657
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2657
>             Project: Nutch
>          Issue Type: Improvement
>          Components: protocol
>    Affects Versions: 1.15
>            Reporter: Sebastian Nagel
>            Priority: Minor
>             Fix For: 1.16
>
>
> The plugins protocol-http and protocol-okhttp allow to store the HTTP request and/or response headers in the response metadata. However, there is no consensus which line breaks ("\r\n" or "\n") are used between header lines and whether there is a trailing second line break at the end of the headers: while request headers are stored by both plugins with "\r\n" and two trailing "\r\n",  the response headers are stored by protocol-http with "\n" and a single trailing line break. This is difficult to handle if the headers are required to be stored uniformly (I've created such a [nasty bug writing WARC files|https://github.com/commoncrawl/nutch/issues/5]).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)