You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2012/03/14 16:16:38 UTC

[jira] [Created] (NUTCH-1310) Nutch to send HTTP-accept header

Nutch to send HTTP-accept header
--------------------------------

                 Key: NUTCH-1310
                 URL: https://issues.apache.org/jira/browse/NUTCH-1310
             Project: Nutch
          Issue Type: Bug
    Affects Versions: 1.4
            Reporter: Markus Jelsma
             Fix For: 1.5


Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (NUTCH-1310) Nutch to send HTTP-accept header

Posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Markus Jelsma resolved NUTCH-1310.
----------------------------------

    Resolution: Fixed

Committed for 1.5 in rev. 1301480.
Thanks Lewis.
                
> Nutch to send HTTP-accept header
> --------------------------------
>
>                 Key: NUTCH-1310
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1310
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>             Fix For: 1.5
>
>         Attachments: NUTCH-1310-1.5-1.patch
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (NUTCH-1310) Nutch to send HTTP-accept header

Posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Markus Jelsma updated NUTCH-1310:
---------------------------------

    Attachment: NUTCH-1310-1.5-1.patch

Patch for 1.5. A simple PHP script tells me it works as the Accept header is sent along with the rest:

{code}
["HTTP_ACCEPT_LANGUAGE"]=> string(28) "en-us,en-gb,en;q=0.7,*;q=0.3"
["HTTP_ACCEPT"]=> string(63) "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
{code}
                
> Nutch to send HTTP-accept header
> --------------------------------
>
>                 Key: NUTCH-1310
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1310
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>             Fix For: 1.5
>
>         Attachments: NUTCH-1310-1.5-1.patch
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header

Posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229295#comment-13229295 ] 

Markus Jelsma commented on NUTCH-1310:
--------------------------------------

Ah, yes, that should work out just fine. Thanks for pointing me to it!
                
> Nutch to send HTTP-accept header
> --------------------------------
>
>                 Key: NUTCH-1310
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1310
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>             Fix For: 1.5
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232302#comment-13232302 ] 

Hudson commented on NUTCH-1310:
-------------------------------

Integrated in Nutch-trunk #1789 (See [https://builds.apache.org/job/Nutch-trunk/1789/])
    NUTCH-1310 Nutch to send HTTP-accept header (Revision 1301480)

     Result = SUCCESS
markus : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1301480
Files : 
* /nutch/trunk/CHANGES.txt
* /nutch/trunk/conf/nutch-default.xml
* /nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
* /nutch/trunk/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java

                
> Nutch to send HTTP-accept header
> --------------------------------
>
>                 Key: NUTCH-1310
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1310
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>             Fix For: 1.5
>
>         Attachments: NUTCH-1310-1.5-1.patch
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header

Posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229254#comment-13229254 ] 

Markus Jelsma commented on NUTCH-1310:
--------------------------------------

Any idea on how to resolve this? Suggestions for code location and header value?
                
> Nutch to send HTTP-accept header
> --------------------------------
>
>                 Key: NUTCH-1310
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1310
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>             Fix For: 1.5
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231139#comment-13231139 ] 

Hudson commented on NUTCH-1310:
-------------------------------

Integrated in nutch-trunk-maven #198 (See [https://builds.apache.org/job/nutch-trunk-maven/198/])
    NUTCH-1310 Nutch to send HTTP-accept header (Revision 1301480)

     Result = SUCCESS
markus : 
Files : 
* /nutch/trunk/CHANGES.txt
* /nutch/trunk/conf/nutch-default.xml
* /nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
* /nutch/trunk/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java

                
> Nutch to send HTTP-accept header
> --------------------------------
>
>                 Key: NUTCH-1310
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1310
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>             Fix For: 1.5
>
>         Attachments: NUTCH-1310-1.5-1.patch
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header

Posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230400#comment-13230400 ] 

Lewis John McGibbney commented on NUTCH-1310:
---------------------------------------------

Looks good to me Markus. +1
                
> Nutch to send HTTP-accept header
> --------------------------------
>
>                 Key: NUTCH-1310
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1310
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>             Fix For: 1.5
>
>         Attachments: NUTCH-1310-1.5-1.patch
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header

Posted by "Julien Nioche (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229290#comment-13229290 ] 

Julien Nioche commented on NUTCH-1310:
--------------------------------------

code location - same as 

<property>
  <name>http.accept.language</name>
  <value>en-us,en-gb,en;q=0.7,*;q=0.3</value>
  <description>Value of the "Accept-Language" request header field.
  This allows selecting non-English language as default one to retrieve.
  It is a useful setting for search engines build for certain national group.
  </description>
</property>

?

                
> Nutch to send HTTP-accept header
> --------------------------------
>
>                 Key: NUTCH-1310
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1310
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>             Fix For: 1.5
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (NUTCH-1310) Nutch to send HTTP-accept header

Posted by "Markus Jelsma (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Markus Jelsma reassigned NUTCH-1310:
------------------------------------

    Assignee: Markus Jelsma
    
> Nutch to send HTTP-accept header
> --------------------------------
>
>                 Key: NUTCH-1310
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1310
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>             Fix For: 1.5
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira