You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Markus Jelsma (Created) (JIRA)" <ji...@apache.org> on 2012/03/14 16:16:38 UTC
[jira] [Created] (NUTCH-1310) Nutch to send HTTP-accept header
Nutch to send HTTP-accept header
--------------------------------
Key: NUTCH-1310
URL: https://issues.apache.org/jira/browse/NUTCH-1310
Project: Nutch
Issue Type: Bug
Affects Versions: 1.4
Reporter: Markus Jelsma
Fix For: 1.5
Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (NUTCH-1310) Nutch to send HTTP-accept header
Posted by "Markus Jelsma (Resolved) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Markus Jelsma resolved NUTCH-1310.
----------------------------------
Resolution: Fixed
Committed for 1.5 in rev. 1301480.
Thanks Lewis.
> Nutch to send HTTP-accept header
> --------------------------------
>
> Key: NUTCH-1310
> URL: https://issues.apache.org/jira/browse/NUTCH-1310
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.4
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Fix For: 1.5
>
> Attachments: NUTCH-1310-1.5-1.patch
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1310) Nutch to send HTTP-accept header
Posted by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Markus Jelsma updated NUTCH-1310:
---------------------------------
Attachment: NUTCH-1310-1.5-1.patch
Patch for 1.5. A simple PHP script tells me it works as the Accept header is sent along with the rest:
{code}
["HTTP_ACCEPT_LANGUAGE"]=> string(28) "en-us,en-gb,en;q=0.7,*;q=0.3"
["HTTP_ACCEPT"]=> string(63) "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
{code}
> Nutch to send HTTP-accept header
> --------------------------------
>
> Key: NUTCH-1310
> URL: https://issues.apache.org/jira/browse/NUTCH-1310
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.4
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Fix For: 1.5
>
> Attachments: NUTCH-1310-1.5-1.patch
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header
Posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229295#comment-13229295 ]
Markus Jelsma commented on NUTCH-1310:
--------------------------------------
Ah, yes, that should work out just fine. Thanks for pointing me to it!
> Nutch to send HTTP-accept header
> --------------------------------
>
> Key: NUTCH-1310
> URL: https://issues.apache.org/jira/browse/NUTCH-1310
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.4
> Reporter: Markus Jelsma
> Fix For: 1.5
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232302#comment-13232302 ]
Hudson commented on NUTCH-1310:
-------------------------------
Integrated in Nutch-trunk #1789 (See [https://builds.apache.org/job/Nutch-trunk/1789/])
NUTCH-1310 Nutch to send HTTP-accept header (Revision 1301480)
Result = SUCCESS
markus : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1301480
Files :
* /nutch/trunk/CHANGES.txt
* /nutch/trunk/conf/nutch-default.xml
* /nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
* /nutch/trunk/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java
> Nutch to send HTTP-accept header
> --------------------------------
>
> Key: NUTCH-1310
> URL: https://issues.apache.org/jira/browse/NUTCH-1310
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.4
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Fix For: 1.5
>
> Attachments: NUTCH-1310-1.5-1.patch
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header
Posted by "Markus Jelsma (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229254#comment-13229254 ]
Markus Jelsma commented on NUTCH-1310:
--------------------------------------
Any idea on how to resolve this? Suggestions for code location and header value?
> Nutch to send HTTP-accept header
> --------------------------------
>
> Key: NUTCH-1310
> URL: https://issues.apache.org/jira/browse/NUTCH-1310
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.4
> Reporter: Markus Jelsma
> Fix For: 1.5
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231139#comment-13231139 ]
Hudson commented on NUTCH-1310:
-------------------------------
Integrated in nutch-trunk-maven #198 (See [https://builds.apache.org/job/nutch-trunk-maven/198/])
NUTCH-1310 Nutch to send HTTP-accept header (Revision 1301480)
Result = SUCCESS
markus :
Files :
* /nutch/trunk/CHANGES.txt
* /nutch/trunk/conf/nutch-default.xml
* /nutch/trunk/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
* /nutch/trunk/src/plugin/protocol-http/src/java/org/apache/nutch/protocol/http/HttpResponse.java
> Nutch to send HTTP-accept header
> --------------------------------
>
> Key: NUTCH-1310
> URL: https://issues.apache.org/jira/browse/NUTCH-1310
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.4
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Fix For: 1.5
>
> Attachments: NUTCH-1310-1.5-1.patch
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header
Posted by "Lewis John McGibbney (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230400#comment-13230400 ]
Lewis John McGibbney commented on NUTCH-1310:
---------------------------------------------
Looks good to me Markus. +1
> Nutch to send HTTP-accept header
> --------------------------------
>
> Key: NUTCH-1310
> URL: https://issues.apache.org/jira/browse/NUTCH-1310
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.4
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Fix For: 1.5
>
> Attachments: NUTCH-1310-1.5-1.patch
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-1310) Nutch to send HTTP-accept header
Posted by "Julien Nioche (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229290#comment-13229290 ]
Julien Nioche commented on NUTCH-1310:
--------------------------------------
code location - same as
<property>
<name>http.accept.language</name>
<value>en-us,en-gb,en;q=0.7,*;q=0.3</value>
<description>Value of the "Accept-Language" request header field.
This allows selecting non-English language as default one to retrieve.
It is a useful setting for search engines build for certain national group.
</description>
</property>
?
> Nutch to send HTTP-accept header
> --------------------------------
>
> Key: NUTCH-1310
> URL: https://issues.apache.org/jira/browse/NUTCH-1310
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.4
> Reporter: Markus Jelsma
> Fix For: 1.5
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (NUTCH-1310) Nutch to send HTTP-accept header
Posted by "Markus Jelsma (Assigned) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Markus Jelsma reassigned NUTCH-1310:
------------------------------------
Assignee: Markus Jelsma
> Nutch to send HTTP-accept header
> --------------------------------
>
> Key: NUTCH-1310
> URL: https://issues.apache.org/jira/browse/NUTCH-1310
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.4
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Fix For: 1.5
>
>
> Nutch does not send a HTTP-accept header with its requests. This is usually not a problem but some firewall do not like it and will reject the request.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira