You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by d_k <ma...@gmail.com> on 2014/05/23 12:51:05 UTC

Are these settings/behaviors really required to maintain when porting httpclient to 4.3.3?

Hello,

I'm porting the protocol-httpclient to 4.3.3 and I came across some
settings set on the client that i'll be happy to clarify.

currently i'm having trouble with the makeLenient() [0] API call because
some of the settings set there are only available using a custom response
message parser which leads me to believe one should not do that unless a
crawler like nutch has some special requirements for them.

Currently the settings i'm struggeling with are:

http.protocol.unambiguous-statusline = false
http.protocol.strict-transfer-encoding = false
http.protocol.reject-head-body = false
http.protocol.warn-extra-input = false
http.protocol.status-line-garbage-limit = 0

and i was wondering perhaps we can leave it to the default httpclient
values?

If not then could someone explain to me why these settings are important
enough to write a custom parser for them?

If they are simply nice to have then i'll gladly post a patch without them
and come back to that some other time.

[0]
https://hc.apache.org/httpclient-3.x/apidocs/org/apache/commons/httpclient/params/HttpMethodParams.html#makeLenient%28%29