You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by ol...@apache.org on 2005/01/13 21:13:16 UTC
cvs commit: jakarta-commons/httpclient/xdocs performance.xml navigation.xml userguide.xml
olegk 2005/01/13 12:13:16
Modified: httpclient/xdocs navigation.xml userguide.xml
Added: httpclient/xdocs performance.xml
Log:
PR #28296 (Compile performance optimization guide)
Contributed by Oleg Kalnichevski, Ortwin Glueck, Michael Becke
Revision Changes Path
1.17 +2 -1 jakarta-commons/httpclient/xdocs/navigation.xml
Index: navigation.xml
===================================================================
RCS file: /home/cvs/jakarta-commons/httpclient/xdocs/navigation.xml,v
retrieving revision 1.16
retrieving revision 1.17
diff -u -r1.16 -r1.17
--- navigation.xml 8 Jan 2005 11:38:04 -0000 1.16
+++ navigation.xml 13 Jan 2005 20:13:16 -0000 1.17
@@ -24,6 +24,7 @@
<item name="Exception Handling" href="/exception-handling.html"/>
<item name="Logging Guide" href="/logging.html"/>
<item name="Methods" href="/methods.html"/>
+ <item name="Optimization Guide" href="/performance.html"/>
<item name="Preference Architecture" href="/preference-api.html"/>
<item name="Redirects Handling" href="/redirects.html"/>
<item name="Sample Code" href="http://cvs.apache.org/viewcvs.cgi/jakarta-commons/httpclient/src/examples/"/>
1.6 +5 -1 jakarta-commons/httpclient/xdocs/userguide.xml
Index: userguide.xml
===================================================================
RCS file: /home/cvs/jakarta-commons/httpclient/xdocs/userguide.xml,v
retrieving revision 1.5
retrieving revision 1.6
diff -u -r1.5 -r1.6
--- userguide.xml 16 Sep 2004 06:24:43 -0000 1.5
+++ userguide.xml 13 Jan 2005 20:13:16 -0000 1.6
@@ -55,6 +55,10 @@
that are provided by HttpClient and how to use them.</td>
</tr>
<tr>
+ <td><a href="performance.html">Optimization Guide</a></td>
+ <td>This document outlines HttpClient performance optimization techniques.</td>
+ </tr>
+ <tr>
<td><a href="preference-api.html">Preference Architecture</a></td>
<td>This document explains the preference architecture used by HttpClient
and enumerates standard HttpClient parameters.</td>
1.1 jakarta-commons/httpclient/xdocs/performance.xml
Index: performance.xml
===================================================================
<?xml version="1.0" encoding="ISO-8859-1"?>
<document>
<properties>
<title>HttpClient Performance Optimization Guide</title>
<author email="oleg -at- ural.ru">Oleg Kalnichevski</author>
<revision>$Id: performance.xml,v 1.1 2005/01/13 20:13:16 olegk Exp $</revision>
</properties>
<body>
<section name="Introduction">
<p>
By default HttpClient is configured to provide maximum reliability and standards
compliance rather than raw performance. There are several configuration options and
optimization techniques which can significantly improve the performance of HttpClient.
This document outlines various techniques to achieve maximum HttpClient performance.
</p>
<subsection name="Contents">
<ul>
<li>
<a href="#Reuse of HttpClient instance">Reuse the HttpClient instance</a>
</li>
<li>
<a href="#Connection persistence">Connection persistence</a>
</li>
<li>
<a href="#Concurrent execution of HTTP methods">Concurrent execution of HTTP methods</a>
</li>
<li>
<a href="#Request/Response entity streaming">Request/Response entity streaming</a>
</li>
<li>
<a href="#Expect-continue handshake">Expect-continue handshake</a>
</li>
<li>
<a href="#Stale connection check">Stale connection check</a>
</li>
<li>
<a href="#Cookie processing">Cookie processing</a>
</li>
</ul>
</subsection>
</section>
<section name="Reuse the HttpClient instance">
<p>
Generally it is recommended to have a single instance of HttpClient per communication
component or even per application. However, if the application makes use of HttpClient
only very infrequently, and keeping an idle instance of HttpClient in memory is not warranted,
it is highly recommended to explicitly <a href="apidocs/org/apache/commons/httpclient/MultiThreadedHttpConnectionManager.html#shutdown()">
shut down</a> the multithreaded connection manager prior to disposing
the HttpClient instance. This will ensure proper closure of all HTTP connections in the
connection pool.
</p>
</section>
<section name="Connection persistence">
<p>
HttpClient always does its best to reuse connections. Connection persistence is enabled
by default and requires no configuration. Under some situations this can lead to leaked
connections and therefore lost resources. The easiest way to disable connection persistence
is to provide or extend a connection manager that force-closes connections
upon release in the <a href="apidocs/org/apache/commons/httpclient/HttpConnectionManager.html#releaseConnection(org.apache.commons.httpclient.HttpConnection)">
releaseConnection</a> method.
</p>
</section>
<section name="Concurrent execution of HTTP methods">
<p>
If the application logic allows for execution of multiple HTTP requests concurrently
(e.g. multiple requests against various sites, or multiple requests representing
different user identities), the use of a dedicated thread per HTTP session can result in a
significant performance gain. HttpClient is fully thread-safe when used with a thread-safe
connection manager such as <a href="apidocs/org/apache/commons/httpclient/MultiThreadedHttpConnectionManager.html">
MultiThreadedHttpConnectionManager</a>. Please note that each respective thread of execution
must have a local instance of HttpMethod and can have a local instance of HttpState or/and
HostConfiguration to represent a specific host configuration and conversational state. At the
same time the HttpClient instance and connection manager should be shared among all threads
for maximum efficiency.
</p>
<p>
For details on using multiple threads with HttpClient please refer to the <a href="threading.html">
HttpClient Threading Guide</a>.
</p>
</section>
<section name="Request/Response entity streaming">
<p>
HttpClient is capable of efficient request/response body streaming. Large entities may be submitted
or received without being buffered in memory. This is especially critical if multiple HTTP
methods may be executed concurrently. While there are convenience methods to deal with entities such as
strings or byte arrays, their use is discouraged. Unless used carefully they can easily lead to
out of memory conditions, since they imply buffering of the complete entity in memory.
</p>
<p>
<strong>Response streaming:</strong> It is recommended to consume the HTTP response body as a stream of
bytes/characters using HttpMethod#getResponseBodyAsStream method. The use of HttpMethod#getResponseBody and
HttpMethod#getResponseBodyAsString are strongly discouraged.
<source><![CDATA[
HttpClient httpclient = new HttpClient();
GetMethod httpget = new GetMethod("http://www.myhost.com/");
try {
httpclient.executeMethod(httpget);
Reader reader = new InputStreamReader(
httpget.getResponseBodyAsStream(), httpget.getResponseCharSet());
// consume the response entity
} finally {
httpget.releaseConnection();
}]]></source>
</p>
<p>
<strong>Request streaming:</strong> The main difficulty encountered when streaming request bodies is that
some entity enclosing methods need to be retried due to an authentication failure or an I/O failure.
Obviously non-buffered entities cannot be reread and resubmitted. The recommended approach is to create a custom
<a href="apidocs/org/apache/commons/httpclient/methods/RequestEntity.html">RequestEntity</a> capable of
reconstructing the underlying input stream.
<source><![CDATA[
public class FileRequestEntity implements RequestEntity {
private File file = null;
public FileRequestEntity(File file) {
super();
this.file = file;
}
public boolean isRepeatable() {
return true;
}
public String getContentType() {
return "text/plain; charset=UTF-8";
}
public void writeRequest(OutputStream out) throws IOException {
InputStream in = new FileInputStream(this.file);
try {
int l;
byte[] buffer = new byte[1024];
while ((l = in.read(buffer)) != -1) {
out.write(buffer, 0, l);
}
} finally {
in.close();
}
}
public long getContentLength() {
return file.length();
}
}
File myfile = new File("myfile.txt");
PostMethod httppost = new PostMethod("/stuff");
httppost.setRequestEntity(new FileRequestEntity(myfile));]]></source>
</p>
</section>
<section name="Expect-continue handshake">
<p>
The purpose of the HTTP 100 (Continue) status is to allow a client sending a request entity to
determine if the target server is willing to accept the request (based on the
request headers) before the client sends the request entity. It is highly inefficient for the client
to send the request entity if the server will reject the request without looking at the body.
Authentication failures are the most common reason for the request to be rejected based on the request
headers alone. Therefore, use of the 'Expect-continue' handshake is especially recommended with
those target servers that require HTTP authentication. For proxied requests caution
must be taken as older HTTP/1.0 proxies may be unable to correctly handle the 'Expect-continue'
handshake.
</p>
<p>
See the <a href="preference-api.html">http.protocol.expect-continue</a> parameter documentation
for more information.
</p>
</section>
<section name="Stale connection check">
<p>
HTTP specification permits both the client and the server to terminate a persistent (keep-alive)
connection at any time without notice to the counterpart, thus rendering the connection invalid
or stale. By default HttpClient performs a check, just prior to executing a request, to determine if the
active connection is stale. The cost of this operation is about 15-30 ms, depending on the JRE used.
Disabling stale connection check may result in slight performance improvement, especially for small
payload responses, at the risk of getting an I/O error when executing a request over a connection
that has been closed at the server side.
</p>
<p>
See the <a href="preference-api.html">http.connection.stalecheck</a> parameter documentation for more
information.
</p>
</section>
<section name="Cookie processing">
<p>
If an application, such as web spider, does not need to maintain conversational state with the target
server, a small performance gain can made by disabling cookie processing. For details
on cookie processing please to the <a href="cookies.html">HttpClient Cookies Guide</a>.
</p>
</section>
</body>
</document>
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org