You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hc.apache.org by jo...@apache.org on 2011/01/10 20:02:34 UTC

svn commit: r1057314 - /httpcomponents/httpclient/trunk/src/docbkx/caching.xml

Author: jonm
Date: Mon Jan 10 19:02:33 2011
New Revision: 1057314

URL: http://svn.apache.org/viewvc?rev=1057314&view=rev
Log:
Improved docbook for caching module.

Modified:
    httpcomponents/httpclient/trunk/src/docbkx/caching.xml

Modified: httpcomponents/httpclient/trunk/src/docbkx/caching.xml
URL: http://svn.apache.org/viewvc/httpcomponents/httpclient/trunk/src/docbkx/caching.xml?rev=1057314&r1=1057313&r2=1057314&view=diff
==============================================================================
--- httpcomponents/httpclient/trunk/src/docbkx/caching.xml (original)
+++ httpcomponents/httpclient/trunk/src/docbkx/caching.xml Mon Jan 10 19:02:33 2011
@@ -27,11 +27,29 @@
   <section id="generalconcepts">
     <title>General Concepts</title>
 
-    <para>HttpClient Cache provides an HTTP 1.1 compliant caching layer to be
-    used with HttpClient. It is implemented as a decorator of HttpClient. It
-    provides basic HTTP 1.1 caching capability. You can specify a limit on the
-    maximum cacheable object size to have some control over the size of your
-    cache.</para>
+    <para>HttpClient Cache provides an HTTP/1.1-compliant caching layer to be
+    used with HttpClient--the Java equivalent of a browser cache. The
+    implementation follows the Decorator design pattern, where the
+    CachingHttpClient class is a drop-in replacement for
+    a DefaultHttpClient; requests that can be satisfied entirely from the cache
+    will not result in actual origin requests. Stale cache entries are
+    automatically validated with the origin where possible, using conditional GETs
+    and the If-Modified-Since and/or If-None-Match request headers.  
+    </para>
+    
+    <para>
+    HTTP/1.1 caching in general is designed to be <emphasis>semantically
+    transparent</emphasis>; that is, a cache should not change the meaning of 
+    the request-response exchange between client and server. As such, it should
+    be safe to drop a CachingHttpClient into an existing compliant client-server
+    relationship. Although the caching module is part of the client from an
+    HTTP protocol point of view, the implementation aims to be compatible with
+    the requirements placed on a transparent caching proxy.
+    </para>
+    
+    <para>Finally, CachingHttpClient includes support the Cache-Control
+    extensions specified by RFC 5861 (stale-if-error and stale-while-revalidate).
+    </para>
 
     <para>When CachingHttpClient executes a request, it goes through the
     following flow:</para>
@@ -77,7 +95,7 @@
 
     <orderedlist>
       <listitem>
-        <para>Examing the response for protocol compliance</para>
+        <para>Examining the response for protocol compliance</para>
       </listitem>
 
       <listitem>
@@ -105,12 +123,17 @@
   <section id="rfc2616compliance">
     <title>RFC-2616 Compliance</title>
 
-    <para>HttpClient Cache makes an effort to be at least conditionally
-    compliant with <ulink
+    <para>HttpClient Cache makes an effort to be at least <emphasis>conditionally
+    compliant</emphasis> with <ulink
     url="http://www.ietf.org/rfc/rfc2616.txt">RFC-2616</ulink>. That is,
     wherever the specification indicates MUST or MUST NOT for HTTP caches, the
     caching layer attempts to behave in a way that satisfies those
-    requirements.</para>
+    requirements. This means the caching module won't produce incorrect
+    behavior when you drop it in. At the same time, the project is continuing
+    to work on unconditional compliance, which would add compliance with all the
+    SHOULDs and SHOULD NOTs, many of which we already comply with. We just can't
+    claim fully unconditional compliance until we satisfy <emphasis>all</emphasis>
+    of them.</para>
   </section>
 
   <section>
@@ -155,4 +178,71 @@ case VALIDATED:
 ]]>
     </programlisting>
   </section>
+  
+  <section id="configuration">
+    <title>Configuration</title>
+
+    <para>As the CachingHttpClient is a decorator, much of the configuration you may
+    want to do can be done on the HttpClient used as the "backend" by the HttpClient
+    (this includes setting options like timeouts and connection pool sizes). For
+    caching-specific configuration, you can provide a CacheConfig instance to
+    customize behavior across the following areas:</para>
+    
+    <para><emphasis>Cache size.</emphasis> If the backend storage supports these limits,
+    you can specify the maximum number of cache entries as well as the maximum cacheable
+    response body size.</para>
+    
+
+    <para><emphasis>Public/private caching.</emphasis> By default, the caching module
+    considers itself to be a shared (public) cache, and will not, for example, cache
+    responses to requests with Authorization headers or responses marked with
+    "Cache-Control: private". If, however, the cache is only going to be used by one
+    logical "user" (behaving similarly to a browser cache), then you will want to turn
+    off the shared cache setting.</para>
+
+    <para><emphasis>Heuristic caching.</emphasis>Per RFC2616, a cache MAY cache
+    certain cache entries even if no explicit cache control headers are set by the
+    origin. This behavior is off by default, but you may want to turn this on if you
+    are working with an origin that doesn't set proper headers but where you still
+    want to cache the responses. You will want to enable heuristic caching, then
+    specify either a default freshness lifetime and/or a fraction of the time since
+    the resource was last modified. See Sections 13.2.2 and 13.2.4 of the HTTP/1.1
+    RFC for more details on heuristic caching.</para>
+
+    <para><emphasis>Background validation.</emphasis> The cache module supports the
+    stale-while-revalidate directive of RFC5861, which allows certain cache entry
+    revalidations to happen in the background. You may want to tweak the settings
+    for the minimum and maximum number of background worker threads, as well as the
+    maximum time they can be idle before being reclaimed. You can also control the
+    size of the queue used for revalidations when there aren't enough workers to
+    keep up with demand.</para>
+  </section>
+  
+  <section id="storage">
+    <title>Storage Backends</title>
+
+    <para>The default implementation of CachingHttpClient stores cache entries and
+    cached response bodies in memory in the JVM of your application. While this
+    offers high performance, it may not be appropriate for your application due to
+    the limitation on size or because the cache entries are ephemeral and don't
+    survive an application restart. The current release includes support for storing
+    cache entries using Ehcache and memcached implementations, which allow for
+    spilling cache entries to disk or storing them in an external process.</para>
+    
+    <para>If none of those options are suitable for your application, it is
+    possible to provide your own storage backend by implementing the HttpCacheStorage
+    interface and then supplying that to CachingHttpClient at construction time. In
+    this case, the cache entries will be stored using your scheme but you will get to
+    reuse all of the logic surrounding HTTP/1.1 compliance and cache handling.
+    Generally speaking, it should be possible to create an HttpCacheStorage
+    implementation out of anything that supports a key/value store (similar to the
+    Java Map interface) with the ability to apply atomic updates.</para>
+    
+    <para>Finally, because the CachingHttpClient is a decorator for HttpClient,
+    it's entirely possible to set up a multi-tier caching hierarchy; for example,
+    wrapping an in-memory CachingHttpClient around one that stores cache entries on
+    disk or remotely in memcached, following a pattern similar to virtual memory,
+    L1/L2 processor caches, etc.
+    </para>
+  </section>
 </chapter>