You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@trafficserver.apache.org by jp...@apache.org on 2014/12/10 22:35:52 UTC

[1/3] trafficserver git commit: docs: update focused on architecture documentation

Repository: trafficserver
Updated Branches:
  refs/heads/master 1f1e2ae15 -> aa37d0ab5


http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/arch/cache/cache-data-structures.en.rst
----------------------------------------------------------------------
diff --git a/doc/arch/cache/cache-data-structures.en.rst b/doc/arch/cache/cache-data-structures.en.rst
index 508c4da..1158051 100644
--- a/doc/arch/cache/cache-data-structures.en.rst
+++ b/doc/arch/cache/cache-data-structures.en.rst
@@ -1,28 +1,31 @@
 .. Licensed to the Apache Software Foundation (ASF) under one
    or more contributor license agreements.  See the NOTICE file
-  distributed with this work for additional information
-  regarding copyright ownership.  The ASF licenses this file
-  to you under the Apache License, Version 2.0 (the
-  "License"); you may not use this file except in compliance
-  with the License.  You may obtain a copy of the License at
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
 
    http://www.apache.org/licenses/LICENSE-2.0
 
-  Unless required by applicable law or agreed to in writing,
-  software distributed under the License is distributed on an
-  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-  KIND, either express or implied.  See the License for the
-  specific language governing permissions and limitations
-  under the License.
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
+
+.. _cache-data-structures:
 
 Cache Data Structures
-******************************
+*********************
 
 .. include:: common.defs
 
 .. cpp:class:: OpenDir
 
-   An open directory entry. It contains all the information of a :cpp:class:`Dir` plus additional information from the first :cpp:class:`Doc`.
+   An open directory entry. It contains all the information of a
+   :cpp:class:`Dir` plus additional information from the first :cpp:class:`Doc`.
 
 .. cpp:class:: CacheVC
 
@@ -30,15 +33,20 @@ Cache Data Structures
 
 .. cpp:function:: int CacheVC::openReadStartHead(int event, Event* e)
 
-   Do the initial read for a cached object.
+   Performs the initial read for a cached object.
 
 .. cpp:function:: int CacheVC::openReadStartEarliest(int event, Event* e)
 
-   Do the initial read for an alternate of an object.
+   Performs the initial read for an :term:`alternate` of an object.
 
 .. cpp:class:: HttpTunnel
 
-   Data transfer driver. This contains a set of *producers*. Each producer is connected to one or more *consumers*. The tunnel handles events and buffers so that data moves from producers to consumers. The data, as much as possible, is kept in reference counted buffers so that copies are done only when the data is modified or for sources (which acquire data from outside |TS|) and sinks (which move data to outside |TS|).
+   Data transfer driver. This contains a set of *producers*. Each producer is
+   connected to one or more *consumers*. The tunnel handles events and buffers
+   so that data moves from producers to consumers. The data, as much as
+   possible, is kept in reference counted buffers so that copies are done only
+   when the data is modified or for sources (which acquire data from outside
+   |TS|) and sinks (which move data to outside |TS|).
 
 .. cpp:class:: CacheControlResult
 
@@ -46,13 +54,17 @@ Cache Data Structures
 
 .. cpp:class:: CacheHTTPInfoVector
 
-   Defined in |P-CacheHttp.h|_. This is an array of :cpp:class:`HTTPInfo` objects and serves as the respository of information about alternates of an object. It is marshaled as part of the metadata for an object in the cache.
+   Defined in |P-CacheHttp.h|_. This is an array of :cpp:class:`HTTPInfo`
+   objects and serves as the respository of information about alternates of an
+   object. It is marshaled as part of the metadata for an object in the cache.
 
 .. cpp:class:: HTTPInfo
 
    Defined in |HTTP.h|_.
 
-   This class is a wrapper for :cpp:class:`HTTPCacheAlt`. It provides the external API for accessing data in the wrapped class. It contains only a pointer (possibly ``NULL``) to an instance of the wrapped class.
+   This class is a wrapper for :cpp:class:`HTTPCacheAlt`. It provides the
+   external API for accessing data in the wrapped class. It contains only a
+   pointer (possibly ``NULL``) to an instance of the wrapped class.
 
 .. cpp:class:: CacheHTTPInfo
 
@@ -62,12 +74,16 @@ Cache Data Structures
 
    Defined in |HTTP.h|_.
 
-   This is the metadata for a single alternate for a cached object. In contains among other data
+   This is the metadata for a single :term:`alternate` for a cached object. It
+   contains, among other data, the following:
 
    * The key for the earliest ``Doc`` of the alternate.
+
    * The request and response headers.
-   * The fragment offset table. [#]_
-   * Timestamps for request and response from origin server.
+
+   * The fragment offset table.[#fragment-offset-table]_
+
+   * Timestamps for request and response from :term:`origin server`.
 
 .. cpp:class:: EvacuationBlock
 
@@ -75,19 +91,25 @@ Cache Data Structures
 
 .. cpp:class:: Vol
 
-   This represents a storage unit inside a cache volume.
+   This represents a :term:`storage unit` inside a :term:`cache volume`.
 
    .. cpp:member:: off_t Vol::segments
 
-      The number of segments in the volume. This will be roughly the total number of entries divided by the number of entries in a segment. It will be rounded up to cover all entries.
+      The number of segments in the volume. This will be roughly the total
+      number of entries divided by the number of entries in a segment. It will
+      be rounded up to cover all entries.
 
    .. cpp:member:: off_t Vol::buckets
 
-      The number of buckets in the volume. This will be roughly the number of entries in a segment divided by ``DIR_DEPTH``. For currently defined values this is around 16,384 (2^16 / 4). Buckets are used as the targets of the index hash.
+      The number of buckets in the volume. This will be roughly the number of
+      entries in a segment divided by ``DIR_DEPTH``. For currently defined
+      values this is around 16,384 (2^16 / 4). Buckets are used as the targets
+      of the index hash.
 
    .. cpp:member:: DLL\<EvacuationBlock\> Vol::evacuate
 
-      Array of of :cpp:class:`EvacuationBlock` buckets. This is sized so there is one bucket for every evacuation span.
+      Array of of :cpp:class:`EvacuationBlock` buckets. This is sized so there
+      is one bucket for every evacuation span.
 
    .. cpp:member:: off_t len
 
@@ -95,11 +117,13 @@ Cache Data Structures
 
 .. cpp:function:: int Vol::evac_range(off_t low, off_t high, int evac_phase)
 
-   Start an evacuation if there is any :cpp:class:`EvacuationBlock` in the range from *low* to *high*. Return 0 if no evacuation was started, non-zero otherwise.
+   Start an evacuation if there is any :cpp:class:`EvacuationBlock` in the range
+   from :arg:`low` to :arg:`high`. Return ``0`` if no evacuation was started,
+   non-zero otherwise.
 
 .. cpp:class:: CacheVol
 
-   A cache volume as described in :file:`volume.config`.
+   A :term:`cache volume` as described in :file:`volume.config`.
 
 .. cpp:class:: Doc
 
@@ -111,33 +135,48 @@ Cache Data Structures
 
    .. cpp:member:: uint32_t Doc::len
 
-      The length of this segment including the header length, fragment table, and this structure.
+      The length of this segment including the header length, fragment table,
+      and this structure.
 
    .. cpp:member:: uint64_t Doc::total_len
 
-      Total length of the entire document not including meta data but including headers.
+      Total length of the entire document not including meta data but including
+      headers.
 
    .. cpp:member:: INK_MD5 Doc::first_key
 
-      First index key in the document (the index key used to locate this object in the volume index).
+      First index key in the document (the index key used to locate this object
+      in the volume index).
 
    .. cpp:member:: INK_MD5 Doc::key
 
-      The index key for this fragment. Fragment keys are computationally chained so that the key for the next and previous fragments can be computed from this key.
+      The index key for this fragment. Fragment keys are computationally
+      chained so that the key for the next and previous fragments can be
+      computed from this key.
 
    .. cpp:member:: uint32_t Doc::hlen
 
-      Document header (metadata) length. This is not the length of the HTTP headers.
+      Document header (metadata) length. This is not the length of the HTTP
+      headers.
 
    .. cpp:member:: uint8_t Doc::ftype
 
-      Fragment type. Currently only `CACHE_FRAG_TYPE_HTTP` is used. Other types may be used for cache extensions if those are ever used / implemented.
+      Fragment type. Currently only ``CACHE_FRAG_TYPE_HTTP`` is used. Other
+      types may be used for cache extensions if those are ever implemented.
 
    .. cpp:member:: uint24_t Doc::flen
 
-      Fragment table length, if any. Only the first ``Doc`` in an object should contain a fragment table.
+      Fragment table length, if any. Only the first ``Doc`` in an object should
+      contain a fragment table.
 
-      The fragment table is a list of offsets relative to the HTTP content (not counting metadata or HTTP headers). Each offset is the byte offset of the first byte in the fragment. The first element in the table is the second fragment (what would be index 1 for an array). The offset for the first fragment is of course always zero and so not stored. The purpose of this is to enable a fast seek for range requests - given the first ``Doc`` the fragment containing the first byte in the range can be computed and loaded directly without further disk access.
+      The fragment table is a list of offsets relative to the HTTP content (not
+      counting metadata or HTTP headers). Each offset is the byte offset of the
+      first byte in the fragment. The first element in the table is the second
+      fragment (what would be index 1 for an array). The offset for the first
+      fragment is of course always zero and so not stored. The purpose of this
+      is to enable a fast seek for range requests. Given the first ``Doc`` the
+      fragment containing the first byte in the range can be computed and loaded
+      directly without further disk access.
 
       Removed as of version 3.3.0.
 
@@ -155,10 +194,14 @@ Cache Data Structures
 
    .. cpp:member:: uint32_t checksum
 
-      Unknown. (A checksum of some sort)
+      Unknown.
 
 .. cpp:class:: VolHeaderFooter
 
 .. rubric:: Footnotes
 
-.. [#] Changed in version 3.2.0. This previously resided in the first ``Doc`` but that caused different alternates to share the same fragment table.
+.. [#fragment-offset-table]
+
+   Changed in version 3.2.0. This previously resided in the first ``Doc`` but
+   that caused different alternates to share the same fragment table.
+

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/arch/cache/cache.en.rst
----------------------------------------------------------------------
diff --git a/doc/arch/cache/cache.en.rst b/doc/arch/cache/cache.en.rst
index f103d9f..4e50649 100644
--- a/doc/arch/cache/cache.en.rst
+++ b/doc/arch/cache/cache.en.rst
@@ -1,19 +1,19 @@
 .. Licensed to the Apache Software Foundation (ASF) under one
    or more contributor license agreements.  See the NOTICE file
-  distributed with this work for additional information
-  regarding copyright ownership.  The ASF licenses this file
-  to you under the Apache License, Version 2.0 (the
-  "License"); you may not use this file except in compliance
-  with the License.  You may obtain a copy of the License at
- 
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
+
    http://www.apache.org/licenses/LICENSE-2.0
- 
-  Unless required by applicable law or agreed to in writing,
-  software distributed under the License is distributed on an
-  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-  KIND, either express or implied.  See the License for the
-  specific language governing permissions and limitations
-  under the License.
+
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
 
 Apache Traffic Server Cache
 ***************************
@@ -30,4 +30,4 @@ Contents:
    tier-storage.en
    ram-cache.en
 
-..   appendix
+.. appendix

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/arch/cache/ram-cache.en.rst
----------------------------------------------------------------------
diff --git a/doc/arch/cache/ram-cache.en.rst b/doc/arch/cache/ram-cache.en.rst
index b0b15e1..e7174cf 100644
--- a/doc/arch/cache/ram-cache.en.rst
+++ b/doc/arch/cache/ram-cache.en.rst
@@ -5,9 +5,9 @@
    to you under the Apache License, Version 2.0 (the
    "License"); you may not use this file except in compliance
    with the License.  You may obtain a copy of the License at
-   
+
    http://www.apache.org/licenses/LICENSE-2.0
-   
+
    Unless required by applicable law or agreed to in writing,
    software distributed under the License is distributed on an
    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@@ -21,68 +21,149 @@
 Ram Cache
 *********
 
-New Ram Cache Algorithm (CLFUS)
+New RAM Cache Algorithm (CLFUS)
 ===============================
 
-The new Ram Cache uses ideas from a number of cache replacement policies and algorithms, including LRU, LFU, CLOCK, GDFS and 2Q, called CLFUS (Clocked Least Frequently Used by Size). It avoids any patented algorithms and includes the following features:
+The new RAM Cache uses ideas from a number of cache replacement policies and
+algorithms, including LRU, LFU, CLOCK, GDFS and 2Q, called CLFUS (Clocked Least
+Frequently Used by Size). It avoids any patented algorithms and includes the
+following features:
 
-* Balances Recentness, Frequency and Size to maximize hit rate (not byte hit rate).
-* Is Scan Resistant and extracts robust hit rates even when the working set does not fit in the Ram Cache.
-* Supports compression at 3 levels fastlz, gzip(libz), and xz(liblzma).  Compression can be moved to another thread.
-* Has very low CPU overhead, only little more than a basic LRU.  Rather than using an O(lg n) heap, it uses a probabilistic replacement policy for O(1) cost with low C.
-* Has relatively low memory overhead of approximately 200 bytes per object in memory.
+* Balances Recentness, Frequency and Size to maximize hit rate (not byte hit
+  rate).
 
-The rational for emphasizing hit rate over byte hit rate is that the overhead of pulling more bytes from secondary storage is low compared to the cost of a request.
+* Is Scan Resistant and extracts robust hit rates even when the working set does
+  not fit in the RAM Cache.
 
-The Ram Cache consists of an object hash fronting 2 LRU/CLOCK lists and a "Seen" hash table.  The first "Cached" list contains objects in memory while the second contains a "History" of objects which have either recently been in memory or are being considered for keeping in memory.  The "Seen" hash table is used to make the algorithm scan resistant.
+* Supports compression at 3 levels: fastlz, gzip (libz), and xz (liblzma).
+  Compression can be moved to another thread.
 
-The list entries record the following information:
+* Has very low CPU overhead, only slightly more than a basic LRU. Rather than
+  using an O(lg n) heap, it uses a probabilistic replacement policy for O(1)
+  cost with low C.
 
-* key - 16 byte unique object identifier
-* auxkeys - 8 bytes worth of version number (in our system the block in the partition).  When the version of an object changes old entries are purged from the cache.
-* hits - number of hits within this clock period
-* size - the size of the object in the cache
-* len - the actual length of the object (differs from size because of compression and padding)
-* compressed_len - the compressed length of the object
-* compressed (none, fastlz, libz, liblzma)
-* uncompressible (flag)
-* copy - whether or not this object should be copied in and copied out (e.g. HTTP HDR)
-* LRU link
-* HASH link
-* IOBufferData (smart point to the data buffer)
+* Has relatively low memory overhead of approximately 200 bytes per object in
+  memory.
 
+The rationale for emphasizing hit rate over byte hit rate is that the overhead
+of pulling more bytes from secondary storage is low compared to the cost of a
+request.
+
+The RAM Cache consists of an object hash fronting 2 LRU/CLOCK lists and a *seen*
+hash table. The first cached list contains objects in memory, while the second
+contains a history of objects which have either recently been in memory or are
+being considered for keeping in memory. The *seen* hash table is used to make
+the algorithm scan resistant.
+
+The list entries record the following information:
 
-The interface to the cache is Get and Put operations.  Get operations check if an object is in the cache and are called on a read attempt.  The Put operation decides whether or not to cache the provided object in memory.  It is called after a read from secondary storage.
+============== ================================================================
+Value          Description
+============== ================================================================
+key            16 byte unique object identifier
+auxkeys        8 bytes worth of version number (in our system, the block in the
+               partition). When the version of an object changes old entries are
+               purged from the cache.
+hits           Number of hits within this clock period.
+size           size of the object in the cache.
+len            Length of the object, which differs from *size* because of
+               compression and padding).
+compressed_len Compressed length of the object.
+compressed     Compression type, or ``none`` if no compression. Possible types
+               are: *fastlz*, *libz*, and *liblzma*.
+uncompressible Flag indicating that content cannot be compressed (true), or that
+               it mat be compressed (false).
+copy           Whether or not this object should be copied in and copied out
+               (e.g. HTTP HDR).
+LRU link
+HASH link
+IOBufferData   Smart point to the data buffer.
+============== ================================================================
+
+The interface to the cache is *Get* and *Put* operations. Get operations check
+if an object is in the cache and are called on a read attempt. The Put operation
+decides whether or not to cache the provided object in memory. It is called
+after a read from secondary storage.
 
 Seen Hash
 =========
 
-The Seen List becomes active after the Cached and History lists become full after a cold start.  The purpose is to make the cache scan resistant which means that the cache state must not be effected at all by a long sequence Get and Put operations on objects which are seen only once.  This is essential, without it not only would the cache be polluted, but it could lose critical information about the objects that it cares about.  It is therefore essential that the Cache and History lists are not effected by Get or Put operations on objects seen the first time.  The Seen Hash maintains a set of 16 bit hash tags, and requests which do not hit in the object cache (are in the Cache List or History List) and do not match the hash tag result in the hash tag begin updated but are otherwise ignored. The Seen Hash is sized to approximately the number of objects in the cache in order to match the number that are passed through it with the CLOCK rate of the Cached and History Lists. 
+The *Seen List* becomes active after the *Cached* and *History* lists become
+full following a cold start. The purpose is to make the cache scan resistant,
+which means that the cache state must not be affected at all by a long sequence
+Get and Put operations on objects which are seen only once. This is essential,
+and without it not only would the cache be polluted, but it could lose critical
+information about the objects that it cares about. It is therefore essential
+that the Cache and History lists are not affected by Get or Put operations on
+objects seen the first time. The Seen Hash maintains a set of 16 bit hash tags,
+and requests which do not hit in the object cache (are in the Cache List or
+History List) and do not match the hash tag result in the hash tag being updated
+but are otherwise ignored. The Seen Hash is sized to approximately the number of
+objects in the cache in order to match the number that are passed through it
+with the CLOCK rate of the Cached and History Lists.
 
 Cached List
 ===========
 
-The Cached list contains objects actually in memory.  The basic operation is LRU with new entries inserted into a FIFO (queue) and hits causing objects to be reinserted.  The interesting bit comes when an object is being considered for insertion.  First we check if the Object Hash to see if the object is in the Cached List or History.  Hits result in updating the "hit" field and reinsertion.  History hits result in the "hit" field being updated and a comparison to see if this object should be kept in memory.  The comparison is against the least recently used members of the Cache List, and is based on a weighted frequency::
+The *Cached List* contains objects actually in memory. The basic operation is
+LRU with new entries inserted into a FIFO queue and hits causing objects to be
+reinserted. The interesting bit comes when an object is being considered for
+insertion. A check is first made against the Object Hash to see if the object
+is in the Cached List or History. Hits result in updating the ``hit`` field and
+reinsertion of the object. History hits result in the ``hit`` field being
+updated and a comparison to see if this object should be kept in memory. The
+comparison is against the least recently used members of the Cache List, and
+is based on a weighted frequency::
 
    CACHE_VALUE = hits / (size + overhead)
 
-A new object must beat enough bytes worth of currently cached objects to cover itself.  Each time an object is considered for replacement the CLOCK moves forward.  If the History object has a greater value then it is inserted into the Cached List and the replaced objects are removed from memory and their list entries are inserted into the History List.  If the History object has a lesser value it is reinserted into the History List.  Objects considered for replacement (at least one) but not replaced have their "hits" field set to zero and are reinserted into the Cached List.  This is the CLOCK operation on the Cached List.
+A new object must be enough bytes worth of currently cached objects to cover
+itself. Each time an object is considered for replacement the CLOCK moves
+forward. If the History object has a greater value then it is inserted into the
+Cached List and the replaced objects are removed from memory and their list
+entries are inserted into the History List. If the History object has a lesser
+value it is reinserted into the History List. Objects considered for replacement
+(at least one) but not replaced have their ``hits`` field set to ``0`` and are
+reinserted into the Cached List. This is the CLOCK operation on the Cached List.
 
 History List
 ============
 
-Each CLOCK the least recently used entry in the History List is dequeued and if the "hits" field is not greater than 1 (it was hit at least once in the History or Cached List) it is deleted, otherwise the "hits" is set to zero and it is requeued on the History List. 
-
-Compression/Decompression
-=========================
-
-Compression is performed by a background operation (currently called as part of Put) which maintains a pointer into the Cached List and runs toward the head compressing entries.  Decompression occurs on demand during a Get.  In the case of objects tagged "copy" the compressed version is reinserted in the LRU since we need to make a copy anyway.  Those not tagged "copy" are inserted uncompressed in the hope that they can be reused in uncompressed form.  This is a compile time option and may be something we want to change.
-
-There are 3 algorithms and levels of compression (speed on 1 thread i7 920) :
-
-* fastlz: 173 MB/sec compression, 442 MB/sec decompression : basically free since disk or network will limit first, ~53% final size
-* libz: 55 MB/sec compression, 234 MB/sec decompression : almost free, particularly decompression, ~37% final size
-* liblzma: 3 MB/sec compression, 50 MB/sec decompression : expensive, ~27% final size
-
-These are ballpark numbers, and your millage will vary enormously.  JPEG for example will not compress with any of these. The RamCache does detect compression level and will declare something "incompressible" if it doesn't get below 90% of the original size. This value is cached so that the RamCache will not attempt to compress it again (at least as long as it is in the history).
+Each CLOCK, the least recently used entry in the History List is dequeued and
+if the ``hits`` field is not greater than ``1`` (it was hit at least once in
+the History or Cached List) it is deleted. Otherwise, the ``hits`` is set to
+``0`` and it is requeued on the History List.
+
+Compression and Decompression
+=============================
+
+Compression is performed by a background operation (currently called as part of
+Put) which maintains a pointer into the Cached List and runs toward the head
+compressing entries. Decompression occurs on demand during a Get. In the case
+of objects tagged ``copy``, the compressed version is reinserted in the LRU
+since we need to make a copy anyway. Those not tagged ``copy`` are inserted
+uncompressed in the hope that they can be reused in uncompressed form. This is
+a compile time option and may be something we want to change.
+
+There are 3 algorithms and levels of compression (speed on an Intel i7 920
+series processor using one thread):
+
+======= ================ ================== ====================================
+Method  Compression Rate Decompression Rate Notes
+======= ================ ================== ====================================
+fastlz  173 MB/sec       442 MB/sec         Basically free since disk or network
+                                            will limit first; ~53% final size.
+libz    55 MB/sec        234 MB/sec         Almost free, particularly
+                                            decompression; ~37% final size.
+liblzma 3 MB/sec         50 MB/sec          Expensive; ~27% final size.
+======= ================ ================== ====================================
+
+These are ballpark numbers, and your millage will vary enormously. JPEG, for
+example, will not compress with any of these (or at least will only do so at
+such a marginal level that the cost of compression and decompression is wholly
+unjustified), and the same is true of many other media and binary file types
+which embed some form of compression. The RAM Cache does detect compression
+level and will declare something *incompressible* if it doesn't get below 90% of
+the original size. This value is cached so that the RAM Cache will not attempt
+to compress it again (at least as long as it is in the history).
 

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/arch/cache/tier-storage.en.rst
----------------------------------------------------------------------
diff --git a/doc/arch/cache/tier-storage.en.rst b/doc/arch/cache/tier-storage.en.rst
index 933b02b..88c147d 100644
--- a/doc/arch/cache/tier-storage.en.rst
+++ b/doc/arch/cache/tier-storage.en.rst
@@ -15,104 +15,129 @@
    specific language governing permissions and limitations
    under the License.
 
-==============================
 Tiered Storage Design
 ==============================
 
 .. include:: common.defs
 
---------------
 Introduction
 --------------
 
-Tiered storage is an attempt to allow |TS| to take advantage of physical storage with different properties. This design
-concerns only mechanism. Policies to take advantage of these are outside of the scope of this document. Instead we will
-presume an *oracle* which implements this policy and describe the queries that must be answered by the oracle and the
-effects of the answers.
-
-Beyond avoiding question of tier policy the design is also intended to be effectively identical to current operations
-for the case where there is only one tier.
-
-The most common case for tiers is an ordered list of tiers, where higher tiers are presumed faster but more expensive
-(or more limited in capacity). This is not required. It might be that different tiers are differentiated by other
-properties (such as expected persistence). The design here is intended to handle both cases.
-
-The design presumes that if a user has multiple tiers of storage and an ordering for those tiers, they will usually want
-content stored at one tier level to also be stored at every other lower level as well, so that it does not have to be
+Tiered storage is an attempt to allow |TS| to take advantage of physical storage
+with different properties. This design concerns only mechanism. Policies to take
+advantage of these are outside of the scope of this document. Instead we will
+presume an *oracle* which implements this policy and describe the queries that
+must be answered by the oracle and the effects of the answers.
+
+Beyond avoiding question of tier policy, the design is also intended to be
+effectively identical to current operations for the case where there is only
+one tier.
+
+The most common case for tiers is an ordered list of tiers, where higher tiers
+are presumed faster but more expensive (or more limited in capacity). This is
+not required. It might be that different tiers are differentiated by other
+properties (such as expected persistence). The design here is intended to
+handle both cases.
+
+The design presumes that if a user has multiple tiers of storage and an ordering
+for those tiers, they will usually want content stored at one tier level to also
+be stored at every other lower level as well, so that it does not have to be
 copied if evicted from a higher tier.
 
--------------
 Configuration
 -------------
 
-Each storage unit in :file:`storage.config` can be marked with a *quality* value which is 32 bit number. Storage units
-that are not marked are all assigned the same value which is guaranteed to be distinct from all explicit values. The
-quality value is arbitrary from the point of view of this design, serving as a tag rather than a numeric value. The user
-(via the oracle) can impose what ever additional meaning is useful on this value (rating, bit slicing, etc.). In such
-cases all volumes should be explicitly assigned a value, as the default / unmarked value is not guaranteed to have any
-relationship to explicit values. The unmarked value is intended to be useful in situations where the user has no
-interest in tiered storage and so wants to let Traffic Server automatically handle all volumes as a single tier.
+Each :term:`storage unit` in :file:`storage.config` can be marked with a
+*quality* value which is 32 bit number. Storage units that are not marked are
+all assigned the same value which is guaranteed to be distinct from all explicit
+values. The quality value is arbitrary from the point of view of this design,
+serving as a tag rather than a numeric value. The user (via the oracle) can
+impose what ever additional meaning is useful on this value (rating, bit
+slicing, etc.).
+
+In such cases, all :term:`volumes <cache volume>` should be explicitly assigned
+a value, as the default (unmarked) value is not guaranteed to have any
+relationship to explicit values. The unmarked value is intended to be useful in
+situations where the user has no interest in tiered storage and so wants to let
+|TS| automatically handle all volumes as a single tier.
 
--------------
 Operations
--------------
+----------
 
-After a client request is received and processed, volume assignment is done. This would be changed to do volume assignment across all tiers simultaneously. For each tier the oracle would return one of four values along with a volume pointer.
+After a client request is received and processed, volume assignment is done. For
+each tier, the oracle would return one of four values along with a volume
+pointer:
 
-`READ`
+``READ``
     The tier appears to have the object and can serve it.
 
-`WRITE`
-    The object is not in this tier and should be written to this tier if possible.
+``WRITE``
+    The object is not in this tier and should be written to this tier if
+    possible.
 
-`RW`
-    Treat as `READ` if possible but if the object turns out to not in the cache treat as `WRITE`.
+``RW``
+    Treat as ``READ`` if possible, but if the object turns out to not in the
+    cache treat as ``WRITE``.
 
-`NO_SALE`
+``NO_SALE``
     Do not interact with this tier for this object.
 
-The volume returned for the tier must be a volume with the corresponding tier quality value. In effect the current style
-of volume assignment is done for each tier, by assigning one volume out of all of the volumes of the same quality and
-returning one of `RW` or `WRITE` depending on whether the initial volume directory lookup succeeds. Note that as with
-current volume assignment it is presumed this can be done from in memory structures (no disk I/O required).
+The :term:`volume <cache volume>` returned for the tier must be a volume with
+the corresponding tier quality value. In effect, the current style of volume
+assignment is done for each tier, by assigning one volume out of all of the
+volumes of the same quality and returning one of ``RW`` or ``WRITE``, depending
+on whether the initial volume directory lookup succeeds. Note that as with
+current volume assignment, it is presumed this can be done from in memory
+structures (no disk I/O required).
+
+If the oracle returns ``READ`` or ``RW`` for more than one tier, it must also
+return an ordering for those tiers (it may return an ordering for all tiers,
+ones that are not readable will be ignored). For each tier, in that order, a
+read of cache storage is attempted for the object. A successful read locks that
+tier as the provider of cached content. If no tier has a successful read, or no
+tier is marked ``READ`` or ``RW`` then it is a cache miss. Any tier marked
+``RW`` that fails the read test is demoted to ``WRITE``.
+
+If the object is cached, every tier that returns ``WRITE`` receives the object
+to store in the selected volume (this includes ``RW`` returns that are demoted
+to ``WRITE``). This is a cache to cache copy, not from the :term:`origin server`.
+In this case, tiers marked ``RW`` that are not tested for read will not receive
+any data and will not be further involved in the request processing.
+
+For a cache miss, all tiers marked ``WRITE`` will receive data from the origin
+server connection (if successful).
+
+This means, among other things, that if there is a tier with the object all
+other tiers that are written will get a local copy of the object, and the origin
+server will not be used. In terms of implementation, currently a cache write to
+a volume is done via the construction of an instance of :cpp:class:`CacheVC`
+which recieves the object stream. For tiered storage, the same thing is done
+for each target volume.
+
+For cache volume overrides (via :file:`hosting.config`) this same process is
+used except with only the volumes stripes contained within the specified cache
+volume.
 
-If the oracle returns `READ` or `RW` for more than one tier, it must also return an ordering for those tiers (it may
-return an ordering for all tiers, ones that are not readable will be ignored). For each tier, in that order, a read of
-cache storage is attempted for the object. A successful read locks that tier as the provider of cached content. If no
-tier has a successful read, or no tier is marked `READ` or `RW` then it is a cache miss. Any tier marked `RW` that fails
-the read test is demoted to `WRITE`.
-
-If the object is cached every tier that returns `WRITE` receives the object to store in the selected volume (this
-includes `RW` returns that are demoted to `WRITE`). This is a cache to cache copy, not from the origin server. In this
-case tiers marked `RW` that are not tested for read will not receive any data and will not be further involved in the
-request processing.
-
-For a cache miss, all tiers marked `WRITE` will receive data from the origin server connection (if successful).
-
-This means, among other things, that if there is a tier with the object all other tiers that are written will get a
-local copy of the object, the origin server will not be used. In terms of implementation, currently a cache write to a
-volume is done via the construction of an instance of :cpp:class:`CacheVC` which recieves the object stream. For tiered storage the
-same thing is done for each target volume.
-
-For cache volume overrides (e.g. via :file:`hosting.config`) this same process is used except with only the volumes
-stripes contained within the specified cache volume.
-
--------
 Copying
 -------
 
-It may be necessary to provide a mechanism to copy objects between tiers outside of a client originated transaction. In
-terms of implementation this is straight forward using :cpp:class:`HttpTunnel` as if in a transaction only using a :cpp:class:`CacheVC`
-instance for both the producer and consumer. The more difficult question is what event would trigger a possible copy. A
-signal could be provided whenever a volume directory entry is deleted although it should be noted that the object in
-question may have already been evicted when this event happens.
+It may be necessary to provide a mechanism to copy objects between tiers outside
+of a client originated transaction. In terms of implementation, this is straight
+forward using :cpp:class:`HttpTunnel` as if in a transaction, only using a
+:cpp:class:`CacheVC` instance for both the producer and consumer. The more
+difficult question is what event would trigger a possible copy. A signal could
+be provided whenever a volume directory entry is deleted, although it should be
+noted that the object in question may have already been evicted when this event
+happens.
 
-----------------
 Additional Notes
 ----------------
 
-As an example use, it would be possible to have only one cache volume that uses tiered storage for a particular set of
-domains using volume tagging. :file:`hosting.config` would be used to direct those domains to the selected cache volume.
-The oracle would check the URL in parallel and return `NO_SALE` for the tiers in the target cache volume for other
-domains. For the other tier (that of the unmarked storage units) the oracle would return `RW` for the tier in all cases
-as that tier would not be queried for the target domains.
+As an example use, it would be possible to have only one cache volume that uses
+tiered storage for a particular set of domains using volume tagging.
+:file:`hosting.config` would be used to direct those domains to the selected
+cache volume. The oracle would check the URL in parallel and return ``NO_SALE``
+for the tiers in the target cache volume for other domains. For the other tier
+(that of the unmarked storage units), the oracle would return ``RW`` for the
+tier in all cases as that tier would not be queried for the target domains.
+

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/arch/hacking/config-var-impl.en.rst
----------------------------------------------------------------------
diff --git a/doc/arch/hacking/config-var-impl.en.rst b/doc/arch/hacking/config-var-impl.en.rst
index 2f3a584..a0afcaf 100644
--- a/doc/arch/hacking/config-var-impl.en.rst
+++ b/doc/arch/hacking/config-var-impl.en.rst
@@ -44,25 +44,25 @@
 .. _RECU_DYNAMIC: recu-dynamic_
 
 
-=====================================
 Configuration Variable Implementation
 =====================================
 
-Adding a new configuration variable in :file:`records.config` requires a number of steps which are mostly documented
-here.
+Adding a new configuration variable in :file:`records.config` requires a number
+of steps which are mostly documented here.
 
-Before adding a new configuration variable, please discuss it on the mailing list. It will commonly be the case that a
-better name will be suggested or a more general approach to the problem which solves several different issues.
+Before adding a new configuration variable, please discuss it on the mailing
+list. It will commonly be the case that a better name, or a more general
+approach to the problem which solves several different issues, may be suggested.
 
-=====================================
 Defining the Variable
-=====================================
+=====================
 
-To begin the new configuration variables must be added to |RecordsConfig.cc|_. This contains a long array of
-configuration variable records. The fields for each record are
+To begin, the new configuration variables must be added to |RecordsConfig.cc|_.
+This contains a long array of configuration variable records. The fields for
+each record are:
 
 type:``RecT``
-   Type of record. There valid values are
+   Type of record. The valid values are:
 
    ``RECT_NULL``
       Undefined record.
@@ -85,22 +85,28 @@ type:``RecT``
    ``RECT_PLUGIN``
       Plugin created statistic.
 
-   In general ``RECT_CONFIG`` should be used unless it is required that the value not be shared among members of a
-   cluster in which case ``RECT_LOCAL`` should be used. If you use ``RECT_LOCAL`` you must also start the line with ``LOCAL`` instead of ``CONFIG`` and the name should use ``.local.`` instead of ``.config.``.
+   In general, ``RECT_CONFIG`` should be used unless it is required that the
+   value not be shared among members of a cluster, in which case ``RECT_LOCAL``
+   should be used. If you use ``RECT_LOCAL``, you must also start the line with
+   ``LOCAL`` instead of ``CONFIG`` and the name should use ``.local.`` instead
+   of ``.config.``.
 
 name:``char const*``
-   The fully qualified name of the configuration variable. Although there appears to be a hierarchial naming scheme,
-   that's just a convention, it is not actually used by the code. Nonetheless new variables should adhere to the
-   hierarchial scheme.
+   The fully qualified name of the configuration variable. Although there
+   appears to be a hierarchial naming scheme, that's just a convention, and it
+   is not actually used by the code. Nonetheless, new variables should adhere
+   to the hierarchial scheme.
 
 value_type:``RecDataT``
-   The data type of the value. It should be one of ``RECD_INT``, ``RECD_STRING``, ``RECD_FLOAT`` as appropriate.
+   The data type of the value. It should be one of ``RECD_INT``,
+   ``RECD_STRING``, ``RECD_FLOAT`` as appropriate.
 
 default:``char const*``
-   The default value for the variable. This is always a string regardless of the *value_type*.
+   The default value for the variable. This is always a string regardless of
+   the *value_type*.
 
 update:``RecUpdateT``
-   Information about how the variable is updated. The valid values are
+   Information about how the variable is updated. The valid values are:
 
    ``RECU_NULL``
       Behavior is unknown or unspecified.
@@ -120,12 +126,14 @@ update:``RecUpdateT``
       The :ref:`traffic_cop` process must be restarted for a new value to take effect.
 
 required:``RecordRequiredType``
-   Effectively a boolean that specifies if the record is required to be present, with ``RR_NULL`` meaning not required
-   and ``RR_REQUIRED`` indicating that it is required. Given that using ``RR_REQUIRED`` would be a major
+   Effectively a boolean that specifies if the record is required to be present,
+   with ``RR_NULL`` meaning not required and ``RR_REQUIRED`` indicating that it
+   is required. Given that using ``RR_REQUIRED`` would be a major
    incompatibility, ``RR_NULL`` is generally the better choice.
 
 check:``RecCheckT``
-   Additional type checking. It is unclear if this is actually implemented. The valid values are
+   Additional type checking. It is unclear if this is actually implemented. The
+   valid values are:
 
    ``RECC_NULL``
       No additional checking.
@@ -139,12 +147,15 @@ check:``RecCheckT``
    ``RECC_IP``
       Verify the value is an IP address. Unknown if this checks for IPv6.
 
+.. XXX confirm RECC_IP & IPv6 behavior
+
 pattern:``char const*``
-   Even more validity checking. This provides a regular expressions (PCRE format) for validating the value. This can be
+   This provides a regular expressions (PCRE format) for validating the value,
+   beyond the basic type validation performed by ``RecCheckT``. This can be
    ``NULL`` if there is no regular expression to use.
 
 access:``RecAccessT``
-   Access control. The valid values are
+   Access control. The valid values are:
 
    ``RECA_NULL``
       The value is read / write.
@@ -153,27 +164,34 @@ access:``RecAccessT``
       The value is read only.
 
    ``RECA_NO_ACCESS``
-      No access to the value - only privileged levels parts of the ATS can access the value.
+      No access to the value; only privileged level parts of ATS can access the
+      value.
 
-=====================================
 Variable Infrastructure
-=====================================
+=======================
 
-The primary effort in defining a configuration variable is handling updates, generally via :option:`traffic_line -x`. This
-is handled in a generic way, as described in the next section, or in a :ref:`more specialized way
-<http-config-var-impl>` (built on top of the generic mechanism) for HTTP related configuration variables. This is only
-needed if the variable is marked as dynamically updateable (|RECU_DYNAMIC|_) although HTTP configuration variables
-should be dynamic if possible.
+The primary effort in defining a configuration variable is handling updates,
+generally via :option:`traffic_line -x`. This is handled in a generic way, as
+described in the next section, or in a :ref:`more specialized way <http-config-var-impl>`
+(built on top of the generic mechanism) for HTTP related configuration
+variables. This is only needed if the variable is marked as dynamically
+updateable (|RECU_DYNAMIC|_) although HTTP configuration variables should be
+dynamic if possible.
 
---------------------------
 Documentation and Defaults
 --------------------------
 
-A configuration variable should be documented in :file:`records.config`. There are many examples  in the file already that can be used for guidance. The general format is to use the tag ::
+A configuration variable should be documented in :file:`records.config`. There
+are many examples in the file already that can be used for guidance. The general
+format is to use the tag ::
 
-   .. ts:cv::
+   .. ts:cv:`variable.name.here`
 
-The arguments to this are the same as for the configuration file. The documentation generator will pick out key bits and use them to decorate the entry. In particular if a value is present it will be removed and used as the default value. You can attach some additional options to the variable. These are
+The arguments to this are the same as for the configuration file. The
+documentation generator will pick out key bits and use them to decorate the
+entry. In particular if a value is present it will be removed and used as the
+default value. You can attach some additional options to the variable. These
+are:
 
 reloadable
    The variable can be reloaded via command line on a running Traffic Server.
@@ -186,98 +204,122 @@ deprecated
 
 .. topic:: Example
 
-   ::
-
+   \:ts\:cv\:\`custom.variable\`
       :reloadable:
       :metric: minutes
       :deprecated:
 
-If you need to refer to another configuration variable in the documentation, you can use the form ::
+If you need to refer to another configuration variable in the documentation, you
+can use the form ::
 
    :ts:cv:`the.full.name.of.the.variable`
 
-This will display the name as a link to the definition.
+This will display the name as a link to the full definition.
 
-In general a new configuration variable should not be present in the default :file:`records.config`. If it is added, such defaults should be added to the file ``proxy/config/records.config.default.in``. This is used to generate the default :file:`records.config`. Just add the variable to the file in an appropriate place with a proper default as this will now override whatever default you put in the code for new installs.
+In general, a new configuration variable should not be present in the default
+:file:`records.config`. If it is added, such defaults should be added to the
+file ``proxy/config/records.config.default.in``. This is used to generate the
+default :file:`records.config`. Just add the variable to the file in an
+appropriate place with a proper default as this will now override whatever
+default you put in the code for new installs.
 
-------------------------------
 Handling Updates
-------------------------------
+----------------
 
-The simplest mechanism for handling updates is the ``REC_EstablishStaticConfigXXX`` family of functions. This mechanism
-will cause the value in the indicated instance to be updated in place when an update to :file:`records.config` occurs.
-This is done asynchronously using atomic operations. Use of these variables must keep that in mind.
+The simplest mechanism for handling updates is the ``REC_EstablishStaticConfigXXX``
+family of functions. This mechanism will cause the value in the indicated
+instance to be updated in place when an update to :file:`records.config` occurs.
+This is done asynchronously using atomic operations. Use of these variables must
+keep that in mind.
 
-If a variable requires additional handling when updated a callback can be registered which is called when the variable
-is updated. This is what the ``REC_EstablishStaticConfigXXX`` calls do internally with a callback that simply reads the
-new value and writes it to storage indicated by the call parameters. The functions used are the ``link_XXX`` static
-functions in |RecCore.cc|_.
+If a variable requires additional handling when updated a callback can be
+registered which is called when the variable is updated. This is what the
+``REC_EstablishStaticConfigXXX`` calls do internally with a callback that simply
+reads the new value and writes it to storage indicated by the call parameters.
+The functions used are the ``link_XXX`` static functions in |RecCore.cc|_.
 
-To register a configuration variable callback, call ``RecRegisterConfigUpdateCb`` with the arguments
+To register a configuration variable callback, call ``RecRegisterConfigUpdateCb``
+with the arguments:
 
 ``char const*`` *name*
    The variable name.
 
 *callback*
-   A function with the signature ``<int (char const* name, RecDataT type, RecData data, void* cookie)>``. The *name*
-   value passed is the same as the *name* passed to the registration function as is the *cookie* argument. The *type* and
-   *data* are the new value for the variable. The return value is currently ignored. For future compatibility return
-   ``REC_ERR_OKAY``.
+   A function with the signature ``<int (char const* name, RecDataT type, RecData data, void* cookie)>``.
+   The :arg:`name` value passed is the same as the :arg:`name` passed to the
+   registration function as is the :arg:`cookie` argument. The :arg:`type` and
+   :arg:`data` are the new value for the variable. The return value is currently
+   ignored. For future compatibility return ``REC_ERR_OKAY``.
 
 ``void*`` *cookie*
-   A value passed to the *callback*. This is only for the callback, the internals simply store it and pass it on.
+   A value passed to the *callback*. This is only for the callback, the
+   internals simply store it and pass it on.
 
-*callback* is called under lock so it should be quick and not block. If that is necessary a continuation should be
-scheduled to handle the required action.
+*callback* is called under lock so it should be quick and not block. If that is
+necessary a :term:`continuation` should be scheduled to handle the required
+action.
 
 .. note::
-   The callback occurs asynchronously. For HTTP variables as described in the next section, this is handled by the more
-   specialized HTTP update mechanisms. Otherwise it is the implementor's responsibility to avoid race conditions.
+
+   The callback occurs asynchronously. For HTTP variables as described in the
+   next section, this is handled by the more specialized HTTP update mechanisms.
+   Otherwise it is the implementor's responsibility to avoid race conditions.
 
 .. _http-config-var-impl:
 
-------------------------
 HTTP Configuation Values
 ------------------------
 
-Variables used for HTTP processing should be declared as members of the ``HTTPConfigParams`` structure (but :ref:`see
-<overridable-config-vars>`) and use the specialized HTTP update mechanisms which handle synchronization and
-initialization issues.
-
-The configuration logic maintains two copies of the ``HTTPConfigParams`` structure - the master copy and the current
-copy. The master copy is kept in the ``m_master`` member of the ``HttpConfig`` singleton. The current copy is kept in
-the ConfigProcessor. The goal is to provide a (somewhat) atomic update for configuration variables which are loaded
-individually in to the master copy as updates are received and then bulk copied to a new instance which is then swapped
-in as the current copy. The HTTP state machine interacts with this mechanism to avoid race conditions.
-
-For each variable a mapping between the variable name and the appropriate member in the master copy should be
-established between in the ``HTTPConfig::startup`` method. The ``HttpEstablishStaticConfigXXX`` functions should be used
-unless there is an strong, explicit reason to not do so.
-
-The ``HTTPConfig::reconfigure`` method handles the current copy of the HTTP configuration variables. Logic should be
-added here to copy the value from the master copy to the current copy. Generally this will be a simple assignment. If
-there are dependencies between variables those should be enforced / checked in this method.
+Variables used for HTTP processing should be declared as members of the
+``HTTPConfigParams`` structure (but see :ref:`overridable-config-vars` for
+further details) and use the specialized HTTP update mechanisms which handle
+synchronization and initialization issues.
+
+The configuration logic maintains two copies of the ``HTTPConfigParams``
+structure, the master copy and the current copy. The master copy is kept in the
+``m_master`` member of the ``HttpConfig`` singleton. The current copy is kept in
+the ConfigProcessor. The goal is to provide a (somewhat) atomic update for
+configuration variables which are loaded individually in to the master copy as
+updates are received and then bulk copied to a new instance which is then
+swapped in as the current copy. The HTTP state machine interacts with this
+mechanism to avoid race conditions.
+
+For each variable, a mapping between the variable name and the appropriate
+member in the master copy should be established between in the ``HTTPConfig::startup``
+method. The ``HttpEstablishStaticConfigXXX`` functions should be used unless
+there is a strong, explicit reason to not do so.
+
+The ``HTTPConfig::reconfigure`` method handles the current copy of the HTTP
+configuration variables. Logic should be added here to copy the value from the
+master copy to the current copy. Generally this will be a simple assignment. If
+there are dependencies between variables, those should be checked and enforced
+in this method.
 
 .. _overridable-config-vars:
 
------------------------
 Overridable Variables
------------------------
+---------------------
 
-HTTP related variables that are changeable per transaction are stored in the ``OverridableHttpConfigParams`` structure,
-an instance of which is the ``oride`` member of ``HTTPConfigParams`` and therefore the points in the previous section
-still apply. The only difference for that is the further ``.oride`` in the structure references.
+HTTP related variables that are changeable per transaction are stored in the
+``OverridableHttpConfigParams`` structure, an instance of which is the ``oride``
+member of ``HTTPConfigParams`` and therefore the points in the previous section
+still apply. The only difference for that is the further ``.oride`` in the
+structure references.
 
-In addition the variable is required to be accessible from the transaction API. In addition to any custom API functions
-used to access the value, the following items are required for generic access
+The variable is required to be accessible from the transaction API. In addition
+to any custom API functions used to access the value, the following items are
+required for generic access:
 
 #. Add a value to the ``TSOverridableConfigKey`` enumeration in |ts.h.in|_.
 
-#. Augment the ``TSHttpTxnConfigFind`` function to return this enumeration value when given the name of the configuration
-   variable. Be sure to count the charaters very carefully.
+#. Augment the ``TSHttpTxnConfigFind`` function to return this enumeration value
+   when given the name of the configuration variable. Be sure to count the
+   charaters very carefully.
+
+#. Augment the ``_conf_to_memberp`` function in |InkAPI.cc|_ to return a pointer
+   to the appropriate member of ``OverridableHttpConfigParams`` and set the type
+   if not a byte value.
 
-#. Augment the ``_conf_to_memberp`` function in |InkAPI.cc|_ to return a pointer to the appropriate member of
-   ``OverridableHttpConfigParams`` and set the type if not a byte value.
+#. Update the testing logic in |InkAPITest.cc|_ by adding the string name of the
+   configuration variable to the ``SDK_Overridable_Configs`` array.
 
-#. Update the testing logic in |InkAPITest.cc|_ by adding the string name of the configuration variable to the
-   ``SDK_Overridable_Configs`` array.

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/arch/hacking/index.en.rst
----------------------------------------------------------------------
diff --git a/doc/arch/hacking/index.en.rst b/doc/arch/hacking/index.en.rst
index 0a5b52a..2fbb17c 100644
--- a/doc/arch/hacking/index.en.rst
+++ b/doc/arch/hacking/index.en.rst
@@ -3,26 +3,27 @@ Hacking
 
 .. Licensed to the Apache Software Foundation (ASF) under one
    or more contributor license agreements.  See the NOTICE file
-  distributed with this work for additional information
-  regarding copyright ownership.  The ASF licenses this file
-  to you under the Apache License, Version 2.0 (the
-  "License"); you may not use this file except in compliance
-  with the License.  You may obtain a copy of the License at
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
 
    http://www.apache.org/licenses/LICENSE-2.0
 
-  Unless required by applicable law or agreed to in writing,
-  software distributed under the License is distributed on an
-  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-  KIND, either express or implied.  See the License for the
-  specific language governing permissions and limitations
-  under the License.
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
 
 Introduction
 ------------
 
-This is a documentation stub on how to hack Apache Traffic Server. Here we try to document things such as how to write
-and run unit or regression tests or how to inspect the state of the core with a debugger.
+This is a documentation stub on how to hack Apache Traffic Server. Here we try
+to document things such as how to write and run unit or regression tests or how
+to inspect the state of the core with a debugger.
 
 .. toctree::
    :maxdepth: 2

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/arch/hacking/release-process.en.rst
----------------------------------------------------------------------
diff --git a/doc/arch/hacking/release-process.en.rst b/doc/arch/hacking/release-process.en.rst
index b74c75c..5ef784e 100644
--- a/doc/arch/hacking/release-process.en.rst
+++ b/doc/arch/hacking/release-process.en.rst
@@ -16,77 +16,94 @@
    under the License.
 
 
-==============================
 Traffic Server Release Process
 ==============================
 
-Managing release is easiest in an environment that is as clean as possible. For this reason cloning the code base in to a new directory for the release process is recommended.
+Managing a release is easiest in an environment that is as clean as possible.
+For this reason, cloning the code base in to a new directory for the release
+process is recommended.
 
-------------
 Requirements
 ------------
 
 * A system for git and building.
-* A cryptographic key that has been signed by at least two other PMC members. This should be preferentially associated with your ``apache.org`` email address but that is not required.
+
+* A cryptographic key that has been signed by at least two other PMC members.
+  This should be preferentially associated with your ``apache.org`` email
+  address but that is not required.
 
 .. _release-management-release-candidate:
 
------------------
 Release Candidate
 -----------------
 
-The first step in a release is making a release candidate. This is distributed to the community for validation before the actual release.
+The first step in a release is making a release candidate. This is distributed
+to the community for validation before the actual release.
 
 Document
 --------
 
-Gather up information about the changes for the release. The ``CHANGES`` file is a good starting point. You may also
-want to check the commits since the last release. The primary purpose of this is to generate a list of the important
+Gather up information about the changes for the release. The ``CHANGES`` file
+is a good starting point. You may also want to check the commits since the last
+release. The primary purpose of this is to generate a list of the important
 changes since the last release.
 
-Create or update a page on the Wiki for the release. If it is a major or minor release it should have its own page. Use
-the previous release page as a template. Point releases should get a section at the end of the corresponding release
-page.
+Create or update a page on the Wiki for the release. If it is a major or minor
+release it should have its own page. Use the previous release page as a
+template. Point releases should get a section at the end of the corresponding
+release page.
 
-Write an announcement for the release. This will contain much of the same information that is on the Wiki page but more
-concisely. Check the `mailing list archives <http://mail-archives.apache.org/mod_mbox/trafficserver-dev/>`_ for examples to use as a base.
+Write an announcement for the release. This will contain much of the same
+information that is on the Wiki page but more concisely. Check the
+`mailing list archives <http://mail-archives.apache.org/mod_mbox/trafficserver-dev/>`_
+for examples to use as a base.
 
 Build
 -----
 
-Go to the top level source directory.
+#. Go to the top level source directory.
+
+#. Check the version in ``configure.ac``. There are two values near the top that
+   need to be set, ``TS_VERSION_S`` and ``TS_VERSION_N``. These are the release
+   version number in different encodings.
 
-* Check the version in ``configure.ac``. There are two values near the top that need to be set, ``TS_VERSION_S`` and
-  ``TS_VERSION_N``. These are the release version number in different encodings.
-* Check the variable ``RC`` in the top level ``Makefile.am``. This should be the point release value. This needs to be changed for every release candidate. The first release candidate is 0 (zero).
+#. Check the variable ``RC`` in the top level ``Makefile.am``. This should be
+   the point release value. This needs to be changed for every release
+   candidate. The first release candidate is ``0`` (zero).
 
-Execute the following commands to make the distribution files. ::
+#. Execute the following commands to make the distribution files. ::
 
-   autoreconf -i
-   ./configure
-   make rel-candidate
+      autoreconf -i
+      ./configure
+      make rel-candidate
 
-This will create the distribution files and sign them using your key. Expect to be prompted twice for your passphrase
-unless you use an ssh key agent. If you have multiple keys you will need to set the default appropriately beforehand, as
-no option will be provided to select the signing key. The files should have names that start
-with ``trafficserver-X.Y.Z-rcA.tar.bz2`` where ``X.Y.Z`` is the version and ``A`` is the release candidate counter. There
-should be four such files, one with no extension and three others with the extensions ``asc``, ``md5``, and ``sha1``. This will also create a signed git tag of the form ``X.Y.Z-rcA``.
+These steps will create the distribution files and sign them using your key.
+Expect to be prompted twice for your passphrase unless you use an ssh key agent.
+If you have multiple keys you will need to set the default appropriately
+beforehand, as no option will be provided to select the signing key. The files
+should have names that start with ``trafficserver-X.Y.Z-rcA.tar.bz2`` where
+``X.Y.Z`` is the version and ``A`` is the release candidate counter. There
+should be four such files, one with no extension and three others with the
+extensions ``asc``, ``md5``, and ``sha1``. This will also create a signed git
+tag of the form ``X.Y.Z-rcA``.
 
 Distribute
 ----------
 
-The release candidate files should be uploaded to some public storage. Your personal storage on ``people.apach.org`` is
-a reasonable location to use.
+The release candidate files should be uploaded to some public storage. Your
+personal storage on *people.apach.org* is a reasonable location to use.
 
-Send the release candiate announcement to the ``users`` and ``dev`` mailinging lists, noting that it is a release
-*candidate* and providing a link to the distribution files you uploaded. This announcement should also call for a vote
+Send the release candiate announcement to the *users* and *dev* mailinging
+lists, noting that it is a release *candidate* and providing a link to the
+distribution files you uploaded. This announcement should also call for a vote
 on the candidate, generally with a 72 hours time limit.
 
-If the voting was successful (at least three "+1" votes and no "-1" votes) proceed to :ref:`release-management-official-release`. Otherwise repeat the :ref:`release-management-release-candidate` process.
+If the voting was successful (at least three "+1" votes and no "-1" votes),
+proceed to :ref:`release-management-official-release`. Otherwise, repeat the
+:ref:`release-management-release-candidate` process.
 
 .. _release-management-official-release:
 
-----------------
 Official Release
 ----------------
 
@@ -94,30 +111,36 @@ Build the distribution files with the command ::
 
    make release
 
-Be sure to not have changed anything since the release candidate was built so the checksums are identical. This will
-create a signed git tag of the form ``X.Y.Z`` and produce the distribution files. Push the tag to the ASF repository with
-the command ::
+Be sure to not have changed anything since the release candidate was built so
+the checksums are identical. This will create a signed git tag of the form
+``X.Y.Z`` and produce the distribution files. Push the tag to the ASF repository
+with the command ::
 
    git push origin X.Y.Z
 
-This presumes ``origin`` is the name for the ASF remote repository which is correct if you originally clone from the ASF
-repository.
+This presumes ``origin`` is the name for the ASF remote repository which is
+correct if you originally clone from the ASF repository.
 
-The distribution files must be added to an SVN repository. This can be accessed with the command::
+The distribution files must be added to an SVN repository. This can be accessed
+with the command::
 
    svn co https://dist.apache.org/repos/dist/release/trafficserver <local-directory>
 
-All four of the distribution files go here. If you are making a point release then you should also remove the distribution
-files for the previous release. Allow 24 hours for the files to be distributed through the ASF infrastructure.
+All four of the distribution files go here. If you are making a point release
+then you should also remove the distribution files for the previous release.
+Allow 24 hours for the files to be distributed through the ASF infrastructure.
 
-The Traffic Server website must be updated. This is an SVN repository which you can access with ::
+The Traffic Server website must be updated. This is an SVN repository which you
+can access with ::
 
    svn co https://svn.apache.org/repos/asf/trafficserver/site/trunk <local-directory>
 
 The files of interest are in the ``content`` directory.
 
 ``index.html``
-   This is the front page. The places to edit here are any security announcements at the top and the "News" section.
+   This is the front page. The places to edit here are any security
+   announcements at the top and the "News" section.
+
 ``downloads.en.mdtext``
    Update the downloads page to point to the new download objects.
 
@@ -125,15 +148,24 @@ After making changes, commit them and then run ::
 
    publish.pl trafficserver <apache-id>
 
-on the ``people.apache.org`` host.
+On the ``people.apache.org`` host.
+
+If needed, update the Wiki page for the release to point at the release
+distribution files.
+
+Update the announcement, if needed, to refer to the release distribution files
+and remove the comments concerning the release candidate. This announcement
+should be sent to the *users* and *dev* mailing lists. It should also be sent
+to the ASF announcement list, which must be done using an ``apache.org`` email
+address.
+
+Finally, update various files after the release:
+
+* The ``STATUS`` file for master and for the release branch to include this version.
 
-If needed, update the Wiki page for the release to point at the release distribution files.
+* The ``CHANGES`` file to have a header for the next version.
 
-Update the announcement if needed to refer to the release distribution files and remove the comments concerning the release candidate. This announcement should be sent to the ``users`` and ``dev`` mailing lists. It should also be sent to the ASF announcement list, which must be done using an ``apache.org`` email address.
+* ``configure.ac`` to be set to the next version.
 
-Finally, update various files after the release.
+* In the top level ``Makefile.am`` change ``RC`` to have the value ``0``.
 
-   * The ``STATUS`` file for master and for the release branch to include this version.
-   * The ``CHANGES`` file to have a header for the next version.
-   * ``configure.ac`` to be set to the next version.
-   * In the top level ``Makefile.am`` change ``RC`` to have the value ``0``.

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/arch/index.en.rst
----------------------------------------------------------------------
diff --git a/doc/arch/index.en.rst b/doc/arch/index.en.rst
index 6cc9fac..2ed51dc 100644
--- a/doc/arch/index.en.rst
+++ b/doc/arch/index.en.rst
@@ -3,29 +3,32 @@ Architecture and Hacking
 
 .. Licensed to the Apache Software Foundation (ASF) under one
    or more contributor license agreements.  See the NOTICE file
-  distributed with this work for additional information
-  regarding copyright ownership.  The ASF licenses this file
-  to you under the Apache License, Version 2.0 (the
-  "License"); you may not use this file except in compliance
-  with the License.  You may obtain a copy of the License at
+   distributed with this work for additional information
+   regarding copyright ownership.  The ASF licenses this file
+   to you under the Apache License, Version 2.0 (the
+   "License"); you may not use this file except in compliance
+   with the License.  You may obtain a copy of the License at
 
    http://www.apache.org/licenses/LICENSE-2.0
 
-  Unless required by applicable law or agreed to in writing,
-  software distributed under the License is distributed on an
-  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-  KIND, either express or implied.  See the License for the
-  specific language governing permissions and limitations
-  under the License.
+   Unless required by applicable law or agreed to in writing,
+   software distributed under the License is distributed on an
+   "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+   KIND, either express or implied.  See the License for the
+   specific language governing permissions and limitations
+   under the License.
 
 Introduction
---------------
+------------
 
-The original architectural documents for Traffic Server were lost in the transition to an open source project. The
-documents in this section are provisional and were written based on the existing code. The purpose is to have a high
-level description of aspects of Traffic Server to better inform ongoing work.
+The original architectural documents for Traffic Server were lost in the
+transition to an open source project. The documents in this section are
+provisional and were written based on the existing code. The purpose is to have
+a high level description of aspects of Traffic Server to better inform ongoing
+work.
 
-In the final section on "hacking" we try to document our approaches to understanding and modifying the source.
+In the final section on "hacking" we try to document our approaches to
+understanding and modifying the source.
 
 Contents:
 

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/glossary.en.rst
----------------------------------------------------------------------
diff --git a/doc/glossary.en.rst b/doc/glossary.en.rst
index 94996db..758523e 100644
--- a/doc/glossary.en.rst
+++ b/doc/glossary.en.rst
@@ -99,3 +99,25 @@ Glossary
       The unit of storage in the cache. All reads from the cache always read exactly one fragment. Fragments may be
       written in groups, but every write is always an integral number of fragments. Each fragment has a corresponding
       :term:`directory entry` which describes its location in the cache storage.
+
+   object store
+      The database of :term:`cache objects <cache object>`.
+
+   fresh
+      The state of a :term:`cache object` which can be served directly from the
+      the cache in response to client requests. Fresh objects have not met or
+      passed their :term:`origin server` defined expiration time, nor have they
+      reached the algorithmically determined :term:`stale` age.
+
+   stale
+      The state of a :term:`cache object` which is not yet expired, but has
+      reached an algorithmically determined age at which the :term:`origin server`
+      will be contacted to :term:`revalidate <revalidation>` the freshness of
+      the object. Contrast with :term:`fresh`.
+
+   origin server
+      An HTTP server which provides the original source of content being cached
+      by Traffic Server.
+
+   cache partition
+

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/reference/configuration/records.config.en.rst
----------------------------------------------------------------------
diff --git a/doc/reference/configuration/records.config.en.rst b/doc/reference/configuration/records.config.en.rst
index 6397461..72bb3d3 100644
--- a/doc/reference/configuration/records.config.en.rst
+++ b/doc/reference/configuration/records.config.en.rst
@@ -1334,9 +1334,9 @@ Cache Control
 
 .. ts:cv:: CONFIG proxy.config.cache.target_fragment_size INT 1048576
 
-   Sets the target size of a contiguous fragment of a file in the disk cache. Accepts values that are powers of 2, e.g. 65536, 131072,
-   262144, 524288, 1048576, 2097152, etc. When setting this, consider that larger numbers could waste memory on slow connections,
-   but smaller numbers could increase (waste) seeks.
+   Sets the target size of a contiguous fragment of a file in the disk cache.
+   When setting this, consider that larger numbers could waste memory on slow
+   connections, but smaller numbers could increase (waste) seeks.
 
 RAM Cache
 =========


[2/3] trafficserver git commit: docs: update focused on architecture documentation

Posted by jp...@apache.org.
http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/arch/cache/cache-arch.en.rst
----------------------------------------------------------------------
diff --git a/doc/arch/cache/cache-arch.en.rst b/doc/arch/cache/cache-arch.en.rst
index 78764d7..fecd0fb 100644
--- a/doc/arch/cache/cache-arch.en.rst
+++ b/doc/arch/cache/cache-arch.en.rst
@@ -23,81 +23,99 @@ Cache Architecture
 Introduction
 ~~~~~~~~~~~~
 
-In addition to an HTTP proxy, |ATS| is also an HTTP cache. |TS| can cache any octet stream although it currently
-supports only those octet streams delivered by the HTTP protocol. When such a stream is cached (along with the HTTP
-protocol headers) it is termed an :term:`object <cache object>` in the cache. Each object is identified by a globally
-unique value called a :term:`cache key`.
-
-The purpose of this document is to describe the basic structure and implementation details of the |TS| cache.
-Configuration of the cache will be discussed only to the extent needed to understand the internal mechanisms. This
-document will be useful primarily to |TS| developers working on the |TS| codebase or plugins for |TS|. It is assumed the
-reader is already familiar with the :ref:`admin-guide` and specifically with :ref:`http-proxy-caching` and
-:ref:`configuring-the-cache` along with the associated configuration files and values.
-
-Unfortunately the internal terminology is not particularly consistent so this document will frequently use terms in
-different ways than they are used in the code in an attempt to create some consistency.
+In addition to being an HTTP proxy, |ATS| is also an HTTP cache. |TS| can cache
+any octet stream, although it currently supports only those octet streams
+delivered by the HTTP protocol. When such a stream is cached (along with the
+HTTP protocol headers) it is termed an :term:`object <cache object>` in the
+cache. Each object is identified by a globally unique value called a
+:term:`cache key`.
+
+The purpose of this document is to describe the basic structure and
+implementation details of the |TS| cache. Configuration of the cache will be
+discussed only to the extent needed to understand the internal mechanisms. This
+document will be useful primarily to |TS| developers working on the |TS|
+codebase or plugins for |TS|. It is assumed the reader is already familiar with
+the :ref:`admin-guide` and specifically with :ref:`http-proxy-caching` and
+:ref:`configuring-the-cache` along with the associated configuration files and
+values.
+
+Unfortunately, the internal terminology is not particularly consistent, so this
+document will frequently use terms in different ways than they are used in the
+code in an attempt to create some consistency.
 
 Cache Layout
 ~~~~~~~~~~~~
 
-The following sections describe how persistent cache data is structured. |TS| treats its persisent storage an
-undifferentiated collection of bytes, assuming no other structure to it. In particular it does not use the file system
-of the host operating system. If a file is used it is used only to mark out the set of bytes to be used.
+The following sections describe how persistent cache data is structured. |TS|
+treats its persisent storage as an undifferentiated collection of bytes,
+assuming no other structure to it. In particular, it does not use the file
+system of the host operating system. If a file is used it is used only to mark
+out the set of bytes to be used.
 
 Cache storage
 =============
 
-The raw storage for the |TS| cache is configured in :file:`storage.config`. Each line in the file defines a :term:`cache
-span` which is treated as a uniform persistent store.
+The raw storage for the |TS| cache is configured in :file:`storage.config`. Each
+line in the file defines a :term:`cache span` which is treated as a uniform
+persistent store.
 
 .. figure:: images/cache-spans.png
    :align: center
 
    Two cache spans
 
-This storage organized in to a set of :term:`cache volume`\ s which are defined in :file:`volume.config` for the
-purposes of the administrator. These are the units that used for all other administator level configuration.
+This storage organized into a set of :term:`cache volumes <cache volume>` which
+are defined in :file:`volume.config`. These are the units that are used for all
+other administator level configuration.
 
-Cache volumes can be defined by a percentage of the total storage or an absolute amount of storage. By default each
-cache volume is spread across all of the cache spans for robustness. The intersection of a cache volume and a cache span
-is a :term:`cache stripe`. Each cache span is divided in to cache stripes and each cache volume is a collection of those
-stripes.
+Cache volumes can be defined by a percentage of the total storage or as an
+absolute amount of storage. By default, each cache volume is spread across all
+of the cache spans for robustness. The intersection of a cache volume and a
+cache span is a :term:`cache stripe`. Each cache span is divided into cache
+stripes and each cache volume is a collection of those stripes.
 
-If the cache volumes for the example cache spans were defined as
+If the cache volumes for the example cache spans were defined as:
 
 .. image:: images/ats-cache-volume-definition.png
    :align: center
 
-then the actual layout would look like
+Then the actual layout would look like:
 
 .. image:: images/cache-span-layout.png
    :align: center
 
-Cache stripes are the fundamental unit of cache for the implementation. A cached object is stored entirely in a single
-stripe, and therefore in a single cache span - objects are never split across cache spans or volumes. Objects are
-assigned to a stripe (and hence to a cache volume) automatically based on a hash of the URI used to retrieve the object
-from the origin server. It is possible to configure this to a limited extent in :file:`hosting.config` which supports
-content from specific host or domain to be stored on specific cache volumes. In addition, as of version 4.0.1 it is
-possible to control which cache spans (and hence, which cache stripes) are contained in a specific cache volume.
-
-The layout and structure of the cache spans, the cache volumes, and the cache stripes that compose them are derived
-entirely from the :file:`storage.config` and :file:`cache.config` and is recomputed from scratch when the
-:program:`traffic_server` is started. Therefore any change to those files can (and almost always will) invalidate the
-existing cache in its entirety.
+Cache stripes are the fundamental unit of cache for the implementation. A
+cached object is stored entirely in a single stripe, and therefore in a single
+cache span. Objects are never split across cache spans or volumes. Objects are
+assigned to a stripe (and in turn to a cache volume) automatically based on a
+hash of the URI used to retrieve the object from the :term:`origin server`. It
+is possible to configure this to a limited extent in :file:`hosting.config`,
+which supports content from specific hosts or domain to be stored on specific
+cache volumes. As of version 4.0.1 it is also possible to control which cache
+spans (and hence, which cache stripes) are contained in a specific cache volume.
+
+The layout and structure of the cache spans, the cache volumes, and the cache
+stripes that compose them are derived entirely from :file:`storage.config` and
+:file:`cache.config` and is recomputed from scratch when the
+:program:`traffic_server` is started. Therefore, any change to those files can
+(and almost always will) invalidate the existing cache in its entirety.
 
 Stripe Structure
 ================
 
-|TS| treats the storage associated with a cache stripe as an undifferentiated span of bytes. Internally each stripe is
-treated almost entirely independently. The data structures described in this section are duplicated for each stripe.
-Internally the term "volume" is used for these stripes and implemented primarily in :cpp:class:`Vol`. What a user thinks
-of as a volume (what this document calls a "cache volume") is represented by :cpp:class:`CacheVol`.
+|TS| treats the storage associated with a cache stripe as an undifferentiated
+span of bytes. Internally each stripe is treated almost entirely independently.
+The data structures described in this section are duplicated for each stripe.
+Internally the term *volume* is used for these stripes and implemented primarily
+in :cpp:class:`Vol`. What a user thinks of as a volume (and what this document
+calls a *cache volume*) is represented by :cpp:class:`CacheVol`.
 
 .. note::
 
-   Stripe assignment must be done before working with an object because the directory is local to the stripe. Any cached
-   objects for which the stripe assignment is changed are effectively lost as their directory data will not be found in
-   the new stripe.
+   Stripe assignment must be done before working with an object because the
+   directory is local to the stripe. Any cached objects for which the stripe
+   assignment is changed are effectively lost as their directory data will not
+   be found in the new stripe.
 
 .. index:: cache directory
 .. _cache-directory:
@@ -111,102 +129,129 @@ Cache Directory
 
 .. _fragment:
 
-Content in a stripe is tracked via a directory. We call each element of the directory a "directory entry" and each is
-represented by :cpp:class:`Dir`. Each entry refers to a chunk of contiguous storage in the cache. These are referred to
-variously as "fragments", "segments", "docs" / "documents", and a few other things. This document will use the term
-"fragment" as that is the most common reference in the code. The term "Doc" (for :cpp:class:`Doc`) will be used to refer
-to the header data for a fragment. Overall the directory is treated as a hash with the :term:`cache ID` as the key. See
-:ref:`directory probing <cache-directory-probe>` for how the cache ID is used to locate a directory entry. The cache ID
-is in turn computed from a :term:`cache key` which by default is the URL of the content.
-
-The directory is used as a memory resident structure which means a directory entry is as small as possible (currently 10
-bytes). This forces some compromises on the data that can be stored there. On the other hand this means that most cache
-misses do not require disk I/O which has a large performance benefit.
-
-An additional point is the directory is always fully sized. Once a stripe is initialized the directory size is
-fixed and never changed. This size is related (roughly linearly) to the size of the stripe. It is for this reason the
-memory footprint of |TS| depends strongly on the size of the disk cache. Because the directory size does not change,
-neither does this memory requirement so |TS| does not consume more memory as more content is stored in the cache. If
-there is enough memory to run |TS| with an empty cache there is enough to run it with a full cache.
+Content in a stripe is tracked via a directory. Each element of the directory
+is a :term:`directory entry` and is represented by :cpp:class:`Dir`. Each entry
+refers to a chunk of contiguous storage in the cache. These are referred to
+variously as *fragments*, *segments*, *docs*, *documents*, and a few other
+things. This document will use the term *fragment* as that is the most common
+reference in the code. The term *Doc* (for :cpp:class:`Doc`) will be used to
+refer to the header data for a fragment. Overall, the directory is treated as a
+hash with the :term:`cache ID` as the key. See
+:ref:`directory probing <cache-directory-probe>` for how the cache ID is used
+to locate a directory entry. The cache ID is in turn computed from a
+:term:`cache key` which by default is the URL of the content.
+
+The directory is used as a memory resident structure, which means a directory
+entry is as small as possible (currently 10 bytes). This forces some
+compromises on the data that can be stored there. On the other hand this means
+that most cache misses do not require disk I/O, which has a large performance
+benefit.
+
+The directory is always fully sized; once a stripe is initialized the directory
+size is fixed and never changed. This size is related (roughly linearly) to the
+size of the stripe. It is for this reason the memory footprint of |TS| depends
+strongly on the size of the disk cache. Because the directory size does not
+change, neither does this memory requirement, so |TS| does not consume more
+memory as more content is stored in the cache. If there is enough memory to run
+|TS| with an empty cache there is enough to run it with a full cache.
 
 .. figure:: images/cache-directory-structure.png
    :align: center
 
-Each entry stores an offset in the stripe and a size. The size stored in the directory entry is an :ref:`approximate
-size <dir-size>` which is at least as big as the actual data in the fragment. Exact size data is stored in the fragment
-header on disk.
+Each entry stores an offset in the stripe and a size. The size stored in the
+directory entry is an :ref:`approximate size <dir-size>` which is at least as
+big as the actual data in the fragment. Exact size data is stored in the
+fragment header on disk.
 
 .. note::
 
-   Data in HTTP headers cannot be examined without disk I/O. This includes the original URL for the object. The cache
-   key is not stored explicitly and therefore cannot be reliably retrieved.
+   Data in HTTP headers cannot be examined without disk I/O. This includes the
+   original URL for the object. The cache key is not stored explicitly and
+   therefore cannot be reliably retrieved.
 
 The directory is a hash table that uses `chaining
-<http://en.wikibooks.org/wiki/Data_Structures/Hash_Tables#Collision_resolution>`_ for collision resolution. Because each
-entry is small they are used directly as the list header of the hash bucket.
+<http://en.wikibooks.org/wiki/Data_Structures/Hash_Tables#Collision_resolution>`_
+for collision resolution. Because each entry is small they are used directly as
+the list header of the hash bucket.
 
 .. _dir-segment:
 .. _dir-bucket:
 
-Chaining is implemented by imposing grouping structures on the entries in a directory. The first level grouping is a
-:term:`directory bucket`. This is a fixed number (currently 4 - defined as ``DIR_DEPTH``) of entries. This
-serves to define the basic hash buckets with the first entry in each cache bucket serving as the root of the hash
-bucket.
+Chaining is implemented by imposing grouping structures on the entries in a
+directory. The first level grouping is a :term:`directory bucket`. This is a
+fixed number (currently 4, defined as ``DIR_DEPTH``) of entries. This serves to
+define the basic hash buckets with the first entry in each cache bucket serving
+as the root of the hash bucket.
 
 .. note::
 
-   The term "bucket" is used in the code to mean both the conceptual bucket for hashing and for a structural grouping
-   mechanism in the directory and so these will be qualified as needed to distinguish them. The unqualified term
-   "bucket" is almost always used to mean the structural grouping in the directory.
+   The term *bucket* is used in the code to mean both the conceptual bucket for
+   hashing and for a structural grouping mechanism in the directory. These will
+   be qualified as needed to distinguish them. The unqualified term *bucket* is
+   almost always used to mean the structural grouping in the directory.
 
-Directory buckets are grouped in to :term:`segments <directory segment>`. All segments in a stripe have the same number of
-buckets. The number of segments in a stripe is chosen so that each segment has as many buckets as possible without
-exceeding 65535 (2\ :sup:`16`\ -1) entries in a segment.
+Directory buckets are grouped in to :term:`segments <directory segment>`. All
+segments in a stripe have the same number of buckets. The number of segments in
+a stripe is chosen so that each segment has as many buckets as possible without
+exceeding 65,535 (2\ :sup:`16`\ -1) entries in a segment.
 
 .. figure:: images/dir-segment-bucket.png
    :align: center
 
-Each directory entry has a previous and next index value which is used to link entries in the same segment. Because no
-segment has more than 65535 entries 16 bits suffices for storing the index values. The stripe header contains an array
-of entry indices which are used as the roots of entry free lists, one for each segment. Active entries are stored via
-the bucket structure. When a stripe is initialized the first entry in each bucket is zeroed (marked unused) and all
-other entries are put in the corresponding segment free list in the stripe header. This means the first entry of each
-directory bucket is used as the root of a hash bucket and is therefore marked unused rather than being put a free list.
-The other entries in the directory bucket are preferentially preferred for adding to the corresponding hash bucket but
-this is not required. The segment free lists are initialized such that the extra bucket entries are added in order - all
-the seconds, then the thirds, then the fourths. Because the free lists are FIFOs this means extra entries will be
-selected from the fourth entries across all the buckets first, then the thirds, etc. When allocating a new directory
-entry in a bucket the entries are searched from first to last, which maximizes bucket locality (that is, cache IDs that
-map to the same hash bucket will also tend to use the same directory bucket).
+Each directory entry has a previous and next index value which is used to link
+entries in the same segment. Because no segment has more than 65,535 entries,
+16 bits suffices for storing the index values. The stripe header contains an
+array of entry indices which are used as the roots of entry free lists, one for
+each segment. Active entries are stored via the bucket structure. When a stripe
+is initialized the first entry in each bucket is zeroed (marked unused) and all
+other entries are put in the corresponding segment free list in the stripe
+header. This means the first entry of each :term:`directory bucket` is used as
+the root of a hash bucket and is therefore marked unused rather than being put
+a free list. The other entries in the directory bucket are preferred for adding
+to the corresponding hash bucket but this is not required. The segment free
+lists are initialized such that the extra bucket entries are added in order;
+all the seconds, then the thirds, then the fourths. Because the free lists are
+FIFOs, this means extra entries will be selected from the fourth entries across
+all the buckets first, then the thirds, etc. When allocating a new directory
+entry in a bucket the entries are searched from first to last, which maximizes
+bucket locality (that is, :term:`cache IDs <cache ID>` that map to the same
+hash bucket will also tend to use the same directory bucket).
 
 .. figure:: images/dir-bucket-assign.png
    :align: center
 
-Entries are removed from the free list when used and returned when no longer in use. When a fragment needs to be put in
-to the directory the cache ID is used to locate a hash bucket (which also determines the segment and directory bucket).
-If the first entry in the directory bucket is marked unused, it is used. If not then the other entries in the bucket are
-searched and if any are on the free list, that entry is used. If none are available then the first entry on the segment
-free list is used. This entry is attached to the hash bucket via the same next and previous indices used for the free
-list so that it can be found when doing a lookup of a cache ID.
+Entries are removed from the free list when used and returned when no longer in
+use. When a :term:`fragment <cache fragment>` needs to be put in to the
+directory the cache ID is used to locate a hash bucket (which also determines
+the segment and directory bucket). If the first entry in the directory bucket
+is marked unused, it is used. Otherwise, the other entries in the bucket are
+searched and if any are on the free list, that entry is used. If none are
+available then the first entry on the segment free list is used. This entry is
+attached to the hash bucket via the same next and previous indices used for the
+free list so that it can be found when doing a lookup of a cache ID.
 
 Storage Layout
 --------------
 
-The storage layout is the stripe metadata followed by cached content. The metadata consists of three parts - the stripe
-header, the directory, and the stripe footer. The metadata is stored twice. The header and the footer are instances of
-:cpp:class:`VolHeaderFooter`. This is a stub structure which can have a trailing variable sized array. This array is
-used as the segment free list roots in the directory. Each contains the segment index of the first element of the free
-list for the segment. The footer is a copy of the header without the segment free lists. This makes the size of the
-header dependent on the directory but not that of the footer.
+The storage layout is the stripe metadata followed by cached content. The
+metadata consists of three parts: the stripe header, the directory, and the
+stripe footer. The metadata is stored twice. The header and the footer are
+instances of :cpp:class:`VolHeaderFooter`. This is a stub structure which can
+have a trailing variable sized array. This array is used as the segment free
+list roots in the directory. Each contains the segment index of the first
+element of the free list for the segment. The footer is a copy of the header
+without the segment free lists. This makes the size of the header dependent on
+the directory but not that of the footer.
 
 .. figure:: images/cache-stripe-layout.png
    :align: center
 
-Each stripe has several values that describe its basic layout.
+Each stripe has several values that describe its basic layout:
 
 skip
-   The start of stripe data. This represents either space reserved at the start of a physical device to avoid problems
-   with the host operating system, or an offset representing use of space in the cache span by other stripes.
+   The start of stripe data. This represents either space reserved at the start
+   of a physical device to avoid problems with the host operating system, or an
+   offset representing use of space in the cache span by other stripes.
 
 start
    The offset for the start of the content, after the stripe metadata.
@@ -215,25 +260,35 @@ length
    Total number of bytes in the stripe. :cpp:member:`Vol::len`.
 
 data length
-   Total number of blocks in the stripe available for content storage. :cpp:member:`Vol::data_blocks`.
+   Total number of blocks in the stripe available for content storage.
+   :cpp:member:`Vol::data_blocks`.
+
+.. note::
 
-.. note:: Great care must be taken with sizes and lengths in the cache code because there are at least three different metrics (bytes, cache blocks, store blocks) used in various places.
+   Great care must be taken with sizes and lengths in the cache code because
+   there are at least three different metrics (bytes, cache blocks, store
+   blocks) used in various places.
 
-The total size of the directory (the number of entries) is computed by taking the size of the stripe and dividing by the
-average object size. The directory always consumes this amount of memory which has the effect that if cache size is
-increased so is the memory requirement for |TS|. The average object size defaults to 8000 bytes but can be configured
-using :ts:cv:`proxy.config.cache.min_average_object_size`. Increasing the average object size will reduce the memory
-footprint of the directory at the expense of reducing the number of distinct objects that can be stored in the cache
-[#]_.
+The total size of the directory (the number of :term:`entries <directory entry>`)
+is computed by taking the size of the :term:`cache stripe` and dividing by the
+average object size. The directory always consumes this amount of memory which
+has the effect that if cache size is increased so is the memory requirement for
+|TS|. The average object size defaults to 8000 bytes but can be configured using
+:ts:cv:`proxy.config.cache.min_average_object_size`. Increasing the average
+object size will reduce the memory footprint of the directory at the expense of
+reducing the number of distinct objects that can be stored in the cache.
 
 .. index: write cursor
 .. _write-cursor:
 
-The content area stores the actual objects and is used as a circular buffer where new objects overwrite the least
-recently cached objects. The location in a stripe where new cache data is written is called the *write cursor*. This
-means that objects can be de facto evicted from cache even if they have not expired if the data is overwritten by the
-write cursor. If an object is overwritten this is not detected at that time and the directory is not updated. Instead it
-will be noted if the object is accessed in the future and the disk read of the fragment fails.
+The content area stores the actual objects and is used as a circular buffer
+where new objects overwrite the least recently cached objects. The location in
+a stripe where new cache data is written is called the *write cursor*. This
+means that objects can be de facto evicted from cache even if they have not
+expired if the data is overwritten by the write cursor. If an object is
+overwritten this is not detected at that time and the directory is not updated.
+Instead it will be noted if the object is accessed in the future and the disk
+read of the fragment fails.
 
 .. figure:: images/ats-cache-write-cursor.png
    :align: center
@@ -242,91 +297,116 @@ will be noted if the object is accessed in the future and the disk read of the f
 
 .. note:: Cache data on disk is never updated.
 
-This is a key thing to keep in mind. What appear to be updates (such as doing a refresh on stale content and getting
-back a 304) are actually new copies of data being written at the write cursor. The originals are left as "dead" space
-which will be consumed when the write cursor arrives at that disk location. Once the stripe directory is updated (in
-memory!) the original fragment in the cache is effectively destroyed. This is the general space management techinque
-used in other cases as well. If an object needs to removed from cache, only the directory needs to be changed. No other
-work (and *particularly* no disk I/O) needs to be done.
+This is a key thing to keep in mind. What appear to be updates (such as doing a
+refresh on :term:`stale` content and getting back a 304) are actually new
+copies of data being written at the write cursor. The originals are left as
+"dead" space which will be consumed when the write cursor arrives at that disk
+location. Once the stripe directory is updated (in memory) the original
+fragment in the cache is effectively destroyed. This is the general space
+management technique used in other cases as well. If an object needs to removed
+from cache, only the directory needs to be changed. No other work (and
+particularly, no disk I/O) needs to be done.
 
 Object Structure
 ================
 
-Objects are stored as two types of data, metadata and content data. Metadata is all the data about the object and the
-content and includes the HTTP headers. The content data is the content of the object, the octet stream delivered to the
-client as the object.
-
-Objects are rooted in a :cpp:class:`Doc` structure stored in the cache. :cpp:class:`Doc` serves as the header data for a
-fragment and is contained at the start of every fragment. The first fragment for an object is termed the "first ``Doc``"
-and always contains the object metadata. Any operation on the object will read this fragment first. The fragment is
-located by converting the cache key for the object to a cache ID and then doing a lookup for a directory entry with that
-key. The directory entry has the offset and approximate size of the first fragment which is then read from the disk.
-This fragment will contain the request header and response along with overall object properties (such as content
-length).
+Objects are stored as two types of data: metadata and content data. Metadata is
+all the data about the object and the content and includes the HTTP headers.
+The content data is the content of the object, the octet stream delivered to
+the client as the object.
+
+Objects are rooted in a :cpp:class:`Doc` structure stored in the cache.
+:cpp:class:`Doc` serves as the header data for a :term:`cache fragment` and is
+contained at the start of every fragment. The first fragment for an object is
+termed the *first Doc* and always contains the object metadata. Any
+operation on the object will read this fragment first. The fragment is located
+by converting the :term:`cache key` for the object to a :term:`cache ID` and
+then doing a lookup for a :term:`directory entry` with that key. The directory
+entry has the offset and approximate size of the first fragment which is then
+read from the disk. This fragment will contain the request header and response
+along with overall object properties (such as content length).
 
 .. index:: alternate
 
-|TS| supports `varying content <http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44>`_ for objects. These
-are called *alternates*. All metadata for all alternates is stored in the first fragment including the set of alternates
+|TS| supports `varying content <http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44>`_
+for objects. These are called :term:`alternates <alternate>`. All metadata for
+all alternates is stored in the first fragment including the set of alternates
 and the HTTP headers for them. This enables `alternate selection
-<http://trafficserver.apache.org/docs/trunk/sdk/http-hooks-and-transactions/http-alternate-selection.en.html>`_ to be
-done after the first ``Doc`` is read from disk. An object that has more than one alternate will have the alternate
-content stored separately from the first fragment. For objects with only one alternate the content may or may not be in
-the same (first) fragment as the metadata. Each separate alternate content is allocated a directory entry and the key
-for that entry is stored in the first fragment metadata.
-
-Prior to version 4.0.1 the header data was stored in the :cpp:class:`CacheHTTPInfoVector` class which was marshaled to a
-variable length area of the on disk image, followed by information about additional fragments if needed to store the
-object.
+<http://trafficserver.apache.org/docs/trunk/sdk/http-hooks-and-transactions/http-alternate-selection.en.html>`_
+to be done after the *first Doc* is read from disk. An object that has more than
+one alternate will have the alternate content stored separately from the first
+fragment. For objects with only one alternate the content may or may not be in
+the same (first) fragment as the metadata. Each separate alternate content is
+allocated a directory entry and the key for that entry is stored in the first
+fragment metadata.
+
+Prior to version 4.0.1, the header data was stored in the
+:cpp:class:`CacheHTTPInfoVector` class which was marshaled to a variable length
+area of the on disk image, followed by information about additional fragments
+if needed to store the object.
 
 .. figure:: images/cache-doc-layout-3-2-0.png
    :align: center
 
    ``Doc`` layout 3.2.0
 
-This had the problem that with only one fragment table it could not be reliable for objects with more than one alternate
-[#]_. Therefore the fragment data was moved from being a separate variable length section of the metadata to being
-directly incorporated in to the :cpp:class:`CacheHTTPInfoVector`, yielding a layout of the following form.
+This had the problem that with only one fragment table it could not be reliable
+for objects with more than one alternate[#multiple-alternates]_. Therefore, the
+fragment data was moved from being a separate variable length section of the
+metadata to being directly incorporated in to the :cpp:class:`CacheHTTPInfoVector`,
+yielding a layout of the following form.
 
 .. figure:: images/cache-doc-layout-4-0-1.png
    :align: center
 
    ``Doc`` layout 4.0.1
 
-Each element in the vector contains for each alternate, in addition to the HTTP headers and the fragment table (if any),
-a cache key. This cache key identifies a directory entry that is referred to as the "earliest ``Doc``". This is the
-location where the content for the alternate begins.
+Each element in the vector contains for each alternate, in addition to the HTTP
+headers and the fragment table (if any), a :term:`cache key`. This cache key
+identifies a :term:`directory entry` that is referred to as the *earliest Doc*.
+This is the location where the content for the alternate begins.
 
-When the object is first cached, it will have a single alternate and that will be stored (if not too large) in first
-``Doc``. This is termed a *resident alternate* in the code. This can only happen on the initial store of the object. If
-the metadata is updated (such as a ``304`` response to an ``If-Modified-Since`` request) then unless the object is
-small, the object data will be left in the original fragment and a new fragment written as the first fragment, making
-the alternate non-resident. "Small" is defined as a length smaller than :ts:cv:`proxy.config.cache.alt_rewrite_max_size`.
+When the object is first cached, it will have a single alternate and that will
+be stored (if not too large) in first ``Doc``. This is termed a *resident alternate*
+in the code. This can only happen on the initial store of the object. If the
+metadata is updated (such as a ``304`` response to an ``If-Modified-Since``
+request) then unless the object is small, the object data will be left in the
+original fragment and a new fragment written as the first fragment, making the
+alternate non-resident. *Small* is defined as a length smaller than
+:ts:cv:`proxy.config.cache.alt_rewrite_max_size`.
 
 .. note::
 
-   The :cpp:class:`CacheHTTPInfoVector` is stored only in the first ``Doc``. Subsequent ``Doc`` instances for the
-   object, including the earliest ``Doc``, should have an ``hlen`` of zero and if not, it is ignored.
-
-Large objects are split in to multiple fragments when written to the cache. This is indicated by a total document length
-that is longer than the content in first ``Doc`` or an earliest ``Doc``. In such a case a fragment offset table is
-stored. This contains the byte offset in the object content of the first byte of content data for each fragment past the
-first (as the offset for the first is always zero). This allows range requests to be serviced much more efficiently for
-large objects, as intermediate fragments that do not contain data in the range can be skipped. The last fragment in the
-sequence is detected by the fragment size and offset reaching the end of the total size of the object, there is no
-explicit end mark. Each fragment is computationally chained from the previous in that the cache key for fragment N is
-computed by::
+   The :cpp:class:`CacheHTTPInfoVector` is stored only in the first ``Doc``.
+   Subsequent ``Doc`` instances for the object, including the earliest ``Doc``,
+   should have an ``hlen`` of zero and if not, it is ignored.
+
+Large objects are split in to multiple fragments when written to the cache. This
+is indicated by a total document length that is longer than the content in
+first ``Doc`` or an earliest ``Doc``. In such a case a fragment offset table is
+stored. This contains the byte offset in the object content of the first byte
+of content data for each fragment past the first (as the offset for the first
+is always zero). This allows range requests to be serviced much more
+efficiently for large objects, as intermediate fragments that do not contain
+data in the range can be skipped. The last fragment in the sequence is detected
+by the fragment size and offset reaching the end of the total size of the
+object, there is no explicit end mark. Each fragment is computationally chained
+from the previous in that the cache key for fragment N is computed by::
 
    key_for_N_plus_one = next_key(key_for_N);
 
-where ``next_key`` is a global function that deterministically computes a new cache key from an existing cache key.
+Where ``next_key`` is a global function that deterministically computes a new
+cache key from an existing cache key.
 
-Objects with multiple fragments are laid out such that the data fragments (including the earliest ``Doc``) are written
-first and the first ``Doc`` is written last. When read from disk, both the first and earliest ``Doc`` are validated
-(tested to ensure that they haven't been overwritten by the write cursor) to verify that the entire document is present
-on disk (as they bookend the other fragments - the write cursor cannot overwrite them without overwriting at leastone of
-the verified ``Doc`` instances). Note that while the fragments of a single object are ordered they are not necessarily
-contiguous as data from different objects are interleaved as the data arrives in |TS|.
+Objects with multiple fragments are laid out such that the data fragments
+(including the earliest ``Doc``) are written first and the first ``Doc`` is
+written last. When read from disk, both the first and earliest ``Doc`` are
+validated (tested to ensure that they haven't been overwritten by the write
+cursor) to verify that the entire document is present on disk (as they bookend
+the other fragments - the write cursor cannot overwrite them without overwriting
+at least one of the verified ``Doc`` instances). Note that while the fragments
+of a single object are ordered they are not necessarily contiguous, as data from
+different objects are interleaved as the data arrives in |TS|.
 
 .. figure:: images/cache-multi-fragment.png
    :align: center
@@ -335,18 +415,22 @@ contiguous as data from different objects are interleaved as the data arrives in
 
 .. index:: pinned
 
-Documents which are "pinned" into the cache must not be overwritten so they are "evacuated" from in front of the write
-cursor. Each fragment is read and rewritten. There is a special lookup mechanism for objects that are being evacuated so
-that they can be found in memory rather than the potentially unreliable disk regions. The cache scans ahead of the write
-cursor to discover pinned objects as there is a dead zone immediately before the write cursor from which data cannot be
-evacuated. Evacuated data is read from disk and placed in the write queue and written as its turn comes up.
+Documents which are pinned into the cache must not be overwritten so they are
+evacuated from in front of the write cursor. Each fragment is read and
+rewritten. There is a special lookup mechanism for objects that are being
+evacuated so that they can be found in memory rather than the potentially
+unreliable disk regions. The cache scans ahead of the write cursor to discover
+pinned objects as there is a dead zone immediately before the write cursor from
+which data cannot be evacuated. Evacuated data is read from disk and placed in
+the write queue and written as its turn comes up.
 
-It appears that objects can only be pinned via the :file:`cache.config` file and if the
-:ts:cv:`proxy.config.cache.permit.pinning` is set to non-zero (it is zero by default). Objects which are in use when the
-write cursor is near use the same underlying evacuation mechanism but are handled automatically and not via the explicit
-``pinned`` bit in :cpp:class:`Dir`.
+Objects can only be pinned via :file:`cache.config` and while
+:ts:cv:`proxy.config.cache.permit.pinning` is set to non-zero (it is zero by
+default). Objects which are in use when the write cursor is near use the same
+underlying evacuation mechanism but are handled automatically and not via the
+explicit ``pinned`` bit in :cpp:class:`Dir`.
 
-.. [#] It could, under certain circumstances, be accurate for none of the alternates.
+.. [#multiple-alternates] It could, under certain circumstances, be accurate for none of the alternates.
 
 Additional Notes
 ====================
@@ -356,30 +440,48 @@ Some general observations on the data structures.
 Cyclone buffer
 --------------
 
-Because the cache is a cyclone cache objects are not preserved for an indefinite time. Even if the object is not stale
-it can be overwritten as the cache cycles through its volume. Marking an object as ``pinned`` preserves the object
-through the passage of the write cursor but this is done by copying the object across the gap, in effect re-storing it
-in the cache. Pinning large objects or a large number objects can lead to a excessive disk activity. The original
-purpose of pinning seems to have been for small, frequently used objects explicitly marked by the administrator.
-
-This means the purpose of expiration data on objects is simply to prevent them from being served to clients. They are
-not in the standard sense deleted or cleaned up. The space can't be immediately reclaimed in any event because writing
-only happens at the write cursor. Deleting an object consists only of removing the directory entries in the volume
-directory which suffices to (eventually) free the space and render the document inaccessible.
-
-Historically the cache is designed this way because web content was relatively small and not particularly consistent.
-The design also provides high performance and low consistency requirements. There are no fragmentation issues for the
-storage, and both cache misses and object deletions require no disk I/O. It does not deal particularly well with long
-term storage of large objects. See the :ref:`volume tagging` appendix for details on some work in this area.
+Because the cache is a *cyclone cache*, objects are not preserved for an
+indefinite time. Even if the object is not :term:`stale` it can be overwritten
+as the cache cycles through its volume. Marking an object as *pinned* preserves
+the object through the passage of the write cursor but this is done by copying
+the object across the gap, in effect re-storing it in the cache. Pinning large
+objects or a large number objects can lead to excessive disk activity. The
+original purpose of pinning was for small, frequently used objects explicitly
+marked by the administrator.
+
+This means the purpose of expiration data on objects is simply to prevent them
+from being served to clients. They are not in the standard sense deleted or
+cleaned up. The space can't be immediately reclaimed in any event, because
+writing only happens at the write cursor. Deleting an object consists only of
+removing the directory entries in the volume directory which suffices to
+(eventually) free the space and render the document inaccessible.
+
+Historically, the cache was designed this way because web content was relatively
+small and not particularly consistent. The design also provided high performance
+and low consistency requirements. There are no fragmentation issues for the
+storage, and both cache misses and object deletions require no disk I/O. It does
+not deal particularly well with long term storage of large objects. See the
+:ref:`volume tagging` appendix for details on some work in this area.
 
 Disk Failure
 ------------
 
-The cache is designed to be relatively resistant to disk failures. Because each storage unit in each volume is mostly independent the loss of a disk simply means that the corresponding :cpp:class:`Vol` instances (one per cache volume that uses the storage unit) becomes unusable. The primary issue is updating the volume assignment table to both preserve assignments for objects on still operational volumes while distributing the assignments from the failed disk to those operational volumes. This mostly done in::
+The cache is designed to be relatively resistant to disk failures. Because each
+:term:`storage unit` in each :term:`cache volume` is mostly independent, the
+loss of a disk simply means that the corresponding :cpp:class:`Vol` instances
+(one per cache volume that uses the storage unit) becomes unusable. The primary
+issue is updating the volume assignment table to both preserve assignments for
+objects on still operational volumes while distributing the assignments from the
+failed disk to those operational volumes. This mostly done in::
 
    AIO_Callback_handler::handle_disk_failure
 
-Restoring a disk to active duty is quite a bit more difficult task. Changing the volume assignment of a cache key renders any currently cached data inaccessible. This is obviouly not a problem when a disk has failed, but is a bit trickier to decide which cached objects are to be de facto evicted if a new storage unit is added to a running system. The mechanism for this, if any, is still under investigation.
+Restoring a disk to active duty is a more difficult task. Changing the volume
+assignment of a :term:`cache key` renders any currently cached data
+inaccessible. This is not a problem when a disk has failed, but is a bit
+trickier to decide which cached objects are to be de facto evicted if a new
+storage unit is added to a running system. The mechanism for this, if any, is
+still under investigation.
 
 Implementation Details
 ======================
@@ -389,7 +491,7 @@ Stripe Directory
 
 .. _directory-entry:
 
-The in memory volume directory entries are defined as described below.
+The in memory volume directory entries are described below.
 
 .. cpp:class:: Dir
 
@@ -410,126 +512,186 @@ The in memory volume directory entries are defined as described below.
    offset_high inku16              High order offset bits
    =========== =================== ===================================================
 
-   The stripe directory is an array of ``Dir`` instances. Each entry refers to a span in the volume which contains a cached object. Because every object in the cache has at least one directory entry this data has been made as small as possible.
+The stripe directory is an array of ``Dir`` instances. Each entry refers to
+a span in the volume which contains a cached object. Because every object in
+the cache has at least one directory entry this data has been made as small
+as possible.
 
-   The offset value is the starting byte of the object in the volume. It is 40 bits long split between the *offset* (lower 24 bits) and *offset_high* (upper 16 bits) members. Note that since there is a directory for every storage unit in a cache volume, this is the offset in to the slice of a storage unit attached to that volume.
+The offset value is the starting byte of the object in the volume. It is 40
+bits long, split between the *offset* (lower 24 bits) and *offset_high*
+(upper 16 bits) members. Note that since there is a directory for every
+storage unit in a cache volume, this is the offset in to the slice of a
+storage unit attached to that volume.
 
 .. _dir-size:
 
-   The *size* and *big* values are used to calculate the approximate size of the fragment which contains the object. This value is used as the number of bytes to read from storage at the offset value. The exact size is contained in the object metadata in :cpp:class:`Doc` which is consulted once the read has completed. For this reason the approximate size needs to be at least as large as the actual size but can be larger, at the cost of reading the extraneous bytes.
+The *size* and *big* values are used to calculate the approximate size of
+the fragment which contains the object. This value is used as the number of
+bytes to read from storage at the offset value. The exact size is contained
+in the object metadata in :cpp:class:`Doc` which is consulted once the read
+has completed. For this reason, the approximate size needs to be at least as
+large as the actual size but can be larger, at the cost of reading the
+extraneous bytes.
 
-   The computation of the approximate size of the fragment is defined as::
+The computation of the approximate size of the fragment is defined as::
 
-      ( *size* + 1 ) * 2 ^ ( ``CACHE_BLOCK_SHIFT`` + 3 * *big* )
+  ( *size* + 1 ) * 2 ^ ( ``CACHE_BLOCK_SHIFT`` + 3 * *big* )
 
-   where ``CACHE_BLOCK_SHIFT`` is the bit width of the size of a basic cache block (9, corresponding to a sector size of 512). Therefore the value with current defines is::
+Where ``CACHE_BLOCK_SHIFT`` is the bit width of the size of a basic cache
+block (9, corresponding to a sector size of 512). Therefore the value with
+current defines is::
 
-      ( *size* + 1 ) * 2 ^ (9 + 3 * *big*)
+  ( *size* + 1 ) * 2 ^ (9 + 3 * *big*)
 
-   Because *big* is 2 bits the values for the multiplier of *size* are
+.. _big-mult:
 
-   .. _big-mult:
+Because *big* is 2 bits, the values for the multiplier of *size* are:
 
-   ===== ===============   ========================
-   *big* Multiplier        Maximum Size
-   ===== ===============   ========================
-     0   512 (2^9)         32768 (2^15)
-     1   4096 (2^12)       262144 (2^18)
-     2   32768 (2^15)      2097152 (2^21)
-     3   262144 (2^18)     16777216 (2^24)
-   ===== ===============   ========================
+   ===== =============== ========================
+   *big* Multiplier      Maximum Size
+   ===== =============== ========================
+   0     512 (2^9)       32768 (2^15)
+   1     4096 (2^12)     262144 (2^18)
+   2     32768 (2^15)    2097152 (2^21)
+   3     262144 (2^18)   16777216 (2^24)
+   ===== =============== ========================
 
-   Note also that *size* is effectively offset by one, so a value of 0 indicates a single unit of the multiplier.
+Note also that *size* is effectively offset by one, so a value of 0 indicates
+a single unit of the multiplier.
 
 .. _target-fragment-size:
 
 The target fragment size can set with the :file:`records.config` value
-
-   ``proxy.config.cache.target_fragment_size``
-
-This value should be chosen so that it is a multiple of a :ref:`cache entry multiplier <big-mult>`. It is not necessary
-to make it a power of 2 [#]_. Larger fragments increase I/O efficiency but lead to more wasted space. The default size
-(1M, 2^20) is a reasonable choice in most circumstances altough in very specific cases there can be benefit from tuning
-this parameter. |TS| imposes an internal maximum of a 4194232 bytes which is 4M (2^22) less the size of a struct
-:cpp:class:`Doc`. In practice then the largest reasonable target fragment size is 4M - 262144 = 3932160.
-
-When a fragment is stored to disk the size data in the cache index entry is set to the finest granularity permitted by
-the size of the fragment. To determine this consult the :ref:`cache entry multipler <big-mult>` table, find the smallest
-maximum size that is at least as large as the fragment. That will indicate the value of *big* selected and therefore the
-granularity of the approximate size. That represents the largest possible amount of wasted disk I/O when the fragment is
-read from disk.
+:ts:cv:`proxy.config.cache.target_fragment_size`.
+
+This value should be chosen so that it is a multiple of a
+:ref:`cache entry multiplier <big-mult>`. It is not necessary to make it a
+power of two[#cache-mult-value]_. Larger fragments increase I/O efficiency but
+lead to more wasted space. The default size (1M, 2^20) is a reasonable choice
+in most circumstances, altough in very specific cases there can be benefit from
+tuning this parameter. |TS| imposes an internal maximum of a 4,194,232 bytes,
+which is 4M (2^22), less the size of a struct :cpp:class:`Doc`. In practice,
+the largest reasonable target fragment size is 4M - 262,144 = 3,932,160.
+
+When a fragment is stored to disk, the size data in the cache index entry is
+set to the finest granularity permitted by the size of the fragment. To
+determine this, consult the :ref:`cache entry multipler <big-mult>` table and
+find the smallest maximum size that is at least as large as the fragment. That
+will indicate the value of *big* selected and therefore the granularity of the
+approximate size. That represents the largest possible amount of wasted disk I/O
+when the fragment is read from disk.
 
 .. index:: DIR_DEPTH, index segment, index buckets
 
-The set of index entries for a volume are grouped in to *segments*. The number of segments for an index is selected so
-that there are as few segments as possible such that no segment has more than 2^16 entries. Intra-segment references can
-therefore use a 16 bit value to refer to any other entry in the segment.
+The set of index entries for a volume are grouped in to :term:`segments <directory segment>`.
+The number of segments for an index is selected so that there are as few
+segments as possible such that no segment has more than 2^16 entries.
+Intra-segment references can therefore use a 16 bit value to refer to any other
+entry in the segment.
+
+Index entries in a segment are grouped :term:`buckets <directory bucket>`, each
+of ``DIR_DEPTH`` (currently 4) entries. These are handled in the standard hash
+table manner, giving somewhat less than 2^14 buckets per segment.
 
-Index entries in a segment are grouped *buckets* each of ``DIR_DEPTH`` (currently 4) entries. These are handled in the
-standard hash table way, giving somewhat less than 2^14 buckets per segment.
+.. [#cache-mult-value]
 
-.. [#] The comment in :file:`records.config` is simply wrong.
+   The comment in earlier versions of the :file:`records.config` documentation
+   which indicated that this value must be a power of two were, unfortunately,
+   mistaken and have been corrected.
 
 .. _cache-directory-probe:
 
 Directory Probing
 -----------------
 
-Directory probing is locating a specific directory entry in the stripe directory based on a cache ID. This is handled
-primarily by the function :cpp:func:`dir_probe()`. This is passed the cache ID (:arg:`key`), a stripe (:arg:`d`), and a
-last collision (:arg:`last_collision`). The last of these is an in and out parameter, updated as useful during the
-probe.
-
-Given an ID, the top half (64 bits) is used as a :ref:`segment <dir-segment>` index, taken modulo the number of segments in
-the directory. The bottom half is used as a :ref:`bucket <dir-bucket>` index, taken modulo the number of buckets per
-segment. The :arg:`last_collision` value is used to mark the last matching entry returned by :cpp:func:`dir_probe`.
-
-After computing the appropriate bucket, the entries in that bucket are searched to find a match. In this case a match is
-detected by comparison of the bottom 12 bits of the cache ID (the *cache tag*). The search starts at the base entry for
-the bucket and then proceeds via the linked list of entries from that first entry. If a tag match is found and there is
-no :arg:`collision` then that entry is returned and :arg:`last_collision` is updated to that entry. If :arg:`collision`
-is set, then if it isn't the current match the search continues down the linked list, otherwise :arg:`collision` is
-cleared and the search continues. The effect of this is that matches are skipped until the last returned match
-(:arg:`last_collision`) is found, after which the next match (if any) is returned. If the search falls off the end of
-the linked list then a miss result is returned (if no last collision), otherwise the probe is restarted after clearing
-the collision on the presumption that the entry for the collision has been removed from the bucket. This can lead to
-repeats among the returned values but guarantees that no valid entry will be skipped.
-
-Last collision can therefore be used to restart a probe at a later time. This is important because the match returned
-may not be the actual object - although the hashing of the cache ID to a bucket and the tag matching is unlikely to
-create false positives, that is possible. When a fragment is read the full cache ID is available and checked and if
-wrong, that read can be discarded and the next possible match from the directory found because the cache virtual
-connection tracks the last collision value.
+Directory probing is the locating of a specific :term:`directory entry` in the
+stripe directory based on a :term:`cache ID`. This is handled primarily by the
+function :cpp:func:`dir_probe()`. This is passed the cache ID (:arg:`key`), a
+stripe (:arg:`d`), and a last collision (:arg:`last_collision`). The last of
+these is an in and out parameter, updated as useful during the probe.
+
+Given an ID, the top half (64 bits) is used as a :ref:`segment <dir-segment>`
+index, taken modulo the number of segments in the directory. The bottom half is
+used as a :ref:`bucket <dir-bucket>` index, taken modulo the number of buckets
+per segment. The :arg:`last_collision` value is used to mark the last matching
+entry returned by :cpp:func:`dir_probe`.
+
+After computing the appropriate bucket, the entries in that bucket are searched
+to find a match. In this case a match is detected by comparison of the bottom
+12 bits of the :term:`cache ID` (the *cache tag*). The search starts at the base
+entry for the bucket and then proceeds via the linked list of entries from that
+first entry. If a tag match is found and there is no :arg:`collision` then that
+entry is returned and :arg:`last_collision` is updated to that entry. If
+:arg:`collision` is set and if it isn't the current match, the search continues
+down the linked list, otherwise :arg:`collision` is cleared and the search
+continues.
+
+The effect of this is that matches are skipped until the last returned match
+(:arg:`last_collision`) is found, after which the next match (if any) is
+returned. If the search falls off the end of the linked list, then a miss result
+is returned (if no last collision), otherwise the probe is restarted after
+clearing the collision on the presumption that the entry for the collision has
+been removed from the bucket. This can lead to repeats among the returned
+values but guarantees that no valid entry will be skipped.
+
+Last collision can therefore be used to restart a probe at a later time. This
+is important because the match returned may not be the actual object. Although
+the hashing of the :term:`cache ID` to a :term:`bucket <directory bucket>` and
+the tag matching is unlikely to create false positives, it is possible. When a
+fragment is read the full cache ID is available and checked and if wrong, that
+read can be discarded and the next possible match from the directory found
+because the cache virtual connection tracks the last collision value.
 
 ----------------
 Cache Operations
 ----------------
 
-Cache activity starts after the HTTP request header has been parsed and remapped. Tunneled transactions do not interact with the cache because the headers are never parsed.
+Cache activity starts after the HTTP request header has been parsed and
+remapped. Tunneled transactions do not interact with the cache because the
+headers are never parsed.
 
-To understand the logic we must introduce the term "cache valid" which means something that is directly related to an object that is valid to be put in the cache (e.g. a ``DELETE`` which refers to a URL that is cache valid but cannot be cached itself). This is important because |TS| computes cache validity several times during a transaction and only performs cache operations for cache valid results. The criteria used changes during the course of the transaction as well. This is done to avoid the cost of cache activity for objects that cannot be in the cache.
+To understand the logic we must introduce the term *cache valid* which means
+something that is directly related to an object that is valid to be put in the
+cache (e.g. a ``DELETE`` which refers to a URL that is cache valid but cannot
+be cached itself). This is important because |TS| computes cache validity
+several times during a transaction and only performs cache operations for cache
+valid results. The criteria used changes during the course of the transaction
+as well. This is done to avoid the cost of cache activity for objects that
+cannot be in the cache.
 
-The three basic cache operations are lookup, read, and write. We will take deleting entries as a special case of writing where only the volume directory is updated.
+The three basic cache operations are: lookup, read, and write. We will take
+deleting entries as a special case of writing where only the volume directory
+is updated.
 
-After the client request header is parsed and is determined to be potentially cacheable, a `cache lookup`_ is done. If successful a `cache read`_ is attempted. If either the lookup or the read fails and the content is considered cacheable then a `cache write`_ is attempted.
+After the client request header is parsed and is determined to be potentially
+cacheable, a `cache lookup`_ is done. If successful, a `cache read`_ is
+attempted. If either the lookup or the read fails and the content is considered
+cacheable then a `cache write`_ is attempted.
 
 Cacheability
 ============
 
-The first thing done with a request with respect to cache is to determine whether it is potentially a valid object for the cache. After initial parsing and remapping this check is done primarily to detect a negative result because if so all further cache processing is skipped -- it will not be put in to the cache nor will a cache lookup be done. There are a number of prerequisites along with configuration options to change them. Additional cacheability checks are done later in the process when more is known about the transaction (such as plugin operations and the origin server response). Those checks are described as appropriate in the sections on the relevant operations.
+The first thing done with a request with respect to cache is to determine
+whether it is potentially a valid object for the cache. After initial parsing
+and remapping, this check is done primarily to detect a negative result, as it
+allows further cache processing to be skipped. It will not be put in to the
+cache, nor will a cache lookup be performed. There are a number of prerequisites
+along with configuration options to change them. Additional cacheability checks
+are done later in the process, when more is known about the transaction (such
+as plugin operations and the origin server response). Those checks are described
+as appropriate in the sections on the relevant operations.
 
-The set of things which can affect cacheability are
+The set of things which can affect cacheability are:
 
-* Built in constraints
-* Settings in :file:`records.config`
-* Settings in :file:`cache.config`
-* Plugin operations
+* Built in constraints.
+* Settings in :file:`records.config`.
+* Settings in :file:`cache.config`.
+* Plugin operations.
 
-The initial internal checks, along with their :file:`records.config` overrides[#]_, are done in::
+The initial internal checks, along with their :file:`records.config`
+overrides[#cacheability-overrides]_, are done in ``HttpTransact::is_request_cache_lookupable``.
 
-   HttpTransact::is_request_cache_lookupable
-
-The checks that are done are
+The checks that are done are:
 
    Cacheable Method
       The request must be one of ``GET``, ``HEAD``, ``POST``, ``DELETE``, ``PUT``.
@@ -537,235 +699,329 @@ The checks that are done are
       See ``HttpTransact::is_method_cache_lookupable()``.
 
    Dynamic URL
-      |TS| tries to avoid caching dynamic content because it's dynamic. A URL is considered dynamic if it
-
-      *  is not ``HTTP`` or ``HTTPS``
-      *  has query parameters
-      *  ends in ``asp``
-      *  has ``cgi`` in the path
+      |TS| tries to avoid caching dynamic content because it's dynamic. A URL is
+      considered dynamic if:
 
-      This check can be disabled by setting a non-zero value for::
+      *  It is not ``HTTP`` or ``HTTPS``,
+      *  Has query parameters,
+      *  Ends in ``asp``,
+      *  Has ``cgi`` in the path.
 
-         proxy.config.http.cache.cache_urls_that_look_dynamic
+      This check can be disabled by setting a non-zero value for
+      :ts:cv:`proxy.config.http.cache.cache_urls_that_look_dynamic`.
 
-      In addition if a TTL is set for rule that matches in :file:`cache.config` then this check is not done.
+      In addition if a TTL is set for rule that matches in :file:`cache.config`
+      then this check is not done.
 
    Range Request
       Cache valid only if :ts:cv:`proxy.config.http.cache.range.lookup` in
       :file:`records.config` is non-zero. This does not mean the range request
-	    can be cached, only that it might be satisfiable from the
-	    cache. In addition, :ts:cv:`proxy.config.http.cache.range.write`
-	    can be set to try to force a write on a range request. This
-	    probably has little value at the moment, but if for example the
-	    origin server ignores the ``Range:`` header, this option can allow
-	    for the response to be cached. It is disabled by default, for
-	    best performance.
-
-A plugin can call :c:func:`TSHttpTxnReqCacheableSet()` to force the request to be viewed as cache valid.
-
-.. [#] The code appears to check :file:`cache.config` in this logic by setting the ``does_config_permit_lookup`` in the ``cache_info.directives`` of the state machine instance but I can find no place where the value is used. The directive ``does_config_permit_storing`` is set and later checked so the directive (from the administrator point of view) is effective in preventing caching of the object.
+      can be cached, only that it might be satisfiable from the cache. In
+      addition, :ts:cv:`proxy.config.http.cache.range.write`
+      can be set to try to force a write on a range request. This
+      probably has little value at the moment, but if for example the
+      origin server ignores the ``Range:`` header, this option can allow
+      for the response to be cached. It is disabled by default, for
+      best performance.
+
+A plugin can call :c:func:`TSHttpTxnReqCacheableSet()` to force the request to
+be viewed as cache valid.
+
+.. [#cacheability-overrides]
+
+   The code appears to check :file:`cache.config` in this logic by setting the
+   ``does_config_permit_lookup`` in the ``cache_info.directives`` of the state
+   machine instance but I can find no place where the value is used. The
+   directive ``does_config_permit_storing`` is set and later checked so the
+   directive (from the administrator point of view) is effective in preventing
+   caching of the object.
 
 Cache Lookup
 ============
 
-If the initial request is not determined to be cache invalid then a lookup is done. Cache lookup determines if an object is in the cache and if so, where it is located. In some cases the lookup proceeds to read the first ``Doc`` from disk to verify the object is still present in the cache.
+If the initial request is not determined to be cache invalid then a lookup is
+done. Cache lookup determines if an object is in the cache and if so, where it
+is located. In some cases the lookup proceeds to read the first ``Doc`` from
+disk to verify the object is still present in the cache.
 
-There are three basic steps to a cache lookup.
+The basic steps to a cache lookup are:
 
 #. The cache key is computed.
 
-   This is normally computed using the request URL but it can be overridden :ref:`by a plugin <cache-key>` . As far as I can tell the cache index string is not stored anywhere, it presumed computable from the client request header.
+   This is normally computed using the request URL but it can be overridden
+   :ref:`by a plugin <cache-key>` . The cache index string is not stored, as it
+   is presumed computable from the client request headers.
 
 #. The cache stripe is determined (based on the cache key).
 
-   The cache key is used as a hash key in to an array of :cpp:class:`Vol` instances. The construction and arrangement of this array is the essence of how volumes are assigned.
+   The :term:`cache key` is used as a hash key in to an array of
+   :cpp:class:`Vol` instances. The construction and arrangement of this array
+   is the essence of how volumes are assigned.
 
-#. The cache stripe directory :ref:`is probed <cache-directory-probe>` using the index key computed from the cache key.
+#. The cache stripe directory :ref:`is probed <cache-directory-probe>` using the
+   index key computed from the cache key.
 
-   Various other lookaside directories are checked as well, such as the :ref:`aggregation buffer <aggregation-buffer>`.
+   Various other lookaside directories are checked as well, such as the
+   :ref:`aggregation buffer <aggregation-buffer>`.
 
-#. If the directory entry is found the first ``Doc`` is read from disk and checked for validity.
+#. If the directory entry is found the first ``Doc`` is read from disk and
+   checked for validity.
 
-   This is done in :cpp:func:`CacheVC::openReadStartHead()` or :cpp:func:`CacheVC::openReadStartEarliest()` which are tightly coupled methods.
+   This is done in :cpp:func:`CacheVC::openReadStartHead()` or
+   :cpp:func:`CacheVC::openReadStartEarliest()` which are tightly coupled
+   methods.
 
-If the lookup succeeds then a more detailed directory entry (struct :cpp:class:`OpenDir`) is created. Note that the directory probe includes a check for an already extant ``OpenDir`` which if found is returned without additional work.
+If the lookup succeeds, then a more detailed directory entry (struct
+:cpp:class:`OpenDir`) is created. Note that the directory probe includes a check
+for an already extant ``OpenDir`` which, if found, is returned without
+additional work.
 
 Cache Read
 ==========
 
-Cache read starts after a successful `cache lookup`_. At this point the first ``Doc`` has been loaded in to memory and can be consulted for additional information. This will always contain the HTTP headers for all alternates of the object.
+Cache read starts after a successful `cache lookup`_. At this point the first
+``Doc`` has been loaded in to memory and can be consulted for additional
+information. This will always contain the HTTP headers for all
+:term:`alternates <alternate>` of the object.
 
 .. sidebar:: Read while write
 
-   There is provision in the code to support "read while write", that is serving an object from cache in one transaction while it is being written in another. Several settings are needed for it to be used. See :ref:`reducing-origin-server-requests-avoiding-the-thundering-herd`. It must specifically enabled in :file:`records.config` and if not, a cache read will fail if the object is currently be written or updated.
-
-At this point an alternate for the object is selected. This is done by comparing the client request to the stored response headers, but it can be controlled by a plugin using ``TS_HTTP_ALT_SELECT_HOOK``.
-
-The content can now be checked to see if it is stale by calculating the "freshness" of the object. This is essential checking how old the object is by looking at the headers and possibly other metadata (note the headers can't be checked until we've selected an alternate).
-
-Most of this work is done in::
-
-   HttpTransact::what_is_document_freshness
-
-First the TTL (time to live) value which can be set in :file:`cache.config` is checked if the request matches the configuration file line. This is done based on when the object was placed in cache, not on any data in the headers.
-
-Next an internal flag ("needs-revalidate-once") is checked if the :file:`cache.config` value "revalidate-after" is not set, and if set the object is marked "stale".
-
-After these checks the object age is calculated by::
-
-   HttpTransactHeaders::calculate_document_age
-
-and then any configured fuzzing is applied. The limits to this age based on available data is calculated by::
-
-   HttpTransact::calculate_document_freshness_limit
-
-How this age is used is determined by the :file:`records.config` value::
-
-   proxy.config.http.cache.when_to_revalidate
-
-If this is zero then the built caclulations are used which compare the freshness limits with document age, modified by any of the client supplied cache control values ``max-age``, ``min-fresh``, ``max-stale`` unless explicitly overridden in :file:`cache.config`.
-
-If the object is not stale then it is served to the client. If stale the client request may be changed to an ``If Modified Since`` request to revalidate.
-
-The request is served using a standard virtual connection tunnel (``HttpTunnel``) with the :cpp:class:`CacheVC` acting
-as the producer and the client ``NetVC`` acting as the sink. If the request is a range request this can be modified with
-a transform to select the appropriate parts of the object or, if the request contains a single range, it can use the
-range acceleration.
-
-Range acceleration is done by consulting a fragment offset table attached to the earliest ``Doc`` which contains offsets
-for all fragments past the first. This allows loading the fragment containing the first requested byte immediately
+   There is provision in the code to support *read while write*, that is,
+   serving an object from cache in one transaction while it is being written in
+   another. Several settings are needed for it to be used. See
+   :ref:`reducing-origin-server-requests-avoiding-the-thundering-herd`. It must
+   specifically enabled in :file:`records.config` and if not, a cache read will
+   fail if the object is currently be written or updated.
+
+At this point an alternate for the object is selected. This is done by comparing
+the client request to the stored response headers, but it can be controlled by a
+plugin using ``TS_HTTP_ALT_SELECT_HOOK``.
+
+The content can now be checked to see if it is :term:`stale` by calculating the
+*freshness* of the object. This is essentially checking how old the object is
+by looking at the headers and possibly other metadata (note that the headers
+can't be checked until we've selected an alternate).
+
+Most of this work is done in ``HttpTransact::what_is_document_freshness``.
+
+First, the TTL (time to live) value, which can be set in :file:`cache.config`,
+is checked if the request matches the configuration file line. This is done
+based on when the object was placed in the cache, not on any data in the
+headers.
+
+Next, an internal flag (``needs-revalidate-once``) is checked if the
+:file:`cache.config` value ``revalidate-after`` is not set, and if set the
+object is marked *stale*.
+
+After these checks the object age is calculated by ``HttpTransactHeaders::calculate_document_age``.
+and then any configured fuzzing is applied. The limits to this age based on
+available data is calculated by ``HttpTransact::calculate_document_freshness_limit``.
+
+How this age is used is determined by the :file:`records.config` setting for
+:ts:cv:`proxy.config.http.cache.when_to_revalidate`. If this is ``0`` then the
+built caclulations are used which compare the freshness limits with document
+age, modified by any of the client supplied cache control values (``max-age``,
+``min-fresh``, ``max-stale``) unless explicitly overridden in
+:file:`cache.config`.
+
+If the object is not stale then it is served to the client. If it is stale, the
+client request may be changed to an ``If Modified Since`` request to
+:term:`revalidate <revalidation>`.
+
+The request is served using a standard virtual connection tunnel (``HttpTunnel``)
+with the :cpp:class:`CacheVC` acting as the producer and the client ``NetVC``
+acting as the sink. If the request is a range request this can be modified with
+a transform to select the appropriate parts of the object or, if the request
+contains a single range, it can use the range acceleration.
+
+Range acceleration is done by consulting a fragment offset table attached to
+the earliest ``Doc`` which contains offsets for all fragments past the first.
+This allows loading the fragment containing the first requested byte immediately
 rather than performing reads on the intermediate fragments.
 
 Cache Write
 ===========
 
-Writing to cache is handled by an instance of the class :cpp:class:`CacheVC`. This is a virtual connection which
-receives data and writes it to cache, acting as a sink. For a standard transaction data transfers between virtual
-connections (*VConns*) are handled by :cpp:class:`HttpTunnel`. Writing to cache is done by attaching a ``CacheVC``
-instance as a tunnel consumer. It therefore operates in parallel with the virtual connection that transfers data to the
-client. The data does not flow to the cache and then to the client, it is split and goes both directions in parallel.
-This avoids any data synchronization issues between the two.
+Writing to the cache is handled by an instance of the class :cpp:class:`CacheVC`.
+This is a virtual connection which receives data and writes it to cache, acting
+as a sink. For a standard transaction data transfers between virtual connections
+(*VConns*) are handled by :cpp:class:`HttpTunnel`. Writing to the cache is done
+by attaching a ``CacheVC`` instance as a tunnel consumer. It therefore operates
+in parallel with the virtual connection that transfers data to the client. The
+data does not flow to the cache and then to the client, it is split and goes
+both directions in parallel. This avoids any data synchronization issues between
+the two.
 
 .. sidebar:: Writing to disk
 
-   The actual write to disk is handled in a separate thread dedicated to I/O operations, the AIO threads. The cache
-   logic marshals the data and then hands the operation off to the AIO thread which signals back once the operation
+   The actual write to disk is handled in a separate thread dedicated to I/O
+   operations, the AIO threads. The cache logic marshals the data and then hands
+   the operation off to the AIO thread which signals back once the operation
    completes.
 
-While each ``CacheVC`` handles its transactions independently, they do interact at the volume level as each ``CacheVC``
-makes calls to the volume object to write its data to the volume content. The ``CacheVC`` accumulates data internally
-until either the transaction is complete or the amount of data to write exceeds the target fragment size. In the former
-case the entire object is submitted to the volume to be written. In the latter case a target fragment size amount of
-data is submitted and the ``CacheVC`` continues to operate on subsequent data. The volume in turn places these write
+While each ``CacheVC`` handles its transactions independently, they do interact
+at the :term:`volume <cache volume>` level as each ``CacheVC`` makes calls to
+the volume object to write its data to the volume content. The ``CacheVC``
+accumulates data internally until either the transaction is complete or the
+amount of data to write exceeds the target fragment size. In the former
+case the entire object is submitted to the volume to be written. In the latter
+case, a target fragment size amount of data is submitted and the ``CacheVC``
+continues to operate on subsequent data. The volume in turn places these write
 requests in an holding area called the `aggregation buffer`_.
 
-For objects under the target fragment size there is no consideration of order, the object is simply written to the
-volume content. For larger objects the earliest ``Doc`` is written first and the first ``Doc`` written last. This
-provides some detection ability should the object be overwritten. Because of the nature of the write cursor no fragment
-after the first fragment (in the earliest ``Doc``) can be overwritten without also overwriting that first fragment
-(since we know at the time the object was finalized in the cache the write cursor was at the position of the first
-``Doc``).
+For objects under the target fragment size, there is no consideration of order,
+the object is simply written to the volume content. For larger objects, the
+earliest ``Doc`` is written first and the first ``Doc`` written last. This
+provides some detection ability should the object be overwritten. Because of
+the nature of the write cursor no fragment after the first fragment (in the
+earliest ``Doc``) can be overwritten without also overwriting that first
+fragment (since we know at the time the object was finalized in the cache the
+write cursor was at the position of the first ``Doc``).
+
+.. note::
 
-.. note:: It is the responsibility of the ``CacheVC`` to not submit writes that exceed the target fragment size.
+   It is the responsibility of the ``CacheVC`` to not submit writes that exceed
+   the target fragment size.
 
-.. how does the write logic know if it's an original object write or an update to an existing object?
+.. XXX how does the write logic know if it's an original object write or an update to an existing object?
 
 Update
 ------
 
-Cache write also covers the case where an existing object in the cache is modified. This occurs when
+Cache write also covers the case where an existing object in the cache is
+modified. This occurs when:
+
+* A conditional request is made to the origin server and a ``304 - Not Modified``
+  response is received.
+
+* An alternate of the object is retrieved from an :term:`origin server` and
+  added to the object.
 
-* A conditional request is made to the origin server and a ``304 - Not Modified`` response is received.
-* An alternate of the object is retrieved from an origin server and added to the object.
 * An alternate of the object is removed (e.g., due to a ``DELETE`` request).
 
-In every case the metadata for the object must be modified. Because |TS| never updates data already in the cache this
-means the first ``Doc`` will be written to the cache again and the volume directory entry updated. Because a client
-request has already been processed the first ``Doc`` has been read from cache and is in memory. The alternate vector is
-updated as appropriate (an entry added or removed, or changed to contain the new HTTP headers), and then written to
-disk. It is possible for multiple alternates to be updated by different ``CacheVC`` instances at the same time. The only
-contention is the first ``Doc``, the rest of the data for each alternate is completely independent.
+In every case the metadata for the object must be modified. Because |TS| never
+updates data already in the cache this means the first ``Doc`` will be written
+to the cache again and the volume directory entry updated. Because a client
+request has already been processed the first ``Doc`` has been read from cache
+and is in memory. The alternate vector is updated as appropriate (an entry
+added or removed, or changed to contain the new HTTP headers), and then written
+to disk. It is possible for multiple alternates to be updated by different
+``CacheVC`` instances at the same time. The only contention is the first
+``Doc``; the rest of the data for each alternate is completely independent.
 
 .. _aggregation-buffer:
 
 Aggregation Buffer
 ------------------
 
-Disk writes to cache are handled through an *aggregation buffer*. There is one for each :cpp:class:`Vol` instance. To
-minimize the number of system calls data is written to disk in units of roughly :ref:`target fragment size
-<target-fragment-size>` bytes. The algorithm used is simple - data is piled up in the aggregation buffer until no more
-will fit without going over the target fragment size, at which point the buffer is written to disk and the volume
-directory entries for objects with data in the buffer are updated with the actual disk locations for those objects
-(which are determined by the write to disk action). After the buffer is written it is cleared and process repeats. There
-is a special lookup table for the aggregation buffer so that object lookup can find cache data in that memory.
-
-Because data in the aggregation buffer is visible to other parts of the cache, particularly `cache lookup`_, there is no
-need to push a partial filled aggregation buffer to disk. In effect any such data is effectively memory cached until
+Disk writes to cache are handled through an *aggregation buffer*. There is one
+for each :cpp:class:`Vol` instance. To minimize the number of system calls data
+is written to disk in units of roughly :ref:`target fragment size <target-fragment-size>`
+bytes. The algorithm used is simple: data is piled up in the aggregation buffer
+until no more will fit without going over the target fragment size, at which
+point the buffer is written to disk and the volume directory entries for objects
+with data in the buffer are updated with the actual disk locations for those
+objects (which are determined by the write to disk action). After the buffer is
+written it is cleared and process repeats. There is a special lookup table for
+the aggregation buffer so that object lookup can find cache data in that memory.
+
+Because data in the aggregation buffer is visible to other parts of the cache,
+particularly `cache lookup`_, there is no need to push a partially filled
+aggregation buffer to disk. In effect, any such data is memory cached until
 enough additional cache content arrives to fill the buffer.
 
-The target fragment size has little effect on small objects because the fragment sized is used only to parcel out disk
-write operations. For larger objects the effect very significant as it causes those objects to be broken up in to
-fragments at different locations on in the volume. Each fragment write has its own entry in the volume directory which
-are computational chained (each cache key is computed from the previous one). If possible a fragment table is
-accumulated in the earliest ``Doc`` which has the offsets of the first byte for each fragment.
+The target fragment size has little effect on small objects because the fragment
+size is used only to parcel out disk write operations. For larger objects the
+effect very significant as it causes those objects to be broken up in to
+fragments at different locations on in the volume. Each fragment write has its
+own entry in the volume directory which are computationally chained (each
+:term:`cache key` is computed from the previous one). If possible, a fragment
+table is accumulated in the earliest ``Doc`` which has the offsets of the first
+byte for each fragment.
+
+.. _evacuation-mechanics:
 
 Evacuation Mechanics
 --------------------
 
-By default the write cursor will overwrite (de facto evict from cache) objects as it proceeds once it has gone around
-the cache stripe at least once. In some cases this is not acceptable and the object is *evacuated* by reading it from
-the cache and then writing it back to cache which moves the physical storage of the object from in front of the write
-cursor to behind the write cursor. Objects that are evacuated are handled in this way based on data in stripe data
-structures (attached to the :cpp:class:`Vol` instance).
-
-Evacuation data structures are defined by dividing up the volume content in to a disjoint and contiguous set of regions
-of ``EVACUATION_BUCKET_SIZE`` bytes. The :cpp:member:`Vol::evacuate` member is an array with an element for each
-evacuation region. Each element is a doubly linked list of :cpp:class:`EvacuationBlock` instances. Each instance
-contains a :cpp:class:`Dir` that specifies the fragment to evacuate. It is assumed that an evacuation block is placed in
-the evacuation bucket (array element) that corresponds to the evacuation region in which the fragment is located
-although no ordering per bucket is enforced in the linked list (this sorting is handled during evacuation). Objects are
-evacuated by specifying the first or earliest fragment in the evactuation block. The evactuation operation will then
-continue the evacuation for subsequent fragments in the object by adding those fragments in evacuation blocks. Note that
-the actual evacuation of those fragments is delayed until the write cursor reaches the fragments, it is not necessarily
-done at the time the first / earliest fragment is evacuated.
-
-There are two types of evacuations, reader based and forced. The ``EvacuationBlock`` has a reader count to track this.
-If the reader count is zero, then it is a forced evacuation and the the target, if it exists, will be evacuated when the
-write cursor gets close. If the reader value is non-zero then it is a count of entities that are currently expecting to
-be able to read the object. Readers increment the count when they require read access to the object, or create the
-``EvacuationBlock`` with a count of 1. When a reader is finished with the object it decrements the count and removes the
-``EvacuationBlock`` if the count goes to zero. If the ``EvacuationBlock`` already exists with a count of zero, the count
-is not modified and the number of readers is not tracked, so the evacuation is valid as long as the object exists.
-
-Evacuation is driven by cache writes, essentially in :cpp:member:`Vol::aggWrite`. This method processes the pending
-cache virtual connections that are trying to write to the stripe. Some of these may be evacuation virtual connections.
-If so then the completion callback for that virtual connection is called as the data is put in to the aggregation
-buffer.
-
-When no more cache virtual connections can be processed (due to an empty queue or the aggregation buffer filling) then
-:cpp:member:`Vol::evac_range` is called to clear the range to be overwritten plus an additional
-:const:`EVACUA

<TRUNCATED>

[3/3] trafficserver git commit: docs: update focused on architecture documentation

Posted by jp...@apache.org.
docs: update focused on architecture documentation


Project: http://git-wip-us.apache.org/repos/asf/trafficserver/repo
Commit: http://git-wip-us.apache.org/repos/asf/trafficserver/commit/aa37d0ab
Tree: http://git-wip-us.apache.org/repos/asf/trafficserver/tree/aa37d0ab
Diff: http://git-wip-us.apache.org/repos/asf/trafficserver/diff/aa37d0ab

Branch: refs/heads/master
Commit: aa37d0ab553ef2f187c4742544248695701202af
Parents: 1f1e2ae
Author: Jon Sime <js...@omniti.com>
Authored: Tue Nov 18 12:34:08 2014 -0800
Committer: James Peach <jp...@apache.org>
Committed: Wed Dec 10 13:35:40 2014 -0800

----------------------------------------------------------------------
 doc/admin/cluster-howto.en.rst                  |    2 +-
 doc/admin/configuring-cache.en.rst              |   94 +-
 doc/arch/cache/cache-api.en.rst                 |   24 +-
 doc/arch/cache/cache-appendix.en.rst            |  143 +-
 doc/arch/cache/cache-arch.en.rst                | 1284 +++++++++++-------
 doc/arch/cache/cache-data-structures.en.rst     |  117 +-
 doc/arch/cache/cache.en.rst                     |   28 +-
 doc/arch/cache/ram-cache.en.rst                 |  165 ++-
 doc/arch/cache/tier-storage.en.rst              |  165 ++-
 doc/arch/hacking/config-var-impl.en.rst         |  222 +--
 doc/arch/hacking/index.en.rst                   |   27 +-
 doc/arch/hacking/release-process.en.rst         |  132 +-
 doc/arch/index.en.rst                           |   35 +-
 doc/glossary.en.rst                             |   22 +
 .../configuration/records.config.en.rst         |    6 +-
 15 files changed, 1560 insertions(+), 906 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/admin/cluster-howto.en.rst
----------------------------------------------------------------------
diff --git a/doc/admin/cluster-howto.en.rst b/doc/admin/cluster-howto.en.rst
index 1c00616..a34ead0 100644
--- a/doc/admin/cluster-howto.en.rst
+++ b/doc/admin/cluster-howto.en.rst
@@ -140,7 +140,7 @@ cluster, for example::
     127.1.2.5:80
 
 After successfully joining a cluster, all changes of global configurations
-performed on any node in that cluster will take effect on **all** nodes, removing
+performed on any node in that cluster will take effect on all nodes, removing
 the need to manually duplicate configuration changes across each node individually.
 
 Deleting Nodes from a Cluster

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/admin/configuring-cache.en.rst
----------------------------------------------------------------------
diff --git a/doc/admin/configuring-cache.en.rst b/doc/admin/configuring-cache.en.rst
index dc009d2..22a31e0 100644
--- a/doc/admin/configuring-cache.en.rst
+++ b/doc/admin/configuring-cache.en.rst
@@ -21,8 +21,8 @@ Configuring the Cache
    under the License.
 
 The Traffic Server cache consists of a high-speed object database called
-the *object store* that indexes objects according to URLs and their
-associated headers.
+the :term:`object store` that indexes :term:`cache objects <cache object>`
+according to URLs and their associated headers.
 
 .. toctree::
    :maxdepth: 2
@@ -31,16 +31,16 @@ The Traffic Server Cache
 ========================
 
 The Traffic Server cache consists of a high-speed object database called
-the *object store*. The object store indexes objects according to URLs
-and associated headers. This enables Traffic Server to store, retrieve,
-and serve not only web pages, but also parts of web pages - which
-provides optimum bandwidth savings. Using sophisticated object
-management, the object store can cache alternate versions of the same
-object (versions may differ because of dissimilar language or encoding
-types). It can also efficiently store very small and very large
-documents, thereby minimizing wasted space. When the cache is full,
-Traffic Server removes stale data to ensure the most requested objects
-are kept readily available and fresh.
+the :term:`object store`. The object store indexes
+:term:`cache objects <cache object>` according to URLs and associated headers.
+This enables Traffic Server to store, retrieve, and serve not only web pages,
+but also parts of web pages - which provides optimum bandwidth savings. Using
+sophisticated object management, the object store can cache
+:term:`alternate` versions of the same object (versions may differ because of
+dissimilar language or encoding types). It can also efficiently store very
+small and very large documents, thereby minimizing wasted space. When the
+cache is full, Traffic Server removes :term:`stale` data to ensure the most
+requested objects are kept readily available and fresh.
 
 Traffic Server is designed to tolerate total disk failures on any of the
 cache disks. If the disk fails completely, then Traffic Server marks the
@@ -50,11 +50,15 @@ fail, then Traffic Server goes into proxy-only mode.
 
 You can perform the following cache configuration tasks:
 
--  Change the total amount of disk space allocated to the cache: refer
+-  Change the total amount of disk space allocated to the cache; refer
    to `Changing Cache Capacity`_.
+
 -  Partition the cache by reserving cache disk space for specific
-   protocols and origin servers/domains; refer to `Partitioning the Cache`_.
+   protocols and :term:`origin servers/domains <origin server>`; refer to
+   `Partitioning the Cache`_.
+
 -  Delete all data in the cache; refer to `Clearing the Cache`_.
+
 -  Override cache directives for a requested domain name, regex on a url,
    hostname or ip, with extra filters for time, port, method of the request,
    and more. ATS can be configured to never cache, always cache,
@@ -85,7 +89,7 @@ resistance against this problem.
 In addition, *CLFUS* also supports compressing in the RAM cache itself.
 This can be useful for content which is not compressed by itself (e.g.
 images). This should not be confused with ``Content-Encoding: gzip``, this
-feature is only thereto save space internally in the RAM cache itself. As
+feature is only present to save space internally in the RAM cache itself. As
 such, it is completely transparent to the User-Agent. The RAM cache
 compression is enabled with the option
 :ts:cv:`proxy.config.cache.ram_cache.compress`.
@@ -101,7 +105,6 @@ Value   Meaning
 3       *liblzma* compression
 ======= =============================
 
-
 .. _changing-the-size-of-the-ram-cache:
 
 Changing the Size of the RAM Cache
@@ -109,10 +112,10 @@ Changing the Size of the RAM Cache
 
 Traffic Server provides a dedicated RAM cache for fast retrieval of
 popular small objects. The default RAM cache size is automatically
-calculated based on the number and size of the cache partitions you have
-configured. If you've partitioned your cache according to protocol
-and/or hosts, then the size of the RAM cache for each partition is
-proportional to the size of that partition.
+calculated based on the number and size of the
+:term:`cache partitions <cache partition>` you have configured. If you've
+partitioned your cache according to protocol and/or hosts, then the size of
+the RAM cache for each partition is proportional to the size of that partition.
 
 You can increase the RAM cache size for better cache hit performance.
 However, if you increase the size of the RAM cache and observe a
@@ -124,10 +127,12 @@ its previous value.
 To change the RAM cache size:
 
 #. Stop Traffic Server.
+
 #. Set the variable :ts:cv:`proxy.config.cache.ram_cache.size`
    to specify the size of the RAM cache. The default value of ``-1`` means
    that the RAM cache is automatically sized at approximately 1MB per
    gigabyte of disk.
+
 #. Restart Traffic Server. If you increase the RAM cache to a size of
    1GB or more, then restart with the :program:`trafficserver` command
    (refer to :ref:`start-traffic-server`).
@@ -146,9 +151,12 @@ To increase the total amount of disk space allocated to the cache on
 existing disks, or to add new disks to a Traffic Server node:
 
 #. Stop Traffic Server.
+
 #. Add hardware, if necessary.
+
 #. Edit :file:`storage.config` to increase the amount of disk space allocated
    to the cache on existing disks or describe the new hardware you are adding.
+
 #. Restart Traffic Server.
 
 Reducing Cache Capacity
@@ -158,9 +166,12 @@ To reduce the total amount of disk space allocated to the cache on an
 existing disk, or to remove disks from a Traffic Server node:
 
 #. Stop Traffic Server.
+
 #. Remove hardware, if necessary.
+
 #. Edit :file:`storage.config` to reduce the amount of disk space allocated
    to the cache on existing disks or delete the reference to the hardware you're removing.
+
 #. Restart Traffic Server.
 
 .. important:: In :file:`storage.config`, a formatted or raw disk must be at least 128 MB.
@@ -171,27 +182,32 @@ Partitioning the Cache
 ======================
 
 You can manage your cache space more efficiently and restrict disk usage
-by creating cache volumes with different sizes for specific protocols.
-You can further configure these volumes to store data from specific
-origin servers and/or domains. The volume configuration must be the same
-on all nodes in a :ref:`cluster <traffic-server-cluster>`.
+by creating :term:`cache volumes <cache volume>` with different sizes for
+specific protocols. You can further configure these volumes to store data from
+specific :term:`origin servers <origin server>` and/or domains. The volume
+configuration must be the same on all nodes in a :ref:`cluster <traffic-server-cluster>`.
 
 Creating Cache Partitions for Specific Protocols
 ------------------------------------------------
 
-You can create separate volumes for your cache that vary in size to
-store content according to protocol. This ensures that a certain amount
-of disk space is always available for a particular protocol. Traffic
-Server currently supports the ``http`` partition type for HTTP objects.
-
-.. XXX: but not https?
+You can create separate :term:`volumes <cache volume>` for your cache that vary
+in size to store content according to protocol. This ensures that a certain
+amount of disk space is always available for a particular protocol. Traffic
+Server currently supports only the ``http`` partition type.
 
 To partition the cache according to protocol:
 
-#. Enter a line in the :file:`volume.config` file for
-   each volume you want to create
+#. Enter a line in :file:`volume.config` for each volume you want to create. ::
+
+    volume=1 scheme=http size=50%
+    volume=2 scheme=http size=50%
+
 #. Restart Traffic Server.
 
+.. important::
+
+    Volume definitions must be the same across all nodes in a cluster.
+
 Making Changes to Partition Sizes and Protocols
 -----------------------------------------------
 
@@ -201,13 +217,17 @@ note the following:
 
 -  You must stop Traffic Server before you change the cache volume size
    and protocol assignment.
+
 -  When you increase the size of a volume, the contents of the volume
    are *not* deleted. However, when you reduce the size of a volume, the
    contents of the volume *are* deleted.
+
 -  When you change the volume number, the volume is deleted and then
    recreated, even if the size and protocol type remain the same.
+
 -  When you add new disks to your Traffic Server node, volume sizes
    specified in percentages will increase proportionately.
+
 -  Substantial changes to volume sizes can result in disk fragmentation,
    which affects performance and cache hit rate. You should clear the cache
    before making many changes to cache volume sizes (refer to `Clearing the Cache`_).
@@ -232,11 +252,11 @@ then Traffic Server will run in proxy-only mode.
 
 .. note::
 
-    You do not need to stop Traffic Server before you assign
-    volumes to particular hosts or domains. However, this type of
-    configuration is time-consuming and can cause a spike in memory usage.
-    Therefore, it's best to configure partition assignment during periods of
-    low traffic.
+    You do not need to stop Traffic Server before you assign volumes
+    to particular hosts or domains. However, this type of configuration
+    is time-consuming and can cause a spike in memory usage.
+    Therefore, it's best to configure partition assignment during
+    periods of low traffic.
 
 To partition the cache according to hostname and domain:
 

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/arch/cache/cache-api.en.rst
----------------------------------------------------------------------
diff --git a/doc/arch/cache/cache-api.en.rst b/doc/arch/cache/cache-api.en.rst
index 7a9963e..2cdd615 100644
--- a/doc/arch/cache/cache-api.en.rst
+++ b/doc/arch/cache/cache-api.en.rst
@@ -5,9 +5,9 @@
    to you under the Apache License, Version 2.0 (the
    "License"); you may not use this file except in compliance
    with the License.  You may obtain a copy of the License at
-   
+
    http://www.apache.org/licenses/LICENSE-2.0
-   
+
    Unless required by applicable law or agreed to in writing,
    software distributed under the License is distributed on an
    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@@ -23,20 +23,28 @@ Cache Related API functions
 
 .. c:function:: void TSHttpTxnReqCacheableSet(TSHttpTxn txnp, int flag)
 
-   Set a *flag* that marks a request as cacheable. This is a positive override only, setting *flag* to 0 restores the default behavior, it does not force the request to be uncacheable.
+   Set a flag that marks a request as cacheable. This is a positive override
+   only, setting :c:arg:``flag`` to ``0`` restores the default behavior, it does not
+   force the request to be uncacheable.
 
 .. c:function:: TSReturnCode TSCacheUrlSet(TSHttpTxn txnp, char const* url, int length)
 
-  Set the cache key for the transaction *txnp* as the string pointed at by *url* of *length* characters. It need not be ``null`` terminated. This should be called from ``TS_HTTP_READ_REQUEST_HDR_HOOK`` which is before cache lookup but late enough that the HTTP request header is available.
+   Set the cache key for the transaction :c:arg:``txnp`` as the string pointed at by
+   :c:arg:``url`` of :c:arg:``length`` characters. It need not be NUL-terminated. This should
+   be called from ``TS_HTTP_READ_REQUEST_HDR_HOOK`` which is before cache lookup
+   but late enough that the HTTP request header is available.
 
 ===============
 Cache Internals
 ===============
 
-.. cpp:function::    int DIR_SIZE_WITH_BLOCK(int big)
+.. cpp:function:: int DIR_SIZE_WITH_BLOCK(int big)
 
-   A preprocessor macro which computes the maximum size of a fragment based on the value of *big*. This is computed as if the argument where the value of the *big* field in a struct :cpp:class:`Dir`.
+   A preprocessor macro which computes the maximum size of a fragment based on
+   the value of :cpp:arg:``big``. This is computed as if the argument where the value of
+   the :cpp:arg:``big`` field in a struct :cpp:class:`Dir`.
 
-.. cpp:function::    int DIR_BLOCK_SIZE(int big)
+.. cpp:function:: int DIR_BLOCK_SIZE(int big)
 
-   A preprocessor macro which computes the block size multiplier for a struct :cpp:class:`Dir` where *big* is the *big* field value.
+   A preprocessor macro which computes the block size multiplier for a struct
+   :cpp:class:`Dir` where :cpp:arg:``big`` is the :cpp:arg:``big`` field value.

http://git-wip-us.apache.org/repos/asf/trafficserver/blob/aa37d0ab/doc/arch/cache/cache-appendix.en.rst
----------------------------------------------------------------------
diff --git a/doc/arch/cache/cache-appendix.en.rst b/doc/arch/cache/cache-appendix.en.rst
index 2b69adc..54eda3a 100644
--- a/doc/arch/cache/cache-appendix.en.rst
+++ b/doc/arch/cache/cache-appendix.en.rst
@@ -5,9 +5,9 @@
    to you under the Apache License, Version 2.0 (the
    "License"); you may not use this file except in compliance
    with the License.  You may obtain a copy of the License at
-   
+
    http://www.apache.org/licenses/LICENSE-2.0
-   
+
    Unless required by applicable law or agreed to in writing,
    software distributed under the License is distributed on an
    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
@@ -33,67 +33,142 @@ Topics to be done
 Cache Consistency
 ~~~~~~~~~~~~~~~~~
 
-The cache is completely consistent, up to and including kicking the power cord out, if the write buffer on consumer disk drives is disabled. You need to use::
+The cache is completely consistent, up to and including kicking the power cord
+out, if the write buffer on consumer disk drives is disabled. You need to use::
 
   hdparm -W0
 
-The cache validates that all the data for the document is available and will silently mark a partial document as a "miss" on read. There is no "gentle" shutdown for traffic server, you just kill the process, so the "recovery" code (fsck) is run every time traffic server starts up.
+The cache validates that all the data for the document is available and will
+silently mark a partial document as a miss on read. There is no gentle
+shutdown for Traffic Server. You simply kill the process and the recovery code
+(fsck) is run every time Traffic Server starts up.
 
-On startup the two versions of the index are checked, and the last valid one is read into memory. Then traffic server moves forward from the last snapped write cursor and reads all the fragments written to disk, and updates the directory (as in a log-based file system). It stops reading at the write before the last valid write header it sees (as a write is not necessarily atomic because of sector reordering). Then the new updated index is written to the invalid version (in case of a crash during startup) and the system starts.
+On startup the two versions of the index are checked, and the last valid one is
+read into memory. |TS| then moves forward from the last snapped write
+cursor and reads all the fragments written to disk and updates the directory
+(as in a log-based file system). It stops reading at the write before the last
+valid write header it sees (as a write is not necessarily atomic because of
+sector reordering). Then the new updated index is written to the invalid
+version (in case of a crash during startup) and the system starts.
 
 .. _volume tagging:
 
 Volume Tagging
 ~~~~~~~~~~~~~~
 
-Currently cache volumes are allocated somewhat arbitrarily from storage elements. `This enhancement <https://issues.apache.org/jira/browse/TS-1728>`__ allows the :file:`storage.config` file to assign storage units to specific volumes although the volumes must still be listed in :file:`volume.config` in general and in particular to map domains to specific volumes. A primary use case for this is to be able to map specific types of content to different storage elements. This could to have different storage devices for the content (SSD vs. rotational).
+Currently, :term:`cache volumes <cache volume>` are allocated somewhat
+arbitrarily from storage elements. `This enhancement <https://issues.apache.org/jira/browse/TS-1728>`__
+allows :file:`storage.config` to assign :term:`storage units <storage unit>` to
+specific :term:`volumes <cache volume>` although the volumes must still be
+listed in :file:`volume.config` in general and in particular to map domains to
+specific volumes. A primary use case for this is to be able to map specific
+types of content to different storage elements. This can be employed to have
+different storage devices for various types of content (SSD vs. rotational).
 
 ---------------
 Version Upgrade
 ---------------
 
-It is currently the case that any change to the cache format will clear the cache. This is an issue when upgrading the |TS| version and should be kept in mind.
+It is currently the case that any change to the cache format will clear the
+cache. This is an issue when upgrading the |TS| version and should be kept in mind.
 
-.. cache-key:
+.. _cache-key:
 
 -------------------------
 Controlling the cache key
 -------------------------
 
-The cache key is by default the URL of the request. There are two possible choices, the original ("pristine") URL and the remapped URL. Which of these is used is determined by the configuration value  :ts:cv:`proxy.config.url_remap.pristine_host_hdr`.
-
-This is an ``INT`` value. If set to ``0`` (disabled) then the remapped URL is used, and if it is not ``0`` (enabled) then the original URL is used. This setting also controls the value of the ``HOST`` header that is placed in the request sent to the origin server, using hostname from the original URL if non-``0`` and the host name from the remapped URL if ``0``. It has no other effects.
-
-For caching, this setting is irrelevant if no remapping is done or there is a one to one mapping between the original and remapped URLs.
-
-It becomes significant if multiple original URLs are mapped to the same remapped URL. If pristine headers are enabled requests to different original URLs will be stored as distinct objects in the cache. If disabled the remapped URL will be used and there may be collisions. This is bad if the contents different but quite useful if they are the same (e.g., the original URLs are just aliases for the same underlying server).
-
-This is also an issue if a remapping is changed because it is effectively a time axis version of the previous case. If an original URL is remapped to a different server address then the setting determines if existing cached objects will be served for new requests (enabled) or not (disabled). Similarly if the original URL mapped to a particular URL is changed then cached objects from the initial original URL will be served from the updated original URL if pristine headers is disabled.
-
-These collisions are not of themselves good or bad. An administrator needs to decide which is appropriate for their situation and set the value correspondingly.
-
-If a greater degree of control is desired a plugin must be used to invoke the API call :c:func:`TSCacheUrlSet()` to provide a specific cache key.  The :c:func:`TSCacheUrlSet()` API can be called as early as ``TS_HTTP_READ_REQUEST_HDR_HOOK``, but no later than ``TS_HTTP_POST_REMAP_HOOK``. It can be called only once per transaction; calling it multiple times has no additional effect.
-
-A plugin that changes the cache key *must* do so consistently for both cache hit and cache miss requests because two different requests that map to the same cache key will be considered equivalent by the cache. Use of the URL directly provides this and so must any substitute. This is entirely the responsibility of the plugin, there is no way for the |TS| core to detect such an occurrence.
-
-If :c:func:`TSHttpTxnCacheLookupUrlGet()` is called after new cache url set by :c:func:`TSCacheUrlSet()`, it should use a URL location created by :c:func:`TSUrlCreate()` as its 3rd input parameter instead of getting url_loc from client request.
-
-It is a requirement that the string be syntactically a URL but otherwise it is completely arbitrary and need not have any path. For instance if the company Network Geographics wanted to store certain content under its own cache key, using a document GUID as part of the key, it could use a cache key like ::
+The :term:`cache key` is by default the URL of the request. There are two
+possible choices, the original (pristine) URL and the remapped URL. Which of
+these is used is determined by the configuration value
+:ts:cv:`proxy.config.url_remap.pristine_host_hdr`.
+
+This is an ``INT`` value. If set to ``0`` (disabled) then the remapped URL is
+used, and if it is not ``0`` (enabled) then the original URL is used. This
+setting also controls the value of the ``HOST`` header that is placed in the
+request sent to the :term:`origin server`, using the hostname from the original
+URL if not ``0`` and the host name from the remapped URL if ``0``. It has no
+other effects.
+
+For caching, this setting is irrelevant if no remapping is done or there is a
+one-to-one mapping between the original and remapped URLs.
+
+It becomes significant if multiple original URLs are mapped to the same
+remapped URL. If pristine headers are enabled, requests to different original
+URLs will be stored as distinct :term:`objects <cache object>` in the cache. If
+disabled, the remapped URL will be used and there may be collisions. This is
+bad if the contents different, but quite useful if they are the same (as in
+situations where the original URLs are just aliases for the same underlying
+server resource).
+
+This is also an issue if a remapping is changed because it is effectively a
+time axis version of the previous case. If an original URL is remapped to a
+different server address then the setting determines if existing cached objects
+will be served for new requests (enabled) or not (disabled). Similarly, if the
+original URL mapped to a particular URL is changed then cached objects from the
+initial original URL will be served from the updated original URL if pristine
+headers is disabled.
+
+These collisions are not by themselves good or bad. An administrator needs to
+decide which is appropriate for their situation and set the value correspondingly.
+
+If a greater degree of control is desired, a plugin must be used to invoke the
+API call :c:func:`TSCacheUrlSet()` to provide a specific :term:`cache key`. The
+:c:func:`TSCacheUrlSet()` API can be called as early as
+``TS_HTTP_READ_REQUEST_HDR_HOOK`` but no later than ``TS_HTTP_POST_REMAP_HOOK``.
+It can be called only once per transaction; calling it multiple times has no
+additional effect.
+
+A plugin that changes the cache key must do so consistently for both cache hit
+and cache miss requests because two different requests that map to the same
+cache key will be considered equivalent by the cache. Use of the URL directly
+provides this and so must any substitute. This is entirely the responsibility
+of the plugin; there is no way for the |TS| core to detect such an occurrence.
+
+If :c:func:`TSHttpTxnCacheLookupUrlGet()` is called after new cache url set by
+:c:func:`TSCacheUrlSet()`, it should use a URL location created by
+:c:func:`TSUrlCreate()` as its third input parameter instead of getting
+``url_loc`` from the client request.
+
+It is a requirement that the string be syntactically a URL but otherwise it is
+completely arbitrary and need not have any path. For instance, if the company
+Network Geographics wanted to store certain content under its own
+:term:`cache key`, using a document GUID as part of the key, it could use a
+cache key like ::
 
    ngeo://W39WaGTPnvg
 
-The scheme ``ngeo`` was picked because it is *not* a valid URL scheme and so will not collide with any valid URL.
+The scheme ``ngeo`` was picked specifically because it is not a valid URL
+scheme, and so will never collide with any valid URL.
 
-This can be useful if the URL encodes both important and unimportant data. Instead of storing potentially identical content under different URLs (because they differ on the unimportant parts) a url containing only the important parts could be created and used.
+This can be useful if the URL encodes both important and unimportant data.
+Instead of storing potentially identical content under different URLs (because
+they differ on the unimportant parts) a url containing only the important parts
+could be created and used.
 
-For example, suppose the URL for Network Geographics content encoded both the document GUID and a referral key. ::
+For example, suppose the URL for Network Geographics content encoded both the
+document GUID and a referral key. ::
 
    http://network-geographics-farm-1.com/doc/W39WaGTPnvg.2511635.UQB_zCc8B8H
 
-We don't want to the same content for every possible referrer. Instead we could use a plugin to convert this to the previous example and requests that differed only in the referrer key would all reference the same cache entry. Note that we would also map ::
+We don't want to serve the same content for every possible referrer. Instead,
+we could use a plugin to convert this to the previous example and requests that
+differed only in the referrer key would all reference the same cache entry.
+Note that we would also map the following to the same cache key ::
 
    http://network-geographics-farm-56.com/doc/W39WaGTPnvg.2511635.UQB_zCc8B8H
 
-to the same cache key. This can be handy for "sharing" content between servers when that content is identical. Note also the plugin can change the cache key or not depending on any data in the request header, for instance not changing the cache key if the request is not in the ``doc`` directory. If distinguishing servers is important that can easily be pulled from the request URL and used in the synthetic cache key. The implementor is free to extract all relevant elements for use in the cache key.
-
-While there is explicit no requirement that the synthetic cache key be based on the HTTP request header, in practice it is generally necessary due to the consistency requirement. Because cache lookup happens before attempting to connect to the origin server no data from the HTTP response header is available, leaving only the request header. The most common case is the one described above where the goal is to elide elements of the URL that do not affect the content to minimize cache footprint and improve cache hit rates.
+This can be handy for sharing content between servers when that content is
+identical. Plugins can change the cache key, or not, depending on any data in
+the request header. For instance, not changing the cache key if the request is
+not in the ``doc`` directory. If distinguishing servers is important, that can
+easily be pulled from the request URL and used in the synthetic cache key. The
+implementor is free to extract all relevant elements for use in the cache key.
+
+While there is no explicit requirement that the synthetic cache key be based on
+the HTTP request header, in practice it is generally necessary due to the
+consistency requirement. Because cache lookup happens before attempting to
+connect to the :term:`origin server`, no data from the HTTP response header is
+available, leaving only the request header. The most common case is the one
+described above where the goal is to elide elements of the URL that do not
+affect the content to minimize cache footprint and improve cache hit rates.