You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@trafficserver.apache.org by am...@apache.org on 2019/02/01 23:07:50 UTC

[trafficserver] branch master updated: Doc: More HdrHeap documentation.

This is an automated email from the ASF dual-hosted git repository.

amc pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/trafficserver.git


The following commit(s) were added to refs/heads/master by this push:
     new a1ccd7a  Doc: More HdrHeap documentation.
a1ccd7a is described below

commit a1ccd7af373bf334d27814ad6afa196382f256af
Author: Alan M. Carroll <am...@apache.org>
AuthorDate: Wed Jan 30 20:18:17 2019 -0600

    Doc: More HdrHeap documentation.
---
 doc/developer-guide/core-architecture/heap.en.rst | 65 +++++++++++++++++++++--
 1 file changed, 62 insertions(+), 3 deletions(-)

diff --git a/doc/developer-guide/core-architecture/heap.en.rst b/doc/developer-guide/core-architecture/heap.en.rst
index 5c3d397..188b2c9 100644
--- a/doc/developer-guide/core-architecture/heap.en.rst
+++ b/doc/developer-guide/core-architecture/heap.en.rst
@@ -121,6 +121,15 @@ Classes
 
       :arg:`type` must be one of the values specified in :class:`HdrHeapObjImpl`.
 
+   .. function:: int marshal_length
+
+      Compute and return the size of the buffer needed to serialize :arg:`this`.
+
+   .. function:: int marshal(char * buffer, int length)
+
+      Serialize :arg:`this` to :arg:`buffer` of size :arg:`length`. It is required that
+      :arg:`length` be at least the value returned by :func:`HdrHeap::marshal_length`.
+
 .. function:: HdrHeap * new_HdrHeap(int n)
 
    Create and return a new instance of :class:`HdrHeap`. If :arg:`n` is less than ``HDR_HEAP_DEFAULT_SIZE``
@@ -148,8 +157,6 @@ collection operation on the writeable string heap in the header heap by calling
 
 *  The amount of dead space in the writable string heap exceeds ``MAX_LOST_STR_SPACE``.
 
-*  The header heap is preparing to be serialized.
-
 *  An external string heap is being added and all current read only string heap slots are used.
 
 Each heap object is responsible for providing a :code:`move_strings` method which copies its strings
@@ -188,6 +195,58 @@ try.
 Once space is found for the object, the base members of :class:`HdrHeapObjImpl` are initialized with
 the objec type and size, with the :arg:`m_obj_flags` set to 0.
 
+Serialization
+-------------
+
+Because heaps store the HTTP request / response data, a header heap needs to be serialized to be put
+in to the cache. For performance reasons, it is desirable to be able to unserialize the serialized
+data in place, rather than copying it again. That is, the data is read from disk into a block of
+memory and then that memory is converted to a live data structure. In this case the memory used by
+the heap is owned by some other object and the header heap must not do any clean up. This is
+signaled by the `m_writeable` flag. In an unserialized header heap this is set to ``false`` and such
+a header heap is not allowed to allocate any additional objects or strings - it is immutable.
+
+The primary mechanism to do this is to use swizzling on the pointers in the structure. During
+serialization pointers are converted to offsets and during unserialization these offsets are
+converted back to pointers. To make this simpler, unserialized header heaps are marked read only so
+that updating does not have to be supported. Additionally, :class:`HdrHeap` is a POD and therefore
+has no virtual function table pointer to be stored or restored [#]_.
+
+To serialize, first :func:`HdrHeap::marshal_length` is called to get a buffer size. The
+serialization buffer is created with sufficient space for the header heap and that space is passed
+to :func:`HdrHeap::marshal` to perform the actual serialization. The object heaps are serialized
+followed by the string heaps. No coalescence is done, on the presumption that because the amount
+of dead space is limited by coalescence (as needed) on every string creation.
+
+When serializing strings, each object is responsible for swizzling its own pointers. Because the
+object heaps have already been serialized and all of the header heap object types are also PODs,
+these serialized objects can have the pointer swizzling method, :code:`marshal`, called directly
+on them. This method is provided with a set of "translations" which indicate the base offset for
+each range of object and string heap memory. The object marshalling can then compute the correct
+offset to store for each live string pointer.
+
+Inheriting Strings
+------------------
+
+The string heaps are designed to be reference counted so that they can be shared as read only
+objects between heaps. This enables copying heap objects between heaps less expensive as the
+strings pointers in them can be preserved in the new heap by sharing the string heaps in which
+those strings reside.
+
+This can still be a bit complex as it is possible that the combined number of string heaps is more
+than the limit. In this case, the target header heap does string coalescence so that it is reduced to
+having a single writeable string heap with enough free space to hold all of the strings in the
+source header heap. As a result, it is required that all heap objects already be present in the
+target header heap before the strings are inherited. This means that the string coalescence will
+properly copy the strings of and update the strings pointers in the copied heap objects.
+
 .. rubric:: Footnotes.
 
-.. [#] Not that I can see any good reason for that, if virtual methods instead of :code:`switch` statements were used.
+.. [#]
+
+   Not that I can see any good reason for that, if virtual methods instead of :code:`switch`
+   statements were used.
+
+.. [#]
+
+   Which makes the initialization logic to "fixup" the virtual function pointer rather silly.