| .. Licensed to the Apache Software Foundation (ASF) under one or more contributor license |
| agreements. See the NOTICE file distributed with this work for additional information regarding |
| copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with the License. You may obtain |
| a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software distributed under the License |
| is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express |
| or implied. See the License for the specific language governing permissions and limitations |
| under the License. |
| |
| .. include:: ../../common.defs |
| .. highlight:: cpp |
| .. default-domain:: cpp |
| |
| .. _core-hdr-heap: |
| |
| Header Heap |
| *********** |
| |
| Memory for HTTP header data is kept in :term:`header heap`\s. |
| |
| Classes |
| ======= |
| |
| .. class:: HdrHeapObjImpl |
| |
| This is the abstract base class for objects allocated in a :class:`HdrHeap`. This allows updating |
| objects in a heap in a generic way, without having to locate all of the pointers to the objects. |
| |
| The type of an instance stored in a heap must be one of the following values. |
| |
| .. enumerator:: HDR_HEAP_OBJ_EMPTY = 0 |
| |
| Used to mark invalid objects, ones not yet constructed or ones that have been destroyed. |
| |
| .. enumerator:: HDR_HEAP_OBJ_RAW = 1 |
| |
| Some sort of raw object, I have no idea. |
| |
| .. enumerator:: HDR_HEAP_OBJ_URL = 2 |
| |
| A URL object. |
| |
| .. enumerator:: HDR_HEAP_OBJ_HTTP_HEADER = 3 |
| |
| The header for an HTTP request or response. |
| |
| .. enumerator:: HDR_HEAP_OBJ_MIME_HEADER = 4 |
| |
| A MIME header, containing MIME style fields with names and values. |
| |
| .. enumerator:: HDR_HEAP_OBJ_FIELD_BLOCK = 5 |
| |
| Who the heck knows? |
| |
| .. class:: HdrStrHeap |
| |
| This is a :term:`variable sized class`, therefore new instance must be created by :func:`new_HdrStrHeap` |
| and deallocated by the :code:`destroy` method. |
| |
| .. function:: HdrStrHeap * new_HdrStrHeap(int n) |
| |
| Create and return a new instance of :class:`HdrStrHeap`. If :arg:`n` is less than ``HDR_STR_HEAP_DEFAULT_SIZE`` |
| it is increased to that value. |
| |
| If the allocated size is ``HDR_STR_HEAP_DEFAULT_SIZE`` (or smaller and upsized to that value) then |
| the instance is allocated from a thread local pool via :code:`strHeapAllocator`. If larger it |
| is allocated from global memory via :code:`ats_malloc`. |
| |
| .. class:: HdrHeap |
| |
| This is a :term:`variable sized class` and therefore new instances must be created by :func:`new_HdrHeap` |
| and deallocated by the :code:`destroy` method. |
| |
| :class:`HdrHeap` manages memory for heap objects directly and memory for strings via ancillary |
| heaps (which are instances of :class:`HdrStrHeap`). For the string heaps there is at most one |
| writeable heap, and up to :code:`HDR_BUF_RONLY_HEAPS` read only heaps. |
| |
| All objects in the internal heap must be subclasses of :class:`HdrHeapObjImpl`. |
| |
| .. function:: size_t required_space_for_evacuation() |
| |
| Calculate and return the total live string space for :arg:`this`. |
| |
| .. function:: void evacuate_from_str_heaps(HdrStrHeap * new_heap) |
| |
| Copy all live strings from the heap objects in :arg:`this` to :arg:`new_heap`. |
| |
| .. function:: void coalesce_str_heaps(int incoming_size) |
| |
| This garbage collects the string heaps in a half space style, by creating a new string space |
| (string heap), copying all of the strings there, and then discarding the existing string heaps. |
| |
| The total amount of live string space is calculated by |
| :func:`HdrHeap::required_space_for_evacuation` and a new string heap is created of a size at |
| least as large as the live string space plus :arg:`incoming_size` bytes. |
| |
| All of the live strings are moved to the new string heap by |
| :func:`HdrHeap::evacuate_from_str_heaps`, the existing string heaps are deallocated, and the |
| new string heap becomes the writeable string heap for the header heap. The end result is a |
| single writeable string heap and no read only string heaps, with all live strings resident in |
| that writeable string heap. |
| |
| .. function:: char * allocate_str(int bytes) |
| |
| Allocate :arg:`nbytes` of space for a string in the writeable string heap. A pointer to the |
| first byte is returned, or ``nullptr`` if the space could not be allocated. |
| |
| .. function:: HdrHeapObjImpl * allocate_obj(int nbytes, int type) |
| |
| Allocate a :arg:`type` object that is :arg:`nbytes` in size in the heap and return a pointer |
| to it, or ``nullptr`` if the object could not be allocated. |
| |
| :arg:`nbytes` must be at most ``HDR_MAX_ALLOC_SIZE``. |
| |
| The members of :class:`HdrHeapObjImpl` are initialized. Further initialization is the |
| responsibility of the caller. |
| |
| :arg:`type` must be one of the values specified in :class:`HdrHeapObjImpl`. |
| |
| .. function:: int marshal_length() |
| |
| Compute and return the size of the buffer needed to serialize :arg:`this`. |
| |
| .. function:: int marshal(char * buffer, int length) |
| |
| Serialize :arg:`this` to :arg:`buffer` of size :arg:`length`. It is required that |
| :arg:`length` be at least the value returned by :func:`HdrHeap::marshal_length`. |
| |
| .. function:: HdrHeap * new_HdrHeap(int n) |
| |
| Create and return a new instance of :class:`HdrHeap`. If :arg:`n` is less than ``HdrHeap::DEFAULT_SIZE`` |
| it is increased to that value. |
| |
| If the allocated size is ``HdrHeap::DEFAULT_SIZE`` (or smaller and upsized to that value) then |
| the instance is allocated from a thread local pool via :code:`hdrHeapAllocator`. If larger it |
| is allocated from global memory via :code:`ats_malloc`. |
| |
| .. topic:: Header Heap Class Structure |
| |
| .. figure:: /uml/images/hdr-heap-class.svg |
| |
| |
| Implementation |
| ============== |
| |
| String Coalescence |
| ------------------ |
| |
| String heaps do not maintain lists of internal free space. Strings that are released are left in |
| place, creating dead space in the heap. For this reason it can become necessary to do a garbage |
| collection operation on the writeable string heap in the header heap by calling |
| :func:`HdrHeap::coalesce_str_heaps`. This is done when |
| |
| * The amount of dead space in the writable string heap exceeds ``MAX_LOST_STR_SPACE``. |
| |
| * An external string heap is being added and all current read only string heap slots are used. |
| |
| The mechanism is simple in design - the size of the live string data in the current string heaps is |
| calculated and a new heap is allocated sufficient to contain all existing strings, with additional |
| space for new string data. Each heap object is required to provide a :code:`strings_length` method |
| which returns the size of the live string data for that object (recursively as needed). The strings |
| are copied to the new string heap, all of the previous string heaps are discarded, and the new heap |
| becomes the writable string heap for the header heap. |
| |
| Each heap object is responsible for providing a :code:`move_strings` method which copies its strings |
| to a new string heap, passed as an argument. This is a source of pointer invalidation for other |
| parts of the core and the plugin API. For the latter, insulating from such string movement is the |
| point of the :cpp:type:`TSMLoc` type. |
| |
| String Allocation |
| ----------------- |
| |
| Storage for a string is allocated by :func:`HdrHeap::allocate_str`. If the current amount of dead |
| space is too large, this is treated as an initial allocation failure. If there is no current |
| writeable string heap, one is created that is a least as large as the space requested and the size |
| of the previous writeable string heap. Space for the string is then allocated out of the writeable |
| string heap. If this fails due to lack of space the current writeable string heap is "demoted" to a |
| read only string heap and allocation retried (which will cause a new writeable string heap). If the |
| writeable string heap cannot be demoted due to lack of read only slots, the strings heaps are |
| coalesced with an additional size request of the requested string size. This will result in a single |
| writeable string heap and not read only heaps, the former containing all of the existing strings plus |
| sufficient space to allocate the new string. |
| |
| .. topic:: Decision Diagram |
| |
| .. figure:: /uml/images/hdr-heap-str-alloc.svg |
| |
| Object Allocation |
| ----------------- |
| |
| Objects are allocated on the header heap by :func:`HdrHeap::allocate_obj`. Such objects must be one |
| of a compile time determined set of types [#]_. This method first tries to allocate the object in |
| existing free space. If that doesn't work then the allocator walks a list of :class:`HdrHeap` |
| instances looking for space. If no space is found anywhere, a new :class:`HdrHeap` instance is |
| created with twice the space of the last :class:`HdrHeap` in the list and added to the list to |
| try. |
| |
| Once space is found for the object, the base members of :class:`HdrHeapObjImpl` are initialized with |
| the object type and size, with the :arg:`m_obj_flags` set to 0. |
| |
| Serialization |
| ------------- |
| |
| Because heaps store the HTTP request / response data, a header heap needs to be serialized to be put |
| in to the cache. For performance reasons, it is desirable to be able to unserialize the serialized |
| data in place, rather than copying it again. That is, the data is read from disk into a block of |
| memory and then that memory is converted to a live data structure. In this case the memory used by |
| the heap is owned by some other object and the header heap must not do any clean up. This is |
| signaled by the `m_writeable` flag. In an unserialized header heap this is set to ``false`` and such |
| a header heap is not allowed to allocate any additional objects or strings - it is immutable. |
| |
| The primary mechanism to do this is to use swizzling on the pointers in the structure. During |
| serialization pointers are converted to offsets and during unserialization these offsets are |
| converted back to pointers. To make this simpler, unserialized header heaps are marked read only so |
| that updating does not have to be supported. Additionally, :class:`HdrHeap` is a POD and therefore |
| has no virtual function table pointer to be stored or restored [#]_. |
| |
| To serialize, first :func:`HdrHeap::marshal_length` is called to get a buffer size. The |
| serialization buffer is created with sufficient space for the header heap and that space is passed |
| to :func:`HdrHeap::marshal` to perform the actual serialization. The object heaps are serialized |
| followed by the string heaps. No coalescence is done, on the presumption that because the amount |
| of dead space is limited by coalescence (as needed) on every string creation. |
| |
| When serializing strings, each object is responsible for swizzling its own pointers. Because the |
| object heaps have already been serialized and all of the header heap object types are also PODs, |
| these serialized objects can have the pointer swizzling method, :code:`marshal`, called directly |
| on them. This method is provided with a set of "translations" which indicate the base offset for |
| each range of object and string heap memory. The object marshalling can then compute the correct |
| offset to store for each live string pointer. |
| |
| Inheriting Strings |
| ------------------ |
| |
| The string heaps are designed to be reference counted so that they can be shared as read only |
| objects between heaps. This enables copying heap objects between heaps less expensive as the |
| strings pointers in them can be preserved in the new heap by sharing the string heaps in which |
| those strings reside. |
| |
| This can still be a bit complex as it is possible that the combined number of string heaps is more |
| than the limit. In this case, the target header heap does string coalescence so that it is reduced to |
| having a single writeable string heap with enough free space to hold all of the strings in the |
| source header heap. As a result, it is required that all heap objects already be present in the |
| target header heap before the strings are inherited. This means that the string coalescence will |
| properly copy the strings of and update the strings pointers in the copied heap objects. |
| |
| .. rubric:: Footnotes. |
| |
| .. [#] |
| |
| Not that I can see any good reason for that, if virtual methods instead of :code:`switch` |
| statements were used. |
| |
| .. [#] |
| |
| Which makes the initialization logic to "fixup" the virtual function pointer rather silly. |