blob: 71732869ba31c8ddb6a90680724d1c32e992ee7e [file] [log] [blame]
.. Licensed to the Apache Software Foundation (ASF) under one or more contributor license
agreements. See the NOTICE file distributed with this work for additional information regarding
copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with the License. You may obtain
a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License
is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
or implied. See the License for the specific language governing permissions and limitations
under the License.
.. include:: ../../common.defs
.. highlight:: cpp
.. default-domain:: cpp
.. _core-hdr-heap:
Header Heap
***********
Memory for HTTP header data is kept in :term:`header heap`\s.
Classes
=======
.. class:: HdrHeapObjImpl
This is the abstract base class for objects allocated in a :class:`HdrHeap`. This allows updating
objects in a heap in a generic way, without having to locate all of the pointers to the objects.
The type of an instance stored in a heap must be one of the following values.
.. enumerator:: HDR_HEAP_OBJ_EMPTY = 0
Used to mark invalid objects, ones not yet constructed or ones that have been destroyed.
.. enumerator:: HDR_HEAP_OBJ_RAW = 1
Some sort of raw object, I have no idea.
.. enumerator:: HDR_HEAP_OBJ_URL = 2
A URL object.
.. enumerator:: HDR_HEAP_OBJ_HTTP_HEADER = 3
The header for an HTTP request or response.
.. enumerator:: HDR_HEAP_OBJ_MIME_HEADER = 4
A MIME header, containing MIME style fields with names and values.
.. enumerator:: HDR_HEAP_OBJ_FIELD_BLOCK = 5
Who the heck knows?
.. class:: HdrStrHeap
This is a :term:`variable sized class`, therefore new instance must be created by :func:`new_HdrStrHeap`
and deallocated by the :code:`destroy` method.
.. function:: HdrStrHeap * new_HdrStrHeap(int n)
Create and return a new instance of :class:`HdrStrHeap`. If :arg:`n` is less than ``HDR_STR_HEAP_DEFAULT_SIZE``
it is increased to that value.
If the allocated size is ``HDR_STR_HEAP_DEFAULT_SIZE`` (or smaller and upsized to that value) then
the instance is allocated from a thread local pool via :code:`strHeapAllocator`. If larger it
is allocated from global memory via :code:`ats_malloc`.
.. class:: HdrHeap
This is a :term:`variable sized class` and therefore new instances must be created by :func:`new_HdrHeap`
and deallocated by the :code:`destroy` method.
:class:`HdrHeap` manages memory for heap objects directly and memory for strings via ancillary
heaps (which are instances of :class:`HdrStrHeap`). For the string heaps there is at most one
writeable heap, and up to :code:`HDR_BUF_RONLY_HEAPS` read only heaps.
All objects in the internal heap must be subclasses of :class:`HdrHeapObjImpl`.
.. function:: size_t required_space_for_evacuation()
Calculate and return the total live string space for :arg:`this`.
.. function:: void evacuate_from_str_heaps(HdrStrHeap * new_heap)
Copy all live strings from the heap objects in :arg:`this` to :arg:`new_heap`.
.. function:: void coalesce_str_heaps(int incoming_size)
This garbage collects the string heaps in a half space style, by creating a new string space
(string heap), copying all of the strings there, and then discarding the existing string heaps.
The total amount of live string space is calculated by
:func:`HdrHeap::required_space_for_evacuation` and a new string heap is created of a size at
least as large as the live string space plus :arg:`incoming_size` bytes.
All of the live strings are moved to the new string heap by
:func:`HdrHeap::evacuate_from_str_heaps`, the existing string heaps are deallocated, and the
new string heap becomes the writeable string heap for the header heap. The end result is a
single writeable string heap and no read only string heaps, with all live strings resident in
that writeable string heap.
.. function:: char * allocate_str(int bytes)
Allocate :arg:`nbytes` of space for a string in the writeable string heap. A pointer to the
first byte is returned, or ``nullptr`` if the space could not be allocated.
.. function:: HdrHeapObjImpl * allocate_obj(int nbytes, int type)
Allocate a :arg:`type` object that is :arg:`nbytes` in size in the heap and return a pointer
to it, or ``nullptr`` if the object could not be allocated.
:arg:`nbytes` must be at most ``HDR_MAX_ALLOC_SIZE``.
The members of :class:`HdrHeapObjImpl` are initialized. Further initialization is the
responsibility of the caller.
:arg:`type` must be one of the values specified in :class:`HdrHeapObjImpl`.
.. function:: int marshal_length()
Compute and return the size of the buffer needed to serialize :arg:`this`.
.. function:: int marshal(char * buffer, int length)
Serialize :arg:`this` to :arg:`buffer` of size :arg:`length`. It is required that
:arg:`length` be at least the value returned by :func:`HdrHeap::marshal_length`.
.. function:: HdrHeap * new_HdrHeap(int n)
Create and return a new instance of :class:`HdrHeap`. If :arg:`n` is less than ``HdrHeap::DEFAULT_SIZE``
it is increased to that value.
If the allocated size is ``HdrHeap::DEFAULT_SIZE`` (or smaller and upsized to that value) then
the instance is allocated from a thread local pool via :code:`hdrHeapAllocator`. If larger it
is allocated from global memory via :code:`ats_malloc`.
.. topic:: Header Heap Class Structure
.. figure:: /uml/images/hdr-heap-class.svg
Implementation
==============
String Coalescence
------------------
String heaps do not maintain lists of internal free space. Strings that are released are left in
place, creating dead space in the heap. For this reason it can become necessary to do a garbage
collection operation on the writeable string heap in the header heap by calling
:func:`HdrHeap::coalesce_str_heaps`. This is done when
* The amount of dead space in the writable string heap exceeds ``MAX_LOST_STR_SPACE``.
* An external string heap is being added and all current read only string heap slots are used.
The mechanism is simple in design - the size of the live string data in the current string heaps is
calculated and a new heap is allocated sufficient to contain all existing strings, with additional
space for new string data. Each heap object is required to provide a :code:`strings_length` method
which returns the size of the live string data for that object (recursively as needed). The strings
are copied to the new string heap, all of the previous string heaps are discarded, and the new heap
becomes the writable string heap for the header heap.
Each heap object is responsible for providing a :code:`move_strings` method which copies its strings
to a new string heap, passed as an argument. This is a source of pointer invalidation for other
parts of the core and the plugin API. For the latter, insulating from such string movement is the
point of the :cpp:type:`TSMLoc` type.
String Allocation
-----------------
Storage for a string is allocated by :func:`HdrHeap::allocate_str`. If the current amount of dead
space is too large, this is treated as an initial allocation failure. If there is no current
writeable string heap, one is created that is a least as large as the space requested and the size
of the previous writeable string heap. Space for the string is then allocated out of the writeable
string heap. If this fails due to lack of space the current writeable string heap is "demoted" to a
read only string heap and allocation retried (which will cause a new writeable string heap). If the
writeable string heap cannot be demoted due to lack of read only slots, the strings heaps are
coalesced with an additional size request of the requested string size. This will result in a single
writeable string heap and not read only heaps, the former containing all of the existing strings plus
sufficient space to allocate the new string.
.. topic:: Decision Diagram
.. figure:: /uml/images/hdr-heap-str-alloc.svg
Object Allocation
-----------------
Objects are allocated on the header heap by :func:`HdrHeap::allocate_obj`. Such objects must be one
of a compile time determined set of types [#]_. This method first tries to allocate the object in
existing free space. If that doesn't work then the allocator walks a list of :class:`HdrHeap`
instances looking for space. If no space is found anywhere, a new :class:`HdrHeap` instance is
created with twice the space of the last :class:`HdrHeap` in the list and added to the list to
try.
Once space is found for the object, the base members of :class:`HdrHeapObjImpl` are initialized with
the object type and size, with the :arg:`m_obj_flags` set to 0.
Serialization
-------------
Because heaps store the HTTP request / response data, a header heap needs to be serialized to be put
in to the cache. For performance reasons, it is desirable to be able to unserialize the serialized
data in place, rather than copying it again. That is, the data is read from disk into a block of
memory and then that memory is converted to a live data structure. In this case the memory used by
the heap is owned by some other object and the header heap must not do any clean up. This is
signaled by the `m_writeable` flag. In an unserialized header heap this is set to ``false`` and such
a header heap is not allowed to allocate any additional objects or strings - it is immutable.
The primary mechanism to do this is to use swizzling on the pointers in the structure. During
serialization pointers are converted to offsets and during unserialization these offsets are
converted back to pointers. To make this simpler, unserialized header heaps are marked read only so
that updating does not have to be supported. Additionally, :class:`HdrHeap` is a POD and therefore
has no virtual function table pointer to be stored or restored [#]_.
To serialize, first :func:`HdrHeap::marshal_length` is called to get a buffer size. The
serialization buffer is created with sufficient space for the header heap and that space is passed
to :func:`HdrHeap::marshal` to perform the actual serialization. The object heaps are serialized
followed by the string heaps. No coalescence is done, on the presumption that because the amount
of dead space is limited by coalescence (as needed) on every string creation.
When serializing strings, each object is responsible for swizzling its own pointers. Because the
object heaps have already been serialized and all of the header heap object types are also PODs,
these serialized objects can have the pointer swizzling method, :code:`marshal`, called directly
on them. This method is provided with a set of "translations" which indicate the base offset for
each range of object and string heap memory. The object marshalling can then compute the correct
offset to store for each live string pointer.
Inheriting Strings
------------------
The string heaps are designed to be reference counted so that they can be shared as read only
objects between heaps. This enables copying heap objects between heaps less expensive as the
strings pointers in them can be preserved in the new heap by sharing the string heaps in which
those strings reside.
This can still be a bit complex as it is possible that the combined number of string heaps is more
than the limit. In this case, the target header heap does string coalescence so that it is reduced to
having a single writeable string heap with enough free space to hold all of the strings in the
source header heap. As a result, it is required that all heap objects already be present in the
target header heap before the strings are inherited. This means that the string coalescence will
properly copy the strings of and update the strings pointers in the copied heap objects.
.. rubric:: Footnotes.
.. [#]
Not that I can see any good reason for that, if virtual methods instead of :code:`switch`
statements were used.
.. [#]
Which makes the initialization logic to "fixup" the virtual function pointer rather silly.