DOC: Update the HTX API documentation

Missing functions have been added. And because the EOM block was removed, some parts have been adapted to better explain how the end of the message may be detected.
2021-02-24 11:33:21 +01:00 · 2021-02-24 11:33:21 +01:00 · 9a2cec4953
commit 9a2cec4953
parent e071f0e6a4
1 changed files with 146 additions and 76 deletions
--- a/doc/internals/htx-api.txt
+++ b/doc/internals/htx-api.txt
@ -1,7 +1,7 @@
                -----------------------------------------------
                                   HTX API
-                                  Version 1.0
+                                  Version 1.1
-                          ( Last update: 2020-12-02 )
+                          ( Last update: 2021-02-24 )
                -----------------------------------------------
                          Author : Christopher Faulet
                      Contact : cfaulet at haproxy dot com
@ -62,8 +62,8 @@ area. When an HTX message is stored in a buffer, this one appears as full.
              (htx->data)
-The blocks part remains linear and sorted. You may think about it as an array
+The blocks part remains linear and sorted. It may be see as an array with
-with negative indexes. But, instead of using negative indexes, we use positive
+negative indexes. But, instead of using negative indexes, we use positive
 positions to identify a block. This position is then converted to an address
 relatively to the beginning of the blocks array.
@ -97,9 +97,9 @@ array), we move back all blocks.
    ...+--------------+---------+    =====>  ...----------+--------------+
-The payloads part is a raw space that may wrap. You never access to a block's
+The payloads part is a raw space that may wrap. A block's payload must never be
-payload directly. Instead you get a block to retrieve the address of its
+accessed directly. Instead a block must be selected to retrieve the address of
-payload.
+its payload.
          +------------------------( B0.addr )--------------------------+
@ -212,9 +212,11 @@ responses are part of the same HTX message.
 When the end of the message is reached a special flag is set on the message
 (HTX_FL_EOM). It means no more data are expected for this message, except
-tunneled data. But tunneled data will never be mixed with message data. Thus
+tunneled data. But tunneled data will never be mixed with message data to avoid
-once the flag marking the end of the message is set, it is easy to know the
+ambiguities. Thus once the flag marking the end of the message is set, it is
-message ends.
+easy to know the message ends. The end is reached if the HTX message is empty or
 on the tail HTX block in the HTX message. Once all blocks of the HTX message are
 consumed, tunneled data, if any, may be transfered.
 3.1. The start-line
@ -295,15 +297,14 @@ buffer. There are 2 functions to do so, the second one relying on the first:
      that it appears as full.
 Both functions return a "zero-sized" HTX message if the buffer is null. This
-way, you are sure to always have a valid HTX message. The first function is the
+way, the HTX message is always valid. The first function is the default function
-default function to use. The second one is only useful when some content will be
+to use. The second one is only useful when some content will be added. For
-added. For instance, it used by the HTX analyzers when HAproxy generates a
+instance, it used by the HTX analyzers when HAproxy generates a response. Thus,
-response. This way, the buffer is in a right state and you don't need to take
+the buffer is in a right state.
 care of it anymore outside the possible error paths.
 Once the processing done, if the HTX message has been modified, the underlying
-buffer must be also updated, except you uses htx_from_buf() and you only add
+buffer must be also updated, except htx_from_buf() was used _AND_ data was only
-data. For all other cases, the function htx_to_buf() must be called.
+added. For all other cases, the function htx_to_buf() must be called.
 Finally, the function htx_reset() may be called at any time to reset an HTX
 message. And the function buf_room_for_htx_data() may be called to know if a raw
@ -313,7 +314,7 @@ the HTX.
 4.2. Helpers to deal with free space in an HTX message
-Once you have an HTX message, following functions may help you to process it :
+Once with an HTX message, following functions may help to process it :
    - htx_used_space() and htx_meta_space() return, respectively, the total
      space used in an HTX message and the space used by block's metadata only.
@ -335,10 +336,9 @@ Once you have an HTX message, following functions may help you to process it :
 4.3. HTX Blocks manipulations
-Once you know how much space is available in an HTX message, the next step is to
+Once the available sapce in an HTX message is known, the next step is to add HTX
-add HTX blocks. First of all the function htx_nbblks() returns the number of
+blocks. First of all the function htx_nbblks() returns the number of blocks
-blocks allocated in an HTX message. Then, there is an add function per block's
+allocated in an HTX message. Then, there is an add function per block's type :
 type:
    - htx_add_stline() adds a start-line. The type (request or response) and the
      flags of the start-line must be provided, as well as its three parts
@ -349,7 +349,7 @@ type:
      NULL if an error occurred.
    - htx_add_endof() must be used to add any end-of marker. The block's type
-      (EOH, EOT or EOM) must be specified. The inserted HTX block is returned on
+      (EOH or EOT) must be specified. The inserted HTX block is returned on
      success or NULL if an error occurred.
    - htx_add_all_headers() and htx_add_all_trailers() add, respectively, a list
@ -361,21 +361,22 @@ type:
    - htx_add_data() must be used to add a DATA block. Unlike previous
      functions, this one returns the number of bytes copied or 0 if nothing was
-      copied. If possible, the data are appended to the last DATA block, if
+      copied. If possible, the data are appended to the tail block if it is a
-      any. Only a part of the payload may be copied because this function will
+      DATA block. Only a part of the payload may be copied because this function
-      try to limit the message defragmentation and the wrapping of blocks as far
+      will try to limit the message defragmentation and the wrapping of blocks
-      as possible. If you really need to add all data or nothing, the function
+      as far as possible.
      htx_add_data_atonce() must be used instead. Because it tries to insert all
      the payload, this function returns the inserted block on success.
      Otherwise it returns NULL.
-When an HTX block is added, it is always the last one (the tail). But, if you
+    - htx_add_data_atonce() must be used if all data must be added or nothing.
-need to add a block at a specific place, it is not really handy. 2 functions may
+      It tries to insert all the payload, this function returns the inserted
-help you (others could be added) :
+      block on success.  Otherwise it returns NULL.
 When an HTX block is added, it is always the last one (the tail). But, if a
 block must be added at a specific place, it is not really handy. 2 functions may
 help (others could be added) :
    - htx_add_last_data() adds a DATA block just after all other DATA blocks and
-      before any trailers and EOT or EOM markers. It relies on
+      before any trailers and EOT marker. It relies on htx_add_data_atonce(), so
-      htx_add_data_atonce(), so a defragmentation may be performed.
+      a defragmentation may be performed.
    - htx_move_blk_before() moves a specific block just after another one. Both
      blocks must already be in the HTX message and the block to move must
@ -400,7 +401,29 @@ Once added, there are three functions to update the block's payload :
      be smaller or larger than the old one. This function returns the new HTX
      block on success, or NULL is an error occurred.
-Finally, You may remove a block using the function htx_remove_blk(). This
+    - htx_change_blk_value_len() changes the size of the value. It is the caller
      responsibility to change the value itself, make sure there is enough space
      and update allocated value. This function updates the HTX message
      accordingly.
    - htx_set_blk_value_len() changes the size of the value. It is the caller
      responsibility to change the value itself, make sure there is enough space
      and update allocated value. Unlike the function
      htx_change_blk_value_len(), this one does not update the HTX message. So
      it should be used with caution.
    - htx_cut_data_blk() removes <n> bytes from the beginning of a DATA
      block. The block's start address and its length are adjusted, and the
      htx's total data count is updated. This is used to mark that part of some
      data were transferred from a DATA block without removing this DATA
      block. No sanity check is performed, the caller is responsible for doing
      this exclusively on DATA blocks, and never removing more than the block's
      size.
    - htx_remove_blk() removes a block from an HTX message. It returns the
      following block or NULL if it is the tail block.
 Finally, a block may be removed using the function htx_remove_blk(). This
 function returns the block following the one removed or NULL if it is the tail
 block.
@ -445,56 +468,103 @@ To iterate on an HTX message, the first thing to do is to get the HTX block to
 start the loop. There are three special blocks in an HTX message that may be
 good candidates to start a loop :
-  * the head block. It is the oldest inserted block. Multiplexers always start
+    - the head block. It is the oldest inserted block. Multiplexers always start
      to consume an HTX message from this block. The function htx_get_head()
      returns its position and htx_get_head_blk() returns the blocks itself. In
      addition, the function htx_get_head_type() returns its block's type.
-  * the tail block. It is the newest inserted block. The function htx_get_tail()
+    - the tail block. It is the newest inserted block. The function
-    returns its position and htx_get_tail_blk() returns the blocks itself. In
+      htx_get_tail() returns its position and htx_get_tail_blk() returns the
-    addition, the function htx_get_tail_type() returns its block's type.
+      blocks itself. In addition, the function htx_get_tail_type() returns its
      block's type.
-  * the first block. It is the block where to (re)start the analyse. It is used
+    - the first block. It is the block where to (re)start the analyse. It is
-    as start point by HTX analyzers. The function htx_get_first() returns its
+      used as start point by HTX analyzers. The function htx_get_first() returns
-    position and htx_get_first_blk() returns the blocks itself. In addition, the
+      its position and htx_get_first_blk() returns the blocks itself. In
-    function htx_get_first_type() returns its block's type.
+      addition, the function htx_get_first_type() returns its block's type.
 For all these functions, if the HTX message is empty, -1 is returned for the
 block's position, NULL instead of a block and HTX_BLK_UNUSED for its type.
-Then to iterate on blocks, you may move foreword or backward :
+Then to iterate on blocks, foreword or backward :
-  * htx_get_prev() and htx_get_next() return, respectively, the position of the
+    - htx_get_prev() and htx_get_next() return, respectively, the position of
-    previous block or the next block, given a specific position. Or -1 if an edge
+      the previous block or the next block, given a specific position. Or -1 if
      an edge is reached.
    - htx_get_prev_blk() and htx_get_next_blk() return, respectively, the
      previous block or the next one, given a specific block. Or NULL if an edge
      is reached.
-  * htx_get_prev_blk() and htx_get_next_blk() return, respectively, the previous
+4.6. Access block content and info
    block or the next one, given a specific block. Or NULL if an edge is
    reached.
 Following functions may be used to retrieve information about a specific HTX
 block :
-4.6. Advanced functions
+    - htx_get_blk_pos() returns the position of a block. It must be in the HTX
      message.
    - htx_get_blk_ptr() returns a pointer on the payload of a block.
    - htx_get_blk_type() returns the type of a block.
    - htx_get_blksz() returns the payload size of a block
    - htx_get_blk_name() returns the name of a block, only if it is a header or
      a trailer. Otherwise, it returns an empty string.
    - htx_get_blk_value() returns the value of a block, depending on its
      type. For header and trailer blocks, it is the value field. For markers
      (EOH or EOT), an empty string is returned. For other blocks an ist
      pointing on the block payload is returned.
    - htx_is_unique_blk() may be used to know if a block is the only one
      remaining inside an HTX message, excluding unsued blocks. This function is
      pretty useful to determine the end of a HTX message, in conjonction with
      HTX_FL_EOM flag.
 4.7. Advanced functions
 Some more advanced functions may be used to do complex processing on the HTX
 message. These functions are used by HTX analyzers or by multiplexers.
-  * htx_truncate() removes all blocks after the one containing a specific offset
+    - htx_truncate() removes all blocks after the one containing a specific
-    relatively to the head block of the HTX message. If the offset is inside a
+      offset relatively to the head block of the HTX message. If the offset is
-    DATA block, it is truncated. For all other blocks, the removal starts to the
+      inside a DATA block, it is truncated. For all other blocks, the removal
-    next block.
+      starts to the next block.
-  * htx_drain() tries to remove a specific amount of bytes of payload. If the
+    - htx_drain() tries to remove a specific amount of bytes of payload. If the
-    last block is a DATA block, it may be truncated if necessary. All other
+      tail block is a DATA block, it may be truncated if necessary. All other
-    block are removed at once or kept. This function returns a mixed value, with
+      block are removed at once or kept. This function returns a mixed value,
-    the first block not removed, or NULL if everything was removed, and the
+      with the first block not removed, or NULL if everything was removed, and
-    amount of data drained.
+      the amount of data drained.
-  * htx_xfer_blks() transfers HTX blocks from an HTX message to another,
+    - htx_xfer_blks() transfers HTX blocks from an HTX message to another,
-    stopping on the first block of a specified type or when a specific amount of
+      stopping on the first block of a specified type or when a specific amount
-    bytes, including meta-data, was moved. If the last block is a DATA block, it
+      of bytes, including meta-data, was moved. If the tail block is a DATA
-    may be partially moved. All other block are transferred at once or
+      block, it may be partially moved. All other block are transferred at once
-    kept. This function returns a mixed value, with the last block moved, or
+      or kept. This function returns a mixed value, with the last block moved,
-    NULL if nothing was moved, and the amount of data transferred. When HEADERS
+      or NULL if nothing was moved, and the amount of data transferred. When
-    or TRAILERS blocks must be transferred, this function transfers all of
+      HEADERS or TRAILERS blocks must be transferred, this function transfers
-    them. Otherwise, if it is not possible, it triggers an error. It is the
+      all of them. Otherwise, if it is not possible, it triggers an error. It is
-    caller responsibility to transfer all headers or trailers at once.
+      the caller responsibility to transfer all headers or trailers at once.
    - htx_append_msg() append an HTX message to another one. All the message is
      copied or nothing. So, if an error occurred, a rollback is performed. This
      function returns 1 on success and 0 on error.
    - htx_reserve_max_data() Reserves the maximum possible size for an HTX data
      block, by extending an existing one or by creating a new one. It returns a
      compound result with the HTX block and the position where new data must be
      inserted (0 for a new block). If an error occurs or if there is no space
      left, NULL is returned instead of a pointer on an HTX block.
    - htx_find_offset() looks for the HTX block containing a specific offset,
      starting at the HTX message's head. The function returns the found HTX
      block and the position inside this block where the offset is. If the
      offset is outside of the HTX message, NULL is returned.
    - htx_defrag() defragments an HTX message. It removes unused blocks and
      unwraps the payloads part. A temporary buffer is used to do so. This
      function never fails. A referenced block may be provided. If so, the
      corresponding new block is returned. Otherwise, NULL is returned.