NEW_TOKEN frame is never emitted by a client, hence parsing was not
tested on frontend side.
On backend side, an issue can occur, as expected token length is static,
based on the token length used internally by haproxy. This is not
sufficient for most server implementation which uses larger token. This
causes a parsing error, which may cause skipping of following frames in
the same packet. This issue was detected using ngtcp2 as server.
As for now tokens are unused by haproxy, simply discard test on token
length during NEW_TOKEN frame parsing. The token itself is merely
skipped without being stored. This is sufficient for now to continue on
experimenting with QUIC backend implementation.
This does not need to be backported.
Report an error during server configuration if QUIC is used by SSL is
not activiated via 'ssl' keyword. This is done in _srv_parse_finalize(),
which is both used by static and dynamic servers.
Note that contrary to listeners, an error is reported instead of a
warning, and SSL is not automatically activated if missing. This is
mainly due to the complex server configuration : _srv_parse_finalize()
is ideal to affect every servers, including dynamic entries. However, it
is executed after server SSL context allocation performed via
<prepare_srv> XPRT operation. A proper fix would be to move SSL ctx
alloc in _srv_parse_finalize(), but this may have unknown impact. Thus,
for now a simpler solution has been chosen.
QUIC traces in ssl_quic_srv_new_ssl_ctx() are problematic as this
function is called early during startup. If activating traces via -dt
command-line argument, a crash occurs due to stderr sink not yet
available.
Thus, traces from ssl_quic_srv_new_ssl_ctx() are simply removed.
No backport needed.
This commit added a "err" C label reachable only with USE_QUIC_OPENSSL_COMPAT:
MINOR: quic-be: Missing callbacks initializations (USE_QUIC_OPENSSL_COMPAT)
leading coverity to warn this:
*** CID 1611481: Control flow issues (UNREACHABLE)
/src/quic_ssl.c: 802 in ssl_quic_srv_new_ssl_ctx()
796 goto err;
797 #endif
798
799 leave:
800 TRACE_LEAVE(QUIC_EV_CONN_NEW);
801 return ctx;
>>> CID 1611481: Control flow issues (UNREACHABLE)
>>> This code cannot be reached: "err:
SSL_CTX_free(ctx);".
802 err:
803 SSL_CTX_free(ctx);
804 ctx = NULL;
805 TRACE_DEVEL("leaving on error", QUIC_EV_CONN_NEW);
806 goto leave;
807 }
The less intrusive (without #ifdef) way to fix this it to add a "goto err"
statement from the code part which is reachable without USE_QUIC_OPENSSL_COMPAT.
Thank you to @chipitsine for having reported this issue in GH #3003.
This issue may occur when qc_new_conn() fails after having allocated
and attached <conn_cid> to its tree. This is the case when compiling
haproxy against WolfSSL for an unknown reason at this time. In this
case the <conn_cid> is freed by pool_head_quic_connection_id(), then
freed again by quic_conn_release().
This bug arrived with this commit:
MINOR: quic-be: QUIC connection allocation adaptation (qc_new_conn())
So, the aim of this patch is to free <conn_cid> only for QUIC backends
and if it is not attached to its tree. This is the case when <conn_id>
local variable passed with NULL value to qc_new_conn() is then intialized
to the same <conn_cid> value.
This patch should have come with this last commit for the last qc_new_conn()
modifications for QUIC backends:
MINOR: quic-be: get rid of ->li quic_conn member
qc_new_conn() must be passed NULL pointers for several variables as mentioned
by the comment. Some of these local variables are used to avoid too much
code modifications.
Implement transcoding of a HTX request into HTTP/0.9. This protocol is a
simplified version of HTTP. Request only supports GET method without any
header. As such, only a request line is written during snd_buf
operation.
Implement transcoding of a HTTP/0.9 response into a HTX message.
HTTP/0.9 is a really simple substract of HTTP spec. The response does
not have any status line and is contains only the payload body. Response
is finished when the underlying connection/stream is closed.
A status line is generated to be compliant with HTX. This is performed
on the first invokation of rcv_buf for the current stream. Status code
is set to 200. Payload body if present is then copied using
htx_add_data().
This commit is the second and final step to initiate QUIC MUX on the
backend side. On handshake completion, MUX is woken up just after its
creation. This step is necessary to notify the stream layer, via the QCS
instance pre-initialized on MUX init, so that the transfer can be
resumed.
This mode of operation is similar to TCP stack when TLS+ALPN are used,
which forces MUX initialization to be delayed after handshake
completion.
Adjust qmux_init() to handle frontend and backend sides differently.
Most notably, on backend side, the first bidirectional stream is created
preemptively. This step is necessary as MUX layer will be woken up just
after handshake completion.
Stream data layer is notified that data is expected when FIN is
received, which marks the end of the HTTP request. This prepares data
layer to be able to handle the expected HTTP response.
Thus, this step is only relevant on frontend side. On backend side, FIN
marks the end of the HTTP response. No further content is expected, thus
expect data should not be set in this case.
Note that se_expect_data() invokation via qcs_attach_sc() is not
protected. This is because this function will only be called during
request headers parsing which is performed on the frontend side.
Mux connection is flagged with new QC_CF_IS_BACK if used on the backend
side. For now the only change is during traces, to be able to
differentiate frontend and backend usage.
Complete document for rcv_buf/snd_buf operations. In particular, return
value is now explicitely defined. For H3 layer, associated functions
documentation is also extended.
Use conn_ctrl_init() on the connection when quic_connect_server()
succeeds. This is necessary so that the connection is considered as
completely initialized. Without this, connect operation will be call
again if connection is reused.
On backend side, multiplexer layer is initialized during
connect_server(). However, this step is not performed if ALPN is used,
as the negotiated protocol may be unknown. Multiplexer initialization is
delayed after TLS handshake completion.
There are still exceptions though that forces the MUX to be initialized
even if ALPN is used. One of them was if <mux_proto> server field was
already set at this stage, which is the case when an explicit proto is
selected on the server line configuration. Remove this condition so that
now MUX init is delayed with ALPN even if proto is forced.
The scope of this change should be minimal. In fact, the only impact
concerns server config with both proto and ALPN set, which is pretty
unlikely as it is contradictory.
The main objective of this patch is to prepare QUIC support on the
backend side. Indeed, QUIC proto will be forced on the server if a QUIC
address is used, similarly to bind configuration. However, we still want
to delay MUX initialization after QUIC handshake completion. This is
mandatory to know the selected application protocol, required during
QUIC MUX init.
Change wake callback behavior for QUIC MUX. This operation loops over
each QCS and notify their stream data layer on certain events via
internal helper qcc_wake_some_streams().
Previously, streams were notified only if an error occured on the
connection. Change this to notify streams data layer everytime wake
callback is used. This behavior is now identical to H2 MUX.
qcc_wake_some_streams() is also renamed to qcc_wake_streams(), as it
better reflect its true behavior.
This change should not have performance impact as wake mux ops should
not be called frequently. Note that qcc_wake_streams() can also be
called directly via qcc_io_process() to ensure a new error is correctly
propagated. As wake callback first uses qcc_io_process(), it will only
call qcc_wake_streams() if no error is present.
No known issue is associated with this commit. However, it could prevent
freezing transfer under certain condition. As such, it is considered as
a bug fix worthy of backporting.
This should be backported after a period of observation.
It was still failing on Ubuntu-24.04 with GCC+ASAN. So, instead of
understand the code path the compiler followed to report uninitialized
variables, let's init them now.
No backport needed.
commit 16eb0fab3 ("MAJOR: counters: dispatch counters over thread groups")
introduced a build regression on some compilers:
src/listener.c: In function 'listener_accept':
src/listener.c:1095:3: error: 'for' loop initial declarations are only allowed in C99 mode
for (int it = 0; it < global.nbtgroups; it++)
^
src/listener.c:1095:3: note: use option -std=c99 or -std=gnu99 to compile your code
src/listener.c:1101:4: error: 'for' loop initial declarations are only allowed in C99 mode
for (int it = 0; it < global.nbtgroups; it++) {
^
make: *** [src/listener.o] Error 1
make: *** Waiting for unfinished jobs....
Let's fix that.
No backport needed
In hlua_applet_tcp_recv_try() and hlua_applet_tcp_getline_yield(), GCC 14.2
reports warnings about 'blk2' variable that may be used uninitialized. It is
a bit strange because the code is pretty similar than before. But to make it
happy and to avoid bugs if the API change in future, 'blk2' is now used only
when its length is greater than 0.
No need to backport.
In hlua_applet_tcp_getline_yield(), the function may yield if there is no
data available. However we must take care to add a return statement just
after the call to hlua_yieldk(). I don't know the details of the LUA API,
but at least, this return statement fix a build error about uninitialized
variables that may be used.
It is a 3.3-specific issue. No backport needed.
On backend side, MUX is instantiated after QUIC handshake completion.
This step is performed via qc_ssl_provide_quic_data(). First, connection
flags for handshake completion are resetted. Then, MUX is instantiated
via conn_create_mux() function.
Force QUIC as <mux_proto> for server if a QUIC address is used. This is
similarly to what is already done for bind instances on the frontend
side. This step ensures that conn_create_mux() will select the proper
protocol.
Replace ->li quic_conn pointer to struct listener member by ->target which is
an object type enum and adapt the code.
Use __objt_(listener|server)() where the object type is known. Typically
this is were the code which is specific to one connection type (frontend/backend).
Remove <server> parameter passed to qc_new_conn(). It is redundant with the
<target> parameter.
GSO is not supported at this time for QUIC backend. qc_prep_pkts() is modified
to prevent it from building more than an MTU. This has as consequence to prevent
qc_send_ppkts() to use GSO.
ssl_clienthello.c code is run only by listeners. This is why __objt_listener()
is used in place of ->li.
Disable the code around SSL_get_peer_quic_transport_params() as this was done
for USE_QUIC_OPENSSL_COMPAT because SSL_get_peer_quic_transport_params() is not
defined by OpenSSL 3.5 QUIC API.
quic_tls_compat_keylog_callback() is the callback used by the QUIC OpenSSL
compatibility module to derive the TLS secrets from other secrets provided
by keylog. The <write> local variable to this function is initialized to denote
the direction (write to send, read to receive) the secret is supposed to be used
for. That said, as the QUIC cryptographic algorithms are symmetrical, the
direction is inversed between the peer: a secret which is used to write/send/cipher
data from a peer point of view is also the secret which is used to
read/receive/decipher data. This was confirmed by the fact that without this
patch, the TLS stack first provides the peer with Handshake to send/cipher
data. The client could not use such secret to decipher the Handshake packets
received from the server. This patch simply reverse the direction stored by
<write> variable to make the secrets derivation works for the QUIC client.
quic_tls_compat_init() function is called from OpenSSL QUIC compatibility module
(USE_QUIC_OPENSSL_COMPAT) to initialize the keylog callback and the callback
which stores the QUIC transport parameters as a TLS extensions into the stack.
These callbacks must also be initialized for QUIC backends.
This is done from TLS secrets derivation callback at Application level (the last
encryption level) calling SSL_get_peer_quic_transport_params() to have an access
to the TLS transport paremeters extension embedded into the Server Hello TLS message.
Then, quic_transport_params_store() is called to store a decoded version of
these transport parameters.
For connection to QUIC servers, this patch modifies the moment where the I/O
handler callback is switched to quic_conn_app_io_cb(). This is no more
done as for listener just after the handshake has completed but just after
it has been confirmed.
Discard the Initial packet number space as soon as possible. This is done
during handshakes in quic_conn_io_cb() as soon as an Handshake packet could
be successfully sent.
The initialization of <ssl_app_data_index> SSL user data index is required
to make all the SSL sessions to QUIC servers work as this is done for TCP
servers. The conn object notably retrieve for SSL callback which are
server specific (e.g. ssl_sess_new_srv_cb()).
Store the peer connection ID (SCID) as the connection DCID as soon as an Initial
packet is received.
Stop comparing the packet to QUIC_PACKET_TYPE_0RTT is already match as
QUIC_PACKET_TYPE_INITIAL.
A QUIC server must not send too short datagram with ack-eliciting packets inside.
This cannot be done from quic_rx_pkt_parse() because one does not know if
there is ack-eliciting frame into the Initial packets. If the packet must be
dropped, this is after having parsed it!
Modify quic_dgram_parse() to stop passing it a listener as third parameter.
In place the object type address of the connection socket owner is passed
to support the haproxy servers with QUIC as transport protocol.
qc_owner_obj_type() is implemented to return this address.
qc_counters() is also implemented to return the QUIC specific counters of
the proxy of owner of the connection.
quic_rx_pkt_parse() called by quic_dgram_parse() is also modify to use
the object type address used by this latter as last parameter. It is
also modified to send Retry packet only from listeners. A QUIC client
(connection to haproxy QUIC servers) must drop the Initial packets with
non null token length. It is also not supposed to receive O-RTT packets
which are dropped.
The QUIC datagram redispatch is there to counter the race condition which
exists only for QUIC connections to listener where datagrams may arrive
on the wrong socket between the bind() and connect() calls.
Run this code part only for listeners.
Allocate a connection to connect to QUIC servers from qc_conn_init() which is the
->init() QUIC xprt callback.
Also initialize ->prepare_srv and ->destroy_srv callback as this done for TCP
servers.
For haproxy QUIC servers (or QUIC clients), the peer is considered as validated.
This is a property which is more specific to QUIC servers (haproxy QUIC listeners).
No <odcid> is used for the QUIC client connection. It is used only on the QUIC server side.
The <token_odcid> is also not used on the QUIC client side. It must be embedded into
the transport parameters only on the QUIC server side.
The quic_conn is created before the socket allocation. So, the local address is
zeroed.
Initilize the transport parameter with qc_srv_params_init().
Stop hardcoding the <server> parameter passed value to qc_new_isecs() to correctly
initialize the Initial secrets.
Modify quic_connect_server() which is the ->connect() callback for QUIC protocol:
- add a BUG_ON() run when entering this funtion: the <fd> socket must equal -1
- conn->handle is a union. conn->handle.qc is use for QUIC connection,
conn->handle.fd must not be used to store the fd.
- code alignment fix for setsockopt(fd, SOL_SOCKET, (SO_SNDBUF|SO_RCVBUF))
statements
- remove the section of code which was duplicated from ->connect() TCP callback
- fd_insert() the new socket file decriptor created to connect to the QUIC
server with quic_conn_sock_fd_iocb() as callback for read event.
This patch only adds <proto_type> new proto_type enum parameter and <sock_type>
socket type parameter to sock_create_server_socket() and adapts its callers.
This is to prepare the use of this function by QUIC servers/backends.
Modify qc_alloc_ssl_sock_ctx() to pass the connection object as parameter. It is
NULL for a QUIC listener, not NULL for a QUIC server. This connection object is
set as value for ->conn quic_conn struct member. Initialise the SSL session object from
this function for QUIC servers.
qc_ssl_set_quic_transport_params() is also modified to pass the SSL object as parameter.
This is the unique parameter this function needs. <qc> parameter is used only for
the trace.
SSL_do_handshake() must be calle as soon as the SSL object is initialized for
the QUIC backend connection. This triggers the TLS CRYPTO data delivery.
tasklet_wakeup() is also called to send asap these CRYPTO data.
Modify the QUIC_EV_CONN_NEW event trace to dump the potential errors returned by
SSL_do_handshake().
Implement ssl_sock_new_ssl_ctx() to allocate a SSL server context as this is currently
done for TCP servers and also for QUIC servers depending on the <is_quic> boolean value
passed as new parameter. For QUIC servers, this function calls ssl_quic_srv_new_ssl_ctx()
which is specific to QUIC.
From connect_server(), QUIC protocol could not be retreived by protocol_lookup()
because of the PROTO_TYPE_STREAM default passed as argument. In place to support
QUIC srv->addr_type.proto_type may be safely passed.
The QUIC servers xprts have already been set at server line parsing time.
This patch prevents the QUIC servers xprts to be reset to <ssl_sock> value which is
the value used for SSL/TCP connections.
Add ->quic_params new member to server struct.
Also set the ->xprt member of the server being initialized and initialize asap its
transport parameters from _srv_parse_init().
This XPRT callback is called from check_config_validity() after the configuration
has been parsed to initialize all the SSL server contexts.
This patch implements the same thing for the QUIC servers.
Add a little check to verify that the version chosen by the server matches
with the client one. Initiliazes local transport parameters ->negotiated_version
value with this version if this is the case. If not, return 0;
According to the RFC, a QUIC client must encode the QUIC version it supports
into the "Available Versions" of "Version Information" transport parameter
order by descending preference.
This is done defining <quic_version_2> and <quic_version_draft_29> new variables
pointers to the corresponding version of <quic_versions> array elements.
A client announces its available versions as follows: v1, v2, draft29.
Activate QUIC protocol support for MUX-QUIC on the backend side,
additionally to current frontend support. This change is mandatory to be
able to implement QUIC on the backend side.
Without this modification, it is impossible to activate explicitely QUIC
protocol on a server line, hence an error is reported :
config : proxy 'xxxx' : MUX protocol 'quic' is not usable for server 'yyyy'
Mark QUIC address support for servers as experimental on the backend
side. Previously, it was allowed but wouldn't function as expected. As
QUIC backend support requires several changes, it is better to declare
it as experimental first.
QUIC is not implemented on the backend side. To prevent any issue, it is
better to reject any server configured which uses it. This is done via
_srv_parse_init() which is used both for static and dynamic servers.
This should be backported up to all stable versions.
Released version 3.3-dev1 with the following main changes :
- BUILD: tools: properly define ha_dump_backtrace() to avoid a build warning
- DOC: config: Fix a typo in 2.7 (Name format for maps and ACLs)
- REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+ (5)
- REGTESTS: Remove REQUIRE_VERSION=2.3 from all tests
- REGTESTS: Remove REQUIRE_VERSION=2.4 from all tests
- REGTESTS: Remove tests with REQUIRE_VERSION_BELOW=2.4
- REGTESTS: Remove support for REQUIRE_VERSION and REQUIRE_VERSION_BELOW
- MINOR: server: group postinit server tasks under _srv_postparse()
- MINOR: stats: add stat_col flags
- MINOR: stats: add ME_NEW_COMMON() helper
- MINOR: proxy: collect per-capability stat in proxy_cond_disable()
- MINOR: proxy: add a true list containing all proxies
- MINOR: log: only run postcheck_log_backend() checks on backend
- MEDIUM: proxy: use global proxy list for REGISTER_POST_PROXY_CHECK() hook
- MEDIUM: server: automatically add server to proxy list in new_server()
- MEDIUM: server: add and use srv_init() function
- BUG/MAJOR: leastconn: Protect tree_elt with the lbprm lock
- BUG/MEDIUM: check: Requeue healthchecks on I/O events to handle check timeout
- CLEANUP: applet: Update comment for applet_put* functions
- DEBUG: check: Add the healthcheck's expiration date in the trace messags
- BUG/MINOR: mux-spop: Fix null-pointer deref on SPOP stream allocation failure
- CLEANUP: sink: remove useless cleanup in sink_new_from_logger()
- MAJOR: counters: add shared counters base infrastructure
- MINOR: counters: add shared counters helpers to get and drop shared pointers
- MINOR: counters: add common struct and flags to {fe,be}_counters_shared
- MEDIUM: counters: manage shared counters using dedicated helpers
- CLEANUP: counters: merge some common counters between {fe,be}_counters_shared
- MINOR: counters: add local-only internal rates to compute some maxes
- MAJOR: counters: dispatch counters over thread groups
- BUG/MEDIUM: cli: Properly parse empty lines and avoid crashed
- BUG/MINOR: config: emit warning for empty args only in discovery mode
- BUG/MINOR: config: fix arg number reported on empty arg warning
- BUG/MINOR: quic: Missing SSL session object freeing
- MINOR: applet: Add API functions to manipulate input and output buffers
- MINOR: applet: Add API functions to get data from the input buffer
- CLEANUP: applet: Simplify a bit comments for applet_put* functions
- MEDIUM: hlua: Update TCP applet functions to use the new applet API
- BUG/MEDIUM: fd: Use the provided tgid in fd_insert() to get tgroup_info
- BUG/MINIR: h1: Fix doc of 'accept-unsafe-...-request' about URI parsing
The description of tests performed on the URI in H1 when
'accept-unsafe-violations-in-http-request' option is wrong. It states that
only characters below 32 and 127 are blocked when this option is set,
suggesting that otherwise, when it is not set, all invalid characters in the
URI, according to the RFC3986, are blocked.
But in fact, it is not true. By default all character below 32 and above 127
are blocked. And when 'accept-unsafe-violations-in-http-request' option is
set, characters above 127 (excluded) are accepted. But characters in
(33..126) are never checked, independently of this option.
This patch should fix the issue #2906. It should be backported as far as
3.0. For older versions, the docuementation could also be clarified because
this part is not really clear.
Note the request URI validation is still under discution because invalid
characters in (33.126) are never checked and some users request a stricter
parsing.
In fd_insert(), use the provided tgid to ghet the thread group info,
instead of using the one of the current thread, as we may call
fd_insert() from a thread of another thread group, that will happen at
least when binding the listeners. Otherwise we'd end up accessing the
thread mask containing enabled thread of the wrong thread group, which
can lead to crashes if we're binding on threads not present in the
thread group.
This should fix Github issue #2991.
This should be backported up to 2.8.
The functions responsible to extract data from the applet input buffer or to
push data into the applet output buffer are now relying on the newly added
functions in the applet API. This simplifies a bit the code.
There was already functions to pushed data from the applet to the stream by
inserting them in the right buffer, depending the applet was using or not
the legacy API. Here, functions to retreive data pushed to the applet by the
stream were added:
* applet_getchar : Gets one character
* applet_getblk : Copies a full block of data
* applet_getword : Copies one text block representing a word using a
custom separator as delimiter
* applet_getline : Copies one text line
* applet_getblk_nc : Get one or two blocks of data
* applet_getword_nc: Gets one or two blocks of text representing a word
using a custom separator as delimiter
* applet_getline_nc: Gets one or two blocks of text representing a line
In this patch, some functions were added to ease input and output buffers
manipulation, regardless the corresponding applet is using its own buffers
or it is relying on channels buffers. Following functions were added:
* applet_get_inbuf : Get the buffer containing data pushed to the applet
by the stream
* applet_get_outbuf : Get the buffer containing data pushed by the applet
to the stream
* applet_input_data : Return the amount of data in the input buffer
* applet_skip_input : Skips <len> bytes from the input buffer
* applet_reset_input: Skips all bytes from the input buffer
* applet_output_room: Returns the amout of space available at the output
buffer
* applet_need_room : Indicates that the applet have more data to deliver
and it needs more room in the output buffer to do
so
qc_alloc_ssl_sock_ctx() allocates an SSL_CTX object for each connection. It also
allocates an SSL object. When this function failed, it freed only the SSL_CTX object.
The correct way to free both of them is to call qc_free_ssl_sock_ctx().
Must be backported as far as 2.6.
If an empty argument is used in configuration, for example due to an
undefined environment variable, the rest of the line is not parsed. As
such, a warning is emitted to report this.
The warning was not totally correct as it reported the wrong argument
index. Fix this by this patch. Note that there is still an issue with
the "^" indicator, but this is not as easy to fix yet.
This is related to github issue #2995.
This should be backported up to 3.2.
Hide warning about empty argument outside of discovery mode. This is
necessary, else the message will be displayed twice, which hampers
haproxy output lisibility.
This should fix github isue #2995.
This should be backported up to 3.2.
Empty lines was not properly parsed and could lead to crashes because the
last argument was parsed outside of the cmdline buffer. Indeed, the last
argument is parsed to look for an eventual payload pattern. It is started
one character after the newline at the end of the command line. But it is
only valid for an non-empty command line.
So, now, this case is properly detected when we leave if an empty line is
detected.
This patch must be backported to 3.2.
Most fe and be counters are good candidates for being shared between
processes. They are now grouped inside "shared" struct sub member under
be_counters and fe_counters.
Now they are properly identified, they would greatly benefit from being
shared over thread groups to reduce the cost of atomic operations when
updating them. For this, we take the current tgid into account so each
thread group only updates its own counters. For this to work, it is
mandatory that the "shared" member from {fe,be}_counters is initialized
AFTER global.nbtgroups is known, because each shared counter causes the stat
to be allocated lobal.nbtgroups times. When updating a counter without
concurrency, the first counter from the array may be updated.
To consult the shared counters (which requires aggregation of per-tgid
individual counters), some helper functions were added to counter.h to
ease code maintenance and avoid computing errors.
cps_max (max new connections received per second), sps_max (max new
sessions per second) and http.rps_max (maximum new http requests per
second) all rely on shared counters (namely conn_per_sec, sess_per_sec and
http.req_per_sec). The problem is that shared counters are about to be
distributed over thread groups, and we cannot afford to compute the
total (for all thread groups) each time we update the max counters.
Instead, since such max counters (relying on shared counters) are a very
few exceptions, let's add internal (sess,conn,req) per sec freq counters
that are dedicated to cps_max, sps_max and http.rps_max computing.
Thanks to that, related *_max counters shouldn't be negatively impacted
by the thread-group distribution, yet they will not benefit from it
either. Related internal freq counters are prefixed with "_" to emphasize
the fact that they should not be used for other purpose (the shared ones,
which are about to be distributed over thread groups in upcoming commits
are still available and must be used instead). The internal ones could
eventually be removed at any time if we find another way to compute the
{cps,sps,http.rps)_max counters.
Now that we have a common struct between fe and be shared counters struct
let's perform some cleanup to merge duplicate members into the common
struct part. This will ease code maintenance.
proxies, listeners and server shared counters are now managed via helpers
added in one of the previous commits.
When guid is not set (ie: when not yet assigned), shared counters pointer
is allocated using calloc() (local memory) and a flag is set on the shared
counters struct to know how to manipulate (and free it). Else if guid is
set, then it means that the counters may be shared so while for now we
don't actually use a shared memory location the API is ready for that.
The way it works, for proxies and servers (for which guid is not known
during creation), we first call counters_{fe,be}_shared_get with guid not
set, which results in local pointer being retrieved (as if we just
manually called calloc() to retrieve a pointer). Later (during postparsing)
if guid is set we try to upgrade the pointer from local to shared.
Lastly, since the memory location for some objects (proxies and servers
counters) may change from creation to postparsing, let's update
counters->last_change member directly under counters_{fe,be}_shared_get()
so we don't miss it.
No change of behavior is expected, this is only preparation work.
fe_counters_shared and be_counters_shared may share some common members
since they are quite similar, so we add a common struct part shared
between the two. struct counters_shared is added for convenience as
a generic pointer to manipulate common members from fe or be shared
counters pointer.
Also, the first common member is added: shared fe and be counters now
have a flags member.
create include/haproxy/counters.h and src/counters.c files to anticipate
for further helpers as some counters specific tasks needs to be carried
out and since counters are shared between multiple object types (ie:
listener, proxy, server..) we need generic helpers.
Add some shared counters helper which are not yet used but will be updated
in upcoming commits.
Shareable counters are not tagged as shared counters and are dynamically
allocated in separate memory area as a prerequisite for being stored
in shared memory area. For now, GUID and threads groups are not taken into
account, this is only a first step.
also we ensure all counters are now manipulated using atomic operations,
namely, "last_change" counter is now read from and written to using atomic
ops.
Despite the numerous changes caused by the counters being moved away from
counters struct, no change of behavior should be expected.
As reported by Ilya in GH #2994, some cleanup parts in
sink_new_from_logger() function are not used.
We can actually simplify the cleanup logic to remove dead code, let's
do that by renaming "error_final" label to "error" and only making use
of the "error" label, because sink_free() already takes care of proper
cleanup for all sink members.
When we try to allocate a new SPOP stream, if an error is encountered,
spop_strm_destroy() is called to released the eventually allocated
stream. But, it must only be called if a stream was allocated. If the
reported error is an SPOP stream allocation failure, we must just leave to
avoid null-pointer dereference.
This patch should fix point 1 of the issue #2993. It must be backported as
far as 3.1.
These functions were copied from the channel API and modified to work with
applets using the new API or the legacy one. However, the comments were
updated accordingly. It is the purpose of this patch.
When a healthchecks is processed, once the first wakeup passed to start the
check, and as long as the expiration timer is not reached, only I/O events
are able to wake it up. It is an issue when there is a check timeout
defined. Especially if the connect timeout is high and the check timeout is
low. In that case, the healthcheck's task is never requeue to handle any
timeout update. When the connection is established, the check timeout is set
to replace the connect timeout. It is thus possible to report a success
while a timeout should be reported.
So, now, when an I/O event is handled, the healthcheck is requeue, except if
an success or an abort is reported.
Thanks to Thierry Fournier for report and the reproducer.
This patch must be backported to all stable versions.
In fwlc_srv_reposition(), set the server's tree_elt while we still hold
the lbprm read lock. While it was protected from concurrent
fwlc_srv_reposition() calls by the server's lb_lock, it was not from
dequeuing/requeuing that could occur if the server gets down/up or its
weight is changed, and that would lead to inconsistencies, and the
watchdog killing the process because it is stuck in an infinite loop in
fwlc_get_next_server().
This hopefully fixes github issue #2990.
This should be backported to 3.2.
rename _srv_postparse() internal function to srv_init() function and group
srv_init_per_thr() plus idle conns list init inside it. This way we can
perform some simplifications as srv_init() performs multiple server
init steps after parsing.
SRV_F_CHECKED flag was added, it is automatically set when srv_init()
runs successfully. If the flag is already set and srv_init() is called
again, nothing is done. This permis to manually call srv_init() earlier
than the default POST_CHECK hook when needed without risking to do things
twice.
while new_server() takes the parent proxy as argument and even assigns
srv->proxy to the parent proxy, it didn't actually inserted the server
to the parent proxy server list on success.
The result is that sometimes we add the server to the list after
new_server() is called, and sometimes we don't.
This is really error-prone and because of that hooks such as
REGISTER_POST_SERVER_CHECK() which as run for all servers listed in
all proxies may not be relied upon for servers which are not actually
inserted in their parent proxy server list. Plus it feels very strange
to have a server that points to a proxy, but then the proxy doesn't know
about it because it cannot find it in its server list.
To prevent errors and make proxy->srv list reliable, we move the insertion
logic directly under new_server(). This requires to know if we are called
during parsing or during runtime to either insert or append the server to
the parent proxy list. For that we use PR_FL_CHECKED flag from the parent
proxy (if the flag is set, then the proxy was checked so we are past the
init phase, thus we assume we are called during runtime)
This implies that during startup if new_server() has to be cancelled on
error paths we need to call srv_detach() (which is now exposed in server.h)
before srv_drop().
The consequence of this commit is that REGISTER_POST_SERVER_CHECK() should
not run reliably on all servers created using new_server() (without having
to manually loop on global servers_list)
REGISTER_POST_PROXY_CHECK() used to iterate over "main" proxies to run
registered callbacks. This means hidden proxies (and their servers) did
not get a chance to get post-checked and could cause issues if some post-
checks are expected to be executed on all proxies no matter their type.
Instead we now rely on the global proxies list. Another side effect is that
the REGISTER_POST_SERVER_CHECK() now runs as well for servers from proxies
that are not part of the main proxies list.
postcheck_log_backend() checks are executed no matter if the proxy
actually has the backend capability while the checks actually depend
on this.
Let's fix that by adding an extra condition to ensure that the BE
capability is set.
This issue is not tagged as a bug because for now it remains impossible
to have a syslog proxy without BE capability in the main proxy list, but
this may change in the future.
We have global proxies_list pointer which is announced as the list of
"all existing proxies", but in fact it only represents regular proxies
declared on the config file through "listen, frontend or backend" keywords
It is ambiguous, and we currently don't have a straightforwrd method to
iterate over all proxies (either public or internal ones) within haproxy
Instead we still have to manually iterate over multiple lists (main
proxies, log-forward proxies, peer proxies..) which is error-prone.
In this patch we add a struct list member (8 bytes) inside struct proxy
in order to store every proxy (except default ones) within a global
"proxies" list which is actually representative for all proxies existing
under haproxy process, like we already have for servers.
proxy_cond_disable() collects and prints cumulated connections for be and
fe proxies no matter their type. With shared stats it may cause issues
because depending on the proxy capabilities only fe or be counters may
be allocated.
In this patch we add some checks to ensure we only try to read from
valid memory locations, else we rely on default values (0).
init_srv_requeue() and init_srv_slowstart() functions are called after
initial server parsing via REGISTER_POST_SERVER_CHECK() hook, and they
are also manually called for dynamic server after the server is
initialized.
This may conflict with _srv_postparse() which is also registered via
REGISTER_POST_SERVER_CHECK() and called during dynamic server creation
To ensure functions don't conflict with each other, let's ensure they
are executed in proper order by calling init_srv_requeue and
init_srv_slowstart() from _srv_postparse() which now becomes the parent
function for server related postparsing stuff. No change of behavior is
expected.
This is no longer used since the migration to the native `haproxy -cc
'version_atleast(X)'` functionality.
see 8727614dc4046e91997ecce421bcb6a5537cac93
see 5efc48dcf1b133dd415c759e83b21d52dc303786
Introduced in:
25bcdb1d9 BUG/MAJOR: h1: Be stricter on request target validation during message parsing
see also:
fbbbc33df REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+
In resolve_sym_name() we declare a few symbols that we want to be able
to resolve. ha_dump_backtrace() was declared with a struct buffer instead
of a pointer to such a struct, which has no effect since we only want to
get the function's pointer, but produces a build warning with LTO, so
let's fix it.
This can be backported to 3.0.
Released version 3.2.0 with the following main changes :
- MINOR: promex: Add agent check status/code/duration metrics
- MINOR: ssl: support strict-sni in ssl-default-bind-options
- MINOR: ssl: also provide the "tls-tickets" bind option
- MINOR: server: define CLI I/O handler for "add server"
- MINOR: server: implement "add server help"
- MINOR: server: use stress mode for "add server help"
- BUG/MEDIUM: server: fix crash after duplicate GUID insertion
- BUG/MEDIUM: server: fix potential null-deref after previous fix
- MINOR: config: list recently added sections with -dKcfg
- BUG/MAJOR: cache: Crash because of wrong cache entry deleted
- DOC: configuration: fix the example in crt-store
- DOC: config: clarify the wording around single/double quotes
- DOC: config: clarify the legacy cookie and header captures
- DOC: config: fix alphabetical ordering of layer 7 sample fetch functions
- DOC: config: fix alphabetical ordering of layer 6 sample fetch functions
- DOC: config: fix alphabetical ordering of layer 5 sample fetch functions
- DOC: config: fix alphabetical ordering of layer 4 sample fetch functions
- DOC: config: fix alphabetical ordering of internal sample fetch functions
- BUG/MINOR: h3: Set HTX flags corresponding to the scheme found in the request
- BUG/MEDIUM: h3: Declare absolute URI as normalized when a :authority is found
- DOC: config: mention in bytes_in and bytes_out that they're read on input
- DOC: config: clarify the basics of ACLs (call point, multi-valued etc)
- REGTESTS: Make the script testing conditional set-var compatible with Vtest2
- REGTESTS: Explicitly allow failing shell commands in some scripts
- MINOR: listeners: Add support for a label on bind line
- BUG/MEDIUM: cli/ring: Properly handle shutdown in "show event" I/O handler
- BUG/MEDIUM: hlua: Properly detect shudowns for TCP applets based on the new API
- BUG/MEDIUM: hlua: Fix getline() for TCP applets to work with applet's buffers
- BUG/MEDIUM: hlua: Fix receive API for TCP applets to properly handle shutdowns
- CI: vtest: Rely on VTest2 to run regression tests
- CI: vtest: Fix the build script to properly work on MaOS
- CI: combine AWS-LC and AWS-LC-FIPS by template
- BUG/MEDIUM: httpclient: Throw an error if an lua httpclient instance is reused
- DOC: hlua: Add a note to warn user about httpclient object reuse
- DOC: hlua: fix a few typos in HTTPMessage.set_body_len() documentation
- DEV: patchbot: prepare for new version 3.3-dev
- MINOR: version: mention that it's 3.2 LTS now.
A few typos were noticed while gathering info for the 3.2 announce
messages, this fixes them, and will probably constitute the last
commit of this release. There's no need to backport it unless commit
94055a5e7 ("MEDIUM: hlua: Add function to change the body length of
an HTTP Message") is backported.
It is not supported to reuse an lua httpclient instance to process several
requests. A new object must be created for each request. Thanks to the
previous patch ("BUG/MEDIUM: httpclient: Throw an error if an lua httpclient
instance is reused"), an error is now reported if this happens. But it is
not obvious for users. So the lua-api docuementation was updated accordingly.
This patch is related to issue #2986. It should be backported with the
commit above.
It is not expected/supported to reuse an httpclient instance to process
several requests. A new instance must be created for each request. However,
in lua, there is nothing to prevent a user to create an httpclient object
and use it in a loop to process requests.
That's unfortunate because this will apparently work, the requests will be
sent and a response will be received and processed. However internally some
ressources will be allocated and never released. When the next response is
processed, the ressources allocated for the previous one are definitively
lost.
In this patch we take care to check that the httpclient object was never
used when a request is sent from a lua script by checking
HTTPCLIENT_FS_STARTED flags. This flag is set when a httpclient applet is
spawned to process a request and never removed after that. In lua, the
httpclient applet is created when the request is sent. So, it is the right
place to do this test.
This patch should fix the issue #2986. It should be backported as far as
2.6.
VTest2 (https://github.com/vtest/VTest2) was released and is a remplacement
for VTest. VTest was archived. So let's use the new version now.
If this commit is backported, the 2 following commits must also be
backported:
* 2808e3577 ("REGTESTS: Explicitly allow failing shell commands in some scripts")
* 82c291124 ("REGTESTS: Make the script testing conditional set-var compatible with Vtest2")
An optional timeout was added to AppletTCP.receive() to interrupt calls after a
delay. It was mandatory to be able to implement interactive applets (like
trisdemo). However, this broke the API and it made impossible to differentiate
the shutdowns from the delays expirations. Indeed, in both cases, an empty
string was returned.
Because historically an empty string was used to notify a connection shutdown,
it should not be changed. So now, 'nil' value is returned when no data was
available before the delay expiration.
The new AppletTCP:try_receive() function was also affected. To fix it, instead
of stating there is no delay when a receive is tried, an expired delay is
set. Concretely TICK_ETERNITY was replaced by now_ms.
Finally, AppletTCP:getline() function is not concerned for now because there
is no way to interrupt it after some delay.
The documentation and trisdemo lua script were updated accordingly.
This patch depends on "BUG/MEDIUM: hlua: Properly detect shudowns for TCP
applets based on the new API". However, it is a 3.2-specific issue, so no
backport is needed.
The commit e5e36ce09 ("BUG/MEDIUM: hlua/cli: Fix lua CLI commands to work
with applet's buffers") fixed the TCP applets API to work with applets using
its own buffers. Howver the getline() function was not updated. It could be
an issue for anyone registering a CLI commands reading lines.
This patch should be backported as far as 3.0.
The internal function responsible to receive data for TCP applets with
internal buffers is buggy. Indeed, for these applets, the buffer API is used
to get data. So there is no tests on the SE to properly detect connection
shutdowns. So, it must be performed by hand after the call to b_getblk_nc().
This patch must be backported as far as 3.0.
The commit 03dc54d802 ("BUG/MINOR: ring: Fix I/O handler of "show event"
command to not rely on the SC") introduced a regression. By removing
dependencies on the SC, a test to detect client shutdowns was removed. So
now, the CLI applet is no longer released when the client shut the
connection during a "show event -w".
So of course, we should not use the SC to detect the shutdowns. But the SE
must be used insteead.
It is a 3.2-specific issue, so no backport needed.
It is now possile to set a label on a bind line. All sockets attached to
this bind line inherits from this label. The idea is to be able to groud of
sockets. For now, there is no mechanism to create these groups, this must be
done by hand.
Vtest2, that should replaced Vtest in few months, will reject any failing
commands in shell blocks. However, some scripts are executing some commands,
expecting an error to be able to parse the error output. So, now use "set
+e" in those scripts to explicitly state failing commads are expected.
It is just used for non-final commands. At the end, the shell block must
still report a success.
VTest2 will replaced VTest in few months. There is not so much change
expected. One of them is that a User-Agent header is added by default in all
requests, except if an custom one is already set or if "-nouseragent" option
is used. To still be compatible with VTest, it is not possible to use the
option to avoid the header addition. So, a custom user-agent is added in the
last test of "sample_fetches/cond_set_var.vtc" to be sure it will pass with
Vtest and Vtest2. It is mandatory because the request length is tested.
This is essentially in order to address the concerns expressed in
issue #2226 where it is mentioned that the moment they are called is
not clear enough. Admittedly, re-reading the paragraph doesn't make
it obvious on a quick read that they behave like functions. This patch
adds an extra paragraph that makes the parallel with programming
languages' boolean functions and explains the fact that they can be
multi-valued. Hoping this is clearer now.
Issue #2267 suggests that it's unclear what exactly the byte counts mean
(particularly when compression is involved). Let's clarify that the counts
are read on data input and that they also cover headers and a bit of
internal overhead.
Since commit 2c3d656f8 ("MEDIUM: h3: use absolute URI form with
:authority"), the absolute URI form is used when a ':authority'
pseudo-header is found. However, this URI was not declared as normalized
internally. So, when the request is reformated to be sent to an h1 server,
the absolute-form is used instead of the origin-form. It is unexpected and
may be an issue for some servers that could reject the request.
So, now, we take care to set HTX_SL_F_HAS_AUTHORITY flag on the HTX message
when an authority was found and HTX_SL_F_NORMALIZED_URI flag is set for
"http" or "https" schemes.
No backport needed because the commit above must not be backported. It
should fix a regression reported on the 3.2-dev17 in issue #2977.
This commit depends on "BUG/MINOR: h3: Set HTX flags corresponding to the
scheme found in the request".
When a ":scheme" pseudo-header is found in a h3 request, the
HTX_SL_F_HAS_SCHM flag must be set on the HTX message. And if the scheme is
'http' or 'https', the corresponding HTX flag must also be set. So,
respectively, HTX_SL_F_SCHM_HTTP or HTX_SL_F_SCHM_HTTPS.
It is mainly used to send the right ":scheme" pseudo-header value to H2
server on backend side.
This patch could be backported as far as 2.6.
As reported in issue #2195, cookie captures and header captures are no
longer the recommended way to proceed. Let's mention that this is the
legacy way and provide a few pointers to the recommended functions and
actions to use the modern methods.
As reported in issue #2327, the wording used in the section about quoting
can be read two ways due to the use of the two types of quotes to protect
each other quote. Better only use the quoting without mixing the two when
mentioning them.
Fix a bad example in the crt-store section. site1 does not use the "web"
crt-store but the global one.
Must be backported as far as 3.0 however the section was 3.12 in
previous version.
When "vary" is enabled, we can have multiple entries for a given primary
key in the cache tree. There is a limit to how many secondary entries
can be inserted for a given key. When we try to insert a new secondary
entry, if the limit is already reached, we can try to find expired
entries with the same primary key, and if the limit is still reached we
want to abort the current insertion and to remove the node that was just
inserted.
In commit "a29b073: MEDIUM: cache: Add refcount on cache_entry" though,
a regression was introduced. Instead of removing the entry just inserted
as the comments suggested, we removed the second to last entry and
returned NULL. We then reset the eb.key of the cache_entry in the caller
because we assumed that the entry was already removed from the tree.
This means that some entries with an empty key were wrongly kept in the
tree and the last secondary entry, which keeps the number of secondary
entries of a given key was removed.
This ended up causing some crashes later on when we tried to iterate
over the elements of this given key. The crash could occur in multiple
places, either when trying to retrieve an entry or to add some new ones.
This crash was raised in GitHub issue #2950.
The fix should be backported up to 3.0.
A valid build warning was reported in the CI with latest commit b40ce97ecc
("BUG/MEDIUM: server: fix crash after duplicate GUID insertion"). Indeed,
if the first test in the function fails, we branch to the err label
with guid==NULL and will crash there. Let's just test guid before
dereferencing it for freeing.
This needs to be backported to 3.0 as well since the commit above was
meant to go there.
On "add server", if a GUID is defined, guid_insert() is used to add the
entry into the global GUID tree. If a similar entry already exists, GUID
insertion fails and the server creation is eventually aborted.
A crash could occur in this case because of an invalid memory access via
guid_remove(). The latter is caused via free_server() as the server
insertion is rejected. The invalid occurs on GUID key.
The issue occurs because of guid_insert(). The function properly
deallocates the GUID key on duplicate insertion, but it failed to reset
<guid.node.key> to NULL. This caused the invalid memory access on
guid_remove(). To fix this, ensure that key member is properly resetted
on guid_insert() error path.
This must be backported up to 3.0.
Implement stress mode on "add server help". This ensures that the
command is fully reentrant on full output buffer.
For testing, it requires compilation with USE_STRESS and global setting
"stress-level 1".
Implement "help" as a sub-command for "add server" CLI. The objective is
to list all the keywords that are supported for dynamic servers. CLI IO
handler and add_srv_ctx are used to support reentrancy on full output
buffer.
Now that this command is implemented, the outdated keyword list on "add
server" from management documentation can be removed.
Extend "add server" to support an IO handler function named
cli_io_handler_add_server(). A context object is also defined whose
usage will depend on IO handler capabilities.
IO handler is skipped when "add server" is run in default mode, i.e. on
a dynamic server creation. Thus, currently IO handler is unneeded.
However, it will become useful to support sub-commands for "add server".
Note that return value of "add server" parser has been changed on server
creation success. Previously, it was used incorrectly to report if
server was inserted or not. In fact, parser return value is used by CLI
generic code to detect if command processing has been completed, or
should continue to the IO handler. Now, "add server" always returns 1 to
signal that CLI processing is completed. This is necessary to preserve
CLI output emitted by parser, even now that IO handler is defined for
the command. Previously, output was emitted in every situations due to
IO handler not defined. See below code snippet from cli.c for a better
overview :
if (kw->parse && kw->parse(args, payload, appctx, kw->private) != 0) {
ret = 1;
goto fail;
}
/* kw->parse could set its own io_handler or io_release handler */
if (!appctx->cli_ctx.io_handler) {
ret = 1;
goto fail;
}
appctx->st0 = CLI_ST_CALLBACK;
ret = 1;
goto end;
Currently there is "no-tls-tickets" that is also supported in the
ssl-default-bind-options directive, but there's no way to re-enable
them on a specific "bind" line. This patch simply provides the option
to re-enable them. Note that the flag is inverted because tickets are
enabled by default and the no-tls-ticket option sets the flag to
disable them.
Several users already reported that it would be nice to support
strict-sni in ssl-default-bind-options. However, in order to support
it, we also need an option to disable it.
This patch moves the setting of the option from the strict_sni field
to a flag in the ssl_options field so that it can be inherited from
the default bind options, and adds a new "no-strict-sni" directive to
allow to disable it on a specific "bind" line.
The test file "del_ssl_crt-list.vtc" which already tests both options
was updated to make use of the default option and the no- variant to
confirm everything continues to work.
In the Prometheus exporter, the last health check status is already exposed,
with its code and duration in seconds. The server status is also exposed.
But the information about the agent check are not available. It is not
really handy because when a server status is changed because of the agent,
it is not obvious by looking to the Prometheus metrics. Indeed, the server
may reported as DOWN for instance, while the health check status still
reports a success. Being able to get the agent status in that case could be
valuable.
So now, the last agent check status is exposed, with its code and duration
in seconds. Following metrics can be grabbe now:
* haproxy_server_agent_status
* haproxy_server_agent_code
* haproxy_server_agent_duration_seconds
Note that unlike the other metrics, no per-backend aggregated metric is
exposed.
This patch is related to issue #2983.
Released version 3.2-dev17 with the following main changes :
- DOC: configuration: explicit multi-choice on bind shards option
- BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers
- BUG/MEDIUM: peers: also limit the number of incoming updates
- MEDIUM: hlua: Add function to change the body length of an HTTP Message
- BUG/MEDIUM: stconn: Disable 0-copy forwarding for filters altering the payload
- BUG/MINOR: h3: don't insert more than one Host header
- BUG/MEDIUM: h1/h2/h3: reject forbidden chars in the Host header field
- DOC: config: properly index "table and "stick-table" in their section
- DOC: management: change reference to configuration manual
- BUILD: debug: mark ha_crash_now() as attribute(noreturn)
- IMPORT: slz: avoid multiple shifts on 64-bits
- IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested
- IMPORT: slz: use a better hash for machines with a fast multiply
- IMPORT: slz: fix header used for empty zlib message
- IMPORT: slz: silence a build warning on non-x86 non-arm
- BUG/MAJOR: leastconn: do not loop forever when facing saturated servers
- BUG/MAJOR: queue: properly keep count of the queue length
- BUG/MINOR: quic: fix crash on quic_conn alloc failure
- BUG/MAJOR: leastconn: never reuse the node after dropping the lock
- MINOR: acme: renewal notification over the dpapi sink
- CLEANUP: quic: Useless BIO_METHOD initialization
- MINOR: quic: Add useful error traces about qc_ssl_sess_init() failures
- MINOR: quic: Allow the use of the new OpenSSL 3.5.0 QUIC TLS API (to be completed)
- MINOR: quic: implement all remaining callbacks for OpenSSL 3.5 QUIC API
- MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset
- MINOR: quic: OpenSSL 3.5 trick to support 0-RTT
- DOC: update INSTALL for QUIC with OpenSSL 3.5 usages
- DOC: management: update 'acme status'
- BUG/MEDIUM: wdt: always ignore the first watchdog wakeup
- CLEANUP: wdt: clarify the comments on the common exit path
- BUILD: ssl: avoid possible printf format warning in traces
- BUILD: acme: fix build issue on 32-bit archs with 64-bit time_t
- DOC: management: precise some of the fields of "show servers conn"
- BUG/MEDIUM: mux-quic: fix BUG_ON() on rxbuf alloc error
- DOC: watchdog: update the doc to reflect the recent changes
- BUG/MEDIUM: acme: check if acme domains are configured
- BUG/MINOR: acme: fix formatting issue in error and logs
- EXAMPLES: lua: avoid screen refresh effect in "trisdemo"
- CLEANUP: quic: remove unused cbuf module
- MINOR: quic: move function to check stream type in utils
- MINOR: quic: refactor handling of streams after MUX release
- MINOR: quic: add some missing includes
- MINOR: quic: adjust quic_conn-t.h include list
- CLEANUP: cfgparse: alphabetically sort the global keywords
- MINOR: glitches: add global setting "tune.glitches.kill.cpu-usage"
It was mentioned during the development of glitches that it would be
nice to support not killing misbehaving connections below a certain
CPU usage so that poor implementations that routinely misbehave without
impact are not killed. This is now possible by setting a CPU usage
threshold under which we don't kill them via this parameter. It defaults
to zero so that we continue to kill them by default.
Adjust include list in quic_conn-t.h. This file is included in many QUIC
source, so it is useful to keep as lightweight as possible. Note that
connection/QUIC MUX are transformed into forward declaration for better
layer separation.
Insert some missing includes statement in QUIC source files. This was
detected after the next commit which adjust the include list used in
quic_conn-t.h file.
quic-conn layer has to handle itself STREAM frames after MUX release. If
the stream was already seen, it is probably only a retransmitted frame
which can be safely ignored. For other streams, an active closure may be
needed.
Thus it's necessary that quic-conn layer knows the highest stream ID
already handled by the MUX after its release. Previously, this was done
via <nb_streams> member array in quic-conn structure.
Refactor this by replacing <nb_streams> by two members called
<stream_max_uni>/<stream_max_bidi>. Indeed, it is unnecessary for
quic-conn layer to monitor locally opened uni streams, as the peer
cannot by definition emit a STREAM frame on it. Also, bidirectional
streams are always opened by the remote side.
Previously, <nb_streams> were set by quic-stream layer. Now,
<stream_max_uni>/<stream_max_bidi> members are only set one time, just
prior to QUIC MUX release. This is sufficient as quic-conn do not use
them if the MUX is available.
Note that previously, IDs were used relatively to their type, thus
incremented by 1, after shifting the original value. For simplification,
use the plain stream ID, which is incremented by 4.
Move general function to check if a stream is uni or bidirectional from
QUIC MUX to quic_utils module. This should prevent unnecessary include
of QUIC MUX header file in other sources.
In current version of the game, there is a "screen refresh" effect: the
screen is cleared before being re-drawn.
I moved the clear right after the connection is opened and removed it
from rendering time.
Stop emitting \n in errmsg for intermediate error messages, this was
emitting multiline logs and was returning to a new line in the middle of
sentences.
We don't need to emit them in acme_start_task() since the errmsg is
ouput in a send_log which already contains a \n or on the CLI which
also emits it.
When starting the ACME task with a ckch_conf which does not contain the
domains, the ACME task would segfault because it will try to dereference
a NULL in this case.
The patch fix the issue by emitting a warning when no domains are
configured. It's not done at configuration parsing because it is not
easy to emit the warning because there are is no callback system which
give access to the whole ckch_conf once a line is parsed.
No backport needed.
RX buffer allocation has been reworked in current dev tree. The
objective is to support multiple buffers per QCS to improve upload
throughput.
RX buffer allocation failure is handled simply : the whole connection is
closed. This is done via qcc_set_error(), with INTERNAL_ERROR as error
code. This function contains a BUG_ON() to ensure it is called only one
time per connection instance.
On RX buffer alloc failure, the aformentioned BUG_ON() crashes due to a
double invokation of qcc_set_error(). First by qcs_get_rxbuf(), and
immediately after it by qcc_recv(), which is the caller of the previous
one. This regression was introduced by the following commit.
60f64449fbba7bb6e351e8343741bb3c960a2e6d
MAJOR: mux-quic: support multiple QCS RX buffers
To fix this, simply remove qcc_set_error() invocation in
qcs_get_rxbuf(). On buffer alloc failture, qcc_recv() is responsible to
set the error.
This does not need to be backported.
As reported in issue #2970, the output of "show servers conn" is not
clear. It was essentially meant as a debugging tool during some changes
to idle connections management, but if some users want to monitor or
graph them, more info is needed. The doc mentions the currently known
list of fields, and reminds that this output is not meant to be stable
over time, but as long as it does not change, it can provide some useful
metrics to some users.
The build failed on mips32 with a 64-bit time_t here:
https://github.com/haproxy/haproxy/actions/runs/15150389164/job/42595310111
Let's just turn the "remain" variable used to show the remaining time
into a more portable ullong and use %llu for all format specifiers,
since long remains limited to 32-bit on 32-bit archs.
No backport needed.
When building on MIPS-32 with gcc-9.5 and glibc-2.31, I got this:
src/ssl_trace.c: In function 'ssl_trace':
src/ssl_trace.c:118:42: warning: format '%ld' expects argument of type 'long int', but argument 3 has type 'ssize_t' {aka 'const int'} [-Wformat=]
118 | chunk_appendf(&trace_buf, " : size=%ld", *size);
| ~~^ ~~~~~
| | |
| | ssize_t {aka const int}
| long int
| %d
Let's just cast the type. No backport needed.
With commit a06c215f08 ("MEDIUM: wdt: always make the faulty thread
report its own warnings"), when the TH_FL_STUCK flag was flipped on,
we'd then go to the panic code instead of giving a second chance like
before the commit. This can trigger rare cases that only happen with
moderate loads like was addressed by commit 24ce001771 ("BUG/MEDIUM:
wdt: fix the stuck detection for warnings"). This is in fact due to
the loss of the common "goto update_and_leave" that used to serve
both the warning code and the flag setting for probation, and it's
apparently what hit Christian in issue #2980.
Let's make sure we exit naturally when turning the bit on for the
first time. Let's also update the confusing comment at the end of
the check that was left over by latest change.
Since the first commit was backported to 3.1, this commit should be
backported there as well.
For an unidentified reason, SSL_do_hanshake() succeeds at its first call when 0-RTT
is enabled for the connection. This behavior looks very similar by the one encountered
by AWS-LC stack. That said, it was documented by AWS-LC. This issue leads the
connection to stop sending handshake packets after having release the handshake
encryption level. In fact, no handshake packets could even been sent leading
the handshake to always fail.
To fix this, this patch simulates a "handshake in progress" state waiting
for the application level read secret to be established by the TLS stack.
This may happen only after the QUIC listener has completed/confirmed the handshake
upon handshake CRYPTO data receipt from the peer.
A QUIC must sent its transport parameter using a TLS custom extention. This
extension is reset by SSL_set_SSL_CTX(). It can be restored calling
quic_ssl_set_tls_cbs() (which calls SSL_set_quic_tls_cbs()).
The quic_conn struct is modified for two reasons. The first one is to store
the encoded version of the local tranport parameter as this is done for
USE_QUIC_OPENSSL_COMPAT. Indeed, the local transport parameter "should remain
valid until after the parameters have been sent" as mentionned by
SSL_set_quic_tls_cbs(3) manual. In our case, the buffer is a static buffer
attached to the quic_conn object. qc_ssl_set_quic_transport_params() function
whose role is to call SSL_set_tls_quic_transport_params() (aliased by
SSL_set_quic_transport_params() to set these local tranport parameter into
the TLS stack from the buffer attached to the quic_conn struct.
The second quic_conn struct modification is the addition of the new ->prot_level
(SSL protection level) member added to the quic_conn struct to store "the most
recent write encryption level set via the OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn
callback (if it has been called)" as mentionned by SSL_set_quic_tls_cbs(3) manual.
This patches finally implements the five remaining callacks to make the haproxy
QUIC implementation work.
OSSL_FUNC_SSL_QUIC_TLS_crypto_send_fn() (ha_quic_ossl_crypto_send) is easy to
implement. It calls ha_quic_add_handshake_data() after having converted
qc->prot_level TLS protection level value to the correct ssl_encryption_level_t
(boringSSL API/quictls) value.
OSSL_FUNC_SSL_QUIC_TLS_crypto_recv_rcd_fn() (ha_quic_ossl_crypto_recv_rcd())
provide the non-contiguous addresses to the TLS stack, without releasing
them.
OSSL_FUNC_SSL_QUIC_TLS_crypto_release_rcd_fn() (ha_quic_ossl_crypto_release_rcd())
release these non-contiguous buffer relying on the fact that the list of
encryption level (qc->qel_list) is correctly ordered by SSL protection level
secret establishements order (by the TLS stack).
OSSL_FUNC_SSL_QUIC_TLS_yield_secret_fn() (ha_quic_ossl_got_transport_params())
is a simple wrapping function over ha_quic_set_encryption_secrets() which is used
by boringSSL/quictls API.
OSSL_FUNC_SSL_QUIC_TLS_got_transport_params_fn() (ha_quic_ossl_got_transport_params())
role is to store the peer received transport parameters. It simply calls
quic_transport_params_store() and set them into the TLS stack calling
qc_ssl_set_quic_transport_params().
Also add some comments for all the OpenSSL 3.5 QUIC API callbacks.
This patch have no impact on the other use of QUIC API provided by the others TLS
stacks.
This patch allows the use of the new OpenSSL 3.5.0 QUIC TLS API when it is
available and detected at compilation time. The detection relies on the presence of the
OSSL_FUNC_SSL_QUIC_TLS_CRYPTO_SEND macro from openssl-compat.h. Indeed this
macro is defined by OpenSSL since 3.5.0 version. It is not defined by quictls.
This helps in distinguishing these two TLS stacks. When the detection succeeds,
HAVE_OPENSSL_QUIC is also defined by openssl-compat.h. Then, this is this new macro
which is used to detect the availability of the new OpenSSL 3.5.0 QUIC TLS API.
Note that this detection is done only if USE_QUIC_OPENSSL_COMPAT is not asked.
So, USE_QUIC_OPENSSL_COMPAT and HAVE_OPENSSL_QUIC are exclusive.
At the same location, from openssl-compat.h, ssl_encryption_level_t enum is
defined. This enum was defined by quictls and expansively used by the haproxy
QUIC implementation. SSL_set_quic_transport_params() is replaced by
SSL_set_quic_tls_transport_params. SSL_set_quic_early_data_enabled() (quictls) is also replaced
by SSL_set_quic_tls_early_data_enabled() (OpenSSL). SSL_quic_read_level() (quictls)
is not defined by OpenSSL. It is only used by the traces to log the current
TLS stack decryption level (read). A macro makes it return -1 which is an
usused values.
The most of the differences between quictls and OpenSSL QUI APIs are in quic_ssl.c
where some callbacks must be defined for these two APIs. This is why this
patch modifies quic_ssl.c to define an array of OSSL_DISPATCH structs: <ha_quic_dispatch>.
Each element of this arry defines a callback. So, this patch implements these
six callabcks:
- ha_quic_ossl_crypto_send()
- ha_quic_ossl_crypto_recv_rcd()
- ha_quic_ossl_crypto_release_rcd()
- ha_quic_ossl_yield_secret()
- ha_quic_ossl_got_transport_params() and
- ha_quic_ossl_alert().
But at this time, these implementations which must return an int return 0 interpreted
as a failure by the OpenSSL QUIC API, except for ha_quic_ossl_alert() which
is implemented the same was as for quictls. The five remaining functions above
will be implemented by the next patches to come.
ha_quic_set_encryption_secrets() and ha_quic_add_handshake_data() have been moved
to be defined for both quictls and OpenSSL QUIC API.
These callbacks are attached to the SSL objects (sessions) calling qc_ssl_set_cbs()
new function. This latter callback the correct function to attached the correct
callbacks to the SSL objects (defined by <ha_quic_method> for quictls, and
<ha_quic_dispatch> for OpenSSL).
The calls to SSL_provide_quic_data() and SSL_process_quic_post_handshake()
have been also disabled. These functions are not defined by OpenSSL QUIC API.
At this time, the functions which call them are still defined when HAVE_OPENSSL_QUIC
is defined.
There were no traces to diagnose qc_ssl_sess_init() failures from QUIC traces.
This patch add calls to TRACE_DEVEL() into qc_ssl_sess_init() and its caller
(qc_alloc_ssl_sock_ctx()). This was useful at least to diagnose SSL context
initialization failures when porting QUIC to the new OpenSSL 3.5 QUIC API.
Should be easily backported as far as 2.6.
This code is there from QUIC implementation start. It was supposed to
initialize <ha_quic_meth> as a BIO_METHOD static object. But this
BIO_METHOD is not used at all!
Should be backported as far as 2.6 to help integrate the next patches to come.
2025-05-20 15:00:06 +02:00
167 changed files with 4303 additions and 2622 deletions
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.