MINOR: http: add a function to validate characters of :authority
As discussed here: https://github.com/httpwg/http2-spec/pull/936 https://github.com/haproxy/haproxy/issues/2941 It's important to take care of some special characters in the :authority pseudo header before reassembling a complete URI, because after assembly it's too late (e.g. the '/'). This patch adds a specific function which was checks all such characters and their ranges on an ist, and benefits from modern compilers optimizations that arrange the comparisons into an evaluation tree for faster match. That's the version that gave the most consistent performance across various compilers, though some hand-crafted versions using bitmaps stored in register could be slightly faster but super sensitive to code ordering, suggesting that the results might vary with future compilers. This one takes on average 1.2ns per character at 3 GHz (3.6 cycles per char on avg). The resulting impact on H2 request processing time (small requests) was measured around 0.3%, from 6.60 to 6.618us per request, which is a bit high but remains acceptable given that the test only focused on req rate. The code was made usable both for H2 and H3.
This commit is contained in:
parent
c40d6ac840
commit
ebab479cdf
@ -232,6 +232,52 @@ static inline int http_path_has_forbidden_char(const struct ist ist, const char
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Checks whether the :authority pseudo header contains dangerous chars that
|
||||
* might affect its reassembly. We want to catch anything below 0x21, above
|
||||
* 0x7e, as well as '@', '[', ']', '/','?', '#', '\', CR, LF, NUL. Then we
|
||||
* fall back to the slow path and decide. Brackets are used for IP-literal and
|
||||
* deserve special case, that is better handled in the slow path. The function
|
||||
* returns 0 if no forbidden char is presnet, non-zero otherwise.
|
||||
*/
|
||||
static inline int http_authority_has_forbidden_char(const struct ist ist)
|
||||
{
|
||||
size_t ofs, len = istlen(ist);
|
||||
const char *p = istptr(ist);
|
||||
int brackets = 0;
|
||||
uchar c;
|
||||
|
||||
/* Many attempts with various methods have shown that moderately recent
|
||||
* compilers (gcc >= 9, clang >= 13) will arrange the code below as an
|
||||
* evaluation tree that remains efficient at -O2 and above (~1.2ns per
|
||||
* char). The immediate next efficient one is the bitmap from 64-bit
|
||||
* registers but it's extremely sensitive to code arrangements and
|
||||
* optimization.
|
||||
*/
|
||||
for (ofs = 0; ofs < len; ofs++) {
|
||||
c = p[ofs];
|
||||
|
||||
if (unlikely(c < 0x21 || c > 0x7e ||
|
||||
c == '#' || c == '/' || c == '?' || c == '@' ||
|
||||
c == '[' || c == '\\' || c == ']')) {
|
||||
/* all of them must be rejected, except '[' which may
|
||||
* only appear at the beginning, and ']' which may
|
||||
* only appear at the end or before a colon.
|
||||
*/
|
||||
if ((c == '[' && ofs == 0) ||
|
||||
(c == ']' && (ofs == len - 1 || p[ofs + 1] == ':'))) {
|
||||
/* that's an IP-literal (see RFC3986#3.2), it's
|
||||
* OK for now.
|
||||
*/
|
||||
brackets ^= 1;
|
||||
} else {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
/* there must be no opening bracket left nor lone closing one */
|
||||
return brackets;
|
||||
}
|
||||
|
||||
/* Checks status code array <array> for the presence of status code <status>.
|
||||
* Returns non-zero if the code is present, zero otherwise. Any status code is
|
||||
* permitted.
|
||||
|
Loading…
x
Reference in New Issue
Block a user