[ruby/prism] Fix use of uninitialized value when parsing regexp

Parsing the regexp /\A{/ causes uses an uninitialized value because it
tries to parse it as a range quantifier, so it reads the character after
the closing curly bracket. This is using uninitialized values because
prism strings are not null terminated. This can be seen in the Valgrind
output:

    ==834710== Conditional jump or move depends on uninitialised value(s)
    ==834710==    at 0x5DA010: pm_regexp_parse_range_quantifier (regexp.c:163)
    ==834710==    by 0x5DA010: pm_regexp_parse_quantifier (regexp.c:243)
    ==834710==    by 0x5DAD69: pm_regexp_parse_expression (regexp.c:738)
    ==834710==    by 0x5DAD69: pm_regexp_parse_pattern (regexp.c:761)
    ==834710==    by 0x5DAD69: pm_regexp_parse (regexp.c:773)
    ==834710==    by 0x5A2EE7: parse_regular_expression_named_captures (prism.c:20886)
    ==834710==    by 0x5A2EE7: parse_expression_infix (prism.c:21388)
    ==834710==    by 0x5A5FA5: parse_expression (prism.c:21804)
    ==834710==    by 0x5A64F3: parse_statements (prism.c:13858)
    ==834710==    by 0x5A9730: parse_program (prism.c:22011)
    ==834710==    by 0x576F0D: parse_input_success_p (extension.c:1062)
    ==834710==    by 0x576F0D: parse_success_p (extension.c:1084)

This commit adds checks for the end of the string to
pm_regexp_parse_range_quantifier.

https://github.com/ruby/prism/commit/be6cbc23ef
This commit is contained in:
Peter Zhu 2024-11-11 17:03:41 -05:00 committed by git
parent fee706d9dd
commit eca3680c27
2 changed files with 9 additions and 0 deletions

View File

@ -158,6 +158,11 @@ pm_regexp_parse_range_quantifier(pm_regexp_parser_t *parser) {
} state = PM_REGEXP_RANGE_QUANTIFIER_STATE_START;
while (1) {
if (parser->cursor >= parser->end) {
parser->cursor = savepoint;
return true;
}
switch (state) {
case PM_REGEXP_RANGE_QUANTIFIER_STATE_START:
switch (*parser->cursor) {

View File

@ -186,6 +186,10 @@ module Prism
assert_valid_regexp("foo{1, 2}")
end
def test_fake_range_quantifier_because_unclosed
assert_valid_regexp("\\A{")
end
############################################################################
# These test that flag values are correct.
############################################################################