Copy encoding flags when copying a regex [Bug #20039]

* 🐛 Fixes [Bug #20039](https://bugs.ruby-lang.org/issues/20039)

When a Regexp is initialized with another Regexp, we simply copy the
properties from the original. However, the flags on the original were
not being copied correctly. This caused an issue when the original had
multibyte characters and was being compared with an ASCII string.
Without the forced encoding flag (`KCODE_FIXED`) transferred on to the
new Regexp, the comparison would fail. See the included test for an
example.

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
This commit is contained in:
Dustin Brown 2023-12-06 19:25:29 -08:00 committed by GitHub
parent 1ace218690
commit d89280e8bf
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 12 additions and 0 deletions

2
re.c
View File

@ -3853,6 +3853,8 @@ reg_copy(VALUE copy, VALUE orig)
RB_OBJ_WRITE(copy, &RREGEXP(copy)->src, RREGEXP(orig)->src); RB_OBJ_WRITE(copy, &RREGEXP(copy)->src, RREGEXP(orig)->src);
RREGEXP_PTR(copy)->timelimit = RREGEXP_PTR(orig)->timelimit; RREGEXP_PTR(copy)->timelimit = RREGEXP_PTR(orig)->timelimit;
rb_enc_copy(copy, orig); rb_enc_copy(copy, orig);
FL_SET_RAW(copy, FL_TEST_RAW(orig, KCODE_FIXED|REG_ENCODING_NONE));
return copy; return copy;
} }

View File

@ -1936,6 +1936,16 @@ class TestRegexp < Test::Unit::TestCase
assert_equal("123456789".match(/(?:x?\dx?){2,}/)[0], "123456789") assert_equal("123456789".match(/(?:x?\dx?){2,}/)[0], "123456789")
end end
def test_encoding_flags_are_preserved_when_initialized_with_another_regexp
re = Regexp.new("\u2018hello\u2019".encode("UTF-8"))
str = "".encode("US-ASCII")
assert_nothing_raised do
str.match?(re)
str.match?(Regexp.new(re))
end
end
def test_bug_19537 # [Bug #19537] def test_bug_19537 # [Bug #19537]
str = 'aac' str = 'aac'
re = '^([ab]{1,3})(a?)*$' re = '^([ab]{1,3})(a?)*$'