Merge branch 'merge-pcre' into 10.0
This commit is contained in:
commit
f8736063de
@ -4,6 +4,53 @@ ChangeLog for PCRE
|
||||
Note that the PCRE 8.xx series (PCRE1) is now in a bugfix-only state. All
|
||||
development is happening in the PCRE2 10.xx series.
|
||||
|
||||
Version 8.41 05-July-2017
|
||||
-------------------------
|
||||
|
||||
1. Fixed typo in CMakeLists.txt (wrong number of arguments for
|
||||
PCRE_STATIC_RUNTIME (affects MSVC only).
|
||||
|
||||
2. Issue 1 for 8.40 below was not correctly fixed. If pcregrep in multiline
|
||||
mode with --only-matching matched several lines, it restarted scanning at the
|
||||
next line instead of moving on to the end of the matched string, which can be
|
||||
several lines after the start.
|
||||
|
||||
3. Fix a missing else in the JIT compiler reported by 'idaifish'.
|
||||
|
||||
4. A (?# style comment is now ignored between a basic quantifier and a
|
||||
following '+' or '?' (example: /X+(?#comment)?Y/.
|
||||
|
||||
5. Avoid use of a potentially overflowing buffer in pcregrep (patch by Petr
|
||||
Pisar).
|
||||
|
||||
6. Fuzzers have reported issues in pcretest. These are NOT serious (it is,
|
||||
after all, just a test program). However, to stop the reports, some easy ones
|
||||
are fixed:
|
||||
|
||||
(a) Check for values < 256 when calling isprint() in pcretest.
|
||||
(b) Give an error for too big a number after \O.
|
||||
|
||||
7. In the 32-bit library in non-UTF mode, an attempt to find a Unicode
|
||||
property for a character with a code point greater than 0x10ffff (the Unicode
|
||||
maximum) caused a crash.
|
||||
|
||||
8. The alternative matching function, pcre_dfa_exec() misbehaved if it
|
||||
encountered a character class with a possessive repeat, for example [a-f]{3}+.
|
||||
|
||||
9. When pcretest called pcre_copy_substring() in 32-bit mode, it set the buffer
|
||||
length incorrectly, which could result in buffer overflow.
|
||||
|
||||
10. Remove redundant line of code (accidentally left in ages ago).
|
||||
|
||||
11. Applied C++ patch from Irfan Adilovic to guard 'using std::' directives
|
||||
with namespace pcrecpp (Bugzilla #2084).
|
||||
|
||||
12. Remove a duplication typo in pcre_tables.c.
|
||||
|
||||
13. Fix returned offsets from regexec() when REG_STARTEND is used with a
|
||||
starting offset greater than zero.
|
||||
|
||||
|
||||
Version 8.40 11-January-2017
|
||||
----------------------------
|
||||
|
||||
|
@ -1,6 +1,12 @@
|
||||
News about PCRE releases
|
||||
------------------------
|
||||
|
||||
Release 8.41 13-June-2017
|
||||
-------------------------
|
||||
|
||||
This is a bug-fix release.
|
||||
|
||||
|
||||
Release 8.40 11-January-2017
|
||||
----------------------------
|
||||
|
||||
|
@ -9,18 +9,18 @@ dnl The PCRE_PRERELEASE feature is for identifying release candidates. It might
|
||||
dnl be defined as -RC2, for example. For real releases, it should be empty.
|
||||
|
||||
m4_define(pcre_major, [8])
|
||||
m4_define(pcre_minor, [40])
|
||||
m4_define(pcre_minor, [41])
|
||||
m4_define(pcre_prerelease, [])
|
||||
m4_define(pcre_date, [2017-01-11])
|
||||
m4_define(pcre_date, [2017-07-05])
|
||||
|
||||
# NOTE: The CMakeLists.txt file searches for the above variables in the first
|
||||
# 50 lines of this file. Please update that if the variables above are moved.
|
||||
|
||||
# Libtool shared library interface versions (current:revision:age)
|
||||
m4_define(libpcre_version, [3:8:2])
|
||||
m4_define(libpcre16_version, [2:8:2])
|
||||
m4_define(libpcre32_version, [0:8:0])
|
||||
m4_define(libpcreposix_version, [0:4:0])
|
||||
m4_define(libpcre_version, [3:9:2])
|
||||
m4_define(libpcre16_version, [2:9:2])
|
||||
m4_define(libpcre32_version, [0:9:0])
|
||||
m4_define(libpcreposix_version, [0:5:0])
|
||||
m4_define(libpcrecpp_version, [0:1:0])
|
||||
|
||||
AC_PREREQ(2.57)
|
||||
|
@ -79,9 +79,12 @@ API that is JIT-specific.
|
||||
</P>
|
||||
<P>
|
||||
If your program may sometimes be linked with versions of PCRE that are older
|
||||
than 8.20, but you want to use JIT when it is available, you can test
|
||||
the values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT macro such
|
||||
as PCRE_CONFIG_JIT, for compile-time control of your code.
|
||||
than 8.20, but you want to use JIT when it is available, you can test the
|
||||
values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT macro such as
|
||||
PCRE_CONFIG_JIT, for compile-time control of your code. Also beware that the
|
||||
<b>pcre_jit_exec()</b> function was not available at all before 8.32,
|
||||
and may not be available at all if PCRE isn't compiled with
|
||||
--enable-jit. See the "JIT FAST PATH API" section below for details.
|
||||
</P>
|
||||
<br><a name="SEC4" href="#TOC1">SIMPLE USE OF JIT</a><br>
|
||||
<P>
|
||||
@ -119,6 +122,20 @@ when you call <b>pcre_study()</b>:
|
||||
PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE
|
||||
PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE
|
||||
</pre>
|
||||
If using <b>pcre_jit_exec()</b> and supporting a pre-8.32 version of
|
||||
PCRE, you can insert:
|
||||
<pre>
|
||||
#if PCRE_MAJOR >= 8 && PCRE_MINOR >= 32
|
||||
pcre_jit_exec(...);
|
||||
#else
|
||||
pcre_exec(...)
|
||||
#endif
|
||||
</pre>
|
||||
but as described in the "JIT FAST PATH API" section below this assumes
|
||||
version 8.32 and later are compiled with --enable-jit, which may
|
||||
break.
|
||||
<br>
|
||||
<br>
|
||||
The JIT compiler generates different optimized code for each of the three
|
||||
modes (normal, soft partial, hard partial). When <b>pcre_exec()</b> is called,
|
||||
the appropriate code is run if it is available. Otherwise, the pattern is
|
||||
@ -428,6 +445,36 @@ fast path, and if invalid data is passed, the result is undefined.
|
||||
Bypassing the sanity checks and the <b>pcre_exec()</b> wrapping can give
|
||||
speedups of more than 10%.
|
||||
</P>
|
||||
<P>
|
||||
Note that the <b>pcre_jit_exec()</b> function is not available in versions of
|
||||
PCRE before 8.32 (released in November 2012). If you need to support versions
|
||||
that old you must either use the slower <b>pcre_exec()</b>, or switch between
|
||||
the two codepaths by checking the values of PCRE_MAJOR and PCRE_MINOR.
|
||||
</P>
|
||||
<P>
|
||||
Due to an unfortunate implementation oversight, even in versions 8.32
|
||||
and later there will be no <b>pcre_jit_exec()</b> stub function defined
|
||||
when PCRE is compiled with --disable-jit, which is the default, and
|
||||
there's no way to detect whether PCRE was compiled with --enable-jit
|
||||
via a macro.
|
||||
</P>
|
||||
<P>
|
||||
If you need to support versions older than 8.32, or versions that may
|
||||
not build with --enable-jit, you must either use the slower
|
||||
<b>pcre_exec()</b>, or switch between the two codepaths by checking the
|
||||
values of PCRE_MAJOR and PCRE_MINOR.
|
||||
</P>
|
||||
<P>
|
||||
Switching between the two by checking the version assumes that all the
|
||||
versions being targeted are built with --enable-jit. To also support
|
||||
builds that may use --disable-jit either <b>pcre_exec()</b> must be
|
||||
used, or a compile-time check for JIT via <b>pcre_config()</b> (which
|
||||
assumes the runtime environment will be the same), or as the Git
|
||||
project decided to do, simply assume that <b>pcre_jit_exec()</b> is
|
||||
present in 8.32 or later unless a compile-time flag is provided, see
|
||||
the "grep: un-break building with PCRE >= 8.32 without --enable-jit"
|
||||
commit in git.git for an example of that.
|
||||
</P>
|
||||
<br><a name="SEC12" href="#TOC1">SEE ALSO</a><br>
|
||||
<P>
|
||||
<b>pcreapi</b>(3)
|
||||
@ -443,9 +490,9 @@ Cambridge CB2 3QH, England.
|
||||
</P>
|
||||
<br><a name="SEC14" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 17 March 2013
|
||||
Last updated: 05 July 2017
|
||||
<br>
|
||||
Copyright © 1997-2013 University of Cambridge.
|
||||
Copyright © 1997-2017 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE index page</a>.
|
||||
|
@ -74,6 +74,11 @@ newline as data characters. However, in some Windows environments character 26
|
||||
maximum portability, therefore, it is safest to use only ASCII characters in
|
||||
<b>pcretest</b> input files.
|
||||
</P>
|
||||
<P>
|
||||
The input is processed using using C's string functions, so must not
|
||||
contain binary zeroes, even though in Unix-like environments, <b>fgets()</b>
|
||||
treats any bytes other than newline as data characters.
|
||||
</P>
|
||||
<br><a name="SEC3" href="#TOC1">PCRE's 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a><br>
|
||||
<P>
|
||||
From release 8.30, two separate PCRE libraries can be built. The original one
|
||||
@ -1149,9 +1154,9 @@ Cambridge CB2 3QH, England.
|
||||
</P>
|
||||
<br><a name="SEC17" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 09 February 2014
|
||||
Last updated: 23 February 2017
|
||||
<br>
|
||||
Copyright © 1997-2014 University of Cambridge.
|
||||
Copyright © 1997-2017 University of Cambridge.
|
||||
<br>
|
||||
<p>
|
||||
Return to the <a href="index.html">PCRE index page</a>.
|
||||
|
@ -8366,6 +8366,10 @@ AVAILABILITY OF JIT SUPPORT
|
||||
older than 8.20, but you want to use JIT when it is available, you can
|
||||
test the values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT
|
||||
macro such as PCRE_CONFIG_JIT, for compile-time control of your code.
|
||||
Also beware that the pcre_jit_exec() function was not available at all
|
||||
before 8.32, and may not be available at all if PCRE isn't compiled
|
||||
with --enable-jit. See the "JIT FAST PATH API" section below for
|
||||
details.
|
||||
|
||||
|
||||
SIMPLE USE OF JIT
|
||||
@ -8407,6 +8411,18 @@ SIMPLE USE OF JIT
|
||||
PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE
|
||||
PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE
|
||||
|
||||
If using pcre_jit_exec() and supporting a pre-8.32 version of PCRE, you
|
||||
can insert:
|
||||
|
||||
#if PCRE_MAJOR >= 8 && PCRE_MINOR >= 32
|
||||
pcre_jit_exec(...);
|
||||
#else
|
||||
pcre_exec(...)
|
||||
#endif
|
||||
|
||||
but as described in the "JIT FAST PATH API" section below this assumes
|
||||
version 8.32 and later are compiled with --enable-jit, which may break.
|
||||
|
||||
The JIT compiler generates different optimized code for each of the
|
||||
three modes (normal, soft partial, hard partial). When pcre_exec() is
|
||||
called, the appropriate code is run if it is available. Otherwise, the
|
||||
@ -8696,6 +8712,33 @@ JIT FAST PATH API
|
||||
Bypassing the sanity checks and the pcre_exec() wrapping can give
|
||||
speedups of more than 10%.
|
||||
|
||||
Note that the pcre_jit_exec() function is not available in versions of
|
||||
PCRE before 8.32 (released in November 2012). If you need to support
|
||||
versions that old you must either use the slower pcre_exec(), or switch
|
||||
between the two codepaths by checking the values of PCRE_MAJOR and
|
||||
PCRE_MINOR.
|
||||
|
||||
Due to an unfortunate implementation oversight, even in versions 8.32
|
||||
and later there will be no pcre_jit_exec() stub function defined when
|
||||
PCRE is compiled with --disable-jit, which is the default, and there's
|
||||
no way to detect whether PCRE was compiled with --enable-jit via a
|
||||
macro.
|
||||
|
||||
If you need to support versions older than 8.32, or versions that may
|
||||
not build with --enable-jit, you must either use the slower
|
||||
pcre_exec(), or switch between the two codepaths by checking the values
|
||||
of PCRE_MAJOR and PCRE_MINOR.
|
||||
|
||||
Switching between the two by checking the version assumes that all the
|
||||
versions being targeted are built with --enable-jit. To also support
|
||||
builds that may use --disable-jit either pcre_exec() must be used, or a
|
||||
compile-time check for JIT via pcre_config() (which assumes the runtime
|
||||
environment will be the same), or as the Git project decided to do,
|
||||
simply assume that pcre_jit_exec() is present in 8.32 or later unless a
|
||||
compile-time flag is provided, see the "grep: un-break building with
|
||||
PCRE >= 8.32 without --enable-jit" commit in git.git for an example of
|
||||
that.
|
||||
|
||||
|
||||
SEE ALSO
|
||||
|
||||
@ -8711,8 +8754,8 @@ AUTHOR
|
||||
|
||||
REVISION
|
||||
|
||||
Last updated: 17 March 2013
|
||||
Copyright (c) 1997-2013 University of Cambridge.
|
||||
Last updated: 05 July 2017
|
||||
Copyright (c) 1997-2017 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
.TH PCREJIT 3 "17 March 2013" "PCRE 8.33"
|
||||
.TH PCREJIT 3 "05 July 2017" "PCRE 8.41"
|
||||
.SH NAME
|
||||
PCRE - Perl-compatible regular expressions
|
||||
.SH "PCRE JUST-IN-TIME COMPILER SUPPORT"
|
||||
@ -54,9 +54,12 @@ programs that need the best possible performance, there is also a "fast path"
|
||||
API that is JIT-specific.
|
||||
.P
|
||||
If your program may sometimes be linked with versions of PCRE that are older
|
||||
than 8.20, but you want to use JIT when it is available, you can test
|
||||
the values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT macro such
|
||||
as PCRE_CONFIG_JIT, for compile-time control of your code.
|
||||
than 8.20, but you want to use JIT when it is available, you can test the
|
||||
values of PCRE_MAJOR and PCRE_MINOR, or the existence of a JIT macro such as
|
||||
PCRE_CONFIG_JIT, for compile-time control of your code. Also beware that the
|
||||
\fBpcre_jit_exec()\fP function was not available at all before 8.32,
|
||||
and may not be available at all if PCRE isn't compiled with
|
||||
--enable-jit. See the "JIT FAST PATH API" section below for details.
|
||||
.
|
||||
.
|
||||
.SH "SIMPLE USE OF JIT"
|
||||
@ -96,6 +99,19 @@ when you call \fBpcre_study()\fP:
|
||||
PCRE_STUDY_JIT_PARTIAL_HARD_COMPILE
|
||||
PCRE_STUDY_JIT_PARTIAL_SOFT_COMPILE
|
||||
.sp
|
||||
If using \fBpcre_jit_exec()\fP and supporting a pre-8.32 version of
|
||||
PCRE, you can insert:
|
||||
.sp
|
||||
#if PCRE_MAJOR >= 8 && PCRE_MINOR >= 32
|
||||
pcre_jit_exec(...);
|
||||
#else
|
||||
pcre_exec(...)
|
||||
#endif
|
||||
.sp
|
||||
but as described in the "JIT FAST PATH API" section below this assumes
|
||||
version 8.32 and later are compiled with --enable-jit, which may
|
||||
break.
|
||||
.sp
|
||||
The JIT compiler generates different optimized code for each of the three
|
||||
modes (normal, soft partial, hard partial). When \fBpcre_exec()\fP is called,
|
||||
the appropriate code is run if it is available. Otherwise, the pattern is
|
||||
@ -404,6 +420,32 @@ fast path, and if invalid data is passed, the result is undefined.
|
||||
.P
|
||||
Bypassing the sanity checks and the \fBpcre_exec()\fP wrapping can give
|
||||
speedups of more than 10%.
|
||||
.P
|
||||
Note that the \fBpcre_jit_exec()\fP function is not available in versions of
|
||||
PCRE before 8.32 (released in November 2012). If you need to support versions
|
||||
that old you must either use the slower \fBpcre_exec()\fP, or switch between
|
||||
the two codepaths by checking the values of PCRE_MAJOR and PCRE_MINOR.
|
||||
.P
|
||||
Due to an unfortunate implementation oversight, even in versions 8.32
|
||||
and later there will be no \fBpcre_jit_exec()\fP stub function defined
|
||||
when PCRE is compiled with --disable-jit, which is the default, and
|
||||
there's no way to detect whether PCRE was compiled with --enable-jit
|
||||
via a macro.
|
||||
.P
|
||||
If you need to support versions older than 8.32, or versions that may
|
||||
not build with --enable-jit, you must either use the slower
|
||||
\fBpcre_exec()\fP, or switch between the two codepaths by checking the
|
||||
values of PCRE_MAJOR and PCRE_MINOR.
|
||||
.P
|
||||
Switching between the two by checking the version assumes that all the
|
||||
versions being targeted are built with --enable-jit. To also support
|
||||
builds that may use --disable-jit either \fBpcre_exec()\fP must be
|
||||
used, or a compile-time check for JIT via \fBpcre_config()\fP (which
|
||||
assumes the runtime environment will be the same), or as the Git
|
||||
project decided to do, simply assume that \fBpcre_jit_exec()\fP is
|
||||
present in 8.32 or later unless a compile-time flag is provided, see
|
||||
the "grep: un-break building with PCRE >= 8.32 without --enable-jit"
|
||||
commit in git.git for an example of that.
|
||||
.
|
||||
.
|
||||
.SH "SEE ALSO"
|
||||
@ -426,6 +468,6 @@ Cambridge CB2 3QH, England.
|
||||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 17 March 2013
|
||||
Copyright (c) 1997-2013 University of Cambridge.
|
||||
Last updated: 05 July 2017
|
||||
Copyright (c) 1997-2017 University of Cambridge.
|
||||
.fi
|
||||
|
@ -1,4 +1,4 @@
|
||||
.TH PCRETEST 1 "09 February 2014" "PCRE 8.35"
|
||||
.TH PCRETEST 1 "23 February 2017" "PCRE 8.41"
|
||||
.SH NAME
|
||||
pcretest - a program for testing Perl-compatible regular expressions.
|
||||
.SH SYNOPSIS
|
||||
@ -50,6 +50,10 @@ newline as data characters. However, in some Windows environments character 26
|
||||
(hex 1A) causes an immediate end of file, and no further data is read. For
|
||||
maximum portability, therefore, it is safest to use only ASCII characters in
|
||||
\fBpcretest\fP input files.
|
||||
.P
|
||||
The input is processed using using C's string functions, so must not
|
||||
contain binary zeroes, even though in Unix-like environments, \fBfgets()\fP
|
||||
treats any bytes other than newline as data characters.
|
||||
.
|
||||
.
|
||||
.SH "PCRE's 8-BIT, 16-BIT AND 32-BIT LIBRARIES"
|
||||
@ -1151,6 +1155,6 @@ Cambridge CB2 3QH, England.
|
||||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 09 February 2014
|
||||
Copyright (c) 1997-2014 University of Cambridge.
|
||||
Last updated: 23 February 2017
|
||||
Copyright (c) 1997-2017 University of Cambridge.
|
||||
.fi
|
||||
|
@ -39,6 +39,10 @@ INPUT DATA FORMAT
|
||||
For maximum portability, therefore, it is safest to use only ASCII
|
||||
characters in pcretest input files.
|
||||
|
||||
The input is processed using using C's string functions, so must not
|
||||
contain binary zeroes, even though in Unix-like environments, fgets()
|
||||
treats any bytes other than newline as data characters.
|
||||
|
||||
|
||||
PCRE's 8-BIT, 16-BIT AND 32-BIT LIBRARIES
|
||||
|
||||
@ -1083,5 +1087,5 @@ AUTHOR
|
||||
|
||||
REVISION
|
||||
|
||||
Last updated: 09 February 2014
|
||||
Copyright (c) 1997-2014 University of Cambridge.
|
||||
Last updated: 23 February 2017
|
||||
Copyright (c) 1997-2017 University of Cambridge.
|
||||
|
@ -5739,6 +5739,21 @@ for (;; ptr++)
|
||||
ptr = p - 1; /* Character before the next significant one. */
|
||||
}
|
||||
|
||||
/* We also need to skip over (?# comments, which are not dependent on
|
||||
extended mode. */
|
||||
|
||||
if (ptr[1] == CHAR_LEFT_PARENTHESIS && ptr[2] == CHAR_QUESTION_MARK &&
|
||||
ptr[3] == CHAR_NUMBER_SIGN)
|
||||
{
|
||||
ptr += 4;
|
||||
while (*ptr != CHAR_NULL && *ptr != CHAR_RIGHT_PARENTHESIS) ptr++;
|
||||
if (*ptr == CHAR_NULL)
|
||||
{
|
||||
*errorcodeptr = ERR18;
|
||||
goto FAILED;
|
||||
}
|
||||
}
|
||||
|
||||
/* If the next character is '+', we have a possessive quantifier. This
|
||||
implies greediness, whatever the setting of the PCRE_UNGREEDY option.
|
||||
If the next character is '?' this is a minimizing repeat, by default,
|
||||
@ -8210,7 +8225,6 @@ for (;; ptr++)
|
||||
|
||||
if (mclength == 1 || req_caseopt == 0)
|
||||
{
|
||||
firstchar = mcbuffer[0] | req_caseopt;
|
||||
firstchar = mcbuffer[0];
|
||||
firstcharflags = req_caseopt;
|
||||
|
||||
|
@ -7,7 +7,7 @@ and semantics are as close as possible to those of the Perl 5 language (but see
|
||||
below for why this module is different).
|
||||
|
||||
Written by Philip Hazel
|
||||
Copyright (c) 1997-2014 University of Cambridge
|
||||
Copyright (c) 1997-2017 University of Cambridge
|
||||
|
||||
-----------------------------------------------------------------------------
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
@ -2625,7 +2625,7 @@ for (;;)
|
||||
if (isinclass)
|
||||
{
|
||||
int max = (int)GET2(ecode, 1 + IMM2_SIZE);
|
||||
if (*ecode == OP_CRPOSRANGE)
|
||||
if (*ecode == OP_CRPOSRANGE && count >= (int)GET2(ecode, 1))
|
||||
{
|
||||
active_count--; /* Remove non-match possibility */
|
||||
next_active_state--;
|
||||
|
@ -669,7 +669,7 @@ if (ecode == NULL)
|
||||
return match((PCRE_PUCHAR)&rdepth, NULL, NULL, 0, NULL, NULL, 1);
|
||||
else
|
||||
{
|
||||
int len = (char *)&rdepth - (char *)eptr;
|
||||
int len = (int)((char *)&rdepth - (char *)eptr);
|
||||
return (len > 0)? -len : len;
|
||||
}
|
||||
}
|
||||
|
@ -2772,6 +2772,9 @@ extern const pcre_uint8 PRIV(ucd_stage1)[];
|
||||
extern const pcre_uint16 PRIV(ucd_stage2)[];
|
||||
extern const pcre_uint32 PRIV(ucp_gentype)[];
|
||||
extern const pcre_uint32 PRIV(ucp_gbtable)[];
|
||||
#ifdef COMPILE_PCRE32
|
||||
extern const ucd_record PRIV(dummy_ucd_record)[];
|
||||
#endif
|
||||
#ifdef SUPPORT_JIT
|
||||
extern const int PRIV(ucp_typerange)[];
|
||||
#endif
|
||||
@ -2780,10 +2783,16 @@ extern const int PRIV(ucp_typerange)[];
|
||||
/* UCD access macros */
|
||||
|
||||
#define UCD_BLOCK_SIZE 128
|
||||
#define GET_UCD(ch) (PRIV(ucd_records) + \
|
||||
#define REAL_GET_UCD(ch) (PRIV(ucd_records) + \
|
||||
PRIV(ucd_stage2)[PRIV(ucd_stage1)[(int)(ch) / UCD_BLOCK_SIZE] * \
|
||||
UCD_BLOCK_SIZE + (int)(ch) % UCD_BLOCK_SIZE])
|
||||
|
||||
#ifdef COMPILE_PCRE32
|
||||
#define GET_UCD(ch) ((ch > 0x10ffff)? PRIV(dummy_ucd_record) : REAL_GET_UCD(ch))
|
||||
#else
|
||||
#define GET_UCD(ch) REAL_GET_UCD(ch)
|
||||
#endif
|
||||
|
||||
#define UCD_CHARTYPE(ch) GET_UCD(ch)->chartype
|
||||
#define UCD_SCRIPT(ch) GET_UCD(ch)->script
|
||||
#define UCD_CATEGORY(ch) PRIV(ucp_gentype)[UCD_CHARTYPE(ch)]
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -57,6 +57,7 @@
|
||||
} while (0)
|
||||
|
||||
using std::vector;
|
||||
using std::string;
|
||||
using pcrecpp::StringPiece;
|
||||
using pcrecpp::Scanner;
|
||||
|
||||
|
@ -52,12 +52,12 @@
|
||||
|
||||
#include <pcre.h>
|
||||
|
||||
namespace pcrecpp {
|
||||
|
||||
using std::memcmp;
|
||||
using std::strlen;
|
||||
using std::string;
|
||||
|
||||
namespace pcrecpp {
|
||||
|
||||
class PCRECPP_EXP_DEFN StringPiece {
|
||||
private:
|
||||
const char* ptr_;
|
||||
|
@ -24,6 +24,7 @@
|
||||
} \
|
||||
} while (0)
|
||||
|
||||
using std::string;
|
||||
using pcrecpp::StringPiece;
|
||||
|
||||
static void CheckSTLComparator() {
|
||||
|
@ -6,7 +6,7 @@
|
||||
and semantics are as close as possible to those of the Perl 5 language.
|
||||
|
||||
Written by Philip Hazel
|
||||
Copyright (c) 1997-2012 University of Cambridge
|
||||
Copyright (c) 1997-2017 University of Cambridge
|
||||
|
||||
-----------------------------------------------------------------------------
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
@ -161,7 +161,7 @@ const pcre_uint32 PRIV(ucp_gbtable[]) = {
|
||||
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark), /* 5 SpacingMark */
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbL)| /* 6 L */
|
||||
(1<<ucp_gbL)|(1<<ucp_gbV)|(1<<ucp_gbLV)|(1<<ucp_gbLVT),
|
||||
(1<<ucp_gbV)|(1<<ucp_gbLV)|(1<<ucp_gbLVT),
|
||||
|
||||
(1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbV)| /* 7 V */
|
||||
(1<<ucp_gbT),
|
||||
|
@ -38,6 +38,20 @@ const pcre_uint16 PRIV(ucd_stage2)[] = {0};
|
||||
const pcre_uint32 PRIV(ucd_caseless_sets)[] = {0};
|
||||
#else
|
||||
|
||||
/* If the 32-bit library is run in non-32-bit mode, character values
|
||||
greater than 0x10ffff may be encountered. For these we set up a
|
||||
special record. */
|
||||
|
||||
#ifdef COMPILE_PCRE32
|
||||
const ucd_record PRIV(dummy_ucd_record)[] = {{
|
||||
ucp_Common, /* script */
|
||||
ucp_Cn, /* type unassigned */
|
||||
ucp_gbOther, /* grapheme break property */
|
||||
0, /* case set */
|
||||
0, /* other case */
|
||||
}};
|
||||
#endif
|
||||
|
||||
/* When recompiling tables with a new Unicode version, please check the
|
||||
types in this structure definition from pcre_internal.h (the actual
|
||||
field names will be different):
|
||||
|
@ -43,6 +43,7 @@
|
||||
#include <vector>
|
||||
#include "pcrecpp.h"
|
||||
|
||||
using std::string;
|
||||
using pcrecpp::StringPiece;
|
||||
using pcrecpp::RE;
|
||||
using pcrecpp::RE_Options;
|
||||
|
@ -1804,11 +1804,6 @@ while (ptr < endptr)
|
||||
if (line_buffered) fflush(stdout);
|
||||
rc = 0; /* Had some success */
|
||||
|
||||
/* If the current match ended past the end of the line (only possible
|
||||
in multiline mode), we are done with this line. */
|
||||
|
||||
if ((unsigned int)offsets[1] > linelength) goto END_ONE_MATCH;
|
||||
|
||||
startoffset = offsets[1]; /* Restart after the match */
|
||||
if (startoffset <= oldstartoffset)
|
||||
{
|
||||
@ -1818,6 +1813,22 @@ while (ptr < endptr)
|
||||
if (utf8)
|
||||
while ((matchptr[startoffset] & 0xc0) == 0x80) startoffset++;
|
||||
}
|
||||
|
||||
/* If the current match ended past the end of the line (only possible
|
||||
in multiline mode), we must move on to the line in which it did end
|
||||
before searching for more matches. */
|
||||
|
||||
while (startoffset > (int)linelength)
|
||||
{
|
||||
matchptr = ptr += linelength + endlinelength;
|
||||
filepos += (int)(linelength + endlinelength);
|
||||
linenumber++;
|
||||
startoffset -= (int)(linelength + endlinelength);
|
||||
t = end_of_line(ptr, endptr, &endlinelength);
|
||||
linelength = t - ptr - endlinelength;
|
||||
length = (size_t)(endptr - ptr);
|
||||
}
|
||||
|
||||
goto ONLY_MATCHING_RESTART;
|
||||
}
|
||||
}
|
||||
@ -3179,9 +3190,11 @@ for (j = 1, cp = patterns; cp != NULL; j++, cp = cp->next)
|
||||
cp->hint = pcre_study(cp->compiled, study_options, &error);
|
||||
if (error != NULL)
|
||||
{
|
||||
char s[16];
|
||||
if (patterns->next == NULL) s[0] = 0; else sprintf(s, " number %d", j);
|
||||
fprintf(stderr, "pcregrep: Error while studying regex%s: %s\n", s, error);
|
||||
if (patterns->next == NULL)
|
||||
fprintf(stderr, "pcregrep: Error while studying regex: %s\n", error);
|
||||
else
|
||||
fprintf(stderr, "pcregrep: Error while studying regex number %d: %s\n",
|
||||
j, error);
|
||||
goto EXIT2;
|
||||
}
|
||||
#ifdef SUPPORT_PCREGREP_JIT
|
||||
|
@ -6,7 +6,7 @@
|
||||
and semantics are as close as possible to those of the Perl 5 language.
|
||||
|
||||
Written by Philip Hazel
|
||||
Copyright (c) 1997-2016 University of Cambridge
|
||||
Copyright (c) 1997-2017 University of Cambridge
|
||||
|
||||
-----------------------------------------------------------------------------
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
@ -389,8 +389,8 @@ if (rc >= 0)
|
||||
{
|
||||
for (i = 0; i < (size_t)rc; i++)
|
||||
{
|
||||
pmatch[i].rm_so = ovector[i*2];
|
||||
pmatch[i].rm_eo = ovector[i*2+1];
|
||||
pmatch[i].rm_so = ovector[i*2] + so;
|
||||
pmatch[i].rm_eo = ovector[i*2+1] + so;
|
||||
}
|
||||
if (allocated_ovector) free(ovector);
|
||||
for (; i < nmatch; i++) pmatch[i].rm_so = pmatch[i].rm_eo = -1;
|
||||
|
@ -177,7 +177,7 @@ that differ in their output from isprint() even in the "C" locale. */
|
||||
#define PRINTABLE(c) ((c) >= 32 && (c) < 127)
|
||||
#endif
|
||||
|
||||
#define PRINTOK(c) (locale_set? isprint(c) : PRINTABLE(c))
|
||||
#define PRINTOK(c) (locale_set? (((c) < 256) && isprint(c)) : PRINTABLE(c))
|
||||
|
||||
/* Posix support is disabled in 16 or 32 bit only mode. */
|
||||
#if !defined SUPPORT_PCRE8 && !defined NOPOSIX
|
||||
@ -426,11 +426,11 @@ argument, the casting might be incorrectly applied. */
|
||||
#define PCRE_COPY_NAMED_SUBSTRING32(rc, re, bptr, offsets, count, \
|
||||
namesptr, cbuffer, size) \
|
||||
rc = pcre32_copy_named_substring((pcre32 *)re, (PCRE_SPTR32)bptr, offsets, \
|
||||
count, (PCRE_SPTR32)namesptr, (PCRE_UCHAR32 *)cbuffer, size/2)
|
||||
count, (PCRE_SPTR32)namesptr, (PCRE_UCHAR32 *)cbuffer, size/4)
|
||||
|
||||
#define PCRE_COPY_SUBSTRING32(rc, bptr, offsets, count, i, cbuffer, size) \
|
||||
rc = pcre32_copy_substring((PCRE_SPTR32)bptr, offsets, count, i, \
|
||||
(PCRE_UCHAR32 *)cbuffer, size/2)
|
||||
(PCRE_UCHAR32 *)cbuffer, size/4)
|
||||
|
||||
#define PCRE_DFA_EXEC32(count, re, extra, bptr, len, start_offset, options, \
|
||||
offsets, size_offsets, workspace, size_workspace) \
|
||||
@ -4834,7 +4834,16 @@ while (!done)
|
||||
continue;
|
||||
|
||||
case 'O':
|
||||
while(isdigit(*p)) n = n * 10 + *p++ - '0';
|
||||
while(isdigit(*p))
|
||||
{
|
||||
if (n > (INT_MAX-10)/10) /* Hack to stop fuzzers */
|
||||
{
|
||||
printf("** \\O argument is too big\n");
|
||||
yield = 1;
|
||||
goto EXIT;
|
||||
}
|
||||
n = n * 10 + *p++ - '0';
|
||||
}
|
||||
if (n > size_offsets_max)
|
||||
{
|
||||
size_offsets_max = n;
|
||||
|
3
pcre/testdata/testinput1
vendored
3
pcre/testdata/testinput1
vendored
@ -5739,4 +5739,7 @@ AbcdCBefgBhiBqz
|
||||
/(?=.*X)X$/
|
||||
\ X
|
||||
|
||||
/X+(?#comment)?/
|
||||
>XXX<
|
||||
|
||||
/-- End of testinput1 --/
|
||||
|
2
pcre/testdata/testinput12
vendored
2
pcre/testdata/testinput12
vendored
@ -104,4 +104,6 @@ and a couple of things that are different with JIT. --/
|
||||
/(.|.)*?bx/
|
||||
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabax
|
||||
|
||||
/((?(?!))x)(?'name')(?1)/S++
|
||||
|
||||
/-- End of testinput12 --/
|
||||
|
3
pcre/testdata/testinput15
vendored
3
pcre/testdata/testinput15
vendored
@ -363,4 +363,7 @@ correctly, but that messes up comparisons). --/
|
||||
|
||||
/abc/89
|
||||
|
||||
//8+L
|
||||
\xf1\xad\xae\xae
|
||||
|
||||
/-- End of testinput15 --/
|
||||
|
3
pcre/testdata/testinput8
vendored
3
pcre/testdata/testinput8
vendored
@ -4845,4 +4845,7 @@
|
||||
aaa\D
|
||||
a\D
|
||||
|
||||
/(02-)?[0-9]{3}-[0-9]{3}/
|
||||
02-123-123
|
||||
|
||||
/-- End of testinput8 --/
|
||||
|
4
pcre/testdata/testoutput1
vendored
4
pcre/testdata/testoutput1
vendored
@ -9442,4 +9442,8 @@ No match
|
||||
\ X
|
||||
0: X
|
||||
|
||||
/X+(?#comment)?/
|
||||
>XXX<
|
||||
0: X
|
||||
|
||||
/-- End of testinput1 --/
|
||||
|
2
pcre/testdata/testoutput12
vendored
2
pcre/testdata/testoutput12
vendored
@ -201,4 +201,6 @@ No match, mark = m (JIT)
|
||||
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabax
|
||||
Error -8 (match limit exceeded)
|
||||
|
||||
/((?(?!))x)(?'name')(?1)/S++
|
||||
|
||||
/-- End of testinput12 --/
|
||||
|
5
pcre/testdata/testoutput15
vendored
5
pcre/testdata/testoutput15
vendored
@ -1136,4 +1136,9 @@ Failed: setting UTF is disabled by the application at offset 0
|
||||
/abc/89
|
||||
Failed: setting UTF is disabled by the application at offset 0
|
||||
|
||||
//8+L
|
||||
\xf1\xad\xae\xae
|
||||
0:
|
||||
0+ \x{6dbae}
|
||||
|
||||
/-- End of testinput15 --/
|
||||
|
4
pcre/testdata/testoutput8
vendored
4
pcre/testdata/testoutput8
vendored
@ -7801,4 +7801,8 @@ No match
|
||||
** Show all captures ignored after DFA matching
|
||||
0: a
|
||||
|
||||
/(02-)?[0-9]{3}-[0-9]{3}/
|
||||
02-123-123
|
||||
0: 02-123-123
|
||||
|
||||
/-- End of testinput8 --/
|
||||
|
Loading…
x
Reference in New Issue
Block a user