101 Commits

Author SHA1 Message Date
yui-knk
e816ab0b0c Remove rb_imemo_tmpbuf_t from parser
No parser semantic value types are `VALUE` then no need to
use imemo for managing semantic value stack anymore.
2024-04-02 19:37:27 +09:00
yui-knk
799e854897 [Feature #20331] Simplify parser warnings for hash keys duplication and when clause duplication
This commit simplifies warnings for hash keys duplication and when clause duplication,
based on the discussion of https://bugs.ruby-lang.org/issues/20331.
Warnings are reported only when strings are same to ohters.
2024-04-02 08:26:58 +09:00
S-H-GAMELINKS
0774232bf3 Remove unnecessary macros and functions for Universal Parser 2024-04-01 12:05:16 +09:00
S-H-GAMELINKS
060a71d4e7 Fix Ripper memory allocation size when enabled Universal Parser
The size of `struct parser_params` is 8 bytes difference in `ripper_s_allocate` and `rb_ruby_parser_allocate` when the universal parser is
enabled.
This causes a situation where `*r->p` is not fully initialized in `ripper_s_allocate` as shown below.

```console
(gdb) p *r->p
$2 = {heap = 0x0, lval = 0x0, yylloc = 0x0, lex = {strterm = 0x0, gets = 0x0, input = 0, string_buffer = {head = 0x0, last = 0x0}, lastlin
e = 0x0,
    nextline = 0x0, pbeg = 0x0, pcur = 0x0, pend = 0x0, ptok = 0x0, gets_ = {ptr = 0, call = 0x0}, state = EXPR_NONE, paren_nest = 0, lpar
_seen = 0,
    debug = 0, has_shebang = 0, token_seen = 0, token_info_enabled = 0, error_p = 0, cr_seen = 0, value = 0, result = 0, parsing_thread = 0, s_value = 0,
    s_lvalue = 0, s_value_stack = 2097}
````

This seems to cause `double free or corruption (!prev)` and SEGV.
So, fixing this by introduce `rb_ripper_parser_params_allocate` and `rb_ruby_parser_config` functions for Ripper, and `struct parser_params` same size is returned.
2024-03-21 18:10:02 +09:00
HASUMI Hitoshi
9a19cfd4cd [Universal Parser] Reduce dependence on RArray in parse.y
- Introduce `rb_parser_ary_t` structure to partly eliminate RArray from parse.y
  - In this patch, `parser_params->tokens` and `parser_params->ast->node_buffer->tokens` are now `rb_parser_ary_t *`
  - Instead, `ast_node_all_tokens()` internally creates a Ruby Array object from the `rb_parser_ary_t`
  - Also, delete `rb_ast_tokens()` and `rb_ast_set_tokens()` in node.c

- Implement `rb_parser_str_escape()`
  - This is a port of the `rb_str_escape()` function in string.c
  - `rb_parser_str_escape()` does not depend on `VALUE` (RString)
  - Instead, it uses `rb_parser_stirng_t *`
  - This function works when --dump=y option passed

- Because WIP of the universal parser, similar functions like `rb_parser_tokens_free()` exist in both node.c and parse.y. Refactoring them may be needed in some way in the future

- Although we considered redesigning the structure: `ast->node_buffer->tokens` into `ast->tokens`, we leave it as it is because `rb_ast_t` is an imemo. (We will address it in the future)
2024-03-12 17:17:52 +09:00
Peter Zhu
01f9b2ae41 Use rb_str_to_interned_str in parse.y
This commit changes rb_fstring to rb_str_to_interned_str in parse.y.
rb_fstring is private so it shouldn't be used by ripper.
2024-02-23 13:33:46 -05:00
Peter Zhu
330830dd1a Add IMEMO_NEW
Rather than exposing that an imemo has a flag and four fields, this
changes the implementation to only expose one field (the klass) and
fills the rest with 0. The type will have to fill in the values themselves.
2024-02-21 11:33:05 -05:00
yui-knk
91cb303531 Remove not used universal parser macros and functions 2024-02-21 13:36:45 +09:00
yui-knk
e7ab5d891c Introduce NODE_REGX to manage regexp literal 2024-02-21 08:06:48 +09:00
Peter Zhu
c184aa8740 Use rb_gc_mark_and_move for imemo 2024-02-20 10:39:30 -05:00
S-H-GAMELINKS
fba647087b Remove uneeded Universal Parser properties 2024-02-20 19:02:24 +09:00
Nobuyoshi Nakada
b1d70e4264
[Bug #20280] Check by rb_parser_enc_str_coderange
Co-authored-by: Yuichiro Kaneko <spiketeika@gmail.com>
2024-02-19 16:33:26 +09:00
Nobuyoshi Nakada
fcc55dc226
[Bug #20280] Raise SyntaxError on invalid encoding symbol 2024-02-19 16:33:26 +09:00
Peter Zhu
a71d1ed838 Fix memory leak when parsing invalid hash symbol
For example:

    10.times do
      100_000.times do
        eval('{"\xC3": 1}')
      rescue EncodingError
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    32032
    48464
    66112
    84192
    100592
    117520
    134096
    150656
    167168
    183760

After:

    17120
    17120
    17120
    17120
    18560
    18560
    18560
    18560
    18560
    18560
2024-02-13 11:05:56 -05:00
yui-knk
ea91ab696e Fix constant name of Ractor::IsolationError message
`dest` of `const_decl_path` is `NODE_COLON2` or `NODE_COLON3` in some cases.
For example, `B::C ||= [“Not ” + “shareable”]` passes `NODE_COLON2`
and `::C ||= [“Not ” + “shareable”]` passes `NODE_COLON3`.
This commit fixes `Ractor::IsolationError` message for such case.

```
# shareable_constant_value: literal
::C ||= ["Not " + "shareable"]

# Before
# => cannot assign unshareable object to C (Ractor::IsolationError)

# After
# => cannot assign unshareable object to ::C (Ractor::IsolationError)
```
2024-02-10 09:23:17 +09:00
yui-knk
bf72cb84ca Include the first constant name into Ractor::IsolationError message
If lhs of assignment is top-level constant reference, the first
constant name is omitted from error message.
This commit fixes it.

```
# shareable_constant_value: literal
::C = ["Not " + "shareable"]

# Before
# => cannot assign unshareable object to  (Ractor::IsolationError)

# After
# => cannot assign unshareable object to ::C (Ractor::IsolationError)
```
2024-02-10 09:23:17 +09:00
yui-knk
33c1e082d0 Remove ruby object from string nodes
String nodes holds ruby string object on `VALUE nd_lit`.
This commit changes it to `struct rb_parser_string *string`
to reduce dependency on ruby object.
Sometimes these strings are concatenated with other string
therefore string concatenate functions are needed.
2024-02-09 14:20:17 +09:00
S.H
f3df218f48
Introduced rb_node_const_decl_val function
Introduce `rb_node_const_decl_val` function to allow `rb_ary_join` and
`rb_ary_reverse` functions to be removed from Universal Parser.
2024-01-31 13:31:38 +09:00
S.H
9b40f42c22
Introduce NODE_ENCODING
`__ENCODING__ `was managed by `NODE_LIT` with Encoding object. 

Introduce `NODE_ENCODING` for
1. `__ENCODING__` is detectable from AST Node.
2. Reduce dependency Ruby object for parse.y
2024-01-27 08:11:10 +00:00
yui-knk
ee7f63ebba Make lastline and nextline to be rb_parser_string
This commit changes `struct parser_params` lastline and nextline
from `VALUE` (String object) to `rb_parser_string_t *` so that
dependency on Ruby Object is reduced.
`parser_string_buffer_t string_buffer` is added to `struct parser_params`
to manage `rb_parser_string_t` pointers of each line. All allocated line
strings are freed in `rb_ruby_parser_free`.
2024-01-23 08:58:16 +09:00
Nobuyoshi Nakada
0610f555ea
Constify rb_global_parser_config 2024-01-14 17:55:11 +09:00
yui-knk
517e0d87bd Move node value functions closer to other similar functions 2024-01-12 22:10:53 +09:00
yui-knk
631eb2a110 Rename node value functions
They don't compile nodes then remove compile_ prefix.
`compile_numeric_literal` always returns integer then
use integer instead of numeric.
2024-01-12 22:10:53 +09:00
yui-knk
5a471784ca Restore unknown case
This existed before 1b8d01136c3ff6c60325c7609d61e19ac42acd9f.
2024-01-12 22:10:53 +09:00
yui-knk
731fee04c2 Use BUILTIN_TYPE because SPECIAL_CONST or not is already checked 2024-01-12 22:10:53 +09:00
yui-knk
b35e21b388 Remove reference counter from rb_parser_config
It's allocated outside of parser then no need to track
reference count in rb_parser_config.
2024-01-12 21:17:41 +09:00
yui-knk
52d9e55903 Statically allocate parser config 2024-01-12 21:17:41 +09:00
Nobuyoshi Nakada
48fd311721
Constify 2024-01-10 13:49:00 +09:00
yui-knk
db476cc71c Introduce NODE_SYM to manage symbol literal
`:sym` was managed by `NODE_LIT` with `Symbol` object.
This commit introduces `NODE_SYM` so that

1. Symbol literal is detectable from AST Node
2. Reduce dependency on ruby object
2024-01-09 16:07:19 +09:00
yui-knk
7ffff3e043 Change numeric node value functions argument to NODE *
Change the argument to align with other node value functions
like `rb_node_line_lineno_val`.
2024-01-08 14:02:48 +09:00
Nobuyoshi Nakada
c30b8ae947
Adjust styles and indents [ci skip] 2024-01-08 00:50:41 +09:00
S-H-GAMELINKS
ad7aee35e4 Remove unneeded rb_parser_config_struct struct properties for Universal Parser 2024-01-07 21:16:31 +09:00
S-H-GAMELINKS
1b8d01136c Introduce Numeric Node's 2024-01-07 09:24:34 +09:00
yui-knk
7a050638b1 Introduce NODE_FILE
`__FILE__` was managed by `NODE_STR` with `String` object.
This commit introduces `NODE_FILE` and `struct rb_parser_string` so that

1. `__FILE__` is detectable from AST Node
2. Reduce dependency ruby object
2024-01-02 14:19:42 +09:00
yui-knk
1ade170a6c Introduce NODE_LINE
`__LINE__` was managed by `NODE_LIT` with `Integer` object.
This commit introduces `NODE_LINE` so that

1. `__LINE__` is detectable from AST Node
2. Reduce dependency ruby object
2023-12-29 18:32:27 +09:00
yui-knk
4374236e95 Add errno_ptr property for Universal Parser 2023-12-28 13:17:36 +09:00
yui-knk
73fa322497 Add ary_modify property for Universal Parser 2023-12-28 09:00:44 +09:00
Nobuyoshi Nakada
5bbb6fd6c3 Add printf format attributes to rb_parser_config_t 2023-10-20 07:15:24 +09:00
Nobuyoshi Nakada
a405b28e85 Delete heredoc line mark references 2023-10-14 11:08:43 +09:00
Nobuyoshi Nakada
a075c55d0c Manage rb_strterm_t without imemo 2023-10-14 11:08:43 +09:00
yui-knk
74c6781153 Change RNode structure from union to struct
All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members
for holding different kind of data.
This has two problems.

1. Low flexibility of data structure

Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand,
NODE_OP_ASGN2 needs more than three union members. However they use same
structure definition, need to allocate three union members for NODE_TRUE and
need to separate NODE_OP_ASGN2 into another node.
This change removes the restriction so make it possible to
change data structure by each node type.

2. No compile time check for union member access

It’s developer’s responsibility for using correct member for each node type when it’s union.
This change clarifies which node has which type of fields and enables compile time check.

This commit also changes node_buffer_elem_struct buf management to handle
different size data with alignment.
2023-09-28 11:58:10 +09:00
yui-knk
fb7a2ddb4b Directly free structure managed by imemo tmpbuf
NODE_ARGS, NODE_ARYPTN, NODE_FNDPTN manage memory of their
structure by imemo tmpbuf Object.
However rb_ast_struct has reference to NODE. Then these
memory can be freed directly when rb_ast_struct is freed.

This commit reduces parser's dependency on CRuby functions.
2023-09-22 11:25:53 +09:00
Nobuyoshi Nakada
fe73f9f24b
Replace only use of snprintf in parser 2023-08-25 23:34:02 +09:00
Nobuyoshi Nakada
503f98ebd3 Remove SCRIPT_LINES__ related member functions 2023-08-25 18:23:05 +09:00
Nobuyoshi Nakada
6aa16f9ec1 Move SCRIPT_LINES__ away from parse.y 2023-08-25 18:23:05 +09:00
S-H-GAMELINKS
a792890e9b Remove uneeded fix2int and rational_raw property for Universal Parser 2023-08-11 13:50:00 +09:00
S-H-GAMELINKS
4e7e972841 Remove uneeded int2big property for Universal Parser 2023-08-05 11:39:38 +09:00
S-H-GAMELINKS
acd9c208d5 Move some macro for universal parser 2023-07-09 15:00:52 +09:00
S-H-GAMELINKS
8b2a0ec8df Move ISASCII defination to parse.y 2023-07-08 15:26:55 +09:00
Nobuyoshi Nakada
3443e43b62 Remove st_functions_t 2023-06-24 19:17:37 +09:00