This patch suggests relocating the code dealing with `SCRIPT_LINES__` from ast.c to ruby_parser.c.
## Background
- I guess `AbstractSyntaxTree.of` method used to use `SCRIPT_LINES__` internally for some reason before
- However, now it appears `SCRIPT_LINES__` is no longer used meaningfully by the method
- As evidence of this, (and as my patch shows,) removing the function call of `rb_script_lines_for()` from `ast_s_of()` does not affect the result of `test/ruby/test_ast.rb`
Given the above, I think two possibilities can be considered:
- (A) `AbstractSyntaxTree.of` has not needed `SCRIPT_LINES__` already (I pick this)
- (B) We lack a test case of `AbstractSyntaxTree.of` that needs to use `SCRIPT_LINES__`
## Besides,
The current implementation causes strange behavior:
```console
ruby -e"SCRIPT_LINES__ = {__FILE__ => []}; puts RubyVM::AbstractSyntaxTree.of(->{ 1 + 2 }, keep_script_lines: true).script_lines"
=> `-e:1:in '<main>': undefined method 'script_lines' for nil (NoMethodError)`
```
I think this is a bug because `AbstractSyntaxTree.of` is not supposed to return `nil` even in this case.
This happens due to the ast.c's dependence on `SCRIPT_LINES__`.
And at the end of the `ast_s_of()`, `node_find()` can not find the target child node obviously because it doesn't make sense to look for a corresponding node made from the parameter of `AbstractSyntaxTree.of` in the AST tree made from the value of `{__FILE__ => []}`
## Solution
Since I think it's good enough `SCRIPT_LINES__` to be only referred by ruby.c, I chose the possibility "(A)" and wrote this patch which moves `rb_script_lines_for()` from ast.c to ruby_parser.c.
So as the result:
- `ast_s_of()` function no longer look up `SCRIPT_LINES__`
- Even so, this patched code passes the existing tests
- The strange behavior above no longer happens (I also added a test for it)
Please correct me if I miss something🙏
This commit simplifies warnings for hash keys duplication and when clause duplication,
based on the discussion of https://bugs.ruby-lang.org/issues/20331.
Warnings are reported only when strings are same to ohters.
- Introduce `rb_parser_ary_t` structure to partly eliminate RArray from parse.y
- In this patch, `parser_params->tokens` and `parser_params->ast->node_buffer->tokens` are now `rb_parser_ary_t *`
- Instead, `ast_node_all_tokens()` internally creates a Ruby Array object from the `rb_parser_ary_t`
- Also, delete `rb_ast_tokens()` and `rb_ast_set_tokens()` in node.c
- Implement `rb_parser_str_escape()`
- This is a port of the `rb_str_escape()` function in string.c
- `rb_parser_str_escape()` does not depend on `VALUE` (RString)
- Instead, it uses `rb_parser_stirng_t *`
- This function works when --dump=y option passed
- Because WIP of the universal parser, similar functions like `rb_parser_tokens_free()` exist in both node.c and parse.y. Refactoring them may be needed in some way in the future
- Although we considered redesigning the structure: `ast->node_buffer->tokens` into `ast->tokens`, we leave it as it is because `rb_ast_t` is an imemo. (We will address it in the future)
Rather than exposing that an imemo has a flag and four fields, this
changes the implementation to only expose one field (the klass) and
fills the rest with 0. The type will have to fill in the values themselves.
`dest` of `const_decl_path` is `NODE_COLON2` or `NODE_COLON3` in some cases.
For example, `B::C ||= [“Not ” + “shareable”]` passes `NODE_COLON2`
and `::C ||= [“Not ” + “shareable”]` passes `NODE_COLON3`.
This commit fixes `Ractor::IsolationError` message for such case.
```
# shareable_constant_value: literal
::C ||= ["Not " + "shareable"]
# Before
# => cannot assign unshareable object to C (Ractor::IsolationError)
# After
# => cannot assign unshareable object to ::C (Ractor::IsolationError)
```
If lhs of assignment is top-level constant reference, the first
constant name is omitted from error message.
This commit fixes it.
```
# shareable_constant_value: literal
::C = ["Not " + "shareable"]
# Before
# => cannot assign unshareable object to (Ractor::IsolationError)
# After
# => cannot assign unshareable object to ::C (Ractor::IsolationError)
```
String nodes holds ruby string object on `VALUE nd_lit`.
This commit changes it to `struct rb_parser_string *string`
to reduce dependency on ruby object.
Sometimes these strings are concatenated with other string
therefore string concatenate functions are needed.
`__ENCODING__ `was managed by `NODE_LIT` with Encoding object.
Introduce `NODE_ENCODING` for
1. `__ENCODING__` is detectable from AST Node.
2. Reduce dependency Ruby object for parse.y
This commit changes `struct parser_params` lastline and nextline
from `VALUE` (String object) to `rb_parser_string_t *` so that
dependency on Ruby Object is reduced.
`parser_string_buffer_t string_buffer` is added to `struct parser_params`
to manage `rb_parser_string_t` pointers of each line. All allocated line
strings are freed in `rb_ruby_parser_free`.
`:sym` was managed by `NODE_LIT` with `Symbol` object.
This commit introduces `NODE_SYM` so that
1. Symbol literal is detectable from AST Node
2. Reduce dependency on ruby object
`__FILE__` was managed by `NODE_STR` with `String` object.
This commit introduces `NODE_FILE` and `struct rb_parser_string` so that
1. `__FILE__` is detectable from AST Node
2. Reduce dependency ruby object
`__LINE__` was managed by `NODE_LIT` with `Integer` object.
This commit introduces `NODE_LINE` so that
1. `__LINE__` is detectable from AST Node
2. Reduce dependency ruby object
All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members
for holding different kind of data.
This has two problems.
1. Low flexibility of data structure
Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand,
NODE_OP_ASGN2 needs more than three union members. However they use same
structure definition, need to allocate three union members for NODE_TRUE and
need to separate NODE_OP_ASGN2 into another node.
This change removes the restriction so make it possible to
change data structure by each node type.
2. No compile time check for union member access
It’s developer’s responsibility for using correct member for each node type when it’s union.
This change clarifies which node has which type of fields and enables compile time check.
This commit also changes node_buffer_elem_struct buf management to handle
different size data with alignment.
NODE_ARGS, NODE_ARYPTN, NODE_FNDPTN manage memory of their
structure by imemo tmpbuf Object.
However rb_ast_struct has reference to NODE. Then these
memory can be freed directly when rb_ast_struct is freed.
This commit reduces parser's dependency on CRuby functions.