877 Commits

Author SHA1 Message Date
Nobuyoshi Nakada
453f88f7f1 Make ASAN default option string built-in libruby
The content depends on ruby internal, not responsibility of the
caller.  Revive `RUBY_GLOBAL_SETUP` macro to define the hook function.
2025-03-16 17:33:58 +09:00
Peter Zhu
6bad47ac6d RUBY_FREE_AT_EXIT does not work when error in -r
[Bug #21173]

When loading a file using the command line -r, it is processed before
RUBY_FREE_AT_EXIT is checked. So if the loaded file raises an error, it
will cause memory to not be freed with RUBY_FREE_AT_EXIT.

For example `ruby -rtest.rb -e ""` will report a large amount of memory
leaks if `test.rb` raises.
2025-03-06 11:58:54 -05:00
Nobuyoshi Nakada
4a67ef09cc
[Feature #21116] Extract RJIT as a third-party gem 2025-02-13 18:01:03 +09:00
Xavier Noria
f6e259da87 Improve docs of -I ruby option 2025-01-24 11:25:26 +01:00
Nobuyoshi Nakada
dfe6b7c02e
[Bug #21018] Show invalid command line option more properly 2025-01-09 19:26:20 +09:00
Nobuyoshi Nakada
1b0c46daed
[Bug #20979] [DOC] Add a proviso to +comment option 2024-12-24 13:27:05 +09:00
John Hawthorn
a8ebc596d6 Free parse result under -c 2024-11-22 19:25:01 -08:00
Peter Zhu
51ffef2819 Fix memory leak in prism when syntax error in iseq compilation
If there's a syntax error during iseq compilation then prism would leak
memory because it would not free the pm_parse_result_t.

This commit changes pm_iseq_new_with_opt to have a rb_protect to catch
when an error is raised, and return NULL and set error_state to a value
that can be raised by calling rb_jump_tag after memory has been freed.

For example:

    10.times do
      10_000.times do
        eval("/[/=~s")
      rescue SyntaxError
      end

      puts `ps -o rss= -p #{$$}`
    end

Before:

    39280
    68736
    99232
    128864
    158896
    188208
    217344
    246304
    275376
    304592

After:

    12192
    13200
    14256
    14848
    16000
    16000
    16000
    16064
    17232
    17952
2024-11-08 15:43:41 -05:00
Koichi Sasada
ab7ab9e450 Warning[:strict_unused_block]
to show unused block warning strictly.

```ruby
class C
  def f = nil
end

class D
  def f = yield
end

[C.new, D.new].each{|obj| obj.f{}}
```

In this case, `D#f` accepts a block. However `C#f` doesn't
accept a block. There are some cases passing a block with
`obj.f{}` where `obj` is `C` or `D`. To avoid warnings on
such cases, "unused block warning" will be warned only if
there is not same name which accepts a block.
On the above example, `C.new.f{}` doesn't show any warnings
because there is a same name `D#f` which accepts a block.

We call this default behavior as "relax mode".

`strict_unused_block` new warning category changes from
"relax mode" to "strict mode", we don't check same name
methods and `C.new.f{}` will be warned.

[Feature #15554]
2024-11-06 11:06:18 +09:00
Takashi Kokubun
478e0fc710
YJIT: Replace Array#each only when YJIT is enabled (#11955)
* YJIT: Replace Array#each only when YJIT is enabled

* Add comments about BUILTIN_ATTR_C_TRACE

* Make Ruby Array#each available with --yjit as well

* Fix all paths that expect a C location

* Use method_basic_definition_p to detect patches

* Copy a comment about C_TRACE flag to compilers

* Rephrase a comment about add_yjit_hook

* Give METHOD_ENTRY_BASIC flag to Array#each

* Add --yjit-c-builtin option

* Allow inconsistent source_location in test-spec

* Refactor a check of BUILTIN_ATTR_C_TRACE

* Set METHOD_ENTRY_BASIC without touching vm->running
2024-11-04 11:14:28 -05:00
Nobuyoshi Nakada
3e1021b144 Make default parser enum and define getter/setter 2024-10-02 20:43:40 +09:00
Lars Kanis
9b4a497456 Fix loading of nonascii script name on Windows
Since the prism parser was enabled by default, loading scripts with nonascii characters somewhere in the script path is no longer working.
It only works when the codepage was switched to 65001 (UTF-8).

This patch doesn't change the encoding of __FILE__. It is still in locale encoding.
That's why pm_load_file() is called with UTF-8 script name and pm_parse_file() with locale encoding.

The loading of nonascii script names is part of the test-all, but it doesn't trigger the failure on GHA, since it is using cp 65001.
On other codepages it fails with:

[53/71] TestRubyOptions#test_command_line_progname_nonascii = 0.04 s
  1) Failure:
TestRubyOptions#test_command_line_progname_nonascii [C:/Users/Administrator/ruby/test/ruby/test_rubyoptions.rb:1086]:
[ruby-dev:48752] [Bug #10555]
pid 1736 exit 1
| C:\Users\Administrator\ruby\ruby.exe: No such file or directory -- �.rb (LoadError)
.

1. [1/2] Assertion for "stdout"
   | <["\xFF.rb"]> expected but was
   | <[]>.

2. [2/2] Assertion for "stderr"
   | <[]> expected but was
   | <["C:\\Users\\Administrator\\ruby\\ruby.exe: No such file or directory -- \xFF.rb (LoadError)"]>.
2024-09-29 19:01:18 -04:00
Kevin Newton
9afc6a981d [PRISM] Only parse shebang on main script
Fixes [Bug #20730]
2024-09-13 12:51:53 -04:00
Kevin Newton
371432b2d7 [PRISM] Handle RubyVM.keep_script_lines 2024-08-29 20:27:01 -04:00
Alan Wu
554098303d [PRISM] For stdin scripts, use locale encoding
For example:

    $ echo 'p __ENCODING__' | LANG=C ruby
    #<Encoding:US-ASCII>

But, allow -K to override the source encoding.
Found by running spec/ruby/language/magic_comment_spec.rb with LANG=C.
2024-08-29 20:20:26 -04:00
Nobuyoshi Nakada
d33e3d47b8
[Bug #20704] Win32: Fix chdir to non-ASCII path
On Windows, `chdir` in compilers' runtime libraries uses the active
code page, but command line arguments in ruby are always UTF-8, since
commit:33ea2646b98adb49ae2e1781753bf22d33729ac0.
2024-08-29 19:41:53 +09:00
Alexander Momchilov
f93c27d86b Set encoding index correctly 2024-08-28 08:47:43 -04:00
Alan Wu
f2ac013009
Add RB_DEFAULT_PARSER preprocessor macro
This way there is one place to change for switching the default.
This also allows for building the same commit with different cppflags.
2024-08-27 23:15:37 +00:00
Kevin Newton
465cf8d80b [PRISM] Potentially enable coverage on the main script 2024-08-21 16:32:05 -04:00
Kevin Newton
de28ef7db4 [PRISM] Use src encoding not ext encoding 2024-08-15 13:34:25 -04:00
Kevin Newton
09bf3c9d6a [PRISM] Trigger moreswitches off shebang 2024-08-14 15:39:03 -04:00
Peter Zhu
f69ba5716f Move RUBY_FREE_AT_EXIT check earlier
Things that exit early, like `ruby -v`, could not use RUBY_FREE_AT_EXIT
because the check for RUBY_FREE_AT_EXIT was not executed.
2024-07-24 08:36:40 -04:00
Kevin Newton
49cf042cd2 [PRISM] Define DATA constant when parsing stdin and __END__ 2024-07-19 10:17:50 -04:00
Kevin Newton
b1608fc6bc [PRISM] Do not respect xflag when eflag is set 2024-07-18 13:03:33 -04:00
Peter Zhu
8fd2df529b Revert "Load external GC using command line argument"
This reverts commit 8ddb1110c283c5cb59b6582383f36fdbcc43ab19.
2024-07-05 14:05:58 -04:00
Jean Boussier
95ffcd3f9f Fix --debug-frozen-string-literal to not apply --disable-frozen-string-literal
[Feature #20205]

This was an undesired side effect. Now that this value is a triplet, we can't
assume it's disabled by default.
2024-06-24 12:43:39 +02:00
Peter Zhu
90763e04ba Load external GC using command line argument
This commit changes the external GC to be loaded with the `--gc-library`
command line argument instead of the RUBY_GC_LIBRARY_PATH environment
variable because @nobu pointed out that loading binaries using environment
variables can pose a security risk.
2024-06-21 11:49:01 -04:00
Nobuyoshi Nakada
01b13886dc [Bug #20562] Categorize RUBY_FREE_AT_EXIT warning as experimental 2024-06-12 15:36:10 +09:00
Kevin Newton
792e9c46a4 Remove prism compiler warning 2024-06-07 12:24:05 -04:00
Jean Boussier
33f92b3c88 Don't add +YJIT to RUBY_DESCRIPTION until it's actually enabled
If you start Ruby with `--yjit-disable`, the `+YJIT` shouldn't be
added until `RubyVM::YJIT.enable` is actually called. Otherwise
it's confusing in crash reports etc.
2024-06-05 20:53:49 +02:00
Kevin Newton
a708b6aa65 [PRISM] Respect eval coverage setting 2024-05-20 12:28:47 -04:00
yui-knk
899d9f79dd Rename vast to ast_value
There is an English word "vast".
This commit changes the name to be more clear name to avoid confusion.
2024-05-03 12:40:35 +09:00
HASUMI Hitoshi
2244c58b00 [Universal parser] Decouple IMEMO from rb_ast_t
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object.

## Background

We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby.
To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE.

## Summary (file by file)

- `rubyparser.h`
  - Remove the `VALUE flags` member from `rb_ast_t`
- `ruby_parser.c` and `internal/ruby_parser.h`
  - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it
    - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()`
  - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE`
  - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c
- `iseq.c` and `vm_core.h`
  - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE`
  - This keeps the VALUE of AST on the machine stack to prevent being removed by GC
- `ast.c`
  - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff)
  - Fix `node_memsize()`
    - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines
- `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c`
  - Follow-up due to the above changes
- `imemo.{c|h}`
  - If an object with `imemo_ast` appears, considers it a bug

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-04-26 11:21:08 +09:00
Kevin Newton
af24ba4034 [PRISM] Raise LoadError when file cannot be read 2024-04-25 14:59:48 -04:00
Jean Boussier
f06670c5a2 Eliminate usage of OBJ_FREEZE_RAW
Previously it would bypass the `FL_ABLE` check, but
since shapes introduction, it started having a different
behavior than `OBJ_FREEZE`, as it would onyl set the `FL_FREEZE`
flag, but not update the shape.

I have no indication of this causing a bug yet, but it seems
like a trap waiting to happen.
2024-04-16 17:20:35 +02:00
HASUMI Hitoshi
9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)->debug_lines`
  - `(rb_ast_t *)->body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15 20:51:54 +09:00
Nobuyoshi Nakada
b88e0d6653
Merge push_include and ruby_push_include 2024-04-07 17:29:24 +09:00
Nobuyoshi Nakada
0d93fd0f69
Merge push_include_cygwin into push_include 2024-04-07 17:29:23 +09:00
Nobuyoshi Nakada
0620f006c2
Remove translit_char
It has been used only for DOSISH other than Windows.
2024-04-07 17:29:23 +09:00
Nobuyoshi Nakada
df8f1f78f0
[Feature #20329] Separate additional flags from main dump options
Additional flags are comma separated list preceeded by `-` or `+`.

Before:
```sh
$ ruby --dump=insns+without_opt
```

After:
```sh
$ ruby --dump=insns-opt,-optimize
```

At the same time, `parsetree_with_comment` is split to `parsetree`
option and additional `comment` flag.

Before:
```sh
$ ruby --dump=parsetree_with_comment
```

After:
```sh
$ ruby --dump=parsetree,+comment
```

Also flags can be separate `--dump`.
```sh
$ ruby --dump=parsetree --dump=+comment --dump=+error_tolerant
```

Ineffective flags are ignored silently.
```sh
$ ruby --dump=parsetree --dump=+comment --dump=+error_tolerant
```
2024-04-06 20:27:02 +09:00
Nobuyoshi Nakada
9b5d4274a2
[Feature #20329] Clean up dump sub-options
Restructure `insns_without_opt` and `parsetree_with_comment` as
`insns+without_opt` and `parsetree+with_comment` respectively, like
`+error-tolerant`.
2024-04-06 20:27:01 +09:00
HASUMI Hitoshi
f5e387a300 Separate SCRIPT_LINES__ from ast.c
This patch suggests relocating the code dealing with `SCRIPT_LINES__` from ast.c to ruby_parser.c.

## Background

- I guess `AbstractSyntaxTree.of` method used to use `SCRIPT_LINES__` internally for some reason before
- However, now it appears `SCRIPT_LINES__` is no longer used meaningfully by the method
- As evidence of this, (and as my patch shows,) removing the function call of `rb_script_lines_for()` from `ast_s_of()` does not affect the result of `test/ruby/test_ast.rb`

Given the above, I think two possibilities can be considered:

- (A) `AbstractSyntaxTree.of` has not needed `SCRIPT_LINES__` already (I pick this)
- (B) We lack a test case of `AbstractSyntaxTree.of` that needs to use `SCRIPT_LINES__`

## Besides,

The current implementation causes strange behavior:

```console
ruby -e"SCRIPT_LINES__ = {__FILE__ => []}; puts RubyVM::AbstractSyntaxTree.of(->{ 1 + 2 }, keep_script_lines: true).script_lines"
=> `-e:1:in '<main>': undefined method 'script_lines' for nil (NoMethodError)`
```

I think this is a bug because `AbstractSyntaxTree.of` is not supposed to return `nil` even in this case.
This happens due to the ast.c's dependence on `SCRIPT_LINES__`.
And at the end of the `ast_s_of()`, `node_find()` can not find the target child node obviously because it doesn't make sense to look for a corresponding node made from the parameter of `AbstractSyntaxTree.of` in the AST tree made from the value of `{__FILE__ => []}`

## Solution

Since I think it's good enough `SCRIPT_LINES__` to be only referred by ruby.c, I chose the possibility "(A)" and wrote this patch which moves `rb_script_lines_for()` from ast.c to ruby_parser.c.

So as the result:

- `ast_s_of()` function no longer look up `SCRIPT_LINES__`
- Even so, this patched code passes the existing tests
- The strange behavior above no longer happens (I also added a test for it)

Please correct me if I miss something🙏
2024-04-04 18:29:16 +09:00
Kevin Newton
42d1cd8f7f [PRISM] Pass --enable-frozen-string-literal through to evals 2024-03-27 08:34:42 -04:00
Étienne Barrié
12be40ae6b Implement chilled strings
[Feature #20205]

As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
`FrozenError`.

Implementation wise, `rb_compile_option_struct.frozen_string_literal` is
no longer a boolean but a tri-state of `enabled/disabled/unset`.

When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new `putchilledstring`
instruction. This instruction is identical to `putstring` except it marks
the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags.

Chilled strings have the `FL_FREEZE` flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.

Notes:
  - `String#freeze`: clears the chilled flag.
  - `String#-@`: acts as if the string was mutable.
  - `String#+@`: acts as if the string was mutable.
  - `String#clone`: copies the chilled flag.

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2024-03-19 09:26:49 +01:00
Kevin Newton
97810cbbf2 [PRISM] Process encoding on CLI for -K 2024-03-18 11:55:43 -04:00
Jean Boussier
91bf7eb274 Refactor frozen_string_literal check during compilation
In preparation for https://bugs.ruby-lang.org/issues/20205.

The `frozen_string_literal` compilation option will no longer
be a boolean but a tri-state: `on/off/default`.
2024-03-15 15:52:33 +01:00
Nobuyoshi Nakada
c843afbf6f Chomp last punctuations from descriptions for -h
The following parts will not be shown for `-h` option.  And not to
reach 80 columns.  Some terminal emulators (Windows command prompt at
least) wrap the cursor to the next line when reaching the rightmost
column, before exceeding.
2024-03-14 01:19:57 +09:00
Nobuyoshi Nakada
dec2a8191c
--dump=prism_parsetree is no longer provided
Since it did not make sense without `--parser=prism` option, just a
duplication.  Now it is `--parser=prism --dump=parsetree`.
2024-03-13 11:28:50 +09:00
Takashi Kokubun
22708be0d7 Revisions for #10198
This fixes some inconsistencies introduced by that PR.
2024-03-12 13:44:48 -07:00
Burdette Lamar
19da3b4ecf
Revisions for help text (#10198) 2024-03-12 15:14:56 -04:00