22 Commits

Author SHA1 Message Date
Jean Boussier
a74c385208 Make setting and accessing class ivars lock-free
Now that class fields have been deletated to a T_IMEMO/class_fields
when we're in multi-ractor mode, we can read and write class instance
variable in an atomic way using Read-Copy-Update (RCU).

Note when in multi-ractor mode, we always use RCU. In theory
we don't need to, instead if we ensured the field is written
before the shape is updated it would be safe.

Benchmark:

```ruby
Warning[:experimental] = false

class Foo
  @foo = 1
  @bar = 2
  @baz = 3
  @egg = 4
  @spam = 5

  class << self
    attr_reader :foo, :bar, :baz, :egg, :spam
  end
end

ractors = 8.times.map do
  Ractor.new do
    1_000_000.times do
      Foo.bar + Foo.baz * Foo.egg - Foo.spam
    end
  end
end

if Ractor.method_defined?(:value)
  ractors.each(&:value)
else
  ractors.each(&:take)
end
```

This branch vs Ruby 3.4:

```bash
$ hyperfine -w 1 'ruby --disable-all ../test.rb' './miniruby ../test.rb'

Benchmark 1: ruby --disable-all ../test.rb
  Time (mean ± σ):      3.162 s ±  0.071 s    [User: 2.783 s, System: 10.809 s]
  Range (min … max):    3.093 s …  3.337 s    10 runs

Benchmark 2: ./miniruby ../test.rb
  Time (mean ± σ):     208.7 ms ±   4.6 ms    [User: 889.7 ms, System: 6.9 ms]
  Range (min … max):   202.8 ms … 222.0 ms    14 runs

Summary
  ./miniruby ../test.rb ran
   15.15 ± 0.47 times faster than ruby --disable-all ../test.rb
```
2025-06-12 14:55:13 +02:00
Jean Boussier
8b5ac5abf2 Fix class instance variable inside namespaces
Now that classes fields are delegated to an object with its own
shape_id, we no longer need to mark all classes as TOO_COMPLEX.
2025-06-12 13:43:29 +02:00
Jean Boussier
3abdd4241f Turn rb_classext_t.fields into a T_IMEMO/class_fields
This behave almost exactly as a T_OBJECT, the layout is entirely
compatible.

This aims to solve two problems.

First, it solves the problem of namspaced classes having
a single `shape_id`. Now each namespaced classext
has an object that can hold the namespace specific
shape.

Second, it open the door to later make class instance variable
writes atomics, hence be able to read class variables
without locking the VM.
In the future, in multi-ractor mode, we can do the write
on a copy of the `fields_obj` and then atomically swap it.

Considerations:

  - Right now the `RClass` shape_id is always synchronized,
    but with namespace we should likely mark classes that have
    multiple namespace with a specific shape flag.
2025-06-12 07:58:16 +02:00
Satoshi Tagomori
382645d440 namespace on read 2025-05-11 23:32:50 +09:00
Alan Wu
3e04f7b69f
Only mark cc->cme_ on valid imemo_callcache
We observed T_NONE on `cc->cme_` on a --repeat-count=50 run a compaction
test on CI:
http://ci.rvm.jp/results/trunk-repeat50@ruby-sp2-noble-docker/5654900

During reference updating for imemo_callcache in
rb_imemo_mark_and_move(), if `cc->klass` is not live, but `cc->_cme` is
live and moved, we go to the vm_cc_invalidate() path which
leaves `cc->_cme` not updated and stale. In the next marking run after
compaction, CME would've become a T_NONE.

So to quote the comment above "... cc is invalidated by
`vm_cc_invalidate()` and cc->cme is not be accessed."
2025-03-16 16:00:08 -04:00
Peter Zhu
62a1528020 Pass allocation size to rb_imemo_new
This would allow imemo to take advantage of VWA and allocate sizes larger
than RVALUE (40 bytes).
2025-01-08 09:11:59 -05:00
Peter Zhu
d0f9f3e2c6 Remove IMEMO_DEBUG
The code path hasn't compiled for almost a year, since 330830dd1a44b6e497250a14d93efae6fa363f82,
so probably nobody uses it.
2025-01-07 11:01:08 -05:00
Peter Zhu
33f95d632d Don't unpoison the CC in vm_ccs_free
The poison status is maintained by the GC, so don't unpoison it in vm_ccs_free.
If the object is not a garbage object, then it should not be poisoned.
2024-12-19 16:25:23 -05:00
Alan Wu
5978f2f114
Fix use-after-free in vm_ccs_free()
`struct rb_callcache *` point to an imemo object on the GC heap when
pushed into `struct rb_class_cc_entries`, but by the time vm_ccs_free()
runs, the entire GC page the imemo was on could already be deallocated.
With the right conditions, vm_ccs_free() wrote to freed memory.
rb_objspace_garbage_object_p() by itself is not enough to determine
liveness.

I conjectured this situation to be possible in
<https://github.com/ruby/ruby/pull/11995> using hints from crashes
in the wild. With c37bdfa5311be0aa8503b995299fb9547cede0a6 ("Make
asan_poison_object poison the whole slot"), the in-tree test suite
now recreates this scenario[^1][^2][^3].

Use rb_gc_pointer_to_heap_p(). Other uses of
rb_objspace_garbage_object_p() could be making the same mistake, but
correcting them might introduce serious performance regressions, so
leave them alone for now.

[^1]: http://ci.rvm.jp/results/trunk_asan@ruby-sp1/5477412
[^2]: http://ci.rvm.jp/results/trunk_asan@ruby-sp1/5477445
[^3]: http://ci.rvm.jp/results/trunk_asan@ruby-sp1/5477448
2024-12-19 12:28:21 -05:00
Peter Zhu
a58675386c Prefix asan_poison_object with rb 2024-12-19 09:14:34 -05:00
Matt Valentine-House
551be8219e Place all non-default GC API behind USE_SHARED_GC
So that it doesn't get included in the generated binaries for builds
that don't support loading shared GC modules

Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2024-11-25 13:05:23 +00:00
Peter Zhu
22f12b0a62 Use rb_id_table_foreach_values for marking CC table
We don't use the key, so we can speed it up by not needing to convert the
key to ID in the iterator.
2024-09-10 10:09:50 -04:00
Peter Zhu
51bd816517 [Feature #20470] Split GC into gc_impl.c
This commit splits gc.c into two files:

- gc.c now only contains code not specific to Ruby GC. This includes
  code to mark objects (which the GC implementation may choose not to
  use) and wrappers for internal APIs that the implementation may need
  to use (e.g. locking the VM).

- gc_impl.c now contains the implementation of Ruby's GC. This includes
  marking, sweeping, compaction, and statistics. Most importantly,
  gc_impl.c only uses public APIs in Ruby and a limited set of functions
  exposed in gc.c. This allows us to build gc_impl.c independently of
  Ruby and plug Ruby's GC into itself.
2024-07-03 09:03:40 -04:00
Aaron Patterson
e5160a9c60 Mark the class on orphan call caches
"super" CC's are "orphans", meaning there is no class CC table that
points at them.  Since they are orphans, we should mark the class
reference so that if the cache happens to be used, the class will still
be alive
2024-06-18 09:28:25 -07:00
John Hawthorn
520ab22725 Avoid unnecessary writes to imemo_env during GC
Similar to the previous commit, to avoid unnecessary Copy-on-Write
memory use we should only set this flag when it has not previously been
set.
2024-06-03 11:02:49 -07:00
HASUMI Hitoshi
2244c58b00 [Universal parser] Decouple IMEMO from rb_ast_t
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object.

## Background

We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby.
To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE.

## Summary (file by file)

- `rubyparser.h`
  - Remove the `VALUE flags` member from `rb_ast_t`
- `ruby_parser.c` and `internal/ruby_parser.h`
  - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it
    - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()`
  - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE`
  - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c
- `iseq.c` and `vm_core.h`
  - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE`
  - This keeps the VALUE of AST on the machine stack to prevent being removed by GC
- `ast.c`
  - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff)
  - Fix `node_memsize()`
    - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines
- `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c`
  - Follow-up due to the above changes
- `imemo.{c|h}`
  - If an object with `imemo_ast` appears, considers it a bug

Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>
2024-04-26 11:21:08 +09:00
Aaron Patterson
147ca9585e Implement equality for CI comparison when CC searching
When we're searching for CCs, compare the argc and flags for CI rather
than comparing pointers.  This means we don't need to store a reference
to the CI, and it also naturally "de-duplicates" CC objects.

We can observe the effect with the following code:

```ruby
require "objspace"

hash = {}

p ObjectSpace.memsize_of(Hash)

eval ("a".."zzz").map { |key|
  "hash.merge(:#{key} => 1)"
}.join("; ")

p ObjectSpace.memsize_of(Hash)
```

On master:

```
$ ruby -v test.rb
ruby 3.4.0dev (2024-04-15T16:21:41Z master d019b3baec) [arm64-darwin23]
test.rb:3: warning: assigned but unused variable - hash
3424
527736
```

On this branch:

```
$ make runruby
compiling vm.c
linking miniruby
builtin_binary.inc updated
compiling builtin.c
linking static-library libruby.3.4-static.a
ln -sf ../../rbconfig.rb .ext/arm64-darwin23/rbconfig.rb
linking ruby
ld: warning: ignoring duplicate libraries: '-ldl', '-lobjc', '-lpthread'
RUBY_ON_BUG='gdb -x ./.gdbinit -p' ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems  ./test.rb
2240
2368
```

Co-authored-by: John Hawthorn <jhawthorn@github.com>
2024-04-18 09:06:33 -07:00
HASUMI Hitoshi
9b1e97b211 [Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines
This patch is part of universal parser work.

## Summary
- Decouple VALUE from members below:
  - `(struct parser_params *)->debug_lines`
  - `(rb_ast_t *)->body.script_lines`
- Instead, they are now `rb_parser_ary_t *`
  - They can also be a `(VALUE)FIXNUM` as before to hold line count
- `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE
  - In order to do this,
  - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()`
  - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE`

## Other details
- Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too
- Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()`
  - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]`
  - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines`
- Remove the second parameter of `rb_parser_set_script_lines()` to make it simple
- Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines
- Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called
  - With regard to this, please see *Future tasks* below

## Future tasks
- Decouple IMEMO from `rb_ast_t *`
  - This lifts the five-members-restriction of Ruby object,
  - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST
  - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15 20:51:54 +09:00
Peter Zhu
162e13c884 Remove pointer check in vm_ccs_free
We don't need to check that the object is pointer to the GC heap in
vm_ccs_free because it is called during sweeping, which does not free
pages so it can never point to an object that is not on the GC heap.
2024-03-01 10:39:51 -05:00
Peter Zhu
edc7b73fc4 Remove pointer check in moved_or_living_object_strictly_p
We don't need to check that the object is pointer to the GC heap in
moved_or_living_object_strictly_p because it is called during reference
updating, which does not free pages so it can never point to an object
that is not on the GC heap.
2024-02-27 15:49:04 -05:00
Peter Zhu
9d8d029e32 Remove unused variable in imemo.c
The variable klass is only used in debug builds and generates a warning
on non-debug builds.
2024-02-22 15:52:57 -05:00
Peter Zhu
e65315a725 Extract imemo functions from gc.c into imemo.c 2024-02-22 11:35:09 -05:00