2668 Commits

Author SHA1 Message Date
Jean Boussier
0350290262 Ractor: Fix moving embedded objects
[Bug #20271]
[Bug #20267]
[Bug #20255]

`rb_obj_alloc(RBASIC_CLASS(obj))` will always allocate from the basic
40B pool, so if `obj` is larger than `40B`, we'll create a corrupted
object when we later copy the shape_id.

Instead we can use the same logic than ractor copy, which is
to use `rb_obj_clone`, and later ask the GC to free the original
object.

We then must turn it into a `T_OBJECT`, because otherwise
just changing its class to `RactorMoved` leaves a lot of
ways to keep using the object, e.g.:

```
a = [1, 2, 3]
Ractor.new{}.send(a, move: true)
[].concat(a) # Should raise, but wasn't.
```

If it turns out that `rb_obj_clone` isn't performant enough
for some uses, we can always have carefully crafted specialized
paths for the types that would benefit from it.
2025-03-31 12:01:55 +02:00
Peter Zhu
2183899fd1 Re-use objspace variable instead of calling rb_gc_get_objspace() 2025-03-26 09:35:51 -04:00
Peter Zhu
319fcca656 Move rb_gc_impl_ractor_cache_free to shutdown section 2025-03-24 08:49:30 -04:00
Peter Zhu
a572ec1ba0 Move rb_gc_impl_objspace_free to shutdown section 2025-03-24 08:49:30 -04:00
Peter Zhu
7b6e07ea93 Add rb_gc_object_metadata API
This function replaces the internal rb_obj_gc_flags API. rb_gc_object_metadata
returns an array of name and value pairs, with the last element having
0 for the name.
2025-02-19 09:47:28 -05:00
Peter Zhu
0597cbcb1d Fix crash for special constants in too complex generic ivars
We should skip reference updating for entries in too complex generic ivars
that are special constants. This fixes the following crash:

    MAX_SHAPES = 0x80000

    MAX_SHAPES.times do |i|
      o = []
      o.instance_variable_set(:"@foo#{i}", 1)
    end

    o = []

    o.instance_variable_set(:"@a", 123)

    GC.compact
2025-02-18 17:09:28 -05:00
Nobuyoshi Nakada
4a67ef09cc
[Feature #21116] Extract RJIT as a third-party gem 2025-02-13 18:01:03 +09:00
Peter Zhu
3fb455adab Move global symbol reference updating to rb_sym_global_symbols_update_references 2025-02-10 08:47:44 -05:00
Peter Zhu
8d0416ae0b Make ruby_global_symbols movable
The `ids` array and `dsymbol_fstr_hash` were pinned because they were
kept alive by rb_vm_register_global_object. This prevented the GC from
moving them even though there were reference updating code.

This commit changes it to be marked movable by marking it as a root object.
2025-02-10 08:47:44 -05:00
Peter Zhu
a084fef9af [Bug #21099] Fix GC when Ractor list not initialized
When the Ractor list is not initialized and a GC is ran at boot, then it
would crash because the newobj_cache of the main Ractor is not cleared.
This commit changes it to use ruby_single_main_ractor when it's available
and iterate over the Ractor list when we have multiple Ractors.
2025-01-30 10:10:48 -05:00
Peter Zhu
98b36f6f36 Use rb_gc_vm_weak_table_foreach for reference updating
We can use rb_gc_vm_weak_table_foreach for reference updating of weak tables
in the default GC.
2025-01-27 10:28:36 -05:00
Peter Zhu
9e5ff79c5b Optionally traverse non-weak references in rb_gc_vm_weak_table_foreach
For moving garbage collectors, we may want to combine liveliness checking
with reference updating for performance. This commit allows for non-weak
references to be passed into the callback function when weak_only is false.
2025-01-27 10:28:36 -05:00
Peter Zhu
7ed08c4fd3 Fix memory leak in rb_gc_vm_weak_table_foreach
When deleting from the generic ivar table, we need to free the gen_ivtbl
otherwise we will have a memory leak.
2025-01-23 10:24:35 -05:00
Peter Zhu
89240eb2fb Add generic ivar reference updating step
Previously, generic ivars worked differently than the other global tables
during compaction. The other global tables had their references updated
through iteration during rb_gc_update_vm_references. Generic ivars updated
the keys when the object moved and updated the values while reference
updating the object. This is inefficient as this required one lookup for
every moved object and one lookup for every object with generic ivars.

Instead, this commit changes it to iterate over the generic ivar table to
update both the keys and values.
2025-01-22 08:54:52 -05:00
Peter Zhu
d78aef5e3f Add not null checks to rb_gc_vm_weak_table_foreach
If the tables are null (which happens when a GC is ran at boot), it will
segfault when trying to iterate.
2025-01-16 10:31:47 -05:00
Nobuyoshi Nakada
5df20ab0b4
Un-constify mark_current_machine_context on wasm
As `SET_STACK_END` updates `ec->machine.stack_end`, it cannot be
const.
2025-01-16 22:22:43 +09:00
Peter Zhu
67744879a1 Use existing vm variable for frozen strings in rb_gc_vm_weak_table_foreach 2025-01-15 15:57:52 -05:00
Peter Zhu
99ff0224a5 Move rbimpl_size_add_overflow from gc.c to memory.h 2025-01-02 11:03:04 -05:00
Nobuyoshi Nakada
7df5d65eac
[Bug #20981] Bring back rb_undefine_finalizer 2024-12-25 22:21:37 +09:00
Peter Zhu
f4476f0d07 Disable GC during RUBY_INTERNAL_EVENT_NEWOBJ
We must disable GC when running RUBY_INTERNAL_EVENT_NEWOBJ hooks because
the callback could call xmalloc which could potentially trigger a GC,
and a lot of code is unsafe to trigger a GC right after an object has
been allocated because they perform initialization for the object and
assume that the GC does not trigger before then.
2024-12-23 09:03:32 -05:00
Nobuyoshi Nakada
2f2530b195
Allow variables in modular_gc_dir
Such as `$(ruby_version)`, `$(arch)` and so on.
2024-12-22 22:10:26 +09:00
Nobuyoshi Nakada
626037e143
Support RUBY_MODULAR_GC with LOAD_RELATIVE 2024-12-22 22:10:26 +09:00
Peter Zhu
97f5546676 Don't print bug report in asan_death_callback when no VM
If we don't have the VM (e.g. printing memory leaks in LSAN after shutdown)
then we will crash when we try to print the bug report. This issue was
reported in: https://github.com/ruby/ruby/pull/12309#issuecomment-2555766525
2024-12-20 15:04:08 -05:00
Matt Valentine-House
2f6c694977 Memerror is fatal if VM cannot be unlocked.
[Bug #20942]

If we've raised a memerror while the VM is locked, and the tag we're
jumping to has been locked at a different level to the current lock (ie.
we've locked the VM again since the tag we're jumping to) then we should
consider this memerror fatal and exit, since the tag cannot unlock the
VM.

Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2024-12-20 07:49:30 +00:00
Peter Zhu
a58675386c Prefix asan_poison_object with rb 2024-12-19 09:14:34 -05:00
Peter Zhu
c37bdfa531 Make asan_poison_object poison the whole slot
This change poisons the whole slot of the object rather than just the flags.
This allows ASAN to find any reads/writes into the slot after it has been
freed.
2024-12-19 09:14:34 -05:00
Peter Zhu
9733304d61 Assert Ruby object in rb_gc_location
rb_gc_location doesn't check that the object is actually a Ruby object
and only checks if the object looks like a T_MOVED. This may have unexpected
outcomes if the object is not a Ruby object (e.g. a piece of malloc memory
may be corrupted).
2024-12-17 11:03:38 -05:00
Peter Zhu
80b8feb929 Don't directly use rb_gc_impl_location in gc.c
Use the wrapper gc_location_internal instead that checks for special
constants.
2024-12-16 13:32:35 -05:00
Peter Zhu
d28368d27f Move special constant check in rb_gc_location to gc.c 2024-12-16 13:32:35 -05:00
Peter Zhu
516a6cd1ad Check whether object is valid in allocation_info_tracer_compact
When reference updating ObjectSpace.trace_object_allocations, we need to
check whether the object is valid or not because it does not mark the
object so the object may be dead. This can cause a segmentation fault
if the object is on a free heap page.

For example, the following script crashes:

    require "objspace"

    objs = []
    ObjectSpace.trace_object_allocations do
      1_000_000.times do
        objs << Object.new
      end
    end

    objs = nil

    # Free pages that the objs were on
    GC.start

    # Run compaction and check that it doesn't crash
    GC.compact
2024-12-16 12:24:24 -05:00
Peter Zhu
79d90e7351 Call rb_bug_without_die when ASAN error reported
This will give us the Ruby stack trace when an ASAN error is reported.
2024-12-12 14:07:56 -05:00
Nobuyoshi Nakada
f243733564
[Bug #20941] Bail out when recursing no memory 2024-12-11 16:12:04 +09:00
Peter Zhu
c45503f957 Add rb_gc_impl_active_gc_name to gc/gc_impl.h 2024-12-06 10:22:03 -05:00
Peter Zhu
ce1ad1b816 Standardize on the name "modular GC"
We have name fragmentation for this feature, including "shared GC",
"modular GC", and "external GC". This commit standardizes the feature
name to "modular GC" and the implementation to "GC library".
2024-12-05 10:33:26 -05:00
Peter Zhu
62b51d9ad7 Use BUILDING_SHARED_GC instead of RB_AMALGAMATED_DEFAULT_GC
We can use the BUILDING_SHARED_GC flag to check if we're building gc_impl.h
as a shared GC or building the default GC.
2024-12-04 10:25:43 -05:00
Nobuyoshi Nakada
86c01b6aa0 [Bug #20928] Fix build when malloc_usable_size is available
Copy from gc/default/default.c and revert the part of 51bd81651794.
2024-12-04 17:49:55 +09:00
Peter Zhu
3a90663776 Move external_gc_loaded_p to gc_functions 2024-12-03 16:16:13 -05:00
John Hawthorn
a505cd32fb RUBY_DEBUG: Verify PC correctness every alloc 2024-11-29 20:37:27 -08:00
卜部昌平
705714be3e prefer ruby_memerror instead
This could be out of GVL
2024-11-29 23:19:05 +09:00
卜部昌平
25ad7e8e6c rb_gc_impl_malloc can return NULL
Let there be rooms for each GC implementations how to handle multi
threaded situations.  They can be totally reentrant, or can have
their own mutex, or can rely on rb_thread_call_with_gvl.

In any ways the allocator (has been, but now officially is)
expected to run properly without a GVL.  This means there need be
a way for them to inform the interpreter about their allocation
failures, without relying on raising exceptions.

Let them do so by returning NULL.
2024-11-29 23:19:05 +09:00
Matt Valentine-House
f127bcb829 define rb_current_ec_set in all cases 2024-11-25 13:05:23 +00:00
Matt Valentine-House
551be8219e Place all non-default GC API behind USE_SHARED_GC
So that it doesn't get included in the generated binaries for builds
that don't support loading shared GC modules

Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2024-11-25 13:05:23 +00:00
Matt Valentine-House
d61933e503 Use extconf to build external GC modules
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
2024-11-25 13:05:23 +00:00
Earlopain
3826019f31 Fix a build failure with musl
```
compiling gc.c
In file included from gc.c:80:
/usr/include/sys/prctl.h:88:8: error: redefinition of 'struct prctl_mm_map'
   88 | struct prctl_mm_map {
      |        ^~~~~~~~~~~~
In file included from gc.c:79:
/usr/include/linux/prctl.h:134:8: note: originally defined here
  134 | struct prctl_mm_map {
      |        ^~~~~~~~~~~~
```

The first include is not needed and is what causes this issue.
Two other places in ruby exclusively use the sys import.

See https://github.com/seccomp/libseccomp/issues/19 for a similar problem.
2024-11-24 17:47:06 +09:00
Kunshan Wang
8ae7c22972 Annotate anonymous mmap
Use PR_SET_VMA_ANON_NAME to set human-readable names for anonymous
virtual memory areas mapped by `mmap()` when compiled and run on Linux
5.17 or higher.  This makes it convenient for developers to debug mmap.
2024-11-21 13:48:05 -05:00
Nobuyoshi Nakada
36d02dc33e
Fix format modifier for size_t
Also fix the message, just `RB_GC_MAX_NAME_LEN` chars long is OK.
2024-11-17 22:45:07 +09:00
Nobuyoshi Nakada
b4d8e90c2a
rb_bug prints a newline after the given message [ci skip] 2024-11-15 14:52:31 +09:00
Matt Valentine-House
6795fc4981 rb_bug if rb_gc_impl_active_gc_name is too long
This avoids the need to malloc, and reduces the complexity of truncating
the long string for display in RUBY_DESCRIPTION.

The developer of a GC implementation should be responsible for giving it
a succinct name.
2024-11-14 10:46:36 +00:00
Matt Valentine-House
ee290c94a3 Include the currently active GC in RUBY_DESCRIPTION
This will add +MOD_GC to the version string and Ruby description when
Ruby is compiled with shared gc support.

When shared GC support is compiled in and a GC module has been loaded
using RUBY_GC_LIBRARY, the version string will include the name of
the currently active GC as reported by the rb_gc_active_gc_name function
in the form

+MOD_GC[gc_name]

[Feature #20794]
2024-11-14 10:46:36 +00:00
Matt Valentine-House
fa10441981 Expose GC.config[:implementation], to query the currently active GC
And a default and readonly key to the GC.config hash that names the
current GC implementation.

This is provided by each implementation by the API function
rb_gc_impl_active_gc_name
2024-11-14 10:46:36 +00:00