[Bug #20271]
[Bug #20267]
[Bug #20255]
`rb_obj_alloc(RBASIC_CLASS(obj))` will always allocate from the basic
40B pool, so if `obj` is larger than `40B`, we'll create a corrupted
object when we later copy the shape_id.
Instead we can use the same logic than ractor copy, which is
to use `rb_obj_clone`, and later ask the GC to free the original
object.
We then must turn it into a `T_OBJECT`, because otherwise
just changing its class to `RactorMoved` leaves a lot of
ways to keep using the object, e.g.:
```
a = [1, 2, 3]
Ractor.new{}.send(a, move: true)
[].concat(a) # Should raise, but wasn't.
```
If it turns out that `rb_obj_clone` isn't performant enough
for some uses, we can always have carefully crafted specialized
paths for the types that would benefit from it.
This function replaces the internal rb_obj_gc_flags API. rb_gc_object_metadata
returns an array of name and value pairs, with the last element having
0 for the name.
We should skip reference updating for entries in too complex generic ivars
that are special constants. This fixes the following crash:
MAX_SHAPES = 0x80000
MAX_SHAPES.times do |i|
o = []
o.instance_variable_set(:"@foo#{i}", 1)
end
o = []
o.instance_variable_set(:"@a", 123)
GC.compact
The `ids` array and `dsymbol_fstr_hash` were pinned because they were
kept alive by rb_vm_register_global_object. This prevented the GC from
moving them even though there were reference updating code.
This commit changes it to be marked movable by marking it as a root object.
When the Ractor list is not initialized and a GC is ran at boot, then it
would crash because the newobj_cache of the main Ractor is not cleared.
This commit changes it to use ruby_single_main_ractor when it's available
and iterate over the Ractor list when we have multiple Ractors.
For moving garbage collectors, we may want to combine liveliness checking
with reference updating for performance. This commit allows for non-weak
references to be passed into the callback function when weak_only is false.
Previously, generic ivars worked differently than the other global tables
during compaction. The other global tables had their references updated
through iteration during rb_gc_update_vm_references. Generic ivars updated
the keys when the object moved and updated the values while reference
updating the object. This is inefficient as this required one lookup for
every moved object and one lookup for every object with generic ivars.
Instead, this commit changes it to iterate over the generic ivar table to
update both the keys and values.
We must disable GC when running RUBY_INTERNAL_EVENT_NEWOBJ hooks because
the callback could call xmalloc which could potentially trigger a GC,
and a lot of code is unsafe to trigger a GC right after an object has
been allocated because they perform initialization for the object and
assume that the GC does not trigger before then.
[Bug #20942]
If we've raised a memerror while the VM is locked, and the tag we're
jumping to has been locked at a different level to the current lock (ie.
we've locked the VM again since the tag we're jumping to) then we should
consider this memerror fatal and exit, since the tag cannot unlock the
VM.
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
This change poisons the whole slot of the object rather than just the flags.
This allows ASAN to find any reads/writes into the slot after it has been
freed.
rb_gc_location doesn't check that the object is actually a Ruby object
and only checks if the object looks like a T_MOVED. This may have unexpected
outcomes if the object is not a Ruby object (e.g. a piece of malloc memory
may be corrupted).
When reference updating ObjectSpace.trace_object_allocations, we need to
check whether the object is valid or not because it does not mark the
object so the object may be dead. This can cause a segmentation fault
if the object is on a free heap page.
For example, the following script crashes:
require "objspace"
objs = []
ObjectSpace.trace_object_allocations do
1_000_000.times do
objs << Object.new
end
end
objs = nil
# Free pages that the objs were on
GC.start
# Run compaction and check that it doesn't crash
GC.compact
We have name fragmentation for this feature, including "shared GC",
"modular GC", and "external GC". This commit standardizes the feature
name to "modular GC" and the implementation to "GC library".
Let there be rooms for each GC implementations how to handle multi
threaded situations. They can be totally reentrant, or can have
their own mutex, or can rely on rb_thread_call_with_gvl.
In any ways the allocator (has been, but now officially is)
expected to run properly without a GVL. This means there need be
a way for them to inform the interpreter about their allocation
failures, without relying on raising exceptions.
Let them do so by returning NULL.
So that it doesn't get included in the generated binaries for builds
that don't support loading shared GC modules
Co-Authored-By: Peter Zhu <peter@peterzhu.ca>
```
compiling gc.c
In file included from gc.c:80:
/usr/include/sys/prctl.h:88:8: error: redefinition of 'struct prctl_mm_map'
88 | struct prctl_mm_map {
| ^~~~~~~~~~~~
In file included from gc.c:79:
/usr/include/linux/prctl.h:134:8: note: originally defined here
134 | struct prctl_mm_map {
| ^~~~~~~~~~~~
```
The first include is not needed and is what causes this issue.
Two other places in ruby exclusively use the sys import.
See https://github.com/seccomp/libseccomp/issues/19 for a similar problem.
Use PR_SET_VMA_ANON_NAME to set human-readable names for anonymous
virtual memory areas mapped by `mmap()` when compiled and run on Linux
5.17 or higher. This makes it convenient for developers to debug mmap.
This avoids the need to malloc, and reduces the complexity of truncating
the long string for display in RUBY_DESCRIPTION.
The developer of a GC implementation should be responsible for giving it
a succinct name.
This will add +MOD_GC to the version string and Ruby description when
Ruby is compiled with shared gc support.
When shared GC support is compiled in and a GC module has been loaded
using RUBY_GC_LIBRARY, the version string will include the name of
the currently active GC as reported by the rb_gc_active_gc_name function
in the form
+MOD_GC[gc_name]
[Feature #20794]
And a default and readonly key to the GC.config hash that names the
current GC implementation.
This is provided by each implementation by the API function
rb_gc_impl_active_gc_name