This behave almost exactly as a T_OBJECT, the layout is entirely
compatible.
This aims to solve two problems.
First, it solves the problem of namspaced classes having
a single `shape_id`. Now each namespaced classext
has an object that can hold the namespace specific
shape.
Second, it open the door to later make class instance variable
writes atomics, hence be able to read class variables
without locking the VM.
In the future, in multi-ractor mode, we can do the write
on a copy of the `fields_obj` and then atomically swap it.
Considerations:
- Right now the `RClass` shape_id is always synchronized,
but with namespace we should likely mark classes that have
multiple namespace with a specific shape flag.
The type isn't opaque because Ruby isn't often compiled with LTO,
so for optimization purpose it's better to allow as much inlining
as possible.
However ideally only `shape.c` and `shape.h` should deal with
the actual struct, and everything else should just deal with opaque
`shape_id_t`.
Now that there no longer multiple shape roots, all we need to do
when moving an object from one slot to the other is to update the
`heap_index` part of the shape_id.
Since this never need to create a shape transition, it will always
work and never result in a complex shape.
Freezing an object changes its `shape_id` This is necessary
so that `setivar` routines can use the `shape_id` as a cache key
and save on checking the frozen status every time.
However for `getivar` routines, this causes needless cache misses.
By clearing that bit we increase hit rate in codepaths that see
both frozen and mutable objects.
Instead `shape_id_t` higher bits contain flags, and the first one
tells whether the shape is frozen.
This has multiple benefits:
- Can check if a shape is frozen with a single bit check instead of
dereferencing a pointer.
- Guarantees it is always possible to transition to frozen.
- This allow reclaiming `FL_FREEZE` (not done yet).
The downside is you have to be careful to preserve these flags
when transitioning.
Followup: https://github.com/ruby/ruby/pull/13341 / [Feature #21353]
Even thought `shape_id_t` has been make 32bits, we were still limited
to use only the lower 16 bits because they had to fit alongside `attr_index_t`
inside a `uintptr_t` in inline caches.
By enlarging inline caches we can unlock the full 32bits on all
platforms, allowing to use these extra bits for tagging.
Whenever we run into an inline cache miss when we try to set
an ivar, we may need to take the global lock, just to be able to
lookup inside `shape->edges`.
To solve that, when we're in multi-ractor mode, we can treat
the `shape->edges` as immutable. When we need to add a new
edge, we first copy the table, and then replace it with
CAS.
This increases memory allocations, however we expect that
creating new transitions becomes increasingly rare over time.
```ruby
class A
def initialize(bool)
@a = 1
if bool
@b = 2
else
@c = 3
end
end
def test
@d = 4
end
end
def bench(iterations)
i = iterations
while i > 0
A.new(true).test
A.new(false).test
i -= 1
end
end
if ARGV.first == "ractor"
ractors = 8.times.map do
Ractor.new do
bench(20_000_000 / 8)
end
end
ractors.each(&:take)
else
bench(20_000_000)
end
```
The above benchmark takes 27 seconds in Ractor mode on Ruby 3.4,
and only 1.7s with this branch.
Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
This makes `RBobject` `4B` larger on 32 bit systems
but simplifies the implementation a lot.
[Feature #21353]
Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
As well as `RB_OBJ_SHAPE_ID` -> `rb_obj_shape_id`
and `RSHAPE` is now a simple alias for `rb_shape_lookup`.
I tried to turn all these into `static inline` but I'm having
trouble with `RUBY_EXTERN rb_shape_tree_t *rb_shape_tree_ptr;`
not being exposed as I'd expect.