126 Commits

Author SHA1 Message Date
Jean Boussier
7c22330cd2 Allocate rb_shape_tree statically
There is no point allocating it during init, it adds
a useless indirection.
2025-06-12 17:08:22 +02:00
Jean Boussier
de4b910381 Get rid of GET_SHAPE_TREE()
It's a useless indirection.
2025-06-12 17:08:22 +02:00
Jean Boussier
e070d93573 Get rid of rb_shape_lookup 2025-06-12 17:08:22 +02:00
Jean Boussier
0292b702c4 shape.h: make RSHAPE static inline
Since the shape_tree_ptr is `extern` it should be possible to
fully inline `RSHAPE`.
2025-06-12 17:08:22 +02:00
Jean Boussier
3abdd4241f Turn rb_classext_t.fields into a T_IMEMO/class_fields
This behave almost exactly as a T_OBJECT, the layout is entirely
compatible.

This aims to solve two problems.

First, it solves the problem of namspaced classes having
a single `shape_id`. Now each namespaced classext
has an object that can hold the namespace specific
shape.

Second, it open the door to later make class instance variable
writes atomics, hence be able to read class variables
without locking the VM.
In the future, in multi-ractor mode, we can do the write
on a copy of the `fields_obj` and then atomically swap it.

Considerations:

  - Right now the `RClass` shape_id is always synchronized,
    but with namespace we should likely mark classes that have
    multiple namespace with a specific shape flag.
2025-06-12 07:58:16 +02:00
Jean Boussier
95201299fd Refactor the last references to rb_shape_t
The type isn't opaque because Ruby isn't often compiled with LTO,
so for optimization purpose it's better to allow as much inlining
as possible.

However ideally only `shape.c` and `shape.h` should deal with
the actual struct, and everything else should just deal with opaque
`shape_id_t`.
2025-06-11 16:38:38 +02:00
Jean Boussier
4463ac264d shape.h: remove YJIT workaround
YJIT x86 backend would crahs if the shape_id top bit was set.
This should have been fixed now.
2025-06-11 14:21:43 +02:00
Jean Boussier
a640723d31 Simplify rb_gc_rebuild_shape
Now that there no longer multiple shape roots, all we need to do
when moving an object from one slot to the other is to update the
`heap_index` part of the shape_id.

Since this never need to create a shape transition, it will always
work and never result in a complex shape.
2025-06-07 18:30:44 +02:00
Jean Boussier
191f6e3b87 Get rid of rb_shape_t.heap_id 2025-06-07 18:30:44 +02:00
Jean Boussier
6eb0cd8df7 Get rid of SHAPE_T_OBJECT
Now that we have the `heap_index` in shape flags we no longer
need `T_OBJECT` shapes.
2025-06-07 18:30:44 +02:00
Jean Boussier
1c96aed6ee Remove EMBEDDED shape_id flags 2025-06-07 18:30:44 +02:00
Jean Boussier
54edc930f9 Leave the shape_id_t highest bit unused to avoid crashing YJIT 2025-06-07 18:30:44 +02:00
Jean Boussier
689ec51146 Replicate heap_index in shape_id flags.
This is preparation to getting rid of `T_OBJECT` transitions.
By first only replicating the information it's easier to ensure
consistency.
2025-06-07 18:30:44 +02:00
Jean Boussier
4e39580992 Refactor raw accesses to rb_shape_t.capacity 2025-06-05 22:06:15 +02:00
Jean Boussier
772fc1f187 Get rid of rb_shape_t.flags
Now all flags are only in the `shape_id_t`, and can all be checked
without needing to dereference a pointer.
2025-06-05 07:44:44 +02:00
Jean Boussier
675f33508c Get rid of TOO_COMPLEX shape type
Instead it's now a `shape_id` flag.

This allows to check if an object is complex without having
to chase the `rb_shape_t` pointer.
2025-06-04 13:13:50 +02:00
Jean Boussier
bbd5a5a81d vm_getivar: normalize shape_id to ignore frozen state
Freezing an object changes its `shape_id` This is necessary
so that `setivar` routines can use the `shape_id` as a cache key
and save on checking the frozen status every time.

However for `getivar` routines, this causes needless cache misses.
By clearing that bit we increase hit rate in codepaths that see
both frozen and mutable objects.
2025-06-04 07:59:20 +02:00
Jean Boussier
625d6a9cbb Get rid of frozen shapes.
Instead `shape_id_t` higher bits contain flags, and the first one
tells whether the shape is frozen.

This has multiple benefits:
  - Can check if a shape is frozen with a single bit check instead of
    dereferencing a pointer.
  - Guarantees it is always possible to transition to frozen.
  - This allow reclaiming `FL_FREEZE` (not done yet).

The downside is you have to be careful to preserve these flags
when transitioning.
2025-06-04 07:59:20 +02:00
Jean Boussier
e27404af9e Use all 32bits of shape_id_t on all platforms
Followup: https://github.com/ruby/ruby/pull/13341 / [Feature #21353]

Even thought `shape_id_t` has been make 32bits, we were still limited
to use only the lower 16 bits because they had to fit alongside `attr_index_t`
inside a `uintptr_t` in inline caches.

By enlarging inline caches we can unlock the full 32bits on all
platforms, allowing to use these extra bits for tagging.
2025-06-03 21:15:41 +02:00
Jean Boussier
e9fd44dd72 shape.c: Implement a lock-free version of get_next_shape_internal
Whenever we run into an inline cache miss when we try to set
an ivar, we may need to take the global lock, just to be able to
lookup inside `shape->edges`.

To solve that, when we're in multi-ractor mode, we can treat
the `shape->edges` as immutable. When we need to add a new
edge, we first copy the table, and then replace it with
CAS.

This increases memory allocations, however we expect that
creating new transitions becomes increasingly rare over time.

```ruby
class A
  def initialize(bool)
    @a = 1
    if bool
      @b = 2
    else
      @c = 3
    end
  end

  def test
    @d = 4
  end
end

def bench(iterations)
  i = iterations
  while i > 0
    A.new(true).test
    A.new(false).test
    i -= 1
  end
end

if ARGV.first == "ractor"
  ractors = 8.times.map do
    Ractor.new do
      bench(20_000_000 / 8)
    end
  end
  ractors.each(&:take)
else
  bench(20_000_000)
end
```

The above benchmark takes 27 seconds in Ractor mode on Ruby 3.4,
and only 1.7s with this branch.

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
2025-06-02 17:49:53 +02:00
Jean Boussier
749bda96e5 Refactor attr_index_t caches
Ensure the same helpers are used for packing and unpacking.
2025-05-28 12:39:21 +02:00
Jean Boussier
326c120aa7 Rename rb_shape_id_canonical_p -> rb_shape_canonical_p 2025-05-27 15:34:02 +02:00
Jean Boussier
925dec8d70 Rename rb_shape_set_shape_id in rb_obj_set_shape_id 2025-05-27 15:34:02 +02:00
Jean Boussier
ccf2b7c5b8 Refactor rb_shape_too_complex_p to take a shape_id_t. 2025-05-27 15:34:02 +02:00
Jean Boussier
a1f72d23a9 Refactor rb_shape_has_object_id
Now takes a `shape_id_t` and the version that takes a `rb_shape_t *`
is private.
2025-05-27 15:34:02 +02:00
Jean Boussier
a80a5000ab Refactor rb_obj_shape out.
It still exists but only in `shape.c`.
2025-05-27 15:34:02 +02:00
Jean Boussier
97f44ac54e Get rid of rb_shape_set_shape 2025-05-27 15:34:02 +02:00
Jean Boussier
a59835e1d5 Refactor rb_shape_get_iv_index to take a shape_id_t
Further reduce exposure of `rb_shape_t`.
2025-05-27 15:34:02 +02:00
Jean Boussier
e535f8248b Get rid of rb_shape_id(rb_shape_t *)
We should avoid conversions from `rb_shape_t *` into `shape_id_t`
outside of `shape.c` as the short term goal is to have `shape_id_t`
contain tags.
2025-05-27 12:45:24 +02:00
Jean Boussier
cc48fcdff2 Get rid of rb_shape_canonical_p 2025-05-27 12:45:24 +02:00
Jean Boussier
8b0868cbb1 Refactor rb_shape_rebuild_shape to stop exposing rb_shape_t 2025-05-27 12:45:24 +02:00
John Hawthorn
f483befd90 Add shape_id to RBasic under 32 bit
This makes `RBobject` `4B` larger on 32 bit systems
but simplifies the implementation a lot.

[Feature #21353]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2025-05-26 10:31:54 +02:00
Jean Boussier
52da5f8bbc Refactor rb_shape_transition_remove_ivar
Move the fields management logic in `rb_ivar_delete`, and keep
shape managment logic in `rb_shape_transition_remove_ivar`.
2025-05-23 17:33:17 +02:00
Jean Boussier
60ffb714d2 Ensure shape_id is never used on T_IMEMO
It doesn't make sense to set ivars or anything shape
related on a T_IMEMO.

Co-Authored-By: John Hawthorn <john@hawthorn.email>
2025-05-15 16:06:52 +02:00
Jean Boussier
a6435befa7 variable.c: Refactor rb_obj_field_* to take shape_id_t 2025-05-13 10:35:34 +02:00
Jean Boussier
3135eddb4e Refactor FIRST_T_OBJECT_SHAPE_ID to not be used outside shape.c 2025-05-09 20:45:48 +02:00
Jean Boussier
ea77250847 Rename RB_OBJ_SHAPE -> rb_obj_shape
As well as `RB_OBJ_SHAPE_ID` -> `rb_obj_shape_id`
and `RSHAPE` is now a simple alias for `rb_shape_lookup`.

I tried to turn all these into `static inline` but I'm having
trouble with `RUBY_EXTERN rb_shape_tree_t *rb_shape_tree_ptr;`
not being exposed as I'd expect.
2025-05-09 10:22:51 +02:00
Jean Boussier
0b81359b3f Stop exposing rb_shape_frozen_shape_p 2025-05-09 10:22:51 +02:00
Jean Boussier
a970d35de2 Get rid of rb_shape_get_parent. 2025-05-09 10:22:51 +02:00
Jean Boussier
5782561fc1 Rename rb_shape_get_shape_id -> RB_OBJ_SHAPE_ID
And `rb_shape_get_shape` -> `RB_OBJ_SHAPE`.
2025-05-09 10:22:51 +02:00
Jean Boussier
a007575497 Remove unused rb_shape_object_id_index 2025-05-09 10:22:51 +02:00
Jean Boussier
c9b08882b7 Refactor rb_shape_get_next to return an ID
Also rename it, and change parameters to be consistent with
other transition functions.
2025-05-09 10:22:51 +02:00
Jean Boussier
e0200cfba0 Refactor rb_shape_transition_shape_remove_ivar to not take a shape pointer
It's more consistent with other transition functions.
2025-05-09 10:22:51 +02:00
Jean Boussier
3f7c0af051 Rename rb_shape_obj_too_complex -> rb_shape_obj_too_complex_p 2025-05-09 10:22:51 +02:00
Jean Boussier
677d075c29 Refactor rb_shape_transition_too_complex to return an ID. 2025-05-09 10:22:51 +02:00
Jean Boussier
f82523f14b Refactor rb_shape_transition_frozen to return a shape_id. 2025-05-09 10:22:51 +02:00
Jean Boussier
31d0a5815c Get rid of useless SHAPE_MASK 2025-05-09 10:22:51 +02:00
Jean Boussier
334ebba221 Rename rb_shape_get_shape_by_id -> RSHAPE 2025-05-09 10:22:51 +02:00
Jean Boussier
9966de11fb Refactor rb_shape_get_next_iv_shape to take and return ids. 2025-05-09 10:22:51 +02:00
Jean Boussier
df7d25bb3e Stop exposing rb_shape_get_root_shape 2025-05-09 10:22:51 +02:00