95 Commits

Author SHA1 Message Date
John Hawthorn
f483befd90 Add shape_id to RBasic under 32 bit
This makes `RBobject` `4B` larger on 32 bit systems
but simplifies the implementation a lot.

[Feature #21353]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2025-05-26 10:31:54 +02:00
Jean Boussier
52da5f8bbc Refactor rb_shape_transition_remove_ivar
Move the fields management logic in `rb_ivar_delete`, and keep
shape managment logic in `rb_shape_transition_remove_ivar`.
2025-05-23 17:33:17 +02:00
Jean Boussier
60ffb714d2 Ensure shape_id is never used on T_IMEMO
It doesn't make sense to set ivars or anything shape
related on a T_IMEMO.

Co-Authored-By: John Hawthorn <john@hawthorn.email>
2025-05-15 16:06:52 +02:00
Jean Boussier
a6435befa7 variable.c: Refactor rb_obj_field_* to take shape_id_t 2025-05-13 10:35:34 +02:00
Jean Boussier
3135eddb4e Refactor FIRST_T_OBJECT_SHAPE_ID to not be used outside shape.c 2025-05-09 20:45:48 +02:00
Jean Boussier
ea77250847 Rename RB_OBJ_SHAPE -> rb_obj_shape
As well as `RB_OBJ_SHAPE_ID` -> `rb_obj_shape_id`
and `RSHAPE` is now a simple alias for `rb_shape_lookup`.

I tried to turn all these into `static inline` but I'm having
trouble with `RUBY_EXTERN rb_shape_tree_t *rb_shape_tree_ptr;`
not being exposed as I'd expect.
2025-05-09 10:22:51 +02:00
Jean Boussier
0b81359b3f Stop exposing rb_shape_frozen_shape_p 2025-05-09 10:22:51 +02:00
Jean Boussier
a970d35de2 Get rid of rb_shape_get_parent. 2025-05-09 10:22:51 +02:00
Jean Boussier
5782561fc1 Rename rb_shape_get_shape_id -> RB_OBJ_SHAPE_ID
And `rb_shape_get_shape` -> `RB_OBJ_SHAPE`.
2025-05-09 10:22:51 +02:00
Jean Boussier
a007575497 Remove unused rb_shape_object_id_index 2025-05-09 10:22:51 +02:00
Jean Boussier
c9b08882b7 Refactor rb_shape_get_next to return an ID
Also rename it, and change parameters to be consistent with
other transition functions.
2025-05-09 10:22:51 +02:00
Jean Boussier
e0200cfba0 Refactor rb_shape_transition_shape_remove_ivar to not take a shape pointer
It's more consistent with other transition functions.
2025-05-09 10:22:51 +02:00
Jean Boussier
3f7c0af051 Rename rb_shape_obj_too_complex -> rb_shape_obj_too_complex_p 2025-05-09 10:22:51 +02:00
Jean Boussier
677d075c29 Refactor rb_shape_transition_too_complex to return an ID. 2025-05-09 10:22:51 +02:00
Jean Boussier
f82523f14b Refactor rb_shape_transition_frozen to return a shape_id. 2025-05-09 10:22:51 +02:00
Jean Boussier
31d0a5815c Get rid of useless SHAPE_MASK 2025-05-09 10:22:51 +02:00
Jean Boussier
334ebba221 Rename rb_shape_get_shape_by_id -> RSHAPE 2025-05-09 10:22:51 +02:00
Jean Boussier
9966de11fb Refactor rb_shape_get_next_iv_shape to take and return ids. 2025-05-09 10:22:51 +02:00
Jean Boussier
df7d25bb3e Stop exposing rb_shape_get_root_shape 2025-05-09 10:22:51 +02:00
Jean Boussier
62eb2007f6 Remove unused rb_obj_debug_shape 2025-05-09 10:22:51 +02:00
Jean Boussier
e4f97ce387 Refactor rb_shape_depth to take an ID rather than a pointer.
As well as `rb_shape_edges_count` and `rb_shape_memsize`.
2025-05-09 10:22:51 +02:00
Jean Boussier
f8b3fc520f Refactor rb_shape_traverse_from_new_root to not expose rb_shape_t 2025-05-09 10:22:51 +02:00
Jean Boussier
7116b0a7f1 Extract rb_shape_free_all 2025-05-09 10:22:51 +02:00
Jean Boussier
f48e45d1e9 Move object_id in object fields.
And get rid of the `obj_to_id_tbl`

It's no longer needed, the `object_id` is now stored inline
in the object alongside instance variables.

We still need the inverse table in case `_id2ref` is invoked, but
we lazily build it by walking the heap if that happens.

The `object_id` concern is also no longer a GC implementation
concern, but a generic implementation.

Co-Authored-By: Matt Valentine-House <matt@eightbitraptor.com>
2025-05-08 07:58:05 +02:00
Jean Boussier
d34c150547 shape.c: refactor frozen shape to no longer be final
This opens the door to store more informations in shapes, such
as the `object_id` or object address in case it has been observed
and the object has to be moved.
2025-05-08 07:58:05 +02:00
Jean Boussier
6c9b3ac232 Refactor OBJ_TOO_COMPLEX_SHAPE_ID to not be referenced outside shape.h
Also refactor checks for `->type == SHAPE_OBJ_TOO_COMPLEX`.
2025-05-08 07:58:05 +02:00
Jean Boussier
0ea210d1ea Rename ivptr -> fields, next_iv_index -> next_field_index
Ivars will longer be the only thing stored inline
via shapes, so keeping the `iv_index` and `ivptr` names
would be confusing.

Instance variables won't be the only thing stored inline
via shapes, so keeping the `ivptr` name would be confusing.

`field` encompass anything that can be stored in a VALUE array.

Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.
2025-05-08 07:58:05 +02:00
Jean Boussier
a3af4e905f Make rb_shape.capacity an attr_index_t 2025-05-05 14:44:49 +02:00
Jean Boussier
18dac125cb Improve syntax style consistency in shape.c and shape.h
Most of this code use the `type * name` style, while the
overwhemling majority of the rest of ruby use the `type *name`
style.

This is a cosmetic change, but helps with readability.
2025-04-30 08:10:55 +02:00
Matt Valentine-House
8e7df4b7c6 Rename size_pool -> heap
Now that we've inlined the eden_heap into the size_pool, we should
rename the size_pool to heap. So that Ruby contains multiple heaps, with
different sized objects.

The term heap as a collection of memory pages is more in memory
management nomenclature, whereas size_pool was a name chosen out of
necessity during the development of the Variable Width Allocation
features of Ruby.

The concept of size pools was introduced in order to facilitate
different sized objects (other than the default 40 bytes). They wrapped
the eden heap and the tomb heap, and some related state, and provided a
reasonably simple way of duplicating all related concerns, to provide
multiple pools that all shared the same structure but held different
objects.

Since then various changes have happend in Ruby's memory layout:

* The concept of tomb heaps has been replaced by a global free pages list,
  with each page having it's slot size reconfigured at the point when it
  is resurrected
* the eden heap has been inlined into the size pool itself, so that now
  the size pool directly controls the free_pages list, the sweeping
  page, the compaction cursor and the other state that was previously
  being managed by the eden heap.

Now that there is no need for a heap wrapper, we should refer to the
collection of pages containing Ruby objects as a heap again rather than
a size pool
2024-10-03 21:20:09 +01:00
Jean Boussier
f7b53a75b6 Do not emit shape transition warnings when YJIT is compiling
[Bug #20522]

If `Warning.warn` is redefined in Ruby, emitting a warning would invoke
Ruby code, which can't safely be done when YJIT is compiling.
2024-06-04 19:21:01 +02:00
Peter Zhu
3896f9940e Make special const and too complex shapes before T_OBJECT shapes 2024-03-13 09:55:52 -04:00
Peter Zhu
6b0434c0f7 Don't create per size pool shapes for non-T_OBJECT 2024-03-13 09:55:52 -04:00
Peter Zhu
df5b8ea4db Remove unneeded RUBY_FUNC_EXPORTED 2024-02-23 10:24:21 -05:00
Peter Zhu
71babe5536 Use 32-bit integers for redblack_id_t
On 32-bit systems, the shape cache size is 1048576 (value of
REDBLACK_CACHE_SIZE), but a 16-bit unsigned integer can only go up to
65536. This means that the redblack_id_t can overflow and lead to a
corrupted red-black tree.

The following script crashes on 32-bit systems:

    o = Object.new
    1_000_000.times do |i|
      o.instance_variable_set(:"@i#{i}", i)
    end
2023-12-04 13:57:12 -05:00
Aaron Patterson
6fce8c7980 Don't try compacting ivars on Classes that are "too complex"
Too complex classes use a hash table to store ivs, and should always pin
their IVs.  We shouldn't touch those classes in compaction.
2023-11-20 16:09:48 -08:00
Jean Boussier
94c9f16663 Refactor rb_obj_evacuate_ivs_to_hash_table
That function is a bit too low level to called from multiple
places. It's always used in tandem with `rb_shape_set_too_complex`
and both have to know how the object is laid out to update the
`iv_ptr`.

So instead we can provide two higher level function:

  - `rb_obj_copy_ivs_to_hash_table` to prepare a `st_table` from an
    arbitrary oject.
  - `rb_obj_convert_to_too_complex` to assign the new `st_table`
    to the old object, and safely free the old `iv_ptr`.

Unfortunately both can't be combined into one, because `rb_obj_copy_ivar`
need `rb_obj_copy_ivs_to_hash_table` to copy from one object
to another.
2023-11-17 09:19:21 +01:00
Peter Zhu
68869e9bd9 Revert "Revert "Remove SHAPE_CAPACITY_CHANGE shapes""
This reverts commit 5f3fb4f4e397735783743fe52a7899b614bece20.
2023-11-13 18:26:36 -05:00
Peter Zhu
5f3fb4f4e3 Revert "Remove SHAPE_CAPACITY_CHANGE shapes"
This reverts commit f6910a61122931e4193bcc0fad18d839c319b720.

We're seeing crashes in the test suite of Shopify's core monolith after
this change.
2023-11-10 11:27:49 -05:00
Peter Zhu
f6910a6112 Remove SHAPE_CAPACITY_CHANGE shapes
We don't need to create a shape to transition capacity as we can
transition the capacity when the capacity of the SHAPE_IVAR changes.
2023-11-09 09:25:02 -05:00
Jean Boussier
d898e8d6f8 Refactor rb_shape_transition_shape_capa out
Right now the `rb_shape_get_next` shape caller need to
first check if there is capacity left, and if not call
`rb_shape_transition_shape_capa` before it can call `rb_shape_get_next`.

And on each of these it needs to checks if we got a TOO_COMPLEX
back.

All this logic is duplicated in the interpreter, YJIT and RJIT.

Instead we can have `rb_shape_get_next` do the capacity transition
when needed. The caller can compare the old and new shapes capacity
to know if resizing is needed. It also can check for TOO_COMPLEX
only once.
2023-11-08 11:02:55 +01:00
Jean Boussier
b92b9e1e9e vm_getivar: assume the cached shape_id like have a common ancestor
When an inline cache misses, it is very likely that the stale shape_id
and the current instance shape_id have a close common ancestor.

For example if the instance variable is sometimes frozen sometimes
not, one of the two shape will be the direct parent of the other.

Another pattern that commonly cause IC misses is "memoization",
in such case the object will have a "base common shape" and then
a number of close descendants.

In addition, when we find a common ancestor, we store it in the
inline cache instead of the current shape. This help prevent the
cache from flip-flopping, ensuring the next lookup will be marginally
faster and more generally avoid writing in memory too much.

However, now that shapes have an ancestors index, we only check
for a few ancestors before falling back to use the index.

So overall this change speeds up what is assumed to be the more common
case, but makes what is assumed to be the less common case a bit slower.

```
compare-ruby: ruby 3.3.0dev (2023-10-26T05:30:17Z master 701ca070b4) [arm64-darwin22]
built-ruby: ruby 3.3.0dev (2023-10-26T09:25:09Z shapes_double_sear.. a723a85235) [arm64-darwin22]
warming up......

|                                     |compare-ruby|built-ruby|
|:------------------------------------|-----------:|---------:|
|vm_ivar_stable_shape                 |     11.672M|   11.679M|
|                                     |           -|     1.00x|
|vm_ivar_memoize_unstable_shape       |      7.551M|   10.506M|
|                                     |           -|     1.39x|
|vm_ivar_memoize_unstable_shape_miss  |     11.591M|   11.624M|
|                                     |           -|     1.00x|
|vm_ivar_unstable_undef               |      9.037M|    7.981M|
|                                     |       1.13x|         -|
|vm_ivar_divergent_shape              |      8.034M|    6.657M|
|                                     |       1.21x|         -|
|vm_ivar_divergent_shape_imbalanced   |     10.471M|    9.231M|
|                                     |       1.13x|         -|
```

Co-Authored-By: John Hawthorn <john@hawthorn.email>
2023-11-03 12:47:43 +01:00
Peter Zhu
38ba040d8b Make every initial size pool shape a root shape
This commit makes every initial size pool shape a root shape and assigns
it a capacity of 0.
2023-11-02 13:42:11 -04:00
Jean Boussier
b77148ae9f remove_instance_variable: Handle running out of shapes
`remove_shape_recursive` wasn't considering that if we run out of
shapes, it might have to transition to SHAPE_TOO_COMPLEX.

When this happens, we now return with an error and the caller
initiates the evacuation.
2023-11-01 15:21:55 +01:00
Jean Boussier
8e62596e38
Move some defines from shape.h to shape.c
If they are only used there, we might as well not expose them.
2023-10-26 13:07:08 -07:00
Aaron Patterson
d8cb827f39 Remove SHAPE_MAX_NUM_IVS
There is no longer a limit on the number of IVs you can store.
SHAPE_MAX_NUM_IVS was used to work around the IV10K problem (the well
known problem where setting 10k instance variables in a row would be too
slow).  The redblack tree works well at any shape depth, even depths
greater than 80, and solves the IV10K problem.
2023-10-24 14:23:17 -07:00
Aaron Patterson
afae8df373 get_next_shape_internal should always return a shape
If it runs out of shapes, or new variations aren't allowed, it will
return "too complex"
2023-10-24 14:23:17 -07:00
Aaron Patterson
a3f66e09f6 geniv objects can become too complex 2023-10-24 10:52:06 -07:00
Aaron Patterson
caf6a72348 remove IV limit / support complex shapes on classes 2023-10-24 10:52:06 -07:00
Aaron Patterson
27c7531939 increase the maximum number of ivs 2023-10-24 10:52:06 -07:00