98 Commits

Author SHA1 Message Date
Jean Boussier
3abdd4241f Turn rb_classext_t.fields into a T_IMEMO/class_fields
This behave almost exactly as a T_OBJECT, the layout is entirely
compatible.

This aims to solve two problems.

First, it solves the problem of namspaced classes having
a single `shape_id`. Now each namespaced classext
has an object that can hold the namespace specific
shape.

Second, it open the door to later make class instance variable
writes atomics, hence be able to read class variables
without locking the VM.
In the future, in multi-ractor mode, we can do the write
on a copy of the `fields_obj` and then atomically swap it.

Considerations:

  - Right now the `RClass` shape_id is always synchronized,
    but with namespace we should likely mark classes that have
    multiple namespace with a specific shape flag.
2025-06-12 07:58:16 +02:00
Jean Boussier
95201299fd Refactor the last references to rb_shape_t
The type isn't opaque because Ruby isn't often compiled with LTO,
so for optimization purpose it's better to allow as much inlining
as possible.

However ideally only `shape.c` and `shape.h` should deal with
the actual struct, and everything else should just deal with opaque
`shape_id_t`.
2025-06-11 16:38:38 +02:00
Jean Boussier
625d6a9cbb Get rid of frozen shapes.
Instead `shape_id_t` higher bits contain flags, and the first one
tells whether the shape is frozen.

This has multiple benefits:
  - Can check if a shape is frozen with a single bit check instead of
    dereferencing a pointer.
  - Guarantees it is always possible to transition to frozen.
  - This allow reclaiming `FL_FREEZE` (not done yet).

The downside is you have to be careful to preserve these flags
when transitioning.
2025-06-04 07:59:20 +02:00
John Hawthorn
6a62a46c3c Read {max_iv,variation}_count from prime classext
MAX_IV_COUNT is a hint which determines the size of variable width
allocation we should use for a given class. We don't need to scope this
by namespace, if we end up with larger builtin objects on some
namespaces that isn't a user-visible problem, just extra memory use.

Similarly variation_count is used to track if a given object has had too
many branches in shapes it has used, and to use too_complex when that
happens. That's also just a hint, so we can use the same value across
namespaces without it being visible to users.

Previously variation_count was being incremented (written to) on the
RCLASS_EXT_READABLE ext, which seems incorrect if we wanted it to be
different across namespaces
2025-05-29 16:02:07 -04:00
John Hawthorn
d1343e12d2 Use flag for RCLASS_IS_INITIALIZED
Previously we used a flag to set whether a module was uninitialized.
When checked whether a class was initialized, we first had to check that
it had a non-zero superclass, as well as that it wasn't BasicObject.

With the advent of namespaces, RCLASS_SUPER is now an expensive
operation, and though we could just check for the prime superclass, we
might as well take this opportunity to use a flag so that we can perform
the initialized check with as few instructions as possible.

It's possible in the future that we could prevent uninitialized classes
from being available to the user, but currently there are a few ways to
do that.
2025-05-28 11:44:07 -04:00
John Hawthorn
f483befd90 Add shape_id to RBasic under 32 bit
This makes `RBobject` `4B` larger on 32 bit systems
but simplifies the implementation a lot.

[Feature #21353]

Co-authored-by: Jean Boussier <byroot@ruby-lang.org>
2025-05-26 10:31:54 +02:00
Nobuyoshi Nakada
aad9fa2853
Use RB_VM_LOCKING 2025-05-25 15:22:43 +09:00
John Hawthorn
4f9f2243e9 Stricter assert for RCLASS_ALLOCATOR
I'd like to make this only valid to T_CLASS also, but currently it is
called in some places for T_ICLASS and expected to return 0.
2025-05-23 10:33:48 -07:00
John Hawthorn
05cdcfcefd Only call RCLASS_SET_ALLOCATOR on T_CLASS objects
It's invalid to set an allocator on a T_ICLASS or T_MODULE, as those use
the other fields from the union.
2025-05-23 10:33:48 -07:00
John Hawthorn
11ad7f5f47 Don't use namespaced classext for superclasses
Superclasses can't be modified by user code, so do not need namespace
indirection. For example Object.superclass is always BasicObject, no
matter what modules are included onto it.
2025-05-23 10:22:24 -07:00
Jean Boussier
9400119702 Fix object_id for classes and modules in namespace context
Given classes and modules have a different set of fields in every
namespace, we can't store the object_id in fields for them.

Given that some space was freed in `RClass` we can store it there
instead.
2025-05-14 10:26:48 +02:00
Jean Boussier
130d6aaef2 Reclaim one VALUE from rb_classext_t by shrinking super_classdepth
By making `super_classdepth` `uint16_t`, classes and modules can
now fit in 160B slots again.

The downside of course is that before `super_classdepth` was large
enough we never had to care about overflow, as you couldn't
realistically create enough classes to ever go over it.

With this change, while it is stupid, you could realistically
create an ancestor chain containing 65k classes and modules.
2025-05-14 10:17:03 +02:00
Jean Boussier
2ca8769443 Reclaim one VALUE from rb_classext_t
The `includer` field is only used for `T_ICLASS`, so by moving
it into the existing union we can save one `VALUE` per class
and module.
2025-05-13 14:55:39 +02:00
Satoshi Tagomori
f0b41ef669 Describe the basic documents of Namespace 2025-05-11 23:32:50 +09:00
Satoshi Tagomori
8ecc04dc04 Delete code for debugging namespace 2025-05-11 23:32:50 +09:00
Satoshi Tagomori
90e5ce6132 Rename RCLASS_EXT() macro to RCLASS_EXT_PRIME() to prevent using it wrongly
The macro RCLASS_EXT() accesses the prime classext directly, but it can be
valid only in a limited situation when namespace is enabled.
So, to prevent using RCLASS_EXT() in the wrong way, rename the macro and
let the developer check it is ok to access the prime classext or not.
2025-05-11 23:32:50 +09:00
Satoshi Tagomori
ff790c759e Compact prime classext readable/writable flags
To make RClass size smaller, move flags of prime classext readable/writable to:
 readable - use ns_classext_tbl is NULL or not (if NULL, it's readable)
 writable - use FL_USER2 of RBasic flags
2025-05-11 23:32:50 +09:00
Satoshi Tagomori
5ee1ec313a initialize method tables before any GC chance 2025-05-11 23:32:50 +09:00
Satoshi Tagomori
f24ba27d6d avoid calling ZALLOC after NEWOBJ_OF for RClass: need to return RClass not promoted 2025-05-11 23:32:50 +09:00
Yusuke Endoh
8d3cd4301a Remove unnecessary prototype declarations
```
internal/class.h:158:20: warning: ‘RCLASS_SET_CLASSEXT_TABLE’ declared ‘static’ but never defined [-Wunused-function]
  158 | static inline void RCLASS_SET_CLASSEXT_TABLE(VALUE obj, st_table *tbl);
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~
internal/class.h:271:20: warning: ‘RCLASS_WRITE_SUBCLASSES’ declared ‘static’ but never defined [-Wunused-function]
  271 | static inline void RCLASS_WRITE_SUBCLASSES(VALUE klass, rb_subclass_anchor_t *anchor);
      |                    ^~~~~~~~~~~~~~~~~~~~~~~
```
2025-05-11 23:32:50 +09:00
Satoshi Tagomori
382645d440 namespace on read 2025-05-11 23:32:50 +09:00
Jean Boussier
3f7c0af051 Rename rb_shape_obj_too_complex -> rb_shape_obj_too_complex_p 2025-05-09 10:22:51 +02:00
Jean Boussier
334ebba221 Rename rb_shape_get_shape_by_id -> RSHAPE 2025-05-09 10:22:51 +02:00
Jean Boussier
0ea210d1ea Rename ivptr -> fields, next_iv_index -> next_field_index
Ivars will longer be the only thing stored inline
via shapes, so keeping the `iv_index` and `ivptr` names
would be confusing.

Instance variables won't be the only thing stored inline
via shapes, so keeping the `ivptr` name would be confusing.

`field` encompass anything that can be stored in a VALUE array.

Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.
2025-05-08 07:58:05 +02:00
Matt Valentine-House
8e7df4b7c6 Rename size_pool -> heap
Now that we've inlined the eden_heap into the size_pool, we should
rename the size_pool to heap. So that Ruby contains multiple heaps, with
different sized objects.

The term heap as a collection of memory pages is more in memory
management nomenclature, whereas size_pool was a name chosen out of
necessity during the development of the Variable Width Allocation
features of Ruby.

The concept of size pools was introduced in order to facilitate
different sized objects (other than the default 40 bytes). They wrapped
the eden heap and the tomb heap, and some related state, and provided a
reasonably simple way of duplicating all related concerns, to provide
multiple pools that all shared the same structure but held different
objects.

Since then various changes have happend in Ruby's memory layout:

* The concept of tomb heaps has been replaced by a global free pages list,
  with each page having it's slot size reconfigured at the point when it
  is resurrected
* the eden heap has been inlined into the size pool itself, so that now
  the size pool directly controls the free_pages list, the sweeping
  page, the compaction cursor and the other state that was previously
  being managed by the eden heap.

Now that there is no need for a heap wrapper, we should refer to the
collection of pages containing Ruby objects as a heap again rather than
a size pool
2024-10-03 21:20:09 +01:00
Jean Boussier
d4f3dcf4df Refactor VM root modules
This `st_table` is used to both mark and pin classes
defined from the C API. But `vm->mark_object_ary` already
does both much more efficiently.

Currently a Ruby process starts with 252 rooted classes,
which uses `7224B` in an `st_table` or `2016B` in an `RArray`.

So a baseline of 5kB saved, but since `mark_object_ary` is
preallocated with `1024` slots but only use `405` of them,
it's a net `7kB` save.

`vm->mark_object_ary` is also being refactored.

Prior to this changes, `mark_object_ary` was a regular `RArray`, but
since this allows for references to be moved, it was marked a second
time from `rb_vm_mark()` to pin these objects.

This has the detrimental effect of marking these references on every
minors even though it's a mostly append only list.

But using a custom TypedData we can save from having to mark
all the references on minor GC runs.

Addtionally, immediate values are now ignored and not appended
to `vm->mark_object_ary` as it's just wasted space.
2024-03-06 15:33:43 -05:00
Jean Boussier
b4a69351ec Move FL_SINGLETON to FL_USER1
This frees FL_USER0 on both T_MODULE and T_CLASS.

Note: prior to this, FL_SINGLETON was never set on T_MODULE,
so checking for `FL_SINGLETON` without first checking that
`FL_TYPE` was `T_CLASS` was valid. That's no longer the case.
2024-03-06 13:11:41 -05:00
Jean Boussier
e626da82ea Don't pin named structs defined in Ruby
[Bug #20311]

`rb_define_class_under` assumes it's called from C and that the
reference might be held in a C global variable, so it adds the
class to the VM root.

In the case of `Struct.new('Name')` it's wasteful and make
the struct immortal.
2024-03-01 08:23:38 +01:00
John Hawthorn
1c97abaaba De-dup identical callinfo objects
Previously every call to vm_ci_new (when the CI was not packable) would
result in a different callinfo being returned this meant that every
kwarg callsite had its own CI.

When calling, different CIs result in different CCs. These CIs and CCs
both end up persisted on the T_CLASS inside cc_tbl. So in an eval loop
this resulted in a memory leak of both types of object. This also likely
resulted in extra memory used, and extra time searching, in non-eval
cases.

For simplicity in this commit I always allocate a CI object inside
rb_vm_ci_lookup, but ideally we would lazily allocate it only when
needed. I hope to do that as a follow up in the future.
2024-02-20 18:55:00 -08:00
Peter Zhu
28a6e4ea9d Set m_tbl right after allocation
We should set the m_tbl right after allocation before anything that can
trigger GC to avoid clone_p from becoming old and needing to fire write
barriers.

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2023-12-19 13:09:36 -08:00
Aaron Patterson
6fce8c7980 Don't try compacting ivars on Classes that are "too complex"
Too complex classes use a hash table to store ivs, and should always pin
their IVs.  We shouldn't touch those classes in compaction.
2023-11-20 16:09:48 -08:00
Yusuke Endoh
591336a0f2 Avoid the pointer hack in RCLASS_EXT
... because GCC 13 warns it.

```
In file included from class.c:24:
In function ‘RCLASS_SET_ALLOCATOR’,
    inlined from ‘class_alloc’ at class.c:251:5,
    inlined from ‘rb_module_s_alloc’ at class.c:1045:17:
internal/class.h:159:43: warning: array subscript 0 is outside array bounds of ‘rb_classext_t[0]’ {aka ‘struct rb_classext_struct[]’} [-Warray-bounds=]
  159 |     RCLASS_EXT(klass)->as.class.allocator = allocator;
      |                                           ^
```
https://rubyci.s3.amazonaws.com/arch/ruby-master/log/20231015T030003Z.log.html.gz
2023-10-15 15:35:45 +09:00
Nobuyoshi Nakada
4634405f7c
Stop exposing FrozenCore in headers
Revert commit "Directly allocate FrozenCore as an ICLASS",
813a5f4fc46a24ca1695d23c159250b9e1080ac7.
2023-09-19 14:08:05 +09:00
Nobuyoshi Nakada
b934976024
Prefer 0 over NULL as function pointers
SunC warns use of `NULL`, pointer to data as function pointers.
2023-06-23 03:15:55 +09:00
Peter Zhu
813a5f4fc4 Directly allocate FrozenCore as an ICLASS
It's a bad idea to overwrite the flags as the garbage collector may have
set other flags.
2023-06-14 10:42:40 -04:00
eileencodes
40f090f433 Revert "Revert "Fix cvar caching when class is cloned""
This reverts commit 10621f7cb9a0c70e568f89cce47a02e878af6778.

This was reverted because the gc integrity build started failing. We
have figured out a fix so I'm reopening the PR.

Original commit message:

Fix cvar caching when class is cloned

The class variable cache that was added in
ruby#4544 changed the behavior of class
variables on cloned classes. As reported when a class is cloned AND a
class variable was set, and the class variable was read from the
original class, reading a class variable from the cloned class would
return the value from the original class.

This was happening because the IC (inline cache) is stored on the ISEQ
which is shared between the original and cloned class, therefore they
share the cache too.

To fix this we are now storing the `cref` in the cache so that we can
check if it's equal to the current `cref`. If it's different we don't
want to read from the cache. If it's the same we do. Cloned classes
don't share the same cref with their original class.

This will need to be backported to 3.1 in addition to 3.2 since the bug
exists in both versions.

We also added a marking function which was missing.

Fixes [Bug #19379]

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2023-06-05 11:11:12 -07:00
Aaron Patterson
10621f7cb9
Revert "Fix cvar caching when class is cloned"
This reverts commit 77d1b082470790c17c24a2f406b4fec5d522636b.
2023-06-01 14:55:36 -07:00
eileencodes
77d1b08247 Fix cvar caching when class is cloned
The class variable cache that was added in
https://github.com/ruby/ruby/pull/4544 changed the behavior of class
variables on cloned classes. As reported when a class is cloned AND a
class variable was set, and the class variable was read from the
original class, reading a class variable from the cloned class would
return the value from the original class.

This was happening because the IC (inline cache) is stored on the ISEQ
which is shared between the original and cloned class, therefore they
share the cache too.

To fix this we are now storing the `cref` in the cache so that we can
check if it's equal to the current `cref`. If it's different we don't
want to read from the cache. If it's the same we do. Cloned classes
don't share the same cref with their original class.

This will need to be backported to 3.1 in addition to 3.2 since the bug
exists in both versions.

We also added a marking function which was missing.

Fixes [Bug #19379]

Co-authored-by: Aaron Patterson <tenderlove@ruby-lang.org>
2023-06-01 08:52:48 -07:00
Peter Zhu
a0d1069e03 Make classes embedded on 32 bit
Classes are now exactly 80 bytes when embedded, which perfectly fits the
3rd size pool on 32 bit systems.
2023-04-16 11:06:31 -04:00
Peter Zhu
24b137336b Move shape ID to flags for classes on 32 bit
Moves shape ID to FL_USER4 to FL_USER19 for the shape ID on 32 bit
systems. This makes the rb_classext_struct smaller so that it can be
embedded.
2023-04-16 11:06:31 -04:00
Peter Zhu
ad3d4e87d7 Move RCLASS_CLONED to rb_classext_struct
This commit moves RCLASS_CLONED from the flags to the
rb_classext_struct. This frees the FL_USER1 bit.
2023-04-16 11:06:31 -04:00
Peter Zhu
91dcce5ed1 Change max_iv_count to type attr_index_t
max_iv_count is calculated from next_iv_index of the shape, which is of
type attr_index_t, so we can also make max_iv_count of type
attr_index_t.
2023-04-11 15:02:44 -04:00
Aaron Patterson
365fed6369
Revert "Allow classes and modules to become too complex"
This reverts commit 69465df4242f3b2d8e55fbe18d7c45b47b40a626.
2023-03-10 08:50:43 -08:00
HParker
69465df424 Allow classes and modules to become too complex
This makes the behavior of classes and modules when there are too many instance variables match the behavior of objects with too many instance variables.
2023-03-09 15:34:49 -08:00
Takashi Kokubun
233ddfac54 Stop exporting symbols for MJIT 2023-03-06 21:59:23 -08:00
Jean Boussier
1a4b4cd7f8 Move attached_object into rb_classext_struct
Given that signleton classes don't have an allocator,
we can re-use these bytes to store the attached object
in `rb_classext_struct` without making it larger.
2023-02-16 08:14:44 +01:00
Jean Boussier
7413079dae Encapsulate RCLASS_ATTACHED_OBJECT
Right now the attached object is stored as an instance variable
and all the call sites that either get or set it have to know how it's
stored.

It's preferable to hide this implementation detail behind accessors
so that it is easier to change how it's stored.
2023-02-15 15:24:22 +01:00
Jean Boussier
bac4d2eefa Check !RCLASS_EXT_EMBEDDED instead of SIZE_POOL_COUNT == 1
It's much more self documenting and consistent
2023-02-15 10:47:22 +01:00
Peter Zhu
4fa7d38324 Don't redefine RB_OBJ_WRITE
RB_OBJ_WRITE already exists in rgengc.h, so we shouldn't redefine it in
gc.h.
2023-01-18 08:49:32 -05:00
Peter Zhu
abff5f6203 Move classpath to rb_classext_t
This commit moves the classpath (and tmp_classpath) from instance
variables to the rb_classext_t. This improves performance as we no
longer need to set an instance variable when assigning a classpath to
a class.

I benchmarked with the following script:

```ruby
name = :MyClass

puts(Benchmark.measure do
  10_000_000.times do |i|
    Object.const_set(name, Class.new)
    Object.send(:remove_const, name)
  end
end)
```

Before this patch:

```
  5.440119   0.025264   5.465383 (  5.467105)
```

After this patch:

```
  4.889646   0.028325   4.917971 (  4.942678)
```
2023-01-11 11:06:58 -05:00