118 Commits

Author SHA1 Message Date
Satoshi Tagomori
382645d440 namespace on read 2025-05-11 23:32:50 +09:00
Jeremy Evans
e4f85bfc31 Implement Set as a core class
Set has been an autoloaded standard library since Ruby 3.2.
The standard library Set is less efficient than it could be, as it
uses Hash for storage, which stores unnecessary values for each key.

Implementation details:

* Core Set uses a modified version of `st_table`, named `set_table`.
  than `s/st_/set_/`, the main difference is that the stored records
  do not have values, making them 1/3 smaller. `st_table_entry` stores
  `hash`, `key`, and `record` (value), while `set_table_entry` only
  stores `hash` and `key`.  This results in large sets using ~33% less
  memory compared to stdlib Set.  For small sets, core Set uses 12% more
  memory (160 byte object slot and 64 malloc bytes, while stdlib set
  uses 40 for Set and 160 for Hash).  More memory is used because
  the set_table is embedded and 72 bytes in the object slot are
  currently wasted. Hopefully we can make this more efficient and have
  it stored in an 80 byte object slot in the future.

* All methods are implemented as cfuncs, except the pretty_print
  methods, which were moved to `lib/pp.rb` (which is where the
  pretty_print methods for other core classes are defined).  As is
  typical for core classes, internal calls call C functions and
  not Ruby methods.  For example, to check if something is a Set,
  `rb_obj_is_kind_of` is used, instead of calling `is_a?(Set)` on the
  related object.

* Almost all methods use the same algorithm that the pure-Ruby
  implementation used.  The exception is when calling `Set#divide` with a
  block with 2-arity.  The pure-Ruby method used tsort to implement this.
  I developed an algorithm that only allocates a single intermediate
  hash and does not need tsort.

* The `flatten_merge` protected method is no longer necessary, so it
  is not implemented (it could be).

* Similar to Hash/Array, subclasses of Set are no longer reflected in
  `inspect` output.

* RDoc from stdlib Set was moved to core Set, with minor updates.

This includes a comprehensive benchmark suite for all public Set
methods.  As you would expect, the native version is faster in the
vast majority of cases, and multiple times faster in many cases.
There are a few cases where it is significantly slower:

* Set.new with no arguments (~1.6x)
* Set#compare_by_identity for small sets (~1.3x)
* Set#clone for small sets (~1.5x)
* Set#dup for small sets (~1.7x)

These are slower as Set does not currently use the AR table
optimization that Hash does, so a new set_table is initialized for
each call.  I'm not sure it's worth the complexity to have an AR
table-like optimization for small sets (for hashes it makes sense,
as small hashes are used everywhere in Ruby).

The rbs and repl_type_completor bundled gems will need updates to
support core Set.  The pull request marks them as allowed failures.

This passes all set tests with no changes.  The following specs
needed modification:

* Modifying frozen set error message (changed for the better)
* `Set#divide` when passed a 2-arity block no longer yields the same
  object as both the first and second argument (this seems like an issue
  with the previous implementation).
* Set-like objects that override `is_a?` such that `is_a?(Set)` return
  `true` are no longer treated as Set instances.
* `Set.allocate.hash` is no longer the same as `nil.hash`
* `Set#join` no longer calls `Set#to_a` (it calls the underlying C
   function).
* `Set#flatten_merge` protected method is not implemented.

Previously, `set.rb` added a `SortedSet` autoload, which loads
`set/sorted_set.rb`.  This replaces the `Set` autoload in `prelude.rb`
with a `SortedSet` autoload, but I recommend removing it and
`set/sorted_set.rb`.

This moves `test/set/test_set.rb` to `test/ruby/test_set.rb`,
reflecting that switch to a core class.  This does not move the spec
files, as I'm not sure how they should be handled.

Internally, this uses the st_* types and functions as much as
possible, and only adds set_* types and functions as needed.
The underlying set_table implementation is stored in st.c, but
there is no public C-API for it, nor is there one planned, in
order to keep the ability to change the internals going forward.

For internal uses of st_table with Qtrue values, those can
probably be replaced with set_table.  To do that, include
internal/set_table.h.  To handle symbol visibility (rb_ prefix),
internal/set_table.h uses the same macro approach that
include/ruby/st.h uses.

The Set class (rb_cSet) and all methods are defined in set.c.
There isn't currently a C-API for the Set class, though C-API
functions can be added as needed going forward.

Implements [Feature #21216]

Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
Co-authored-by: Oliver Nutter <mrnoname1000@riseup.net>
2025-04-26 10:31:11 +09:00
Nobuyoshi Nakada
b4417ff665 Add Encoding::UNICODE_VERSION constant 2025-04-23 14:14:36 +09:00
Takashi Kokubun
33a052486b Assert everything is compiled in test_zjit (https://github.com/Shopify/zjit/pull/40)
* Assert everything is compiled in test_zjit

* Update a comment on rb_zjit_assert_compiles

Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>

* Add a comment about assert_compiles

* Actually use pipe_fd

---------

Co-authored-by: Maxime Chevalier-Boisvert <maxime.chevalierboisvert@shopify.com>
2025-04-18 21:52:59 +09:00
Nobuyoshi Nakada
4a67ef09cc
[Feature #21116] Extract RJIT as a third-party gem 2025-02-13 18:01:03 +09:00
Takashi Kokubun
478e0fc710
YJIT: Replace Array#each only when YJIT is enabled (#11955)
* YJIT: Replace Array#each only when YJIT is enabled

* Add comments about BUILTIN_ATTR_C_TRACE

* Make Ruby Array#each available with --yjit as well

* Fix all paths that expect a C location

* Use method_basic_definition_p to detect patches

* Copy a comment about C_TRACE flag to compilers

* Rephrase a comment about add_yjit_hook

* Give METHOD_ENTRY_BASIC flag to Array#each

* Add --yjit-c-builtin option

* Allow inconsistent source_location in test-spec

* Refactor a check of BUILTIN_ATTR_C_TRACE

* Set METHOD_ENTRY_BASIC without touching vm->running
2024-11-04 11:14:28 -05:00
Takashi Kokubun
9838c443c4 Make builtin init ifdefs consistent 2024-10-25 17:46:49 -07:00
Jean Boussier
9594db0cf2 Implement Hash.new(capacity:)
[Feature #19236]

When building a large hash, pre-allocating it with enough
capacity can save many re-hashes and significantly improve
performance.

```
/opt/rubies/3.3.0/bin/ruby --disable=gems -rrubygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver \
	            --executables="compare-ruby::../miniruby-master -I.ext/common --disable-gem" \
	            --executables="built-ruby::./miniruby --disable-gem" \
	            --output=markdown --output-compare -v $(find ./benchmark -maxdepth 1 -name 'hash_new' -o -name '*hash_new*.yml' -o -name '*hash_new*.rb' | sort)
compare-ruby: ruby 3.4.0dev (2024-03-25T11:48:11Z master f53209f023) +YJIT dev [arm64-darwin23]
last_commit=[ruby/irb] Cache RDoc::RI::Driver.new (https://github.com/ruby/irb/pull/911)
built-ruby: ruby 3.4.0dev (2024-03-25T15:29:40Z hash-new-rb 77652b08a2) +YJIT dev [arm64-darwin23]
warming up...

|                    |compare-ruby|built-ruby|
|:-------------------|-----------:|---------:|
|new                 |      7.614M|    5.976M|
|                    |       1.27x|         -|
|new_with_capa_1k    |     13.931k|   15.698k|
|                    |           -|     1.13x|
|new_with_capa_100k  |     124.746|   148.283|
|                    |           -|     1.19x|
```
2024-07-08 12:24:33 +02:00
Peter Zhu
9cf754b648 Fix --debug=gc_stress flag
ruby_env_debug_option gets called after Init_gc_stress, so the
--debug=gc_stress flag never works.
2024-03-25 13:07:39 -04:00
Takashi Kokubun
9f0065a077
Initialize interrupt queue before signal handlers (#9196) 2023-12-11 21:12:08 -08:00
KJ Tsanaktsidis
f8effa209a Change the semantics of rb_postponed_job_register
Our current implementation of rb_postponed_job_register suffers from
some safety issues that can lead to interpreter crashes (see bug #1991).
Essentially, the issue is that jobs can be called with the wrong
arguments.

We made two attempts to fix this whilst keeping the promised semantics,
but:
  * The first one involved masking/unmasking when flushing jobs, which
    was believed to be too expensive
  * The second one involved a lock-free, multi-producer, single-consumer
    ringbuffer, which was too complex

The critical insight behind this third solution is that essentially the
only user of these APIs are a) internal, or b) profiling gems.

For a), none of the usages actually require variable data; they will
work just fine with the preregistration interface.

For b), generally profiling gems only call a single callback with a
single piece of data (which is actually usually just zero) for the life
of the program. The ringbuffer is complex because it needs to support
multi-word inserts of job & data (which can't be atomic); but nobody
actually even needs that functionality, really.

So, this comit:
  * Introduces a pre-registration API for jobs, with a GVL-requiring
    rb_postponed_job_prereigster, which returns a handle which can be
    used with an async-signal-safe rb_postponed_job_trigger.
  * Deprecates rb_postponed_job_register (and re-implements it on top of
    the preregister function for compatability)
  * Moves all the internal usages of postponed job register
    pre-registration
2023-12-10 15:00:37 +09:00
Kevin Newton
3d0a46796b Rename YARP symbols to prism 2023-09-27 13:57:38 -04:00
Peter Zhu
901d0b4125 Remove dead function Init_Method
Init_Method no longer has any code, so we can remove it.
2023-09-19 14:25:01 -04:00
Peter Zhu
1e7b67f733 [Feature #19730] Remove transient heap 2023-07-13 09:27:33 -04:00
Jemma Issroff
d53e1f42ff [Feature #19741] Add yarp to builds
Add yarp to common.mk and windows builds to enable us to run yarp
correctly with CI.
2023-06-21 11:25:39 -07:00
Peter Zhu
f98a7fd28d Move WeakMap and WeakKeyMap code to weakmap.c
These classes don't belong in gc.c as they're not actually part of the
GC. This commit refactors the code by moving all the code into a
weakmap.c file.
2023-03-10 09:32:10 -05:00
Takashi Kokubun
23ec248e48 s/mjit/rjit/ 2023-03-06 23:44:01 -08:00
Takashi Kokubun
2e875549a9 s/MJIT/RJIT/ 2023-03-06 23:44:01 -08:00
Takashi Kokubun
9f8f1afba2 Implement --mjit-stats 2023-03-05 22:11:20 -08:00
Takashi Kokubun
b2dcde839d MJIT: Merge mjit_compiler.rb into mjit.rb
There are too many mjit_compiler.* files. It was hard to find files.
2022-11-26 15:31:38 -08:00
Takashi Kokubun
e7443dbbca
Rewrite Symbol#to_sym and #intern in Ruby (#6683) 2022-11-15 21:34:30 -08:00
Jemma Issroff
5246f4027e Transition shape when object's capacity changes
This commit adds a `capacity` field to shapes, and adds shape
transitions whenever an object's capacity changes. Objects which are
allocated out of a bigger size pool will also make a transition from the
root shape to the shape with the correct capacity for their size pool
when they are allocated.

This commit will allow us to remove numiv from objects completely, and
will also mean we can guarantee that if two objects share shapes, their
IVs are in the same positions (an embedded and extended object cannot
share shapes). This will enable us to implement ivar sets in YJIT using
object shapes.

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
2022-11-10 10:11:34 -05:00
Jemma Issroff
ad63b668e2
Revert "Revert "This commit implements the Object Shapes technique in CRuby.""
This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
2022-10-11 08:40:56 -07:00
Aaron Patterson
9a6803c90b
Revert "This commit implements the Object Shapes technique in CRuby."
This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
2022-09-30 16:01:50 -07:00
Jemma Issroff
d594a5a8bd
This commit implements the Object Shapes technique in CRuby.
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects.  Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness").  Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree.  Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.

For example:

```ruby
class Foo
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

class Bar
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```

Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.

This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.

This commit also adds some methods for debugging shapes on objects.  See
`RubyVM::Shape` for more details.

For more context on Object Shapes, see [Feature: #18776]

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-28 08:26:21 -07:00
Aaron Patterson
06abfa5be6
Revert this until we can figure out WB issues or remove shapes from GC
Revert "* expand tabs. [ci skip]"

This reverts commit 830b5b5c351c5c6efa5ad461ae4ec5085e5f0275.

Revert "This commit implements the Object Shapes technique in CRuby."

This reverts commit 9ddfd2ca004d1952be79cf1b84c52c79a55978f4.
2022-09-26 16:10:11 -07:00
Jemma Issroff
9ddfd2ca00 This commit implements the Object Shapes technique in CRuby.
Object Shapes is used for accessing instance variables and representing the
"frozenness" of objects.  Object instances have a "shape" and the shape
represents some attributes of the object (currently which instance variables are
set and the "frozenness").  Shapes form a tree data structure, and when a new
instance variable is set on an object, that object "transitions" to a new shape
in the shape tree.  Each shape has an ID that is used for caching. The shape
structure is independent of class, so objects of different types can have the
same shape.

For example:

```ruby
class Foo
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

class Bar
  def initialize
    # Starts with shape id 0
    @a = 1 # transitions to shape id 1
    @b = 1 # transitions to shape id 2
  end
end

foo = Foo.new # `foo` has shape id 2
bar = Bar.new # `bar` has shape id 2
```

Both `foo` and `bar` instances have the same shape because they both set
instance variables of the same name in the same order.

This technique can help to improve inline cache hits as well as generate more
efficient machine code in JIT compilers.

This commit also adds some methods for debugging shapes on objects.  See
`RubyVM::Shape` for more details.

For more context on Object Shapes, see [Feature: #18776]

Co-Authored-By: Aaron Patterson <tenderlove@ruby-lang.org>
Co-Authored-By: Eileen M. Uchitelle <eileencodes@gmail.com>
Co-Authored-By: John Hawthorn <john@hawthorn.email>
2022-09-26 09:21:30 -07:00
Takashi Kokubun
f2bea691cd Builtin RubyVM::MJIT::C 2022-09-23 06:44:28 +09:00
Takashi Kokubun
0e816e6d30
Demote mjit_instruction.rb from builtin to stdlib 2022-09-18 14:04:20 +09:00
Takashi Kokubun
3767c6a90d
Ruby MJIT (#6028) 2022-09-04 21:53:46 -07:00
Jean Boussier
e3aabe93aa Implement Queue#pop(timeout: sec)
[Feature #18774]

As well as `SizedQueue#pop(timeout: sec)`

If both `non_block=true` and `timeout:` are supplied, ArgumentError
is raised.
2022-08-02 11:04:28 +02:00
Takashi Kokubun
23459e4dbb
Move RubyVM::MJIT to builtin Ruby
just less C code to maintain
2022-06-15 10:52:37 -07:00
Samuel Williams
4b89034218 IO::Buffer for scheduler interface. 2021-11-10 19:21:05 +13:00
Noah Gibbs
da305dd23e Match the main-branch location of yjit in inits.c 2021-10-20 18:19:43 -04:00
Jose Narvaez
4e2eb7695e Yet Another Ruby JIT!
Renaming uJIT to YJIT. AKA s/ujit/yjit/g.
2021-10-20 18:19:31 -04:00
Aaron Patterson
e427fdff0a Directly link libcapstone for easier development
This lets us use libcapstone directly from miniruby so we don't need a
Ruby Gem to to dev work.

Example usage:

```ruby
def foo(x)
  if x < 1
    "wow"
  else
    "neat"
  end
end

iseq = RubyVM::InstructionSequence.of(method(:foo))
puts UJIT.disasm(iseq)
100.times { foo 1 }
puts UJIT.disasm(iseq)
```

Then in the terminal

```
$ ./miniruby test.rb

== disasm: #<ISeq:foo@test.rb:1 (1,0)-(7,3)> (catch: FALSE)
local table (size: 1, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] x@0<Arg>
0000 getlocal_WC_0                          x@0                       (   2)[LiCa]
0002 putobject_INT2FIX_1_
0003 opt_lt                                 <calldata!mid:<, argc:1, ARGS_SIMPLE>
0005 branchunless                           10
0007 putstring                              "wow"                     (   3)[Li]
0009 leave                                                            (   7)[Re]
0010 putstring                              "neat"                    (   5)[Li]
0012 leave                                                            (   7)[Re]

== ISEQ RANGE: 10 -> 10 ========================================================
        0x0:    movabs  rax, 0x7fe816e2d1a0
        0xa:    mov     qword ptr [rdi], rax
        0xd:    mov     r8, rax
        0x10:   mov     r9, rax
        0x13:   mov     r11, r12
        0x16:   jmp     qword ptr [rax]
== ISEQ RANGE: 0 -> 7 ==========================================================
        0x0:    mov     rax, qword ptr [rdi + 0x20]
        0x4:    mov     rax, qword ptr [rax - 0x18]
        0x8:    mov     qword ptr [rdx], rax
        0xb:    mov     qword ptr [rdx + 8], 3
        0x13:   movabs  rax, 0x7fe817808200
        0x1d:   test    byte ptr [rax + 0x3e6], 1
        0x24:   jne     0x3ffff7b
        0x2a:   test    byte ptr [rdx], 1
        0x2d:   je      0x3ffff7b
        0x33:   test    byte ptr [rdx + 8], 1
        0x37:   je      0x3ffff7b
        0x3d:   mov     rax, qword ptr [rdx]
        0x40:   cmp     rax, qword ptr [rdx + 8]
        0x44:   movabs  rax, 0
        0x4e:   movabs  rcx, 0x14
        0x58:   cmovl   rax, rcx
        0x5c:   mov     qword ptr [rdx], rax
        0x5f:   test    qword ptr [rdx], -9
        0x66:   jne     0x3ffffd5
```

Make sure to `brew install pkg-config capstone`
2021-10-20 18:19:27 -04:00
Jean Boussier
afcbb501ac marshal.c Marshal.load accepts a freeze: true option.
Fixes [Feature #18148]

When set, all the loaded objects are returned as frozen.

If a proc is provided, it is called with the objects already frozen.
2021-10-05 18:34:56 +02:00
S.H
28b481938b
Implemented some NilClass method in Ruby code is faster [Feature #17054] (#3366) 2021-06-02 20:04:56 -07:00
Samuel Williams
5f69a7f604
Expose scheduler as public interface & bug fixes. (#3945)
* Rename `rb_scheduler` to `rb_fiber_scheduler`.

* Use public interface if available.

* Use `rb_check_funcall` where possible.

* Don't use `unblock` unless the fiber was non-blocking.
2021-02-09 19:39:56 +13:00
S.H
daec5f9edc
Improve performance some Float methods [Feature #17498] (#4018) 2021-01-01 18:39:07 -08:00
Nobuyoshi Nakada
93735f8fc0
Moved time.rb to timev.rb 2020-12-31 17:23:37 +09:00
Nobuyoshi Nakada
d5fb51d2d3
Add time.rb as builtin 2020-12-31 15:19:06 +09:00
Kenta Murata
890bc2cdde
Buffer protocol proposal (#3261)
* Add buffer protocol

* Modify for some review comments

* Per-object buffer availability

* Rename to MemoryView from Buffer and make compilable

* Support integral repeat count in memory view format

* Support 'x' for padding bytes

* Add rb_memory_view_parse_item_format

* Check type in rb_memory_view_register

* Update dependencies in common.mk

* Add test of MemoryView

* Add test of rb_memory_view_init_as_byte_array

* Add native size format test

* Add MemoryView test utilities

* Add test of rb_memory_view_fill_contiguous_strides

* Skip spaces in format string

* Support endianness specifiers

* Update documentation

* Support alignment

* Use RUBY_ALIGNOF

* Fix format parser to follow the pack format

* Support the _ modifier

* Parse count specifiers in get_format_size function.

* Use STRUCT_ALIGNOF

* Fix test

* Fix test

* Fix total size for the case with tail padding

* Fix rb_memory_view_get_item_pointer

* Fix rb_memory_view_parse_item_format again
2020-09-25 20:32:02 +09:00
Samuel Williams
d387029f39 Standardised scheduler interface. 2020-09-14 16:44:09 +12:00
Matt Valentine-House
ef22af4db0 If the GC runs before the Mutex's are initialised then we get a crash in pthread_mutex_lock.
It is possible for GC to run during initialisation due to objects being allocated
2020-09-10 08:48:51 -07:00
Koichi Sasada
79df14c04b Introduce Ractor mechanism for parallel execution
This commit introduces Ractor mechanism to run Ruby program in
parallel. See doc/ractor.md for more details about Ractor.
See ticket [Feature #17100] to see the implementation details
and discussions.

[Feature #17100]

This commit does not complete the implementation. You can find
many bugs on using Ractor. Also the specification will be changed
so that this feature is experimental. You will see a warning when
you make the first Ractor with `Ractor.new`.

I hope this feature can help programmers from thread-safety issues.
2020-09-03 21:11:06 +09:00
Takashi Kokubun
95b0fed371
Make Integer#zero? a separated method and builtin (#3226)
A prerequisite to fix https://bugs.ruby-lang.org/issues/15589 with JIT.
This commit alone doesn't make a significant difference yet, but I thought
this commit should be committed independently.

This method override was discussed in [Misc #16961].
2020-06-20 14:55:09 -07:00
Nobuyoshi Nakada
2c3c6c96cf
Defer initialization
Defer initialization of extension libraries, loading prelude files
and requiring files, and skip if dump options are given.
2020-05-16 17:37:28 +09:00
Nobuyoshi Nakada
310054b240 Moved Dir.open and Dir#initialize to dir.rb 2020-04-06 22:22:25 +09:00
S.H
290d608637
support builtin for Kernel#clone 2020-03-17 19:37:07 +09:00