1berry/ruby - ruby - Gitea : Git Mirror

Author	SHA1	Message	Date
Yusuke Endoh	993fd96ce6	reject numbered parameters from Binding#local_variables Also, Binding#local_variable_get and #local_variable_set rejects an access to numbered parameters. [Bug #20965] [Bug #21049]	2025-02-18 16:23:24 +09:00
Nobuyoshi Nakada	4a67ef09cc	[Feature #21116 ] Extract RJIT as a third-party gem	2025-02-13 18:01:03 +09:00
Nobuyoshi Nakada	f7059af50a	Use no-inline version `rb_current_ec` on Arm64 The TLS across .so issue seems related to Arm64, but not Darwin.	2025-01-17 22:48:10 +09:00
Peter Zhu	f65a6c090c	Fix use-after-free in constant cache [Bug #20921] When we create a cache entry for a constant, the following sequence of events could happen: - vm_track_constant_cache is called to insert a constant cache. - In vm_track_constant_cache, we first look up the ST table for the ID of the constant. Assume the ST table exists because another iseq also holds a cache entry for this ID. - We then insert into this ST table with the iseq_inline_constant_cache. - However, while inserting into this ST table, it allocates memory, which could trigger a GC. Assume that it does trigger a GC. - The GC frees the one and only other iseq that holds a cache entry for this ID. - In remove_from_constant_cache, it will appear that the ST table is now empty because there are no more iseq with cache entries for this ID, so we free the ST table. - We complete GC and continue our st_insert. However, this ST table has been freed so we now have a use-after-free. This issue is very hard to reproduce, because it requires that the GC runs at a very specific time. However, we can make it show up by applying this patch which runs GC right before the st_insert to mimic the st_insert triggering a GC: diff --git a/vm_insnhelper.c b/vm_insnhelper.c index 3cb23f06f0..a93998136a 100644 --- a/vm_insnhelper.c +++ b/vm_insnhelper.c @@ -6338,6 +6338,10 @@ vm_track_constant_cache(ID id, void *ic) rb_id_table_insert(const_cache, id, (VALUE)ics); } + if (id == rb_intern("MyConstant")) rb_gc(); + st_insert(ics, (st_data_t) ic, (st_data_t) Qtrue); } And if we run this script: Object.const_set("MyConstant", "Hello!") my_proc = eval("-> { MyConstant }") my_proc.call my_proc = eval("-> { MyConstant }") my_proc.call We can see that ASAN outputs a use-after-free error: ==36540==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000049528 at pc 0x000102f3ceac bp 0x00016d607a70 sp 0x00016d607a68 READ of size 8 at 0x606000049528 thread T0 #0 0x102f3cea8 in do_hash st.c:321 #1 0x102f3ddd0 in rb_st_insert st.c:1132 #2 0x103140700 in vm_track_constant_cache vm_insnhelper.c:6345 #3 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #4 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #5 0x1030bc1e0 in vm_exec_core insns.def:263 #6 0x1030b55fc in rb_vm_exec vm.c:2585 #7 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #8 0x102a82588 in rb_ec_exec_node eval.c:281 #9 0x102a81fe0 in ruby_run_node eval.c:319 #10 0x1027f3db4 in rb_main main.c:43 #11 0x1027f3bd4 in main main.c:68 #12 0x183900270 (<unknown module>) 0x606000049528 is located 8 bytes inside of 56-byte region [0x606000049520,0x606000049558) freed by thread T0 here: #0 0x104174d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40) #1 0x102ada89c in rb_gc_impl_free default.c:8183 #2 0x102ada7dc in ruby_sized_xfree gc.c:4507 #3 0x102ac4d34 in ruby_xfree gc.c:4518 #4 0x102f3cb34 in rb_st_free_table st.c:663 #5 0x102bd52d8 in remove_from_constant_cache iseq.c:119 #6 0x102bbe2cc in iseq_clear_ic_references iseq.c:153 #7 0x102bbd2a0 in rb_iseq_free iseq.c:166 #8 0x102b32ed0 in rb_imemo_free imemo.c:564 #9 0x102ac4b44 in rb_gc_obj_free gc.c:1407 #10 0x102af4290 in gc_sweep_plane default.c:3546 #11 0x102af3bdc in gc_sweep_page default.c:3634 #12 0x102aeb140 in gc_sweep_step default.c:3906 #13 0x102aeadf0 in gc_sweep_rest default.c:3978 #14 0x102ae4714 in gc_sweep default.c:4155 #15 0x102af8474 in gc_start default.c:6484 #16 0x102afbe30 in garbage_collect default.c:6363 #17 0x102ad37f0 in rb_gc_impl_start default.c:6816 #18 0x102ad3634 in rb_gc gc.c:3624 #19 0x1031406ec in vm_track_constant_cache vm_insnhelper.c:6342 #20 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #21 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #22 0x1030bc1e0 in vm_exec_core insns.def:263 #23 0x1030b55fc in rb_vm_exec vm.c:2585 #24 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #25 0x102a82588 in rb_ec_exec_node eval.c:281 #26 0x102a81fe0 in ruby_run_node eval.c:319 #27 0x1027f3db4 in rb_main main.c:43 #28 0x1027f3bd4 in main main.c:68 #29 0x183900270 (<unknown module>) previously allocated by thread T0 here: #0 0x104174c04 in malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54c04) #1 0x102ada0ec in rb_gc_impl_malloc default.c:8198 #2 0x102acee44 in ruby_xmalloc gc.c:4438 #3 0x102f3c85c in rb_st_init_table_with_size st.c:571 #4 0x102f3c900 in rb_st_init_table st.c:600 #5 0x102f3c920 in rb_st_init_numtable st.c:608 #6 0x103140698 in vm_track_constant_cache vm_insnhelper.c:6337 #7 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #8 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #9 0x1030bc1e0 in vm_exec_core insns.def:263 #10 0x1030b55fc in rb_vm_exec vm.c:2585 #11 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #12 0x102a82588 in rb_ec_exec_node eval.c:281 #13 0x102a81fe0 in ruby_run_node eval.c:319 #14 0x1027f3db4 in rb_main main.c:43 #15 0x1027f3bd4 in main main.c:68 #16 0x183900270 (<unknown module>) This commit fixes this bug by adding a inserting_constant_cache_id field to the VM, which stores the ID that is currently being inserted and, in remove_from_constant_cache, we don't free the ST table for ID equal to this one. Co-Authored-By: Alan Wu <alanwu@ruby-lang.org>	2024-11-29 10:46:43 -05:00
Randy Stauner	1dd40ec18a	Optimize instructions when creating an array just to call `include?` (#12123 ) * Add opt_duparray_send insn to skip the allocation on `#include?` If the method isn't going to modify the array we don't need to copy it. This avoids the allocation / array copy for things like `[:a, :b].include?(x)`. This adds a BOP for include? and tracks redefinition for it on Array. Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com> * YJIT: Implement opt_duparray_send include_p Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com> * Update opt_newarray_send to support simple forms of include?(arg) Similar to opt_duparray_send but for non-static arrays. * YJIT: Implement opt_newarray_send include_p --------- Co-authored-by: Andrew Novoselac <andrew.novoselac@shopify.com>	2024-11-26 14:31:08 -05:00
Koichi Sasada	aa63699d10	support `require` in non-main Ractors Many libraries should be loaded on the main ractor because of setting constants with unshareable objects and so on. This patch allows to call `requore` on non-main Ractors by asking the main ractor to call `require` on it. The calling ractor waits for the result of `require` from the main ractor. If the `require` call failed with some reasons, an exception objects will be deliverred from the main ractor to the calling ractor if it is copy-able. Same on `require_relative` and `require` by `autoload`. Now `Ractor.new{pp obj}` works well (the first call of `pp` requires `pp` library implicitly). [Feature #20627]	2024-11-08 18:02:46 +09:00
Koichi Sasada	c8297c3eed	`interrupt_exec` introduce - rb_threadptr_interrupt_exec - rb_ractor_interrupt_exec to intercept the thread/ractor execution.	2024-11-08 18:02:46 +09:00
Koichi Sasada	ab7ab9e450	`Warning[:strict_unused_block]` to show unused block warning strictly. ```ruby class C def f = nil end class D def f = yield end [C.new, D.new].each{\|obj\| obj.f{}} ``` In this case, `D#f` accepts a block. However `C#f` doesn't accept a block. There are some cases passing a block with `obj.f{}` where `obj` is `C` or `D`. To avoid warnings on such cases, "unused block warning" will be warned only if there is not same name which accepts a block. On the above example, `C.new.f{}` doesn't show any warnings because there is a same name `D#f` which accepts a block. We call this default behavior as "relax mode". `strict_unused_block` new warning category changes from "relax mode" to "strict mode", we don't check same name methods and `C.new.f{}` will be warned. [Feature #15554]	2024-11-06 11:06:18 +09:00
Takashi Kokubun	478e0fc710	YJIT: Replace Array#each only when YJIT is enabled (#11955 ) * YJIT: Replace Array#each only when YJIT is enabled * Add comments about BUILTIN_ATTR_C_TRACE * Make Ruby Array#each available with --yjit as well * Fix all paths that expect a C location * Use method_basic_definition_p to detect patches * Copy a comment about C_TRACE flag to compilers * Rephrase a comment about add_yjit_hook * Give METHOD_ENTRY_BASIC flag to Array#each * Add --yjit-c-builtin option * Allow inconsistent source_location in test-spec * Refactor a check of BUILTIN_ATTR_C_TRACE * Set METHOD_ENTRY_BASIC without touching vm->running	2024-11-04 11:14:28 -05:00
Samuel Williams	4031beb083	Add documentation for `RUBY_ASSERT_CRITICAL_SECTION`. (#11982 )	2024-11-02 14:44:11 +13:00
Peter Zhu	645a0c9ea7	Remove vm_assert_env	2024-10-31 13:52:24 -04:00
Peter Zhu	843b4f49ee	Fix assertion when envval of proc is Qundef The following code crashes with assertions enabled because envval could be Qundef: {}.to_proc.dup	2024-10-31 13:52:24 -04:00
John Hawthorn	7be9a333ca	YJIT: Allow shareable consts in multi-ractor mode (#11917 ) * Update yjit-bindgen deps * YJIT: Allow shareable consts in multi-ractor mode * Update yjit/src/codegen.rs Co-authored-by: Alan Wu <XrXr@users.noreply.github.com> --------- Co-authored-by: Alan Wu <XrXr@users.noreply.github.com>	2024-10-18 15:01:45 -04:00
Peter Zhu	3c63a01295	Move responsibility of heap walking into Ruby This commit removes the need for the GC implementation to implement heap walking and instead Ruby will implement it.	2024-09-03 10:05:38 -04:00
Nobuyoshi Nakada	21dfe34aae	Stringize VM_ASSERT expression before expansion	2024-08-16 16:55:51 +09:00
Yusuke Endoh	ac5ac48a36	Revert 28a1c4f33e3349a98c04b8e068d9c674eb936064 28a1c4f33e3349a98c04b8e068d9c674eb936064 seems to call an improper ensure clause. [Bug #20655] Than fixing it properly, I bet it would be much better to simply revert that commit. It reduces the unneeded complexity. Jumping into a block called by a C function like Hash#each with callcc is user's fault. It does not need serious support.	2024-07-30 15:31:24 +09:00
Randy Stauner	acbb8d4fb5	Expand opt_newarray_send to support Array#pack with buffer keyword arg Use an enum for the method arg instead of needing to add an id that doesn't map to an actual method name. $ ruby --dump=insns -e 'b = "x"; [v].pack("E", buffer: b)' before: ``` == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,34)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] b@0 0000 putchilledstring "x" ( 1)[Li] 0002 setlocal_WC_0 b@0 0004 putself 0005 opt_send_without_block <calldata!mid:v, argc:0, FCALL\|VCALL\|ARGS_SIMPLE> 0007 newarray 1 0009 putchilledstring "E" 0011 getlocal_WC_0 b@0 0013 opt_send_without_block <calldata!mid:pack, argc:2, kw:[#<Symbol:0x000000000023110c>], KWARG> 0015 leave ``` after: ``` == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,34)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] b@0 0000 putchilledstring "x" ( 1)[Li] 0002 setlocal_WC_0 b@0 0004 putself 0005 opt_send_without_block <calldata!mid:v, argc:0, FCALL\|VCALL\|ARGS_SIMPLE> 0007 putchilledstring "E*" 0009 getlocal b@0, 0 0012 opt_newarray_send 3, 5 0015 leave ```	2024-07-29 16:26:58 -04:00
Peter Zhu	ae8ef06580	[Feature #20470 ] Implement support for USE_SHARED_GC This commit implements support to load Ruby's current GC as a DSO.	2024-07-03 09:03:40 -04:00
Peter Zhu	51bd816517	[Feature #20470 ] Split GC into gc_impl.c This commit splits gc.c into two files: - gc.c now only contains code not specific to Ruby GC. This includes code to mark objects (which the GC implementation may choose not to use) and wrappers for internal APIs that the implementation may need to use (e.g. locking the VM). - gc_impl.c now contains the implementation of Ruby's GC. This includes marking, sweeping, compaction, and statistics. Most importantly, gc_impl.c only uses public APIs in Ruby and a limited set of functions exposed in gc.c. This allows us to build gc_impl.c independently of Ruby and plug Ruby's GC into itself.	2024-07-03 09:03:40 -04:00
yui-knk	9d76a0ab4a	Add RB_GC_GUARD for ast_value I think this change fixes the following assertion failure: ``` [BUG] unexpected rb_parser_ary_data_type (2114076960) for script lines ``` It seems that `ast_value` is collected then `rb_parser_build_script_lines_from` touches invalid memory address. This change prevents `ast_value` from being collected by RB_GC_GUARD.	2024-06-30 09:20:38 +09:00
Peter Zhu	90763e04ba	Load external GC using command line argument This commit changes the external GC to be loaded with the `--gc-library` command line argument instead of the RUBY_GC_LIBRARY_PATH environment variable because @nobu pointed out that loading binaries using environment variables can pose a security risk.	2024-06-21 11:49:01 -04:00
Aaron Patterson	cdf33ed5f3	Optimized forwarding callers and callees This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls. Calls it optimizes look like this: ```ruby def bar(a) = a def foo(...) = bar(...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) = bar(1, 2, ...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) list = [1, 2] bar(list, ...) # optimized end foo(123) ``` All variants of the above but using `super` are also optimized, including a bare super like this: ```ruby def foo(...) super end ``` This patch eliminates intermediate allocations made when calling methods that accept `...`. We can observe allocation elimination like this: ```ruby def m x = GC.stat(:total_allocated_objects) yield GC.stat(:total_allocated_objects) - x end def bar(a) = a def foo(...) = bar(...) def test m { foo(123) } end test p test # allocates 1 object on master, but 0 objects with this patch ``` ```ruby def bar(a, b:) = a + b def foo(...) = bar(...) def test m { foo(1, b: 2) } end test p test # allocates 2 objects on master, but 0 objects with this patch ``` How does it work? ----------------- This patch works by using a dynamic stack size when passing forwarded parameters to callees. The caller's info object (known as the "CI") contains the stack size of the parameters, so we pass the CI object itself as a parameter to the callee. When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee. The CI at the forwarded call site is adjusted using information from the caller's CI. I think this description is kind of confusing, so let's walk through an example with code. ```ruby def delegatee(a, b) = a + b def delegator(...) delegatee(...) # CI2 (FORWARDING) end def caller delegator(1, 2) # CI1 (argc: 2) end ``` Before we call the delegator method, the stack looks like this: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| 5\| delegatee(...) # CI2 (FORWARDING) \| 6\| end \| 7\| \| 8\| def caller \| -> 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in to `delegator`, it writes `CI1` on to the stack as a local variable for the `delegator` method. The `delegator` method has a special local called `...` that holds the caller's CI object. Here is the ISeq disasm fo `delegator`: ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` The local called `...` will contain the caller's CI: CI1. Here is the stack when we enter `delegator`: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 -> 4\| # \| CI1 (argc: 2) 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to memcopy the caller's stack before calling `delegatee`. In this case, it will memcopy self, 1, and 2 to the stack before calling `delegatee`. It knows how much memory to copy from the caller because `CI1` contains stack size information (argc: 2). Before executing the `send` instruction, we push `...` on the stack. The `send` instruction pops `...`, and because it is tagged with `FORWARDING`, it knows to memcopy (using the information in the CI it just popped): ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` Instruction 001 puts the caller's CI on the stack. `send` is tagged with FORWARDING, so it reads the CI and _copies_ the callers stack to this stack: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| CI1 (argc: 2) -> 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| self 9\| delegator(1, 2) # CI1 (argc: 2) \| 1 10\| end \| 2 ``` The "FORWARDING" call site combines information from CI1 with CI2 in order to support passing other values in addition to the `...` value, as well as perfectly forward splat args, kwargs, etc. Since we're able to copy the stack from `caller` in to `delegator`'s stack, we can avoid allocating objects. I want to do this to eliminate object allocations for delegate methods. My long term goal is to implement `Class#new` in Ruby and it uses `...`. I was able to implement `Class#new` in Ruby [here](https://github.com/ruby/ruby/pull/9289). If we adopt the technique in this patch, then we can optimize allocating objects that take keyword parameters for `initialize`. For example, this code will allocate 2 objects: one for `SomeObject`, and one for the kwargs: ```ruby SomeObject.new(foo: 1) ``` If we combine this technique, plus implement `Class#new` in Ruby, then we can reduce allocations for this common operation. Co-Authored-By: John Hawthorn <john@hawthorn.email> Co-Authored-By: Alan Wu <XrXr@users.noreply.github.com>	2024-06-18 09:28:25 -07:00
yui-knk	899d9f79dd	Rename `vast` to `ast_value` There is an English word "vast". This commit changes the name to be more clear name to avoid confusion.	2024-05-03 12:40:35 +09:00
HASUMI Hitoshi	2244c58b00	[Universal parser] Decouple IMEMO from rb_ast_t This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object. ## Background We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby. To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE. ## Summary (file by file) - `rubyparser.h` - Remove the `VALUE flags` member from `rb_ast_t` - `ruby_parser.c` and `internal/ruby_parser.h` - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()` - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t ` to `VALUE` - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c - `iseq.c` and `vm_core.h` - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t ` to `VALUE` - This keeps the VALUE of AST on the machine stack to prevent being removed by GC - `ast.c` - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff) - Fix `node_memsize()` - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines - `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c` - Follow-up due to the above changes - `imemo.{c\|h}` - If an object with `imemo_ast` appears, considers it a bug Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>	2024-04-26 11:21:08 +09:00
Peter Zhu	f248e1008a	Embed rb_gc_function_map_t in rb_vm_t Avoids a pointer indirection and memory allocation.	2024-04-25 09:25:33 -04:00
Peter Zhu	480287d140	Add macro load_external_gc_func for loading functions from external GC	2024-04-24 10:31:03 -04:00
Koichi Sasada	662ce928a7	`RUBY_TRY_UNUSED_BLOCK_WARNING_STRICT` `RUBY_TRY_UNUSED_BLOCK_WARNING_STRICT=1 ruby ...` will enable strict check for unused block warning. This option is only for trial to compare the results so the envname is not considered well. Should be removed before Ruby 3.4.0 release.	2024-04-19 14:28:54 +09:00
Koichi Sasada	e9d7478ded	relax unused block warning for duck typing if a method `foo` uses a block, other (unrelated) method `foo` can receives a block. So try to relax the unused block warning condition. ```ruby class C0 def f = yield end class C1 < C0 def f = nil end [C0, C1].f{ block } # do not warn ```	2024-04-17 20:26:49 +09:00
Matt Valentine-House	065710c0f5	Initialize external GC Library Co-Authored-By: Peter Zhu <peter@peterzhu.ca>	2024-04-15 19:50:47 +01:00
HASUMI Hitoshi	9b1e97b211	[Universal parser] DeVALUE of p->debug_lines and ast->body.script_lines This patch is part of universal parser work. ## Summary - Decouple VALUE from members below: - `(struct parser_params )->debug_lines` - `(rb_ast_t )->body.script_lines` - Instead, they are now `rb_parser_ary_t ` - They can also be a `(VALUE)FIXNUM` as before to hold line count - `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE - In order to do this, - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()` - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t ` into `VALUE` ## Other details - Extend `rb_parser_ary_t `. It previously could only store `rb_parser_ast_token `, now can store script_lines, too - Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()` - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]` - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines` - Remove the second parameter of `rb_parser_set_script_lines()` to make it simple - Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines - Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called - With regard to this, please see Future tasks below ## Future tasks - Decouple IMEMO from `rb_ast_t *` - This lifts the five-members-restriction of Ruby object, - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple	2024-04-15 20:51:54 +09:00
Koichi Sasada	9180e33ca3	show warning for unused block With verbopse mode (-w), the interpreter shows a warning if a block is passed to a method which does not use the given block. Warning on: * the invoked method is written in C * the invoked method is not `initialize` * not invoked with `super` * the first time on the call-site with the invoked method (`obj.foo{}` will be warned once if `foo` is same method) [Feature #15554] `Primitive.attr! :use_block` is introduced to declare that primitive functions (written in C) will use passed block. For minitest, test needs some tweak, so use `ea9caafc07` for `test-bundled-gems`.	2024-04-15 12:08:07 +09:00
KJ Tsanaktsidis	2535a09e85	Check ASAN fake stacks when marking non-current threads Currently, we check the values on the machine stack & register state to see if they're actually a pointer to an ASAN fake stack, and mark the values on the fake stack too if required. However, we are only doing that for the _current_ thread (the one actually running the GC), not for any other thread in the program. Make rb_gc_mark_machine_context (which is called for marking non-current threads) perform the same ASAN fake stack handling that mark_current_machine_context performs. [Bug #20310]	2024-03-25 14:57:04 +11:00
KJ Tsanaktsidis	48d3bdddba	Move asan_fake_stack_handle to EC, not thread It's really a property of the EC; each fiber (which has its own EC) also has its own asan_fake_stack_handle. [Bug #20310]	2024-03-25 14:57:04 +11:00
Peter Zhu	c2170e5c2b	Fix typo from gloabl_object_list to global_object_list	2024-03-14 13:52:20 -04:00
Peter Zhu	4559a161af	Move gloabl_object_list from objspace to VM This is to be consistent with the mark_object_ary that is in the VM.	2024-03-14 13:29:59 -04:00
Jean Boussier	d4f3dcf4df	Refactor VM root modules This `st_table` is used to both mark and pin classes defined from the C API. But `vm->mark_object_ary` already does both much more efficiently. Currently a Ruby process starts with 252 rooted classes, which uses `7224B` in an `st_table` or `2016B` in an `RArray`. So a baseline of 5kB saved, but since `mark_object_ary` is preallocated with `1024` slots but only use `405` of them, it's a net `7kB` save. `vm->mark_object_ary` is also being refactored. Prior to this changes, `mark_object_ary` was a regular `RArray`, but since this allows for references to be moved, it was marked a second time from `rb_vm_mark()` to pin these objects. This has the detrimental effect of marking these references on every minors even though it's a mostly append only list. But using a custom TypedData we can save from having to mark all the references on minor GC runs. Addtionally, immediate values are now ignored and not appended to `vm->mark_object_ary` as it's just wasted space.	2024-03-06 15:33:43 -05:00
Kevin Newton	0f1ca9492c	[PRISM] Provide runtime flag for prism in iseq	2024-02-21 11:44:40 -05:00
Peter Zhu	330830dd1a	Add IMEMO_NEW Rather than exposing that an imemo has a flag and four fields, this changes the implementation to only expose one field (the klass) and fills the rest with 0. The type will have to fill in the values themselves.	2024-02-21 11:33:05 -05:00
John Hawthorn	1c97abaaba	De-dup identical callinfo objects Previously every call to vm_ci_new (when the CI was not packable) would result in a different callinfo being returned this meant that every kwarg callsite had its own CI. When calling, different CIs result in different CCs. These CIs and CCs both end up persisted on the T_CLASS inside cc_tbl. So in an eval loop this resulted in a memory leak of both types of object. This also likely resulted in extra memory used, and extra time searching, in non-eval cases. For simplicity in this commit I always allocate a CI object inside rb_vm_ci_lookup, but ideally we would lazily allocate it only when needed. I hope to do that as a follow up in the future.	2024-02-20 18:55:00 -08:00
Nobuyoshi Nakada	f3cc1f9a70	Show actual imemo type when unexpected type	2024-02-08 18:08:42 +09:00
Jeremy Evans	0f90a24a81	Introduce Allocationless Anonymous Splat Forwarding Ruby makes it easy to delegate all arguments from one method to another: ```ruby def f(args, kw) g(args, *kw) end ``` Unfortunately, this indirection decreases performance. One reason it decreases performance is that this allocates an array and a hash per call to `f`, even if `args` and `kw` are not modified. Due to Ruby's ability to modify almost anything at runtime, it's difficult to avoid the array allocation in the general case. For example, it's not safe to avoid the allocation in a case like this: ```ruby def f(args, *kw) foo(bar) g(args, *kw) end ``` Because `foo` may be `eval` and `bar` may be a string referencing `args` or `kw`. To fix this correctly, you need to perform something similar to escape analysis on the variables. However, there is a case where you can avoid the allocation without doing escape analysis, and that is when the splat variables are anonymous: ```ruby def f(, *) g(, *) end ``` When splat variables are anonymous, it is not possible to reference them directly, it is only possible to use them as splats to other methods. Since that is the case, if `f` is called with a regular splat and a keyword splat, it can pass the arguments directly to `g` without copying them, avoiding allocation. For example: ```ruby def g(a, b:) a + b end def f(, *) g(, *) end a = [1] kw = {b: 2} f(a, **kw) ``` I call this technique: Allocationless Anonymous Splat Forwarding. This is implemented using a couple additional iseq param flags, anon_rest and anon_kwrest. If anon_rest is set, and an array splat is passed when calling the method when the array splat can be used without modification, `setup_parameters_complex` does not duplicate it. Similarly, if anon_kwest is set, and a keyword splat is passed when calling the method, `setup_parameters_complex` does not duplicate it.	2024-01-24 18:25:55 -08:00
Takashi Kokubun	27c1dd8634	YJIT: Allow inlining ISEQ calls with a block (#9622 ) * YJIT: Allow inlining ISEQ calls with a block * Leave a TODO comment about u16 inline_block	2024-01-23 19:36:23 +00:00
Nobuyoshi Nakada	127b19ab56	Use line numbers as builtin-index The order of iseq may differ from the order of tokens, typically `while`/`until` conditions are put after the body. These orders can match by using line numbers as builtin-indexes, but at the same time, it introduces the restriction that multiple `cexpr!` and `cstmt!` cannot appear in the same line. Another possible idea is to use `RubyVM::AbstractSyntaxTree` and `node_id` instead of ripper, with making BASERUBY 3.1 or later.	2024-01-22 19:39:34 +09:00
KJ Tsanaktsidis	61da90c1b8	Mark asan fake stacks during machine stack marking ASAN leaves a pointer to the fake frame on the stack; we can use the __asan_addr_is_in_fake_stack API to work out the extent of the fake stack and thus mark any VALUEs contained therein. [Bug #20001]	2024-01-19 09:55:12 +11:00
KJ Tsanaktsidis	807714447e	Pass down "stack start" variables from closer to the top of the stack This commit changes how stack extents are calculated for both the main thread and other threads. Ruby uses the address of a local variable as part of the calculation for machine stack extents: * pthreads uses it as a lower-bound on the start of the stack, because glibc (and maybe other libcs) can store its own data on the stack before calling into user code on thread creation. * win32 uses it as an argument to VirtualQuery, which gets the extent of the memory mapping which contains the variable However, the local being used for this is actually too low (too close to the leaf function call) in both the main thread case and the new thread case. In the main thread case, we have the `INIT_STACK` macro, which is used for pthreads to set the `native_main_thread->stack_start` value. This value is correctly captured at the very top level of the program (in main.c). However, this is _not_ what's used to set the execution context machine stack (`th->ec->machine_stack.stack_start`); that gets set as part of a call to `ruby_thread_init_stack` in `Init_BareVM`, using the address of a local variable allocated _inside_ `Init_BareVM`. This is too low; we need to use a local allocated closer to the top of the program. In the new thread case, the lolcal is allocated inside `native_thread_init_stack`, which is, again, too low. In both cases, this means that we might have VALUEs lying outside the bounds of `th->ec->machine.stack_{start,end}`, which won't be marked correctly by the GC machinery. To fix this, * In the main thread case: We already have `INIT_STACK` at the right level, so just pass that local var to `ruby_thread_init_stack`. * In the new thread case: Allocate the local one level above the call to `native_thread_init_stack` in `call_thread_start_func2`. [Bug #20001] fix	2024-01-19 09:55:12 +11:00
Takashi Kokubun	8642a573e6	Rename BUILTIN_ATTR_SINGLE_NOARG_INLINE to BUILTIN_ATTR_SINGLE_NOARG_LEAF The attribute was created when the other attribute was called BUILTIN_ATTR_INLINE. Now that the original attribute is renamed to BUILTIN_ATTR_LEAF, it's only confusing that we call it "_INLINE".	2024-01-16 17:31:27 -08:00
Takashi Kokubun	e37a37e696	Drop obsoleted BUILTIN_ATTR_NO_GC attribute The thing that has used this in the past was very buggy, and we've never revisied it. Let's remove it until we need it again.	2024-01-16 17:27:53 -08:00
KJ Tsanaktsidis	396e94666b	Revert "Pass down "stack start" variables from closer to the top of the stack" This reverts commit 4ba8f0dc993953d3ddda6328e3ef17a2fc2cbde5.	2024-01-12 17:58:54 +11:00
KJ Tsanaktsidis	688a6ff510	Revert "Mark asan fake stacks during machine stack marking" This reverts commit d10bc3a2b8300cffc383e10c3730871e851be24c.	2024-01-12 17:58:54 +11:00
KJ Tsanaktsidis	d10bc3a2b8	Mark asan fake stacks during machine stack marking ASAN leaves a pointer to the fake frame on the stack; we can use the __asan_addr_is_in_fake_stack API to work out the extent of the fake stack and thus mark any VALUEs contained therein. [Bug #20001]	2024-01-12 17:29:48 +11:00

1 2 3 4 5 ...

900 Commits