137 Commits

Author SHA1 Message Date
Jeremy Evans
e4f85bfc31 Implement Set as a core class
Set has been an autoloaded standard library since Ruby 3.2.
The standard library Set is less efficient than it could be, as it
uses Hash for storage, which stores unnecessary values for each key.

Implementation details:

* Core Set uses a modified version of `st_table`, named `set_table`.
  than `s/st_/set_/`, the main difference is that the stored records
  do not have values, making them 1/3 smaller. `st_table_entry` stores
  `hash`, `key`, and `record` (value), while `set_table_entry` only
  stores `hash` and `key`.  This results in large sets using ~33% less
  memory compared to stdlib Set.  For small sets, core Set uses 12% more
  memory (160 byte object slot and 64 malloc bytes, while stdlib set
  uses 40 for Set and 160 for Hash).  More memory is used because
  the set_table is embedded and 72 bytes in the object slot are
  currently wasted. Hopefully we can make this more efficient and have
  it stored in an 80 byte object slot in the future.

* All methods are implemented as cfuncs, except the pretty_print
  methods, which were moved to `lib/pp.rb` (which is where the
  pretty_print methods for other core classes are defined).  As is
  typical for core classes, internal calls call C functions and
  not Ruby methods.  For example, to check if something is a Set,
  `rb_obj_is_kind_of` is used, instead of calling `is_a?(Set)` on the
  related object.

* Almost all methods use the same algorithm that the pure-Ruby
  implementation used.  The exception is when calling `Set#divide` with a
  block with 2-arity.  The pure-Ruby method used tsort to implement this.
  I developed an algorithm that only allocates a single intermediate
  hash and does not need tsort.

* The `flatten_merge` protected method is no longer necessary, so it
  is not implemented (it could be).

* Similar to Hash/Array, subclasses of Set are no longer reflected in
  `inspect` output.

* RDoc from stdlib Set was moved to core Set, with minor updates.

This includes a comprehensive benchmark suite for all public Set
methods.  As you would expect, the native version is faster in the
vast majority of cases, and multiple times faster in many cases.
There are a few cases where it is significantly slower:

* Set.new with no arguments (~1.6x)
* Set#compare_by_identity for small sets (~1.3x)
* Set#clone for small sets (~1.5x)
* Set#dup for small sets (~1.7x)

These are slower as Set does not currently use the AR table
optimization that Hash does, so a new set_table is initialized for
each call.  I'm not sure it's worth the complexity to have an AR
table-like optimization for small sets (for hashes it makes sense,
as small hashes are used everywhere in Ruby).

The rbs and repl_type_completor bundled gems will need updates to
support core Set.  The pull request marks them as allowed failures.

This passes all set tests with no changes.  The following specs
needed modification:

* Modifying frozen set error message (changed for the better)
* `Set#divide` when passed a 2-arity block no longer yields the same
  object as both the first and second argument (this seems like an issue
  with the previous implementation).
* Set-like objects that override `is_a?` such that `is_a?(Set)` return
  `true` are no longer treated as Set instances.
* `Set.allocate.hash` is no longer the same as `nil.hash`
* `Set#join` no longer calls `Set#to_a` (it calls the underlying C
   function).
* `Set#flatten_merge` protected method is not implemented.

Previously, `set.rb` added a `SortedSet` autoload, which loads
`set/sorted_set.rb`.  This replaces the `Set` autoload in `prelude.rb`
with a `SortedSet` autoload, but I recommend removing it and
`set/sorted_set.rb`.

This moves `test/set/test_set.rb` to `test/ruby/test_set.rb`,
reflecting that switch to a core class.  This does not move the spec
files, as I'm not sure how they should be handled.

Internally, this uses the st_* types and functions as much as
possible, and only adds set_* types and functions as needed.
The underlying set_table implementation is stored in st.c, but
there is no public C-API for it, nor is there one planned, in
order to keep the ability to change the internals going forward.

For internal uses of st_table with Qtrue values, those can
probably be replaced with set_table.  To do that, include
internal/set_table.h.  To handle symbol visibility (rb_ prefix),
internal/set_table.h uses the same macro approach that
include/ruby/st.h uses.

The Set class (rb_cSet) and all methods are defined in set.c.
There isn't currently a C-API for the Set class, though C-API
functions can be added as needed going forward.

Implements [Feature #21216]

Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
Co-authored-by: Oliver Nutter <mrnoname1000@riseup.net>
2025-04-26 10:31:11 +09:00
Jeremy Evans
00b1a9cde6 [ruby/pp] Rename EMPTY_HASH to EMPTY_KWHASH
https://github.com/ruby/pp/commit/efe5bc878f
2025-04-22 15:21:07 +00:00
Jeremy Evans
3ce5d89c20 [ruby/pp] Avoid an array allocation per element in list passed to seplist
The array allocation was because the keyword splat expression is
not recognized as safe by the compiler.  Also avoid unnecessary
>= method call per element.  This uses a private constant to
avoid unnecessary work at runtime.

I assume the only reason this code is needed is because v may
end with a ruby2_keywords hash that we do not want to treat as
keywords.

This issue was found by the performance warning in Ruby feature
21274.

https://github.com/ruby/pp/commit/3bf6df0e5c
2025-04-22 15:21:07 +00:00
Samuel Williams
021ccbf7e8 [ruby/pp] Ensure the thread local state is always set up.
(https://github.com/ruby/pp/pull/38)

https://github.com/ruby/pp/commit/5b5d483ac2
2025-02-25 03:38:04 +00:00
Hiroshi SHIBATA
e34163d7fe [ruby/pp] Bump up 0.6.2
https://github.com/ruby/pp/commit/979f9d972d
2024-12-03 04:52:19 +00:00
tomoya ishida
cd7c6c66b4 [ruby/pp] Simplify range nil check
https://github.com/ruby/pp/commit/3e4b7c03b0

Co-authored-by: Nobuyoshi Nakada <nobu.nakada@gmail.com>
2024-11-19 14:52:01 +00:00
tompng
7b51b3c75b [ruby/pp] Fix pretty printing range begin/end with false or nil
https://github.com/ruby/pp/commit/6d9c0f255a
2024-11-19 14:52:01 +00:00
Nobuyoshi Nakada
0de7e6ccb0 [ruby/pp] [DOC] Mark up the method name
https://github.com/ruby/pp/commit/e787cd9139
2024-11-19 14:43:33 +00:00
Nobuyoshi Nakada
492b379b52 [ruby/pp] [DOC] Add documents
https://github.com/ruby/pp/commit/dbf177d0fc
2024-11-19 12:34:47 +00:00
Hiroshi SHIBATA
43285f543b [ruby/pp] Bump up v0.6.1
https://github.com/ruby/pp/commit/812933668d
2024-11-14 02:22:14 +00:00
Hiroshi SHIBATA
9a55375df7
Removed unused variable 2024-11-12 11:31:40 +09:00
Hiroshi SHIBATA
400f78939c [ruby/pp] Bump up v0.6.0
https://github.com/ruby/pp/commit/af2229e8e6
2024-11-12 02:18:17 +00:00
Jean Boussier
83702f7157 [ruby/pp] Handle BasicObject
Right now attempting to pretty print a BasicObject or any other
object lacking a few core Object methods will result in an error

```
Error: test_basic_object(PPTestModule::PPInspectTest): NoMethodError: undefined method `is_a?' for an instance of BasicObject
lib/pp.rb:192:in `pp'
lib/pp.rb:97:in `block in pp'
lib/pp.rb:158:in `guard_inspect_key'
lib/pp.rb:97:in `pp'
test/test_pp.rb:131:in `test_basic_object'
     128:
     129:   def test_basic_object
     130:     a = BasicObject.new
  => 131:     assert_match(/\A#<BasicObject:0x[\da-f]+>\n\z/, PP.pp(a, ''.dup))
     132:   end
     133: end
     134:
```

With some fairly small changes we can fallback to `Object#inspect`
which is better than an error.

https://github.com/ruby/pp/commit/4e9f6c2de0
2024-11-12 02:13:15 +00:00
Jean Boussier
107a4da122 [ruby/pp] Data#pretty_print handle privated or removed members
[Bug #20808]

The previous implementation assumed all members are accessible,
but it's possible for users to change the visibility of members or
to entirely remove the accessor.

https://github.com/ruby/pp/commit/fb19501434
2024-11-12 02:11:43 +00:00
tompng
f7343b636f prettyprint hash with colon style 2024-10-03 18:47:09 +09:00
Nobuyoshi Nakada
6e704311bb [ruby/pp] Extract pp_hash_pair
The method which prints single pair of a hash, to make extending
pretty printing Hash easier, apart from Hash construct itself.

https://github.com/ruby/pp/commit/3fcf2d1142
2024-02-21 16:45:01 +00:00
Nobuyoshi Nakada
37b8fc7477 [ruby/pp] Get rid of hardcoded class name
So that the `pp` method can work in inherited classes with that
class.

https://github.com/ruby/pp/commit/f204df3aad
2024-02-21 16:45:00 +00:00
Samuel Giddins
e0312f90bb [ruby/pp] Print beginless ranges properly
Instead of displaying the start of the range as nil

https://github.com/ruby/pp/commit/1df210d903
2024-01-15 14:04:14 +00:00
Benoit Daloze
1ed3b60375 [ruby/pp] Fix pretty printing a Data subclass instance when the subclass is anonymous
* It would be "#<data  a=42>" (double space) instead of "#<data a=42>" (like #inspect).

https://github.com/ruby/pp/commit/bed72bfcb8
2024-01-11 13:44:01 +00:00
Benoit Daloze
62382a4345 [ruby/pp] Use .class.members for pretty printing Data
* Data#members might not be defined, instead it might be defined
  on Data subclasses or a module included there. This is notably the
  case on TruffleRuby which defines it there for optimization purposes.
  In fact the mere presence of Data#members implies a megamorphic call
  inside, so it seems best to avoid relying on its existence.

https://github.com/ruby/pp/commit/6a97d36fbb
2024-01-11 13:44:00 +00:00
Benoit Daloze
3b9cc22536 [ruby/pp] Use a proper feature check to check if Data is defined
https://github.com/ruby/pp/commit/ed602b9f2b
2024-01-11 13:44:00 +00:00
Hiroshi SHIBATA
0ac39f226d [ruby/pp] Bump up 0.5.0
https://github.com/ruby/pp/commit/6e086e6df9
2023-11-07 01:00:08 +00:00
OKURA Masafumi
bf1362306e [Doc] Improve documentation of PP
* Remove mention to `require 'pp'` for `pretty_inspect`
* Mention the need to add `require 'pp'` to customize
  `#pretty_print(pp)` method
2023-10-25 16:49:09 +09:00
Hiroshi SHIBATA
3f8756484f [ruby/pp] Expose PP::VERSION
https://github.com/ruby/pp/commit/3d0e65e79f
2023-04-14 01:49:51 +00:00
manga_osyo
7b7e5153e8 [ruby/pp] [Feature #19045] Add support Data#pretty_print
https://github.com/ruby/pp/commit/343a20d721
2022-10-14 21:31:24 +09:00
Nobuyoshi Nakada
c6cf19340a [ruby/pp] [DOC] Update for PP.width_for [Feature #12913]
https://github.com/ruby/pp/commit/cad3cc762c
2021-12-23 18:00:56 +09:00
Charles Oliver Nutter
73da1c5ea3
[ruby/pp] Use etc instead of .so for broader compatibility
The use of `etc.so` here requires that etc is always implemented
as a C extension on-disk. However at least one impl – JRuby –
currently implements it as an internal extension, loaded via a
Ruby script. This require should simply use the base name of the
library, `etc`, to allow Ruby-based implementations to load as
well.

https://github.com/ruby/pp/commit/2061f994e0
2021-12-18 08:38:58 +09:00
Charles Oliver Nutter
5a6baaba38
[ruby/pp] Only do RubyVM patches if class exists
This class does not exist in any implementation except CRuby.

I would recommend moving this code somewhere else, like a separate
file loaded only on CRuby or into CRuby itself. For now this
change is sufficient to load the library on other implementations.

https://github.com/ruby/pp/commit/7d5a220f64
2021-12-18 08:38:58 +09:00
Yusuke Endoh
3288f0d09e lib/pp.rb (width_for): Ignore all syscall errors
According to nobu, Errno::EBAD is raised on Windows.
2021-11-30 13:46:08 +09:00
Yusuke Endoh
20065eabdb lib/pp.rb (width_for): ignore Errno::EINVAL
The error is raised on Solaris
http://rubyci.s3.amazonaws.com/solaris10-gcc/ruby-master/log/20211130T030003Z.fail.html.gz
```
  1) Failure:
TestRubyOptions#test_require [/export/home/users/chkbuild/cb-gcc/tmp/build/20211130T030003Z/ruby/test/ruby/test_rubyoptions.rb:265]:
pid 7386 exit 1
| /export/home/users/chkbuild/cb-gcc/tmp/build/20211130T030003Z/ruby/lib/pp.rb:67:in `winsize': Invalid argument - <STDOUT> (Errno::EINVAL)
```
2021-11-30 13:17:54 +09:00
Yusuke Endoh
eac347fdb0 lib/pp.rb (PP.pp): Use io/console's winsize by default
[Feature #12913]
2021-11-30 11:43:54 +09:00
Hiroshi SHIBATA
17441a6b1b [ruby/pp] Support < Ruby 3.0
https://github.com/ruby/pp/commit/3ee131ae92
2021-04-21 20:44:55 +09:00
Kazuhiro NISHIYAMA
082114da05
[DOC] Add doc to sharing_detection= [ci skip]
Before:
```
$ ri sharing_detection=
= .sharing_detection=
(from ruby core)
=== Implementation from PP
------------------------------------------------------------------------
  sharing_detection=(b)
------------------------------------------------------------------------
Returns the sharing detection flag as a boolean value. It is false by
default.
```

After:
```
$ ri sharing_detection=
= .sharing_detection=

(from ruby core)
=== Implementation from PP
------------------------------------------------------------------------
  sharing_detection=(b)

------------------------------------------------------------------------

Sets the sharing detection flag to b.
```
2020-12-23 11:13:50 +09:00
Koichi Sasada
cae8bbfe62 pp is ractor-ready.
`@sharing_detection` is only one obstruction to support pp on
non-main ractors, so make it ractor-local.
2020-12-22 23:32:18 +09:00
Jeremy Evans
28d31ead34 Fix pp when passed a empty ruby2_keywords-flagged hash as array element
This causes problems because the hash is passed to a block not
accepting keywords.  Because the hash is empty and keyword flagged,
it is removed before calling the block.  This doesn't cause an
ArgumentError because it is a block and not a lambda.  Just like
any other block not passed required arguments, arguments not
passed are set to nil.

Issues like this are a strong reason not to have ruby2_keywords
by default.

Fixes [Bug #16519]
2020-01-22 10:27:02 -08:00
Richard Viney
6a75a46053
Make prettyprint’s cycle detection aware of Delegator instances
Fixes [Bug #13144]

Co-Authored-By: Nobuyoshi Nakada <nobu@ruby-lang.org>
2019-12-16 23:43:49 +09:00
Jeremy Evans
ffd0820ab3 Deprecate taint/trust and related methods, and make the methods no-ops
This removes the related tests, and puts the related specs behind
version guards.  This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
2019-11-18 01:00:25 +02:00
John Hawthorn
ebbe396d3c Use ident hash for top-level recursion check
We track recursion in order to not infinite loop in ==, inspect, and
similar methods by keeping a thread-local 1 or 2 level hash. This allows
us to track when we have seen the same object (ex. using inspect) or
same pair of objects (ex. using ==) in this stack before and to treat
that differently.

Previously both levels of this Hash used the object's memory_id as a key
(using object_id would be slow and wasteful). Unfortunately, prettyprint
(pp.rb) uses this thread local variable to "pretend" to be inspect and
inherit its same recursion behaviour.

This commit changes the top-level hash to be an identity hash and to use
objects as keys instead of their object_ids.

I'd like to have also converted the 2nd level hash to an ident hash, but
it would have prevented an optimization which avoids allocating a 2nd
level hash for only a single element, which we want to keep because it's
by far the most common case.

So the new format of this hash is:

{ object => true } (not paired)
{ lhs_object => rhs_object_memory_id } (paired, single object)
{ lhs_object => { rhs_object_memory_id => true, ... } } (paired, many objects)

We must also update pp.rb to match this (using identity hashes).
2019-11-04 15:27:15 -08:00
Yusuke Endoh
c9fc82983c lib/pp.rb: Use UnboundMethod#bind_call instead of .bind(obj).call(...)
Related to [Feature #15955].
2019-08-30 11:13:00 +09:00
ktsj
9738f96fcf Introduce pattern matching [EXPERIMENTAL]
[ruby-core:87945] [Feature #14912]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67586 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-04-17 06:48:03 +00:00
mame
2c840bbfef lib/pp.rb (Range#pretty_print): support endless range
`pp(1..)` should print `"(1..)"` instead of `"(1..nil)"`.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66143 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-03 01:39:45 +00:00
nobu
afa685398e Refine RubyVM::AbstractSyntaxTree::Node#type
* ast.c (rb_ast_node_type): simplified to return a Symbol without
  "NODE_" prefix.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66142 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-03 01:06:34 +00:00
nobu
87e1dd2982 Add RubyVM::AST#pretty_print
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66140 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-03 00:24:38 +00:00
nobu
a531c579f8 Requiring pp is not required now [ci skip]
- Followup of https://bugs.ruby-lang.org/issues/14123

From: Prathamesh Sonpatki <csonpatki@gmail.com>

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-18 01:51:53 +00:00
mame
612af3b7cb lib/pp.rb: remove alias for suppressing a redefinition warning.
Because there is now the same guard in prelude.rb (alias pp pp).

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61111 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-11 04:46:57 +00:00
nobu
6d8f47fde1 lib/pp.rb: no rdoc of alias to suppress a warning
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61082 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-08 08:38:57 +00:00
nobu
60771d1315 pp.rb: rdoc
* lib/pp.rb (pp): move pp alias before its rdoc, not to prevent
  parsing.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61080 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-08 07:17:34 +00:00
akr
add309c496 Replace Kernel#pp after PP class is defined.
Avoid a race condition which a context switch
occur after replacing Kernel#pp but before
defining PP class.

Following patch, inserting sleep, makes
this problem reproducible.

```
Index: lib/pp.rb
===================================================================
--- lib/pp.rb	(revision 60960)
+++ lib/pp.rb	(working copy)
@@ -26,6 +26,7 @@ module Kernel
   end
   undef __pp_backup__ if method_defined?(:__pp_backup__)
   module_function :pp
+  sleep 1 # thread context switch
 end
 
 ##
```

With the above patch, "uninitialized constant Kernel::PP" can
happen as as follows.

```
% ./ruby -w -Ilib -e '
t1 = Thread.new {
  Thread.current.report_on_exception = true
  pp :foo1
}
t2 = Thread.new {
  Thread.current.report_on_exception = true
  sleep 0.5
  pp :foo2
}
t1.join rescue nil
t2.join rescue nil
'
#<Thread:0x000055dbf926eaa0@-e:6 run> terminated with exception:
Traceback (most recent call last):
	3: from -e:9:in `block in <main>'
	2: from /home/ruby/tst2/ruby/lib/pp.rb:22:in `pp'
	1: from /home/ruby/tst2/ruby/lib/pp.rb:22:in `each'
/home/ruby/tst2/ruby/lib/pp.rb:23:in `block in pp': uninitialized constant Kernel::PP (NameError)
:foo1
```



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60961 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-01 10:48:29 +00:00
mame
4ae87f0ad6 lib/pp.rb (Kernel#pp): Fix a race condition
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60948 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-12-01 00:41:17 +00:00
mame
23c1fccf83 prelude.rb: Add Kernel#pp, a trigger for lib/pp.rb
[Feature #14123]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60944 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2017-11-30 01:31:00 +00:00