1494 Commits

Author SHA1 Message Date
nobu
fb84b86be0 * string.c (chopped_length): early return for empty strings
[Bug #11391]

From: Franck Verrot <franck@verrot.fr>

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67018 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-02-07 07:39:47 +00:00
kazu
bdbc8a8f12 Add more example of String#dump
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66906 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-22 12:43:57 +00:00
samuel
502f159421 Improvements to documentation.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66897 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-21 10:56:29 +00:00
mame
6891a1cd5d string.c (rb_str_dump): Fix the rdoc
* Officially states that String#dump is intended for round-trip.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66894 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-21 08:51:51 +00:00
nobu
d7976d1451 Use & instead of modulo
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66830 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-15 12:05:46 +00:00
shyouhei
d154bec0d5 setbyte / ungetbyte allow out-of-range integers
* string.c: String#setbyte to accept arbitrary integers [Bug #15460]

* io.c: ditto for IO#ungetbyte

* ext/strringio/stringio.c: ditto for StringIO#ungetbyte



git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66824 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-15 06:41:58 +00:00
nobu
50784a0a44 Defer escaping control char in error messages
* eval_error.c (print_errinfo): defer escaping control char in
  error messages until writing to stderr, instead of quoting at
  building the message.  [ruby-core:90853] [Bug #15497]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66753 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2019-01-08 09:08:31 +00:00
mame
94bdc4edf0 string.c: remove the deprecation warnings of String#bytes with block
And its friends: lines, chars, grapheme_clusters, and codepoints.
[Feature #6670] [ruby-core:90728]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66579 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-26 14:43:25 +00:00
mame
0df1de8b32 Revert "string.c: remove the deprecation warnings of String#bytes with block"
Forgot to write the ticket number in the commit log...

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66578 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-26 14:42:07 +00:00
mame
2b21744efa string.c: remove the deprecation warnings of String#bytes with block
And its friends: lines, chars, grapheme_clusters, and codepoints.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-26 08:52:19 +00:00
stomar
31dc65b275 string.c: [DOC] fix typos
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66375 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-12 22:04:48 +00:00
duerst
3628eae2e7 implement special behavior for Georgian for String#capitalize
The modern Georgian script is special in that it has an 'uppercase'
variant called MTAVRULI which can be used for emphasis of whole words,
for screamy headlines, and so on. However, in contrast to all other
bicameral scripts, there is no usage of capitalizing the first letter
in a word or a sentence. Words with mixed capitalization are not used
at all.

We therefore implement special behavior for String#capitalize. Formally,
we define String#capitalize as first applying String#downcase for the
whole string, then using titlecase on the first letter. Because Georgian
defines titlecase as the identity function both for MTAVRULI ('uppercase')
and Mkhedruli (lowercase), this results in String#capitalize being
equivalent to String#downcase for Georgian. This avoids undesirable
mixed case.

* enc/unicode.c: Actual implementation

* string.c: Add mention of this special case for documentation

* test/ruby/enc/test_case_mapping.rb: Add two tests, a general one
  that uses String#capitalize on some (including nonsensical)
  combinations of MTAVRULI and Mkhedruli, and a canary test to
  detect the potential assignment of characters to the currently
  open slots (holes) at U+1CBB and U+1CBC.

* test/ruby/enc/test_case_comprehensive.rb: Tweak generation of
  expectation data.

Together with r65933, this closes issue #14839.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66300 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-09 23:14:29 +00:00
naruse
e39a83a150 suppress warning: unused variable 'vbits'
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66245 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-06 10:42:35 +00:00
nobu
98e65d9d92 Prefer rb_check_arity when 0 or 1 arguments
Especially over checking argc then calling rb_scan_args just to
raise an ArgumentError.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66238 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-06 07:49:24 +00:00
shyouhei
37c22bd945 string.c: [DOC] deprecate String#crypt [ci skip] [Feature #14915]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66154 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-12-03 05:46:46 +00:00
svn
64148e66b6 * expand tabs.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65957 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-24 12:26:11 +00:00
naruse
1f9731654e fix r65954; Keep tainty
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65956 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-24 12:26:07 +00:00
naruse
7850586af4 Don't use single byte optimization on grapheme clusters
Unicode Text Segmentation considers CRLF as a character. [Bug #15337]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65954 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-24 11:53:19 +00:00
shyouhei
953091a4b1 char is not unsigned
It seems that decades ago, ruby was written under assumption that
char is unsigned.  Which is of course a false assumption.  We
need to explicitly store a numeric value into an unsigned char
variable to tell we expect 0..255 value.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65900 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-21 08:51:39 +00:00
shyouhei
7213568733 string.c: setbyte silently ignores upper bits
The behaviour of String#setbyte has been depending on the width
of int, which is not portable.  Must check explicitly.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65804 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-19 09:52:46 +00:00
shyouhei
74fe1cc3d9 string.c: this assumption is false [ci skip]
Looking at the lines right above, it is clear than a blue sky
that we cannot assume `p` to be aligned at all when
UNALIGNED_WORD_ACCESS is true.  It is a wrong idea to use
__builtin_assume_aligned for that situation.

See also: https://travis-ci.org/ruby/ruby/jobs/451710732#L2007


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65592 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-07 05:23:03 +00:00
shyouhei
4a80c0540f adopt sanitizer API
These APIs are much like <valgrind/memcheck.h>. Use them to
fine-grain annotate the usage of our memory.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65573 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-11-06 10:06:07 +00:00
ko1
870363886f fix type.
* string.c (rb_str_format_m): should pass `int`.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65456 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 22:16:26 +00:00
ko1
312b105d0e introduce TransientHeap. [Bug #14858]
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
  theap is designed for Ruby's object system. theap is like Eden heap
  on generational GC terminology. theap allocation is very fast because
  it only needs to bump up pointer and deallocation is also fast because
  we don't do anything. However we need to evacuate (Copy GC terminology)
  if theap memory is long-lived. Evacuation logic is needed for each type.

  See [Bug #14858] for details.

* array.c: Now, theap for T_ARRAY is supported.

  ary_heap_alloc() tries to allocate memory area from theap. If this trial
  sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
  We don't need to free theap ptr.

* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
  if ary is allocated at theap, force evacuation to malloc'ed memory.
  It makes programs slow, but very compatible with current code because
  theap memory can be evacuated (theap memory will be recycled).

  If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
  instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
  will occur, use RARRAY_CONST_PTR().

(re-commit of r65444)


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65449 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 21:53:56 +00:00
svn
69b8ffcd5b * expand tabs.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65448 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 21:02:12 +00:00
ko1
7d359f9b69 revert r65444 and r65446 because of commit miss
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65447 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 21:01:55 +00:00
ko1
90ac549fa6 introduce TransientHeap. [Bug #14858]
* transient_heap.c, transient_heap.h: implement TransientHeap (theap).
  theap is designed for Ruby's object system. theap is like Eden heap
  on generational GC terminology. theap allocation is very fast because
  it only needs to bump up pointer and deallocation is also fast because
  we don't do anything. However we need to evacuate (Copy GC terminology)
  if theap memory is long-lived. Evacuation logic is needed for each type.

  See [Bug #14858] for details.

* array.c: Now, theap for T_ARRAY is supported.

  ary_heap_alloc() tries to allocate memory area from theap. If this trial
  sccesses, this array has theap ptr and RARRAY_TRANSIENT_FLAG is turned on.
  We don't need to free theap ptr.

* ruby.h: RARRAY_CONST_PTR() returns malloc'ed memory area. It menas that
  if ary is allocated at theap, force evacuation to malloc'ed memory.
  It makes programs slow, but very compatible with current code because
  theap memory can be evacuated (theap memory will be recycled).

  If you want to get transient heap ptr, use RARRAY_CONST_PTR_TRANSIENT()
  instead of RARRAY_CONST_PTR(). If you can't understand when evacuation
  will occur, use RARRAY_CONST_PTR().


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65444 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-30 20:46:24 +00:00
stomar
8f0eb44d93 string.c: improve docs for String#strip and related
* string.c: [DOC] improve docs for String#{strip,lstrip,rstrip}{,!}:
  small clarification, avoid referring to the receiver as `str'
  (does not appear in the call-seq of the generated HTML docs),
  enable links for cross-references, simplify rdoc.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65382 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-26 20:30:09 +00:00
stomar
af7f9de4b9 array.c, file.c, string.c: [DOC] fix typos
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65185 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-19 21:35:51 +00:00
nobu
569fe2922f string.c: grapheme cluster regexp failure
* string.c (get_reg_grapheme_cluster): show error info and relax
  to rb_fatal from rb_bug.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65096 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-16 09:11:12 +00:00
stomar
41a486e966 string.c: [DOC] add example code for String#strip!
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65068 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-13 19:04:02 +00:00
stomar
ee49d04540 string.c: small doc improvement
* string.c: [DOC] move unaltered case for String#strip to the end,
  similar to other strip methods.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65067 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-13 19:02:51 +00:00
nobu
fa8b08b424 Prefer rb_fstring_lit over rb_fstring_cstr
The former states explicitly that the argument must be a literal,
and can optimize away `strlen` on all compilers.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65059 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-13 09:59:22 +00:00
nobu
83a01e6f52 Added comments to rb_setup_fake_str and rb_fstring_new [ci skip]
`ptr` for these functions must refer constant string literals.
Otherwise, the result string's content can be modified/discarded
unexpectedly.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65058 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-10-13 09:23:56 +00:00
marcandre
2521b079fa [DOC] Improve String#strip documentation.
Patch by Josh Goldberg. [Fix GH-1933] [ci skip]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64757 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-09-16 02:49:44 +00:00
shyouhei
22444ae9b1 move function declarations from insns.def to internal.h
Just avoid being loose.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63755 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-06-27 00:57:16 +00:00
stomar
7215cecfb5 string.c: [DOC] grammar fixes
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63632 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-06-11 20:16:27 +00:00
nobu
46d7dc1162 [Docs] Improve documentation of String#lines
* Document about optional getline arguments
* Add examples, especially for the demonstration of `chomp: true`
[Fix GH-1886]

From: Koki Takahashi <hakatasiloving@gmail.com>

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63610 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-06-08 10:45:01 +00:00
normal
256411b47f String#uminus dedupes unconditionally
[Feature #14478] [ruby-core:85669]

Thanks-to: Sam Saffron <sam.saffron@gmail.com>

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63566 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-06-04 23:26:03 +00:00
nobu
ce2f4f8526 string.c: trivial optimizations
* string.c (rb_str_aset): prefer BUILTIN_TYPE over TYPE after
  SPECIAL_CONST_P check.

* string.c (rb_str_start_with): prefer RB_TYPE_P over switch by
  TYPE.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63543 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-06-01 06:53:26 +00:00
nobu
87ccf7e50a string.c: doc for [Feature #13712]
* string.c (rb_str_start_with): [DOC] start_with? example with
  regexp.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63541 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-06-01 06:37:14 +00:00
normal
2fd1525b0f string.c: MAYBE_UNUSED to suppress warnings for old
Building with HAVE_MALLOC_USABLE_SIZE currently makes
SIZED_REALLOC_N ignore the old size arg.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63487 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-05-22 01:58:47 +00:00
normal
0a4be5beda string.c: size hints for free and realloc calls
Another part of the plan to reduce dependencies on malloc_usable_size:
https://bugs.ruby-lang.org/issues/10238

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63485 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-05-22 01:13:08 +00:00
nobu
703a5dd3e0 string.c: adjust to rb_str_upto_each
* range.c (range_each_func): adjust the signature of the callback
  function to rb_str_upto_each, and exit the loop if the callback
  returned non-zero.

* string.c (rb_str_upto_endless_each): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63290 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-04-28 11:16:54 +00:00
nobu
2c8f16e6c0 string.c: fix scanned substring with \K
* string.c (scan_once): fix the matched substring with `\K`, the
  beginning of that string may differ from the matched position.
  [ruby-core:86663] [Bug #14707]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63252 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-04-24 12:25:46 +00:00
mame
7f95eed19e Introduce endless range [Feature#12912]
Typical usages:
```
p ary[1..]          # drop the first element; identical to ary[1..-1]
(1..).each {|n|...} # iterate forever from 1; identical to 1.step{...}
```

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63192 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-04-19 15:18:50 +00:00
nobu
7c35618c53 string.c: suppress warning
* string.c (str_undump): get rid of warning C4129 by VC.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63170 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-04-17 04:12:57 +00:00
nobu
cea438b0ca string.c: fix dumped suffix
* string.c (rb_str_dump): get rid of an error on evaling with
  frozen-string-literal enabled.  [ruby-core:86539] [Bug #14687]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63164 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-04-16 07:12:06 +00:00
nobu
5b2b1130cf string.c: fix checking order
* string.c (str_undump): check for suffix before if Unicode escape
  conflicts with it.  the message "but used force_encoding" sounds
  strange when it is not used.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63162 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-04-16 06:37:42 +00:00
stomar
5e99863393 string.c: [DOC] fix typo
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63160 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2018-04-14 16:50:42 +00:00