1823 Commits

Author SHA1 Message Date
nobu
bd6fe32691 string.c: $; name in error message
* string.c (rb_str_split_m): show $; name in error message when it
  is a wrong object.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55986 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-08-22 17:10:00 +00:00
duerst
31040a307e * string.c (String#downcase), NEWS: Mentioned that case mapping for all
of ISO-8859-1~16 is now supported. [ci skip]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55777 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-30 03:13:28 +00:00
nobu
c463366dfd rb_funcallv
* *.c: rename rb_funcall2 to rb_funcallv, except for extensions
  which are/will be/may be gems.  [Fix GH-1406]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55773 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-29 11:57:14 +00:00
ko1
9f60791a04 * vm_core.h: revisit the structure of frame, block and env.
[Bug #12628]

  This patch introduce many changes.

  * Introduce concept of "Block Handler (BH)" to represent
    passed blocks.

  * move rb_control_frame_t::flag to ep[0] (as a special local
    variable). This flags represents not only frame type, but also
    env flags such as escaped.

  * rename `rb_block_t` to `struct rb_block`.

  * Make Proc, Binding and RubyVM::Env objects wb-protected.

  Check [Bug #12628] for more details.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55766 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-28 11:02:30 +00:00
nobu
a325876ad3 Fix Issues reported by PVS-Studio static analyzer
* vm.c (vm_set_main_stack): remove unnecessary check.  toplevel
  binding must be initialized.  [Bug #12611] (N1)
* win32/win32.c (w32_symlink): fix return type.  [Bug #12611] (N3)
* string.c (rb_str_split_m): simplify the condition.
  [Bug #12611](N4)

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55729 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-22 10:55:22 +00:00
duerst
c6692d9410 * string.c (String#dump): Change escaping of non-ASCII characters in
UTF-8 to use upper-case four-digit hexadecimal escapes without braces
  where possible [Feature #12419].
* test/ruby/test_string.rb (test_dump): Add tests for above.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55728 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-22 08:13:38 +00:00
ngoto
20c4461d86 * string.c (str_buf_cat): Fix potential interger overflow of capa.
In addition, termlen is used instead of +1.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55692 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-15 13:08:54 +00:00
ngoto
2bb292fccf * string.c (str_buf_cat): Fix capa size for embed string.
Fix bug in r55547. [Bug #12536]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55691 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-15 12:35:52 +00:00
normal
ed5401a696 string.c: reduce malloc overhead for default buffer size
* string.c (STR_BUF_MIN_SIZE): reduce from 128 to 127
  [ruby-core:76371] [Feature #12025]
* string.c (rb_str_buf_new): adjust for above reduction

From Jeremy Evans <code@jeremyevans.net>:

This changes the minimum buffer size for string buffers from 128 to
127.  The underlying C buffer is always 1 more than the ruby buffer,
so this changes the actual amount of memory used for the minimum
string buffer from 129 to 128.  This makes it much easier on the
malloc implementation, as evidenced by the following code (note that
time -l is used here, but Linux systems may need time -v).

$ cat bench_mem.rb
i = ARGV.first.to_i
Array.new(1000000){" " * i}
$ /usr/bin/time -l ruby bench_mem.rb 128
        3.10 real         2.19 user         0.46 sys
    289080  maximum resident set size
     72673  minor page faults
        13  block output operations
        29  voluntary context switches
$ /usr/bin/time -l ruby bench_mem.rb 127
        2.64 real         2.09 user         0.27 sys
    162720  maximum resident set size
     40966  minor page faults
         2  block output operations
         4  voluntary context switches

To try to ensure a power-of-2 growth, when a ruby string capacity
needs to be increased, after doubling the capacity, add one.  This
ensures the ruby capacity will be odd, which means actual amount
of memory used will be even, which is probably better than the
current case of the ruby capacity being even and the actual amount
of memory used being odd.

A very similar patch was proposed 4 years ago in feature #5875. It
ended up being rejected, because no performance increase was shown.
One reason for that is that ruby does not use STR_BUF_MIN_SIZE
unless rb_str_buf_new is called, and that previously did not have
a ruby API, only a C API, so unless you were using a C extension
that called it, there would be no performance increase.

With the recently proposed feature #12024, String.buffer is added,
which is a ruby API for creating string buffers.  Using
String.buffer(100) wastes much less memory with this patch, as the
malloc implementation can more easily deal with the power-of-2
sized memory usage.  As measured above, memory usage is 44% less,
and performance is 17% better.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55686 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-14 23:30:29 +00:00
ngoto
5eff15d1bd * string.c (rb_str_change_terminator_length): New function to change
termlen and resize heap for the terminator. This is split from
  rb_str_fill_terminator (str_fill_term) because filling terminator
  and changing terminator length are different things. [Bug #12536]

* internal.h: declaration for rb_str_change_terminator_length.

* string.c (str_fill_term): Simplify only to zero-fill the terminator.
  For non-shared strings, it assumes that (capa + termlen) bytes of
  heap is allocated. This partially reverts r55557.

* encoding.c (rb_enc_associate_index): rb_str_change_terminator_length
  is used, and it should be called whenever the termlen is changed.

* string.c (str_capacity): New static function to return capacity
  of a string with the given termlen, because the termlen may
  sometimes be different from TERM_LEN(str) especially during
  changing termlen or filling terminator with specific termlen.

* string.c (rb_str_capacity): Use str_capacity.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-05 10:45:23 +00:00
ngoto
3418a277d8 * string.c: Partially reverts r55547 and r55555.
ChangeLog about the reverted changes are also deleted in this file.
  [Bug #12536] [ruby-dev:49699] [ruby-dev:49702]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55559 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-01 18:11:11 +00:00
ngoto
61f2ee0d90 * string.c (str_fill_term): When termlen increases, re-allocation
of memory for termlen should always be needed.
  In this fix, if possible, decrease capa instead of realloc.
  [Bug #12536] [ruby-dev:49699]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55557 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-01 17:32:21 +00:00
ngoto
a92a537bf4 * string.c: Specify termlen as far as possible.
Additional fix for [Bug #12536] [ruby-dev:49699].

* string.c (rb_usascii_str_new, rb_utf8_str_new): Specify termlen
  which is apparently 1 for the encodings.

* string.c (str_new0_cstr): New static function to create a String
  object from a C string with specifying termlen.

* string.c (rb_usascii_str_new_cstr, rb_utf8_str_new_cstr): Specify
  termlen by using new str_new0_cstr().

* string.c (str_new_static): Specify termlen from the given encoding
  when creating a new String object is needed.

* string.c (rb_tainted_str_new_with_enc): New function to create a
  tainted String object with the given encoding. This means that
  the termlen is correctly specified. Curretly static function.
  The function name might be renamed to rb_tainted_enc_str_new
  or rb_enc_tainted_str_new.

* string.c (rb_external_str_new_with_enc): Use encoding by using the
  above rb_tainted_str_new_with_enc().


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55555 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-01 11:24:11 +00:00
ngoto
10e28726a1 * string.c (rb_str_subseq, str_substr): When RSTRING_EMBED_LEN_MAX
is used, TERM_LEN(str) should be considered with it because
  embedded strings are also processed by TERM_FILL.
  Additional fix for [Bug #12536] [ruby-dev:49699].


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55552 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-01 04:50:38 +00:00
ngoto
6734a0c3d9 string.c: Add parentheses to avoid C source code ambiguity. [Bug #12536]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55551 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-07-01 03:58:51 +00:00
ngoto
f2ee22371b * string.c: Fix memory corruptions when using UTF-16/32 strings.
[Bug #12536] [ruby-dev:49699]

* string.c (TERM_LEN_MAX): Macro for the longest TERM_FILL length,
  the same as largest value of rb_enc_mbminlen(enc) among encodings.

* string.c (str_new, rb_str_buf_new, str_shared_replace): Allocate
  +TERM_LEN_MAX bytes instead of +1. This change may increase memory
  usage.

* string.c (rb_str_new_with_class): Use TERM_LEN of the "obj".

* string.c (rb_str_plus, rb_str_justify): Use str_new0 which is aware
  of termlen.

* string.c (str_shared_replace): Copy +termlen bytes instead of +1.

* string.c (rb_str_times): termlen should not be included in capa.

* string.c (RESIZE_CAPA_TERM): When using RSTRING_EMBED_LEN_MAX,
  termlen should be counted with it because embedded strings are
  also processed by TERM_FILL.

* string.c (rb_str_capacity, str_shared_replace, str_buf_cat): ditto.

* string.c (rb_str_drop_bytes, rb_str_setbyte, str_byte_substr): ditto.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55547 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-30 10:20:23 +00:00
nobu
bcf0a198f1 CASEMAP_DEBUG [ci skip]
* string.c (rb_str_casemap, rb_str_ascii_casemap): move
  debug/tuning messages under a preprocessor condition,
  CASEMAP_DEBUG.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55483 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-21 08:19:59 +00:00
nobu
3a6bb56029 Fix garbage allocation
* string.c (rb_str_casemap): do not put code with side effects
  inside RSTRING_PTR() macro which evaluates the argument multiple
  times.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55481 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-21 07:38:16 +00:00
naruse
8272729977 * string.c (rb_str_casemap): fix memory leak.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55480 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-21 07:14:05 +00:00
naruse
9d291c82e5 * string.c (rb_str_casemap): int is too small for string size.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55479 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-21 07:14:04 +00:00
nobu
1cbc622ea7 string.c: adjust buffer size
* string.c (tr_trans): adjust buffer size by processed and rest
  lengths, instead of doubling repeatedly.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55428 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-16 03:17:54 +00:00
nobu
cc9f1e9195 string.c: fix terminator
* string.c (tr_trans): consider terminator length and fix heap
  overflow.  reported by Guido Vranken <guido AT guidovranken.nl>.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55427 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-16 02:15:27 +00:00
nobu
aaf8c09900 Fix typo in string.c [ci skip]
* string.c (rb_str_oct): [DOC] fix typo, hornored -> honored.
  [Fix GH-1379]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55378 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-11 06:02:46 +00:00
duerst
02f7ad6237 * enc/iso_8859_1.c: Implement non-ASCII case mapping.
* test/ruby/enc/test_case_comprehensive.rb: Tests for above.
* string.c: Add iso-8859-1 to supported encodings.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55373 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-11 00:46:21 +00:00
duerst
10174c295b * string.c: Special-case :ascii option in rb_str_capitalize_bang and
rb_str_swapcase_bang.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55361 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-10 08:35:17 +00:00
duerst
13f576d6b9 * string.c: Special-case :ascii option in rb_str_upcase_bang (retry).
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55359 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-10 08:12:28 +00:00
nobu
2667d1b38f hash.c: ensure NUL-terminated for ENV
* hash.c (get_env_cstr): ensure NUL-terminated.
  [ruby-dev:49655] [Bug #12475]
* string.c (rb_str_fill_terminator): return the pointer to the
  NUL-terminated content.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55345 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-10 05:48:38 +00:00
kazu
075cf3d2e8 string.c (rb_str_ascii_casemap): fix compile error.
error: implicit conversion loses integer precision: 'long' to 'int' [-Werror,-Wshorten-64-to-32]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55332 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-08 14:11:17 +00:00
duerst
872f9a498f * string.c: Revert previous commit (possibility of endless loop).
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55331 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-08 13:22:28 +00:00
duerst
5eb73eeda8 * string.c: Special-case :ascii option in rb_str_upcase_bang.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55330 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-08 12:57:44 +00:00
duerst
f0fc6ec872 * string.c: New static function rb_str_ascii_casemap; special-casing
:ascii option in rb_str_upcase_bang and rb_str_downcase_bang.
* regenc.c: Fix a bug (wrong use of unnecessary slack at end of string).
* regenc.h -> include/ruby/oniguruma.h: Move declaration of
  onigenc_ascii_only_case_map so that it is visible in string.c.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55329 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-08 12:28:42 +00:00
duerst
8743f010c6 * string.c (rb_str_upcase_bang, rb_str_capitalize_bang,
rb_str_swapcase_bang): Switch to use primitive.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-07 08:18:42 +00:00
duerst
53a3e3ddd9 * string.c (rb_str_downcase_bang): Switch to use primitive except if
conversion can be done ASCII-only.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55308 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-07 07:44:19 +00:00
duerst
ab5f23f26c * string.c: Added UTF-16BE/LE and UTF-32BE/LE to supported encodings
for Unicode case mapping.
* test/ruby/enc/test_case_comprehensive.rb: Tests for above
  functionality; fixed an encoding issue in assertion error message.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55296 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-06 09:36:36 +00:00
duerst
2f49aa8f62 * string.c Change rb_str_casemap to use encoding primitive
case_map instead of directly calling onigenc_unicode_case_map.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55293 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-06 04:37:10 +00:00
duerst
c5ea268264 * string.c: Remove :lithuanian guard for Unicode case mapping.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55277 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-05 05:46:37 +00:00
nobu
40c3c3ec6c crypt.h: remove initialized
* missing/crypt.h (struct crypt_data): remove unnecessary member
  "initialized".
* missing/crypt.c (des_setkey_r): nothing to be initialized in
  crypt_data.
* configure.in (struct crypt_data): check for "initialized" in
  struct crypt_data, which may be only in glibc, and isn't on AIX
  at least.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55272 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-04 01:54:54 +00:00
duerst
3dd98b2446 * string.c: Raise ArgumentError when invalid string is detected in
case mapping methods.
* enc/unicode.c: Check for invalid string and signal with negative
  length value.
* test/ruby/enc/test_case_mapping.rb: Add tests for above.
* test/ruby/test_m17n_comb.rb: Add a message to clarify test failure.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55253 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-02 01:24:52 +00:00
nobu
a94201243e string.c: fallback to crypt_r
* string.c: prefer crypt_r to crypt iff system crypt nor crypt_r
  are not provided.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55250 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-01 13:17:31 +00:00
nobu
a8bfa9bdf1 use system crypt
* configure.in: revert r55237.  replace crypt, not crypt_r, and
  check if crypt is broken more.
* missing/crypt.c: move crypt_r.c
* string.c (rb_str_crypt): use crypt_r if provided by the system.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55245 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-01 06:58:21 +00:00
nobu
3c31685e11 use crypt_r
* string.c (rb_str_crypt): use reentrant crypt_r.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55237 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-06-01 00:48:08 +00:00
naruse
e6ff652ce8 Revert r55225
Run test-all before large commit:
"* string.c: Activate full Unicode case mapping for UTF-8 by removing"

This reverts commit 3fb0fcd1e881c1f6dd74db73a64e8623208acb77.
http://rubyci.s3.amazonaws.com/centos5-64/ruby-trunk/log/20160531T013303Z.fail.html.gz

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55226 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-31 02:56:09 +00:00
duerst
3fb0fcd1e8 * string.c: Activate full Unicode case mapping for UTF-8 by removing
the protective check for the presence of an option.
  Update documentation.
* test/ruby/enc/test_case_comprehensive.rb: Adjust tests for above change.


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55225 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-31 01:10:06 +00:00
duerst
ae4fba3167 * string.c: Document current behavior for other case mapping methods
on String. [ci skip]


git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55217 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-30 12:15:41 +00:00
duerst
85950c5257 * string.c: Document current situation for String#downcase. [ci skip]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55215 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-30 11:00:26 +00:00
nobu
79a85b18cc string.c: return reallocated pointer
* string.c (str_fill_term): return new pointer reallocated by
  filling terminator.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55212 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-30 07:20:28 +00:00
nobu
9ac5f9135a string.c: get rid of unnecessary empty string
* string.c (str_substr, rb_str_aref): refactor not to create
  unnecessary empty string.
* string.c (str_byte_substr, str_byte_aref): ditto.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55209 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-30 05:50:27 +00:00
nobu
e3e8cae9be string.c: check in the order
* string.c (rb_str_aref_m, rb_str_byteslice): check arguments in
  the left-to-right order.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55208 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-30 05:41:02 +00:00
nobu
4fad63da01 transcode.c: scrub in the given encoding
* transcode.c (str_transcode0): scrub in the given encoding when
  the source encoding is given, not in the encoding of the
  receiver.  [ruby-core:75732] [Bug #12431]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55181 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-27 08:09:46 +00:00
nobu
b493d156de string.c: integer overflow
* string.c (rb_str_modify_expand): check integer overflow.
  [ruby-core:75592] [Bug #12390]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55054 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
2016-05-18 05:52:40 +00:00