[DOC] document line continuation.

Document details of escape sequences including line continuation.

[Bug #20518]
This commit is contained in:
Tanaka Akira 2024-06-07 21:53:11 +09:00
parent 547233fb6e
commit 5e1001f754

View File

@ -138,19 +138,18 @@ Also \Rational numbers may be imaginary numbers.
== Strings == Strings
=== \String Literals === Escape Sequences
The most common way of writing strings is using <tt>"</tt>: Some characters can be represented as escape sequences in
double-quoted strings,
character literals,
here document literals (non-quoted, double-quoted, and with backticks),
double-quoted symbols,
double-quoted symbol keys in Hash literals,
Regexp literals, and
several percent literals (<tt>%</tt>, <tt>%Q,</tt> <tt>%W</tt>, <tt>%I</tt>, <tt>%r</tt>, <tt>%x</tt>).
"This is a string." They allow escape sequences such as <tt>\n</tt> for
The string may be many lines long.
Any internal <tt>"</tt> must be escaped:
"This string has a quote: \". As you can see, it is escaped"
Double-quote strings allow escaped characters such as <tt>\n</tt> for
newline, <tt>\t</tt> for tab, etc. The full list of supported escape newline, <tt>\t</tt> for tab, etc. The full list of supported escape
sequences are as follows: sequences are as follows:
@ -174,11 +173,31 @@ sequences are as follows:
\M-\cx same as above \M-\cx same as above
\c\M-x same as above \c\M-x same as above
\c? or \C-? delete, ASCII 7Fh (DEL) \c? or \C-? delete, ASCII 7Fh (DEL)
\<newline> continuation line (empty string)
Any other character following a backslash is interpreted as the The last one, <tt>\<newline></tt>, represents an empty string instead of a character.
It is used to fold a line in a string.
=== Double-quoted \String Literals
The most common way of writing strings is using <tt>"</tt>:
"This is a string."
The string may be many lines long.
Any internal <tt>"</tt> must be escaped:
"This string has a quote: \". As you can see, it is escaped"
Double-quoted strings allow escape sequences described in
{Escape Sequences}[#label-Escape+Sequences].
In a double-quoted string,
any other character following a backslash is interpreted as the
character itself. character itself.
Double-quote strings allow interpolation of other values using Double-quoted strings allow interpolation of other values using
<tt>#{...}</tt>: <tt>#{...}</tt>:
"One plus one is two: #{1 + 1}" "One plus one is two: #{1 + 1}"
@ -190,8 +209,14 @@ You can also use <tt>#@foo</tt>, <tt>#@@foo</tt> and <tt>#$foo</tt> as a
shorthand for, respectively, <tt>#{ @foo }</tt>, <tt>#{ @@foo }</tt> and shorthand for, respectively, <tt>#{ @foo }</tt>, <tt>#{ @@foo }</tt> and
<tt>#{ $foo }</tt>. <tt>#{ $foo }</tt>.
See also:
* {% and %Q: Interpolable String Literals}[#label-25+and+-25Q-3A+Interpolable+String+Literals]
=== Single-quoted \String Literals
Interpolation may be disabled by escaping the "#" character or using Interpolation may be disabled by escaping the "#" character or using
single-quote strings: single-quoted strings:
'#{1 + 1}' #=> "\#{1 + 1}" '#{1 + 1}' #=> "\#{1 + 1}"
@ -199,6 +224,16 @@ In addition to disabling interpolation, single-quoted strings also disable all
escape sequences except for the single-quote (<tt>\'</tt>) and backslash escape sequences except for the single-quote (<tt>\'</tt>) and backslash
(<tt>\\\\</tt>). (<tt>\\\\</tt>).
In a single-quoted string,
any other character following a backslash is interpreted as is:
a backslash and the character itself.
See also:
* {%q: Non-Interpolable String Literals}[#label-25q-3A+Non-Interpolable+String+Literals]
=== Literal String Concatenation
Adjacent string literals are automatically concatenated by the interpreter: Adjacent string literals are automatically concatenated by the interpreter:
"con" "cat" "en" "at" "ion" #=> "concatenation" "con" "cat" "en" "at" "ion" #=> "concatenation"
@ -211,10 +246,12 @@ be concatenated as long as a percent-string is not last.
%q{a} 'b' "c" #=> "abc" %q{a} 'b' "c" #=> "abc"
"a" 'b' %q{c} #=> NameError: uninitialized constant q "a" 'b' %q{c} #=> NameError: uninitialized constant q
=== Character Literal
There is also a character literal notation to represent single There is also a character literal notation to represent single
character strings, which syntax is a question mark (<tt>?</tt>) character strings, which syntax is a question mark (<tt>?</tt>)
followed by a single character or escape sequence that corresponds to followed by a single character or escape sequence (except continuation line)
a single codepoint in the script encoding: that corresponds to a single codepoint in the script encoding:
?a #=> "a" ?a #=> "a"
?abc #=> SyntaxError ?abc #=> SyntaxError
@ -228,11 +265,6 @@ a single codepoint in the script encoding:
?\C-\M-a #=> "\x81", same as above ?\C-\M-a #=> "\x81", same as above
?あ #=> "あ" ?あ #=> "あ"
See also:
* {%q: Non-Interpolable String Literals}[#label-25q-3A+Non-Interpolable+String+Literals]
* {% and %Q: Interpolable String Literals}[#label-25+and+-25Q-3A+Interpolable+String+Literals]
=== Here Document Literals === Here Document Literals
If you are writing a large block of text you may use a "here document" or If you are writing a large block of text you may use a "here document" or
@ -283,9 +315,10 @@ its end is a multiple of eight. The amount to be removed is counted in terms
of the number of spaces. If the boundary appears in the middle of a tab, that of the number of spaces. If the boundary appears in the middle of a tab, that
tab is not removed. tab is not removed.
A heredoc allows interpolation and escaped characters. You may disable A heredoc allows interpolation and the escape sequences described in
interpolation and escaping by surrounding the opening identifier with single {Escape Sequences}[#label-Escape+Sequences].
quotes: You may disable interpolation and the escaping by surrounding the opening
identifier with single quotes:
expected_result = <<-'EXPECTED' expected_result = <<-'EXPECTED'
One plus one is #{1 + 1} One plus one is #{1 + 1}
@ -326,12 +359,15 @@ details on what symbols are and when ruby creates them internally.
You may reference a symbol using a colon: <tt>:my_symbol</tt>. You may reference a symbol using a colon: <tt>:my_symbol</tt>.
You may also create symbols by interpolation: You may also create symbols by interpolation and escape sequences described in
{Escape Sequences}[#label-Escape+Sequences] with double-quotes:
:"my_symbol1" :"my_symbol1"
:"my_symbol#{1 + 1}" :"my_symbol#{1 + 1}"
:"foo\sbar"
Like strings, a single-quote may be used to disable interpolation: Like strings, a single-quote may be used to disable interpolation and
escape sequences:
:'my_symbol#{1 + 1}' #=> :"my_symbol\#{1 + 1}" :'my_symbol#{1 + 1}' #=> :"my_symbol\#{1 + 1}"
@ -451,9 +487,12 @@ may use these paired delimiters:
* <tt>(</tt> and <tt>)</tt>. * <tt>(</tt> and <tt>)</tt>.
* <tt>{</tt> and <tt>}</tt>. * <tt>{</tt> and <tt>}</tt>.
* <tt><</tt> and <tt>></tt>. * <tt><</tt> and <tt>></tt>.
* Any other character, as both beginning and ending delimiters. * Non-alphanumeric ASCII character except above, as both beginning and ending delimiters.
The first four pairs (brackets, parenthesis, braces, and angle brackets) can be nested. The delimiters can be escaped with a backslash.
However, the first four pairs (brackets, parenthesis, braces, and
angle brackets) are allowed without backslash as far as they are correctly
paired.
These are demonstrated in the next section. These are demonstrated in the next section.
@ -462,18 +501,21 @@ These are demonstrated in the next section.
You can write a non-interpolable string with <tt>%q</tt>. You can write a non-interpolable string with <tt>%q</tt>.
The created string is the same as if you created it with single quotes: The created string is the same as if you created it with single quotes:
%[foo bar baz] # => "foo bar baz" # Using []. %q[foo bar baz] # => "foo bar baz" # Using [].
%(foo bar baz) # => "foo bar baz" # Using (). %q(foo bar baz) # => "foo bar baz" # Using ().
%{foo bar baz} # => "foo bar baz" # Using {}. %q{foo bar baz} # => "foo bar baz" # Using {}.
%<foo bar baz> # => "foo bar baz" # Using <>. %q<foo bar baz> # => "foo bar baz" # Using <>.
%|foo bar baz| # => "foo bar baz" # Using two |. %q|foo bar baz| # => "foo bar baz" # Using two |.
%:foo bar baz: # => "foo bar baz" # Using two :. %q:foo bar baz: # => "foo bar baz" # Using two :.
%q(1 + 1 is #{1 + 1}) # => "1 + 1 is \#{1 + 1}" # No interpolation. %q(1 + 1 is #{1 + 1}) # => "1 + 1 is \#{1 + 1}" # No interpolation.
%q[foo[bar]baz] # => "foo[bar]baz" # brackets can be nested. %q[foo[bar]baz] # => "foo[bar]baz" # brackets can be nested.
%q(foo(bar)baz) # => "foo(bar)baz" # parenthesis can be nested. %q(foo(bar)baz) # => "foo(bar)baz" # parenthesis can be nested.
%q{foo{bar}baz} # => "foo{bar}baz" # braces can be nested. %q{foo{bar}baz} # => "foo{bar}baz" # braces can be nested.
%q<foo<bar>baz> # => "foo<bar>baz" # angle brackets can be nested. %q<foo<bar>baz> # => "foo<bar>baz" # angle brackets can be nested.
This is similar to single-quoted string but only backslashs and
the specified delimiters can be escaped with a backslash.
=== <tt>% and %Q</tt>: Interpolable String Literals === <tt>% and %Q</tt>: Interpolable String Literals
You can write an interpolable string with <tt>%Q</tt> You can write an interpolable string with <tt>%Q</tt>
@ -482,15 +524,22 @@ or with its alias <tt>%</tt>:
%[foo bar baz] # => "foo bar baz" %[foo bar baz] # => "foo bar baz"
%(1 + 1 is #{1 + 1}) # => "1 + 1 is 2" # Interpolation. %(1 + 1 is #{1 + 1}) # => "1 + 1 is 2" # Interpolation.
This is similar to double-quoted string.
It allow escape sequences described in
{Escape Sequences}[#label-Escape+Sequences].
Other escaped characters (a backslash followed by a character) are
interpreted as the character.
=== <tt>%w and %W</tt>: String-Array Literals === <tt>%w and %W</tt>: String-Array Literals
You can write an array of strings with <tt>%w</tt> (non-interpolable) You can write an array of strings as whitespace-separated words
or <tt>%W</tt> (interpolable): with <tt>%w</tt> (non-interpolable) or <tt>%W</tt> (interpolable):
%w[foo bar baz] # => ["foo", "bar", "baz"] %w[foo bar baz] # => ["foo", "bar", "baz"]
%w[1 % *] # => ["1", "%", "*"] %w[1 % *] # => ["1", "%", "*"]
# Use backslash to embed spaces in the strings. # Use backslash to embed spaces in the strings.
%w[foo\ bar baz\ bat] # => ["foo bar", "baz bat"] %w[foo\ bar baz\ bat] # => ["foo bar", "baz bat"]
%W[foo\ bar baz\ bat] # => ["foo bar", "baz bat"]
%w(#{1 + 1}) # => ["\#{1", "+", "1}"] %w(#{1 + 1}) # => ["\#{1", "+", "1}"]
%W(#{1 + 1}) # => ["2"] %W(#{1 + 1}) # => ["2"]
@ -498,18 +547,40 @@ or <tt>%W</tt> (interpolable):
# (not nested array). # (not nested array).
%w[foo[bar baz]qux] # => ["foo[bar", "baz]qux"] %w[foo[bar baz]qux] # => ["foo[bar", "baz]qux"]
The following characters are considered as white spaces to separate words:
* space, ASCII 20h (SPC)
* form feed, ASCII 0Ch (FF)
* newline (line feed), ASCII 0Ah (LF)
* carriage return, ASCII 0Dh (CR)
* horizontal tab, ASCII 09h (TAB)
* vertical tab, ASCII 0Bh (VT)
The white space characters can be escaped with a backslash to make them
part of a word.
<tt>%W</tt> allow escape sequences described in
{Escape Sequences}[#label-Escape+Sequences].
However the continuation line <tt>\<newline></tt> is not usable because
it is interpreted as the escaped newline described above.
=== <tt>%i and %I</tt>: Symbol-Array Literals === <tt>%i and %I</tt>: Symbol-Array Literals
You can write an array of symbols with <tt>%i</tt> (non-interpolable) You can write an array of symbols as whitespace-separated words
or <tt>%I</tt> (interpolable): with <tt>%i</tt> (non-interpolable) or <tt>%I</tt> (interpolable):
%i[foo bar baz] # => [:foo, :bar, :baz] %i[foo bar baz] # => [:foo, :bar, :baz]
%i[1 % *] # => [:"1", :%, :*] %i[1 % *] # => [:"1", :%, :*]
# Use backslash to embed spaces in the symbols. # Use backslash to embed spaces in the symbols.
%i[foo\ bar baz\ bat] # => [:"foo bar", :"baz bat"] %i[foo\ bar baz\ bat] # => [:"foo bar", :"baz bat"]
%I[foo\ bar baz\ bat] # => [:"foo bar", :"baz bat"]
%i(#{1 + 1}) # => [:"\#{1", :+, :"1}"] %i(#{1 + 1}) # => [:"\#{1", :+, :"1}"]
%I(#{1 + 1}) # => [:"2"] %I(#{1 + 1}) # => [:"2"]
The white space characters and its escapes are interpreted as the same as
string-array literals described in
{%w and %W: String-Array Literals}[#label-25w+and+-25W-3A+String-Array+Literals].
=== <tt>%s</tt>: Symbol Literals === <tt>%s</tt>: Symbol Literals
You can write a symbol with <tt>%s</tt>: You can write a symbol with <tt>%s</tt>:
@ -517,6 +588,10 @@ You can write a symbol with <tt>%s</tt>:
%s[foo] # => :foo %s[foo] # => :foo
%s[foo bar] # => :"foo bar" %s[foo bar] # => :"foo bar"
This is non-interpolable.
No interpolation allowed.
Only backslashs and the specified delimiters can be escaped with a backslash.
=== <tt>%r</tt>: Regexp Literals === <tt>%r</tt>: Regexp Literals
You can write a regular expression with <tt>%r</tt>; You can write a regular expression with <tt>%r</tt>;
@ -541,4 +616,10 @@ See {Regexp modes}[rdoc-ref:Regexp@Modes] for details.
You can write and execute a shell command with <tt>%x</tt>: You can write and execute a shell command with <tt>%x</tt>:
%x(echo 1) # => "1\n" %x(echo 1) # => "1\n"
%x[echo #{1 + 2}] # => "3\n"
%x[echo \u0030] # => "0\n"
This is interpolable.
<tt>%x</tt> allow escape sequences described in
{Escape Sequences}[#label-Escape+Sequences].