Regular Expression can be a simple pattern, such as the string must contain the sequence of letters "cat" or it can be complex. There are many ways of creating a regular expression pattern. By far the most common is to write it between forward slashes. If you want to match one of the special characters literally in a pattern, precede it with a backslash like so /\\*/.
Because pattern matching returns nil when it fails and because nil is equivalent to false in a boolean context, you can use the result of a pattern match as a condition in statements such as if and while
str = "cat and dog"
if str =~ /cat/
puts "There's a cat here somewhere"
end
You can test to see whether a pattern does not match a string using !~
File.foreach("testfile").with_index do |line, index|
puts "#{index}: #{line}"
if line !~ /on/
end
The sub method changes only the first match it finds. To replace all matches, use gsub.
str = "Dog and Cat"
new_str1 = str.sub(/a/, "*")
new_str2 = str.gsub(/a/, "*")
puts "Using sub: #{new_str1}" # => Using sub: Dog * and Cat
puts "Using gsub: #{new_str2}" # => Using gsub: Dog *nd C*t
Unlike sub and gsub, sub! and gsub! return the string only if the pattern was matched. If no match for the pattern is found in the string, they return nil instead.
You can also create regular expression objects by calling the Regexp class’s new method or by using the %r{...} syntax. The %r syntax is particularly useful when creating patterns that contain forward slashes
/mm\\/dd/ # => /mm\\/dd/
Regexp.new("mm/dd") # => /mm\\/dd/
%r{mm/dd} # => /mm\\/dd/
Regular Expression Options
A regular expression may include one or more options that modify the way the pattern matches strings.
Modifier | Short | Description |
---|---|---|
i | Case insensitive | The pattern match will ignore the case of letters in the pattern and string. |
m | Multiline mode | Matches any character - including newline. |
x | Extended mode | Allows one to insert spaces and newlines in the pattern to make it more readable. |
Matching Against Patterns
After a successful match, Ruby sets a whole bunch of magic variables. For example, $& receives the part of the string that was matched by the pattern, $` receives the part of the string that preceded the match, and $' receives the string after the match.
The match operators return the character position at which the match occurred, while the match method returns a MatchData object. Given a MatchData object, you can call pre_match to return the part of the string before the match, post_match for the string after the match, and index using [0] to get the matched portion.
def show_regexp(string, pattern)
match = pattern.match(string)
if match
"#{match.pre_match}->#{match[0]}<-#{match.post_match}"
else
"no match"
end
end
show_regexp('very interesting', /t/) # => very in->t<-eresting
show_regexp('Fats Waller', /lle/) # => Fats Wa->lle<-r
show_regexp('Fats Waller', /z/) # => no match
Anchors
The patterns ^ and $ match the beginning and end of a line, respectively. the sequence \\A matches the beginning of a string, and \\z and \\Z match the end of a string. Actually, \\Z matches the end of a string unless the string ends with \\n, in which case it matches just before the \\n.