What"s a Regular Expression? Ruby Regular Expressions 101
If you think "regex" are people who just didn't make the grade, then you need our quick Regular Expressions 101 review to learn how to use them. A regular expression (or regex, plural is regexen) in Ruby, like in other languages, is a statement in a special language designed to match strings to text. More simply, they're a way to describe the format and structure of strings. They are used in a variety of methods to extract or replace portions of a string.
Ruby has regular expression literals to make declaring and using regular expressions as easy as possible. Doing this allows regexen to become part of the language itself, and the use of regexen is encouraged throughout. Though they have special syntax, regular expressions are just normal objects. The special syntax simply creates Regexp objects, much like the quoted string "special syntax" creates String objects.
Regular expression literals start and end with a slash character. For example, /abc/ will create a Regexp object that will match the string abc. Additionally, a number of single-character options may be appended to the regex literal. The regex /abc/i, which uses the i or "case insensitive" option, will create a Regexp object that matches both abc and ABC (or any combination of capital and lowercase letters).
The String class is full of methods that use regular expressions. Some of the most obvious are things like match and sub, but there are many more.
Regular Expression Literals
Ruby has regular expression literals to make declaring and using regular expressions as easy as possible. Doing this allows regexen to become part of the language itself, and the use of regexen is encouraged throughout. Though they have special syntax, regular expressions are just normal objects. The special syntax simply creates Regexp objects, much like the quoted string "special syntax" creates String objects.
Regular expression literals start and end with a slash character. For example, /abc/ will create a Regexp object that will match the string abc. Additionally, a number of single-character options may be appended to the regex literal. The regex /abc/i, which uses the i or "case insensitive" option, will create a Regexp object that matches both abc and ABC (or any combination of capital and lowercase letters).
Methods That use Regular Expressions
The String class is full of methods that use regular expressions. Some of the most obvious are things like match and sub, but there are many more.
- "test" =~ /es/ - The =~ operator will return the position of Regexp match in the string. In this case, it would return 1. If there is no match, it would return nil.
- "test"[/e.*/] - The index operator of strings takes regular expressions. Unlike the =~ operator, the index operator will return the sequence of characters that matches the regular expression. In this example, the index operator would return "est", or every character following an e.
- "this is a test".gsub( /is/, "XX" ) - The gsub method replaces all matches of the Regexp passed to it with the second argument. In this example, since the string "is" matches twice, both instance of the string "is" will be replaced with "XX." This example statement would return "thXX XX a test".
- "this is a test".scan( /t./ ) {|m| puts m} - The scan method finds every match of the Regexp in the string and passes it to the block provided. In this example, scan will find every instance of the letter t with the letter directly following it. It will call the block with the strings "th" and "te".
- "this,is,a,test".split( /,/ ) - One of the more common methods to use Regexp objects, the split method will split a string into an array of strings. For every time that the Regexp matches, another portion of the string will be split off. This example will return the array["this", "is", "a", "test"].
Source...