
- Perl - Home
- Perl - Introduction
- Perl - Environment
- Perl - Syntax Overview
- Perl - Data Types
- Perl - Variables
- Perl - Scalars
- Perl - Arrays
- Perl - Hashes
- Perl - IF...ELSE
- Perl - Loops
- Perl - Operators
- Perl - Date & Time
- Perl - Subroutines
- Perl - References
- Perl - Formats
- Perl - File I/O
- Perl - Directories
- Perl - Error Handling
- Perl - Special Variables
- Perl - Coding Standard
- Perl - Regular Expressions
- Perl - Sending Email
- Perl - Socket Programming
- Perl - Object Oriented
- Perl - Database Access
- Perl - CGI Programming
- Perl - Packages & Modules
- Perl - Process Management
- Perl - Embedded Documentation
- Perl - Functions References
- Perl Useful Resources
- Perl - Questions and Answers
- Perl - Quick Guide
- Perl - Cheatsheet
- Perl - Useful Resources
- Perl - Discussion
PERL Regular Expressions



A regular expression is a string of characters that define the pattern or patterns you are viewing. The syntax of regular expressions in Perl is very similar to what you will find within other regular expression.supporting programs, such as sed, grep, and awk. The basic method for applying a regular expression is to use the pattern binding operators =~ and !~. The first operator is a test and assignment operator. There are three regular expression operators within Perl
The forward slashes in each case act as delimiters for the regular expression (regex) that you are specifying. If you are comfortable with any other delimiter then you can use in place of forward slash. The Match OperatorThe match operator, m//, is used to match a string or statement to a regular expression. For example, to match the character sequence "foo" against the scalar $bar, you might use a statement like this:
The m// actually works in the same fashion as the q// operator series.you can use any combination of naturally matching characters to act as delimiters for the expression. For example, m{}, m(), and m>< are all valid. You can omit the m from m// if the delimiters are forward slashes, but for all other delimiters you must use the m prefix. Note that the entire match expression.that is the expression on the left of =~ or !~ and the match operator, returns true (in a scalar context) if the expression matches. Therefore the statement:
Will set $true to 1 if $foo matches the regex, or 0 if the match fails. In a list context, the match returns the contents of any grouped expressions. For example, when extracting the hours, minutes, and seconds from a time string, we can use:
Match Operator ModifiersThe match operator supports its own set of modifiers. The /g modifier allows for global matching. The /i modifier will make the match case insensitive. Here is the complete list of modifiers
Matching Only OnceThere is also a simpler version of the match operator - the ?PATTERN? operator. This is basically identical to the m// operator except that it only matches once within the string you are searching between each call to reset. For example, you can use this to get the first and last elements within a list:
The Substitution OperatorThe substitution operator, s///, is really just an extension of the match operator that allows you to replace the text matched with some new text. The basic form of the operator is:
The PATTERN is the regular expression for the text that we are looking for. The REPLACEMENT is a specification for the text or regular expression that we want to use to replace the found text with. For example, we can replace all occurrences of .dog. with .cat. using
Another example:
Substitution Operator ModifiersHere is the list of all modifiers used with substitution operator
TranslationTranslation is similar, but not identical, to the principles of substitution, but unlike substitution, translation (or transliteration) does not use regular expressions for its search on replacement values. The translation operators are:
The translation replaces all occurrences of the characters in SEARCHLIST with the corresponding characters in REPLACEMENTLIST. For example, using the "The cat sat on the mat." string we have been using in this chapter:
Translation Operator ModifiersFollowing is the list of operators related to translation
The /d modifier deletes the characters matching SEARCHLIST that do not have a corresponding entry in REPLACEMENTLIST. For example:
The last modifier, /s, removes the duplicate sequences of characters that were replaced, so:
More complex regular expressionsYou don't just have to match on fixed strings. In fact, you can match on just about anything you could dream of by using more complex regular expressions. Here's a quick cheat sheet:
Quantifiers can be used to specify how many of the previous thing you want to match on, where "thing" means either a literal character, one of the metacharacters listed above, or a group of characters or metacharacters in parentheses.
The ^ metacharacter matches the beginning of the string and the $ metasymbol matches the end of the string. Here are some brief examples
Lets have alook at another example
Matching BoundariesThe \b matches at any word boundary, as defined by the difference between the \w class and the \W class. Because \w includes the characters for a word, and \W the opposite, this normally means the termination of a word. The \B assertion matches any position that is not a word boundary. For example:
Selecting AlternativesThe | character is just like the standard or bitwise OR within Perl. It specifies alternate matches within a regular expression or group. For example, to match "cat" or "dog" in an expression, you might use this:
You can group individual elements of an expression together in order to support complex matches. Searching for two people.s names could be achieved with two separate tests, like this:
Grouping MatchingFrom a regular-expression point of view, there is no difference between except, perhaps, that the former is slightly clearer.
However, the benefit of grouping is that it allows us to extract a sequence from a regular expression. Groupings are returned as a list in the order in which they appear in the original. For example, in the following fragment we have pulled out the hours, minutes, and seconds from a string.
As well as this direct method, matched groups are also available within the special $x variables, where x is the number of the group within the regular expression. We could therefore rewrite the preceding example as follows:
When groups are used in substitution expressions, the $x syntax can be used in the replacement text. Thus, we could reformat a date string using this:
Using the \G AssertionThe \G assertion allows you to continue searching from the point where the last match occurred. For example, in the following code we have used \G so that we can search to the correct position and then extract some information, without having to create a more complex, single regular expression:
The \G assertion is actually just the metasymbol equivalent of the pos function, so between regular expression calls you can continue to use pos, and even modify the value of pos (and therefore \G) by using pos as an lvalue subroutine: Regular Expression VariablesRegular expression variables include $, which contains whatever the last grouping match matched; $&, which contains the entire matched string; $`, which contains everything before the matched string; and $', which contains everything after the matched string. The following code demonstrates the result:
|



Advertisements |