资料来源 : Free On-Line Dictionary of Computing
regular expression
1. (regexp, RE) One of the {wild
card} patterns used by {Unix} utilities such as {grep}, {sed}
and {awk} and editors such as {vi} and {Emacs}. These use
conventions similar to but more elaborate than those described
under {glob}. A regular expression is a sequence of
characters with the following meanings:
An ordinary character (not one of the special characters
discussed below) matches that character.
A backslash (\) followed by any special character matches the
special character itself. The special characters are:
"." matches any character except NEWLINE; "RE*" (where
the "*" is called the "{Kleene star}") matches zero
or more occurrences of RE. If there is any choice, the
longest leftmost matching string is chosen, in most
regexp {flavour}s.
"^" at the beginning of an RE matches the start of a line and
"$" at the end of an RE matches the end of a line.
[string] matches any one character in that string. If the
first character of the string is a "^" it matches
any character (except NEWLINE, in most regexp {flavour}s)
and the remaining characters in the string. "-" may be used
to indicate a range of consecutive ASCII characters.
\( RE \) matches whatever RE matches and \n, where n is a
digit, matches whatever was matched by the RE between the nth
\( and its corresponding \) earlier in the same RE. In
many flavours ( RE ) is used instead of \( RE \)
The concatenation of REs is a RE that matches the
concatenation of the strings matched by each RE.
\< matches the beginning of a word and \> matches the end of a
word. In many flavours of regexp, \> and \< are replaced by
"\b", the special character for "word boundary".
RE\{m\} matches m occurences of RE. RE\{m,\} matches m or
more occurences of RE. RE\{m,n\} matches between m and n
occurences.
The exact details of how regexp will work in a given
application vary greatly from flavour to flavour. A comprehensive
survey of regexp flavours is found in Friedl 1997 (see below).
[Jeffrey E.F. Friedl, "{Mastering Regular
Expressions(http://enterprise.ic.gc.ca/~jfriedl/regex/index.html)},
O'Reilly, 1997.]
2. Any description of a {pattern} composed from combinations
of {symbols} and the three {operators}:
Concatenation - pattern A concatenated with B matches a match
for A followed by a match for B.
Or - pattern A-or-B matches either a match for A or a match
for B.
Closure - zero or more matches for a pattern.
The earliest form of regular expressions (and the term itself)
were invented by mathematician {Stephen Cole Kleene} in the
mid-1950s, as a notation to easily manipulate "regular sets",
formal descriptions of the behaviour of {finite state
machines}, in {regular algebra}.
[S.C. Kleene, "Representation of events in nerve nets and
finite automata", 1956, Automata Studies. Princeton].
[J.H. Conway, "Regular algebra and finite machines", 1971, Eds
Chapman & Hall].
[Sedgewick, "Algorithms in C", page 294].
(1997-08-03)