Tải bản đầy đủ

linux crash course appendix a

Appendix A:
Regular Expressions
It’s All Greek to Me

Regular Expressions
• A pattern that matches a set of one or more
• May be a simple string, or contain wildcard
characters or modifiers
• Used by programs such as vim, grep, awk,
and sed
• Not the same as shell expansion

• Characters
– Literals
– Special Characters

• Delimiters

– Mark beginning end of regular expressions
– Usually /
– ’ (but not really)

Simple Strings
• Contain no special characters
• Matches only the string
• Ex: /foo/ matches:
– foo
– tomfoolery
– bar.foo.com

Special Characters
• Can match multiple strings
• Represent zero or more characters
• Always match the longest possible string
(we’ll see examples in a bit)

• Matches any single character
• Ex: /.ing/
– I was talking
– bling
– he called ingred

• Ex: /spar.ing/
– sparring
– sparking

• Define a character class
• Match any one character in the class
• If a carat (^) is first character in class,
character class matches any character not in
• Other special characters in class lose


Brackets con’t

Ex. /[jJ]ustin/ matches justin and Justin
Ex. /[A-Za-z]/ matches any letter
Ex. /[0-9]/ matches any number
Ex. /[^a-z]/ matches anything but
lowercase letters

• Zero or more occurrences of the previous
• So match any number of characters would
be /.*/
• Ex. /t.*ing/
– thing
– this is really annoying

Plus Signs and Question Marks
• Very similar to asterisks, depend on previous
• + matches one or more occurrences (not 0)
• ? Matches zero or one occurrence (no more)
• Ex. /2+4?/ matches one or more 2’s
followed by either zero or one 4
– 22224, 2 match
– 4, 244 do not

• Part of the class of extended R.E.

Carets & Dollar Signs
• If a regular expression starts with a ^, the
string must be at the beginning of a line
• If a regular expression ends with a $, the
string must be at the end of a line
• ^ and $ are referred to as anchors
• Ex. /^T.*T$/ matches any line that starts
and ends with T

Quoting Special Characters
• If you want to use a special character
literally, put a backslash in front of it
• Ex. /and\/or/ matches and/or
• Ex. /\\/ matches \
• Ex. /\**/ matches any number of asterisks

Longest Match
• Regular expressions match the longest string
possible in a line
• Ex. I (Justin) like coffee (lots).
• /(.*)/
– Matches (Justin) like coffee (lots)

• /([^)]*)/
– Matches (Justin)

Boolean OR
• You can pattern match for two distinct strings
using OR (the pipe)
• Ex. /CAT|DOG/
– Matches exactly CAT and exactly DOG

• Simplier expressions can be written just
using a character class
– I.E. /a[bc]/ instead of /ab|ac/

• Also part of extended R.E.

• You can apply special characters to groups
of characters in parenthesis
• Also called bracketing
• Matches same as unbracketed expression
• But can use modifiers
• Ex. /\(duck\)*|\(goose\)/

Using with vim
• Use regular expressions for searching and
• Searching:
– /string or ?string

• Substituting:

g : global; substitute all lines
string and replace can be R.E.
/g : global; replace all occurrences in the line

Using with vim con’t
• [address]

n : line number
n[+/-]x : line number plus x lines before or after
n1,n2 : from line n1 to n2
. : alias for current line
$ : alias for last line in work buffer
% : alias for entire work buffer

vim examples
• /^if(
• /end\.$
• :%s/[Jj]ustin/Mr\. Awesome/g

Using with vim con’t
• Ampersand (&)
– Alias for matched string when substituting
– Ex: /[A-Z][0-9]/_&_/

• Quoted digit (\n)
– Used with R.E. with multiple quoted parts
– Can be used to rearrange columns
– Ex: /\([^,]*\), \(.*\)/\2 \1/

Using with grep
• To take advantage of extended regular
expressions, use egrep or grep -E instead
• Use single quote as delimiter
• Ex:
– egrep ’^T.*T$’ myfile
Lists all lines in myfile that begin & end with T

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay