You are on page 1of 24

Regular Expressions

Where can I use


Powerful in
- Search , search and replace, text processing

Text Editors vi, editplus


Programming languages - perl, Java
Grep,awk

07/09/15Friday, January 28,

Before RegEx
Wildcard
*.txt
My_report*.doc
Here the * indicates any number of any
characters.

07/09/15Friday, January 28,

!
Regular expressions (RegEx) tend to
be easier to write than they are to
read

07/09/15Friday, January 28,

What is RegEx
a regular expression-- a pattern that describes or
matches a set of strings
Matched text chunk of text which matches the
regular expression.
ca[trn]
Matches car, can, cat

Editplus is used throughout this presentation as tool to demonstrate


regular expressions

07/09/15Friday, January 28,

the

Structure of RegEx
Made up of normal characters and
metacharacters.
Metacharacters special function
$ ^ . \ [] \( \) + ?
$ means end of line
^ start of line

07/09/15Friday, January 28,

Literal match
RegEx: cat will match the word cat
It will also match words like
concatenation , delicate, located,
modification
It is not desired sometimes ?
solution

07/09/15Friday, January 28,

Matching
Match the space before and after
cat
cat
? Still problem

07/09/15Friday, January 28,

Character class
Want to search in or on ..
So searching RegEx : [io]n will match in
and on both
[ ] : used to specify a set of character to
select from.
[a-h] : indicates set of all characters from a to
h
[4-9A-T]

07/09/15Friday, January 28,

Character class
It can also contain individual
characters as : [acV5y0]
[0-9] : ?
[0-9][0-9] :?
18[0-9][0-9]:?

07/09/15Friday, January 28,

10

Example
set of vowels
[aeiou]

set of consonents

[bcdfghjklmnpqrstvwxyz]

Consider matching words which start with 2


vowels and end with consonant
[aeiou][aeiou][bcdfghjklmnpqrstvwxyz] ?
[aeiou][aeiou][bcdfghjklmnpqrstvwxyz]

07/09/15Friday, January 28,

11

Negation
The absence of any character or set
of character can be shown using ^
symbol

[^ab^8] : means not a , but b , but not 8


[^c-p] : means any character other than
c..p
[^t]ion : select all words ending with
ion but with not before it

07/09/15Friday, January 28,

12

Start/End of line
^ : indicates start of line
$ : indicates end of line
Example:
search lines starting with I
Use RegEx : ^I
search lines ending ending with is
Use RegEx : is$

07/09/15Friday, January 28,

13

match
. : Any character match
e.e : match all strings where first letter is e and
last is e.
Try e.e

If you want only words to be searched then


change the query to
e[a-z]e

07/09/15Friday, January 28,

14

Repeated match
* : match the previous character or
character-class zero or more times
be* : will match sequence of zero or
more e preceded by b
+ : similar to *
Only difference is that it matches
sequence of one or more.

07/09/15Friday, January 28,

15

Selecting a number
Single digit : [0-9]
When single digit is repeated zero or
more times it is a number.
(digit)repeat
[0-9]*
$[0-9]* : ?
\$[0-9]*

07/09/15Friday, January 28,

16

Selecting a word

Word is composed of alphabets


A word is : [a-z]*
A word in all capital letters : ??
A word starting with capital letter :[
][ ]*

07/09/15Friday, January 28,

17

Alternate match
| : symbol is used to specify
alternate match
Search: (above)|(below)

07/09/15Friday, January 28,

18

Search
Day Words
[a-z]*day
[a-z]+day
- [A-Z][a-z]+day

07/09/15Friday, January 28,

19

Escaping Special meaning


How to match (, ) or *
To match the characters which are
used as Metacharacter, \ is added
before them as an escape character.
i.e. to match ( write \( and to match
period . write \.

07/09/15Friday, January 28,

20

Search patterns
has, have, had
not, nt

((have)|(had)|(has))
(( )|(n't)|( not ))*
((have)|(had)|(has))(( )|(n't)|( not ))*

07/09/15Friday, January 28,

21

07/09/15Friday, January 28,

22

References
Editplus help pages
http://gnosis.cx/publish/programming/regular_exp
ressions.html
OReilly - Mastering Regular Expressions
Google regular expression tutorial

07/09/15Friday, January 28,

23

Thank you !

07/09/15Friday, January 28,

24

You might also like