Your Unix/Linux Chapter 10

Dr. Deborah Whitfield

Chapter 10: Filters using REs

Regular expressions:

Syntax Description

zx 'z' followed by 'x'

zy$ ends in zy

^b starts with b

^b.*zy starts with b (^b) followed by any number of characters (.*) and containing zy

[aeiou] contains one of a, e, i, o, u,

[a-z] a lowercase letter

[A-Z] an uppercase letter

expr:

Command Description

expr perform arithmetic operations

expr value operation value
expr 3 + 4 7

man expr Check out all of the expr possibilities!

see the example

Basic Grep:

Command Description

grep zx /usr/share/dict/words find anything containing 'z' followed by 'x'

grep zy /usr/share/dict/words | more find anything containing 'z' followed by 'y' (note: piping to more is often a good idea when using grep, since output from grep may be very large)

grep zy$ /usr/share/dict/words | more find anything ending in 'zy'

grep ^b.*zy find anything starting with b (^b) followed by any number of characters (.*) and containing zy

grep ^[aeious].*zy find anything starting with a, e, i,o, u, s (^[aeious]) followed by any number of characters (.*) and containing zy

vowel=[aeiou] define vowels

grep ^$vowel.*zy /usr/share/dict/words find anything beginning with a vowel and containing 'zy'

constant=[a-z]
grep ^$constant.*zy /usr/share/dict/words find anything beginning with a lowercase letter and containing 'zy'

grep ^z.*$vowel$ /usr/share/dict/words find anything beginning with z and ending with a vowel

w=/usr/share/dict/words
echo $w
/usr/share/dict/words
echo w
w
grep ^z.*z$ $w
grep ^y.*y$ $w defining a symbol to refer to a file name (so as to be able to avoid typing it again and again)

see the example

Grep Flags:

C (count), V (negation), and I (ignore) flags

Command	Description
w=/usr/share/dict/words wc -l $w	How many words (lines) are in the whole dictionary?
grep -c ^e $w	-c flag counts occurences. How many start with an 'e'?
grep -c ^[^e] $w	Using ^ inside of the braces, negates the character. So this counts the number of words that do not start with an e.
grep -c e $w	How many contain an 'e'?
grep -c ^[^e]*$ $w	How many contain only non-e's from start (^) to end ($)? . Note:these last two counts should add up to the total number in the dictionary.
grep -cv e $w	the same thing using the -v flag (look for non-matches) on grep.
grep -cv ^e $w	How many do not start with an e

grep -c d $w	how many words have a d?
grep -ic d $w	how many words have d ignoring case?
grep -c t $w	how many words have a t?
grep -ic t $w	how many words have a t ignoring case?
grep -i d $w \| grep -ic t	how many have both?
$ grep -ic [dt] $w	how many have either?
grep -icv [dt] $w	how many have neither?
grep -ic ^[^dt]*$ $w	Another way to determine how many have neither?
see the example

complex regular expressions -- counting and remembering

Command	Description
w=/usr/share/dict/words
grep "[aeiou]\{5\}" $w	Find words having five vowels in a row.
grep "a[b-df-hj-np-tv-xz]\{5\}" $w	Find words having five consonants in a row, directly following an 'a'.
grep "$.$$.$$.$$.$.*\1\2\3\4" $w	Find words having a sequence of four letters repeated twice. Each parenthesized string is remembered and recalled through \n.
grep "$....$.*\1" $w	A shorter way to do the same thing, using only one remembered string.
grep "$..$$.*\1$\{2\}" $w	Find a sequence of two characters and remember them. Then find any string followed by those same two characters and repeated twice.
see the example

sed info:

FORMAT

sed -f script filename

sed -n

sed -e 'instruction;instruction' file // for more than one file

INSTRUCTIONS

Delete

[address]d

address

/RE/ only lines containing string
line number
line addressing symbol used in RE

^ start of line

$ end of line

! can be used after address so only lines not matching
mixture of above

1,\$d -- deletes 1st to last line

1,/^$/d -- deletes 1st through blank line

Substitute

[address]s/pattern/replacement/flags

replacement

& Replaced by the string matched by the RE

\n n is a single digit, Matches the nth substring

previously specified in the RE using "$" and "$"

\ Escape the & and \

flags

n -- any number from 1 to 512 -- replace only nth occurrence

g -- global

p -- prints

w file -- write to the file

Append

[line-address]a\

text

for multiple lines of text, end each line with a \

Insert

[line-address]i\

text

Change

[line-address]c\

text

Transform

[address]y/abc/xyz transforms each a to an x, b to a y, and c to z

Next
Read
Write

sed Examples:

Command Description

sed s/he/she/ file string substitution -
replace the first occurence of he on each line with she

sed s/he/she/g file string substitution -
replace all occurences of he on each line with she

sed 1s/he/she/g file string substitution -
replace all occurences of he on the first line with she

sed s/word.*/word/ file Replace word and all that follows it with word

sed s/word.*/word./ file Replace word and all that follows it with word adding a '.' at the end.

sed 2d file delete the second line from our display of the file.

sed 1,2d file deleting a range of lines (lines 1 through 2)

sed /more/d file deleting any line that matches a regular expression.

sed /more/p file duplicating any line that matches a regular expression

sed -n /more/p file Produce only the lines that match a regular expression.
The -n flag keeps sed "quiet" unless overtly told to print a line, through the 'p' command.

sed /test/s/is/at/ file Substitute the string 'is' by 'at' but only in lines containing a particular regular expression. Note that 'is' in the first line is unaffected.

sed s/i/x/2 file modify the 2nd occurence of 'i' in each line

sed = file the equal sign places line numbers into the output.

see the example

more sed

Command	Description
sed -n /^[a-e].*xy$/p $w	find the words beginning with a through e and ending in 'xy'
grep ^[a-e].*xy$ $w \| sed s/p/m/	find those words and substitute p by m (first occurrence only)
grep ^[a-e].*xy$ $w \| sed s/p/m/g	find those words and substitute all p's by m's ('g' means "global")
sed -n /^[a-e].*xy$/s/p/m/gp $w	similar to above, but showing only those lines affected by the substitution of p by m.
see the example

All work herein is subject to copyright. Original content to Dr. Deborah Whitfield, text content (Your UNIX/Linux) to Prentice Hall publishing.

Syntax	Description
zx	'z' followed by 'x'
zy$	ends in zy
^b	starts with b
^b.*zy	starts with b (^b) followed by any number of characters (.*) and containing zy
[aeiou]	contains one of a, e, i, o, u,
[a-z]	a lowercase letter
[A-Z]	an uppercase letter

Command	Description
expr	perform arithmetic operations
expr value operation value expr 3 + 4	7
man expr	Check out all of the expr possibilities!
see the example

Command	Description
grep zx /usr/share/dict/words	find anything containing 'z' followed by 'x'
grep zy /usr/share/dict/words \| more	find anything containing 'z' followed by 'y' (note: piping to more is often a good idea when using grep, since output from grep may be very large)
grep zy$ /usr/share/dict/words \| more	find anything ending in 'zy'
grep ^b.*zy	find anything starting with b (^b) followed by any number of characters (.*) and containing zy
grep ^[aeious].*zy	find anything starting with a, e, i,o, u, s (^[aeious]) followed by any number of characters (.*) and containing zy
vowel=[aeiou]	define vowels
grep ^$vowel.*zy /usr/share/dict/words	find anything beginning with a vowel and containing 'zy'
constant=[a-z] grep ^$constant.*zy /usr/share/dict/words	find anything beginning with a lowercase letter and containing 'zy'
grep ^z.*$vowel$ /usr/share/dict/words	find anything beginning with z and ending with a vowel
w=/usr/share/dict/words echo $w /usr/share/dict/words echo w w grep ^z.z$ $w grep ^y.y$ $w	defining a symbol to refer to a file name (so as to be able to avoid typing it again and again)
see the example

Command	Description
history	lists commands you issued
history \| wc	count lines in history
history \| grep ls \| wc	count number of times ls issued
history \| grep cd \| wc	count number of times cd issued
history \|grep grep \| wc	count number of times grep issued
see the example

	more to come

Command	Description
sed s/he/she/ file	string substitution - replace the first occurence of he on each line with she
sed s/he/she/g file	string substitution - replace all occurences of he on each line with she
sed 1s/he/she/g file	string substitution - replace all occurences of he on the first line with she
sed s/word.*/word/ file	Replace word and all that follows it with word
sed s/word.*/word./ file	Replace word and all that follows it with word adding a '.' at the end.
sed 2d file	delete the second line from our display of the file.
sed 1,2d file	deleting a range of lines (lines 1 through 2)
sed /more/d file	deleting any line that matches a regular expression.
sed /more/p file	duplicating any line that matches a regular expression
sed -n /more/p file	Produce only the lines that match a regular expression. The -n flag keeps sed "quiet" unless overtly told to print a line, through the 'p' command.
sed /test/s/is/at/ file	Substitute the string 'is' by 'at' but only in lines containing a particular regular expression. Note that 'is' in the first line is unaffected.
sed s/i/x/2 file	modify the 2nd occurence of 'i' in each line
sed = file	the equal sign places line numbers into the output.
see the example