LevSelector.com |
sed tutorial | |
• sed
introduction
• sed examples #1 |
• sed
commands
• regular expressions • more sed examples |
intro | home - top of the page - |
sed - Stream EDitor - works as a
filter processing input line by line.
Sed reads one line at a time, chops off the terminating newline, puts
what is left into the pattern space (buffer) where the sed script can process
it (often using regular expressions), then if there anything to print -
sed appends a newline and prints out the result (usually to stdout).
• www.engin.umich.edu/htbin/mangate?manpage=sed
- sed
There are many tutorials and FAQs - search google for sed tutorial or sed faq
- www.faqs.org/faqs/editor-faq/sed/
- FAQ (also www.ptug.org/sed/sedfaq.html
)
- www.dreamwvr.com/sed-info/sed-faq.html
- FAQ
- http://spacsun.rice.edu/FAQ/sed.html
- short 1-page intro - examples of usage
- http://spazioweb.inwind.it/seders/tutorials/
- list of links to sed tutorials
- http://spazioweb.inwind.it/seders/tutorials/sedtut_4.txt
-original manual (1978 by Lee E. McMahon with the classic "Kubla Khan"
example)
- www.math.fu-berlin.de/~leitner/sed/tutorial.html
- tutorial by Felix von Leitner
Print out and read the following:
- www.dbnet.ece.ntua.gr/~george/sed/sedtut_1.html
- good tutorial by Carlos Duarte
- www.dbnet.ece.ntua.gr/~george/sed/1liners.html - one-liners (compiled by Eric Pement) (see also www.cornerstonemag.com/sed/sed1line.txt ) |
Books:
Sed & Awk, 2d edition, by Dale Dougherty & Arnold Robbins (O'Reilly, 1997) Mastering Regular Expressions, by Jeffrey E. F. Friedl (O'Reilly, 1997) |
Several more sites: - Yao-Jen
Chang - Sven
Guckes - Felix
von Leitner - Yiorgos
Adamopoulos - Eric Pement
sed examples #1 | home - top of the page - |
Simple commands = pattern + action. If no pattern is given, the action
is applied to all lines, otherwise it is applied only to lines matching
the pattern. Regular expressions are simimlar to those in Perl. Here is
a typical example of usage:
Example:
>cat file
I have three dogs and two cats >sed -e 's/dog/cat/g' -e 's/cat/elephant/g' file I have three elephants and two elephants > |
The way you usually use sed is as follows:
>sed -e 'command1' -e 'command2' -e 'command3' file
>{shell command}|sed -e 'command1' -e 'command2' >sed -f sedscript.sed file >{shell command}|sed -f sedscript.sed |
so sed can read from a file or STDIN, and the commands can be
specified in a file or on the command line.
Note: trailing whitespaces
in the sed script file can cause scripts to fail. Use editor which can
show the trailing spaces and allows to remove them (vim is a good choice).
Example:
To delete the first 10 lines of stdin and echo the rest to stdout:
sed -e '1,10d'
The -e tells sed to execute the next command line argument as sed program.
1,10 - pattern
d - action (delete - general syntax is [address1[
, address2 ] ]d )
note, that since sed programs often contain regular expressions, they will often contain characters that your shell interprets, so you should get used to put all sed programs in single quotes so your shell won't interpret the sed program.
Example: show only lines which match the pattern /mama/:
sed -n -e '/line/p'
test.txt
The -n suppresses printing for all the lines
the p activates printing for matched lines
test.txt is an input file to which this sed command is applied
Example: To print only the first ten lines, we would
have deleted all the lines starting with 11:
sed -e '11,$d'
Note that $ is the last line. Because sed(1) processes the input line by line, it does not keep the whole input in memory. This makes sed(1) very useful for processing large files, but it has it's drawbacks, too. For example, we can't use sed -e '$-10,$d', since sed doesn't know $ before the end of file, so it doesn't know where $-10 is. This is a major problem, and it limits sed(1)'s usefulness, but sed(1) still has a large number of appliances.
Example: Another way to get only the first 10 lines
is to use the -n option:
sed -n -e '1,10p'
If we want to delete only one line, the pattern can be '10,10' or simple
'10'.
Example: More Than One Command (separated by new
lines):
sed -e '1,4d
6,9d'
This would delete the lines 1 to 4 and 6 to 9.
Example: use the -e option more than once:
sed -e '1,4d'
-e '6,9d'
Note: you can omit -e option if you have only one command in your
program. But you should get used to the -e option, so you won't have to
add it if you want to extend your program later on.
sed commands | home - top of the page - |
General syntax for a command is:
[address1[,address2]] function [arguments]
if no address is given, a command is applied to all lines
if 1 address is given, then it is applied to all pattern spaces that
match that address
if 2 addresses are given, then it is applied to all from addr1 to addr2
(including addr1 and addr2 themselves).
Note: Addresses may be expressed in line numbers or in patterns. If in patterns, then the substitution is applied to groups of lines from address1 to the first match of address2. If there are several groups like that in one file - they all will be affected.
Command example:
1,2s/line/LINE/
Tables of commands (number of arguments):
(2)!cmd | exclamation sign means "Don't apply to specified addresses" |
(0)# | comment |
(0):label | place a label |
(1)= | display line number |
(2)D | delete first part of the pattern space |
(2)G | append contents of hold area |
(2)H | append pattern space on buffer |
(2)N | append next line |
(2)P | print first part of the pattern space |
(1)a | append text |
(2)blabel | branch to label |
(2)c | change lines |
(2)d | delete lines |
(2)g | get contents of hold area |
(2)h | hold pattern space (in a hold buffer) |
(1)i | insert lines |
(2)l | list lines |
(2)n | next line |
(2)p | |
(1)q | quit |
(1)r file | read the contents of file |
(2)tlabel | test substitutions and branch on successful substitution |
(2)w file | write to file |
(2)x | exchange buffer space with pattern space |
(2){ | group commands |
(2)s/RE/replacement/[flags] | substitute |
(2)y/list1/list2/ | translates list1 into list2 |
regular expressions | home - top of the page - |
The sed regular expressions are essentially the same as the grep regular
expressions. They are summarized below.
Note that you have to escape with backslashes the many characters:
curlies \{ \} , round brackets \( \), vertical bars
\| , star \*, plus \+, question mark \?
^ | matches the beginning of the line |
$ | matches the end of the line |
. | dot matches any single character |
... \* | match zero or more occurences of (char or something) |
... \+ | match one or more occurences of (char or something) |
... \? | Match 0 or 1 instance of (character) |
[abcdef] | Match any character enclosed in [] (in this instance, a b c d e or
f) ranges of characters such as [a-z] are permitted. The behaviour of this
deserves more description. See the page on grep
for more details about the syntax of lists.
to include `]' in the list, make it the first char, to include `-' in the list, make it the first or last |
[^abcdef] | Match any character NOT enclosed in [] (in this instance, any character other than a b c d e or f) |
(character)\{m,n\} | Match m-n repetitions of (character) |
(character)\{m,\} | Match m or more repetitions of (character) |
(character)\{,n\} | Match n or less (possibly 0) repetitions of (character) |
(character)\{n\} | Match exactly n repetitions of (character) |
\(expression\) | Group operator. Also memorizes into numbered variables - use for backreference as \1 \2 .. \9 |
\n | Backreference - matches nth group |
expression1\|expression2 | Matches expression1 or expression 2. Works with GNU sed, but this feature might not work with other forms of sed. |
\1 \2 ...\9 | backreference, matches i-th memorized \(..\) |
sed examples | home - top of the page - |
Example: delete all the lines that contain the word
``debug'' from the log file:
sed -e '/debug/d'
< log
This works just like grep -v debug.
Example: delete lines with the word debug, but we
only want lines that contain ``foo''. The traditional way to handle this
would be:
grep 'foo' <
log | grep -v debug
Note that this spawns two grep processes. The sed equivalent would
be:
sed -n -e '/debug/d'
-e '/foo/p'
Here -n option inhibits printing, first pattern deletes all the lines
with /debug/, and the second command forces printing of some of the remaining
lines (which match /foo/).
Example: Calling sed program from a file:
sed -f program.sed
to set a -n option from within your sed program - use ``#n'' as the
first line in your program file.
Example: Inserting Text with 'a' (append) or 'i'
(insert) actions:
To insert a string just before line 10.
10i\
I am a string
To append a string after the last line:
$a\
I am a string
Example: Replacing the current line:
10c\
new contents for line 10
Example: option 'l' (as in 'life') causes sed to show visually all non-printable characters and wrap long lines using '\' at the end. Normal backslashes in the text are escaped, too, tabs are replaced with \t and nonprintable characters are printed as escaped three-digit octal numbers.
sed -n -e 'l' <test.txt
a\tb\tc$
d\te\tf$
Example: use 'q' action to end processing. So, yet another
way of printing the first 10 lines would have been:
sed -e '10q'
Example: substitutions using regular expressions:
's/pattern/replacement/[flags]' - this is the most often used sed command.
sed -e 's/foo/bar/'
which would just change the string ``foo'' to ``bar''.
The format for the substitute command is as follows:
[address1[
,address2]]s/pattern/replacement/[flags]
The flags can be any of the following:
n | replace nth instance of pattern with replacement |
g | replace all instances of pattern with replacement |
p | write pattern space to STDOUT if a succesful substitution takes place |
w file | Write the pattern space to file if a succesful substitution takes place |
Note: Addresses may be expressed in line numbers or in patterns. If in patterns, then the substitution is applied to groups of lines from address1 to the first match of address2. If there are several groups like that in one file - they all will be affected.
Note: we can use differen delimiters (for example one of those: @%,;:) instead of '/'.
Example:
>cat file
the black cat was chased by the brown dog >sed -e 's/black/white/g' file the white cat was chased by the brown dog |
Example: do substitution only in lines which match some pattern. In this example, the substitution is only applied to lines matching the regular expression /often/.
>cat file
the black cat was chased by the brown dog. the black cat was often chased by the brown dog >sed -e '/often/s/black/white/g' file the black cat was chased by the brown dog. the white cat was often chased by the brown dog. |
Example:
>cat file
line 1 (one) line 2 (two) line 3 (three) >sed -e '1,2s/line/LINE/' file LINE 1 (one) LINE 2 (two) line 3 (three) >sed -e '/^line.*one/s/line/LINE/' -e '/line/d' file LINE 1 (one). |
Example: Find First Word From a List in a File
This example uses backreferences ( \1, etc.) and subroutines
( grouping commands with curly braces ) .
#!/bin/sh
X='word1\|word2\|word3|\word4|\word5' sed -e " /$X/!d /$X/{ s/\($X\).*/\1/ s/.*\($X\)/\1/ q }" $1 |
Note: the * operator is greedy.
Pattern matching across several lines - use N command to append the next line to a pattern space (or better use Perl for this task).
Example:
/Microsoft[ \t]*$/{
N } /Microsoft[ \t\n]*Windows[ \t]*$/{ N } s/Microsoft[ \t\n]*Windows[ \t\n]*95/Linux/g |
Example: remove html tags (they may span several lines and they can be nested)
:top
/<.*>/{ s/<[^<>]*>//g t top } /</{ N b top } |
A fine point: why didn't we replace the third line of the script with
s/<[^>]*>//g
and removing the t command that follows ? Well consider this sample
file:
<<hello>
hello>
The desired output would be the empty set, since everything is enclosed
in angled brackets. However, the output will look like this:
hello>
since the first line matches the expression <[^>]*> So the point
is that we have set up the script to recursively remove the contents of
the innermost matching pair of delimiters.
----------------------------------------------