Grep
g/re/p
: globally search for a regular expression and print matching lines
Usage
1
grep [OPTION]... PATTERNS [FILE]...
instructions from running grep --help
PATTERNS can contain multiple patterns separated by newlines. When FILE is ‘-‘, read standard input. With no FILE, read ‘.’ if recursive, ‘-‘ otherwise. With fewer than two FILEs, assume -h. Exit status is 0 if any line is selected, 1 otherwise; if any error occurs and -q is not given, the exit status is 2.
Grep supports basic regular expressions by default.
most used switches
--color={always|never|auto}
--exclude-dir=GLOB
: skip directories that match GLOB--exclude=GLOB
: skip files that match GLOB--include=GLOB
: search only files that match GLOB (a file pattern)-C
,--context=NUM
: print NUM lines of output context-E
,--extended-regexp
: PATTERNS are extended regular expressions-F
,--fixed-strings
: PATTERNS are strings-L
,--files-without-match
: print only names of FILEs with no selected lines-c
,--count
: print only a count of selected lines per FILE-i
,--ignore-case
: ignore case distinctions in patterns and data-l
,--files-with-matches
: print only names of FILEs with selected lines-n
,--line-number
: print line number with output lines-q
,--quiet
,--silent
: suppress all normal output-r
,--recursive
: search all files recursively in current directory-v
,--invert-match
: select non-matching lines
a nice alias for grep
1
alias grep="grep --color=auto --exclude-dir={.bzr,CVS,.git,.hg,.svn,.idea,.tox}"
Alternatives
Sed
stream editor
Usage
1
sed [OPTION]... {script-only-if-no-other-script} [input-file]...
instructions from running sed --help
If no -e, –expression, -f, or –file option is given, then the first non-option argument is taken as the sed script to interpret. All remaining arguments are names of input files; if no input files are specified, then the standard input is read.
Sed supports basic regular expressions by default.
most used switches
--debug
: annotate program execution-E
,-r
,--regexp-extended
: use extended regular expressions in the script (for portability use POSIX -E).-e script
,--expression=script
: add the script to the commands to be executed-i[SUFFIX]
,--in-place[=SUFFIX]
: edit files in place (makes backup if SUFFIX supplied)-n
,--quiet
,--silent
: suppress automatic printing of pattern space
most used commands
d
DELETE pattern space
<address>d
1
2
seq 10 | sed 1,5d # 6 7 8 9 10
echo "hello\n\nworld" | sed '/^$/d' # hello world
p
PRINT pattern space to stdout. Usually used with -n
<address>p
1
seq 10 | sed -n 1~3p # 1 4 7 10
q
QUIT
<address>q
1
seq 10 | sed 3q # 1 2 3
s
SUBSTITUTE
<address>s/regexp/replacement/flags
1
command | sed -E 's/(apple)/\U\1/g'
/
may be replaced by any other single character. This is particularly useful
if regexp
itself contains /
.
1
sed 's#/#_#g' <<< "path/to/file" # path_to_file
{ Commands }
<address>{ command1; command2; command3 }
1
seq 10 | sed '{3,7s/[[:digit:]]/x/; /x/d}' # 1 2 8 9 10
Special sequences
\E
: stop case conversion\L
: lowercase all characters after it\U
: uppercase all characters after it\l
: lowercase next character\u
: uppercase next character
Flags
- number: replace the number^th^ match
g
: replace all matches (first match by default)I
: case-insensitive mode
Address
- number: the number^th^ line
/regexp/
: lines matching a pattern/regexp/I
: lines matching a pattern in case-insensitive modestart,end
: lines within a rangestart~step
: every step^th^ line from the start^th^ line$
: the last line
1
2
3
printf "%s\n" a b c | sed '1d' # b c
printf "%s\n" a B c | sed '/b/Id' # a c
printf "%s\n" a b c | sed '$d' # a b
Appending an
!
after an address negates the selection
1
printf "%s\n" a b B c | sed '/b/I!d' # b B
Snippet
-
capitalize
1 2 3
function capitalize { sed -E "s/^(.)/\U\1/" }
-
dequote
1 2 3
function dequote { sed -e "s/^'//" -e "s/'$//" }
Awk
[Designers] Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan @ AT&T Bell Laboratories
Usage
1
awk [POSIX or GNU style options] [--] 'program' file ...
Awk supports most extended regular expressions (ERE) by default.
most used switches
-F fs
,--field-separator=fs
-v var=val
Program structure
<pattern> { <action1>; <action2> }; <pattern> { <action1>; <action2> }
Patterns
BEGIN
-
END
BEGIN and END are two special kinds of patterns which are not tested against the input. The action parts of all BEGIN patterns are merged as if all the statements had been written in a single BEGIN rule. They are ex‐ ecuted before any of the input is read. Similarly, all the END rules are merged, and executed when all the input is exhausted (or when an exit statement is executed). BEGIN and END patterns cannot be combined with other patterns in pattern expressions. BEGIN and END patterns cannot have missing action parts.
-
<test_expression>
Tests whether certain fields match certain regular expressions.
-
/<regexp>/
See regular expressions section in
man awk
-
&&
,||
,!
AND, OR, NOT
-
/<regexp1>/, /<regexp2>/
Range
-
null (empty pattern)
Matches all records
Actions
Action statements are enclosed in braces, { and }. Action statements consist of the usual assignment, conditional, and looping statements found in most languages. The operators, control statements, and input/output statements available are patterned after those in C.
For a detailed list, see actions section in man awk
.
most used control statement
if (<cond>) <stmt> else <stmt>
most used i/o statements
-
next
Stop processing current record
-
print <expr_list>
Print a comma or space separated expression list
-
printf <fmt> <expr_list>
Similar to that in bash
Both print
and printf
can be followed by redirection >
and >>
.
Builtins
Variable
-
FS
: Field separator -
OFS
: Output field separatorPrint username and default shell
1
awk 'BEGIN {FS=":"; OFS="-";} {print $1, $NF}' /etc/passwd
Same as
1
awk -F':' -v OFS='-' '{print $1, $NF}' /etc/passwd
-
NR
: Record numberPrint file with line number
1
awk '{print NR, $0}' <filename>
Count line number
1
awk 'END {print NR}' <filename>
-
NF
: Field numberPrint the last field of each record
1
awk '{print $NF}' <filename>
Function
gsub(re, sub [, str])
: Replace every substring matching re with sub in str (default $0)sub(re, sub [, str])
: Same asgsub
, but only replace the first matchlength(str)
: Length of str (default $0)substr(str, idx [, len])
: Substring of str at index idx of length len, use the rest of str if len is not provided