3 awk \- pattern-directed scanning and processing language
41 for lines that match any of a set of patterns specified literally in
43 or in one or more files
48 there can be an associated action that will be performed
52 Each line is matched against the
53 pattern portion of every pattern-action statement;
54 the associated action is performed for each matched pattern.
57 means the standard input.
62 is treated as an assignment, not a file name,
63 and is executed at the time it would have been opened if it were a file name.
68 is an assignment to be done before the program
72 options may be present.
75 option defines the input field separator to be the regular expression
78 An input line is normally made up of fields separated by white space,
79 or by regular expression
81 The fields are denoted
86 refers to the entire line.
89 is null, the input line is split into one field per character.
91 To compensate for inadequate implementation of storage management,
94 option can be used to set the maximum size of the input record,
97 option to set the maximum number of fields.
105 in which it is not allowed to
106 run shell commands or open files
107 and the environment is not made available
112 A pattern-action statement has the form
114 .IB pattern " { " action " }
118 means print the line;
119 a missing pattern always matches.
120 Pattern-action statements are separated by newlines or semicolons.
122 An action is a sequence of statements.
123 A statement can be one of the following:
126 .ta \w'\fLdelete array[expression]'u
127 if(\fI expression \fP)\fI statement \fP\fR[ \fPelse\fI statement \fP\fR]\fP
128 while(\fI expression \fP)\fI statement\fP
129 for(\fI expression \fP;\fI expression \fP;\fI expression \fP)\fI statement\fP
130 for(\fI var \fPin\fI array \fP)\fI statement\fP
131 do\fI statement \fPwhile(\fI expression \fP)
134 {\fR [\fP\fI statement ... \fP\fR] \fP}
135 \fIexpression\fP #\fR commonly\fP\fI var = expression\fP
136 print\fR [ \fP\fIexpression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP
137 printf\fI format \fP\fR[ \fP,\fI expression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP
138 return\fR [ \fP\fIexpression \fP\fR]\fP
139 next #\fR skip remaining patterns on this input line\fP
140 nextfile #\fR skip rest of this file, open next, start at top\fP
141 delete\fI array\fP[\fI expression \fP] #\fR delete an array element\fP
142 delete\fI array\fP #\fR delete all elements of array\fP
143 exit\fR [ \fP\fIexpression \fP\fR]\fP #\fR exit immediately; status is \fP\fIexpression\fP
147 Statements are terminated by
148 semicolons, newlines or right braces.
153 String constants are quoted \&\fL"\ "\fR,
154 with the usual C escapes recognized within.
155 Expressions take on string or numeric values as appropriate,
156 and are built using the operators
158 (exponentiation), and concatenation (indicated by white space).
161 ! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?:
162 are also available in expressions.
163 Variables may be scalars, array elements
167 Variables are initialized to the null string.
168 Array subscripts may be any string,
169 not necessarily numeric;
170 this allows for a form of associative memory.
171 Multiple subscripts such as
173 are permitted; the constituents are concatenated,
174 separated by the value of
179 statement prints its arguments on the standard output
184 is present or on a pipe if
186 is present), separated by the current output field separator,
187 and terminated by the output record separator.
191 may be literal names or parenthesized expressions;
192 identical string values in different statements denote
196 statement formats its expression list according to the format
199 The built-in function
201 closes the file or pipe
203 The built-in function
205 flushes any buffered output for the file or pipe
209 is omitted or is a null string, all open files are flushed.
211 The mathematical functions
220 Other built-in functions:
224 If its argument is a string, the string's length is returned.
225 If its argument is an array, the number of subscripts in the array is returned.
226 If no argument, the length of
231 random number on (0,1)
236 and returns the previous seed.
239 truncates to an integer value
242 converts its numerical argument, a character number, to a
246 .BI substr( s , " m")
247 the maximum length substring of
249 that begins at position
253 .BI substr( s , " m" , " n\fL)
258 that begins at position
262 .BI index( s , " t" )
267 occurs, or 0 if it does not.
269 .BI match( s , " r" )
272 where the regular expression
274 occurs, or 0 if it does not.
279 are set to the position and length of the matched string.
281 .BI split( s , " a" , " fs\fL)
291 The separation is done with the regular expression
293 or with the field separator
298 An empty string as field separator splits the string
299 into one array element per character.
301 .BI sub( r , " t" , " s\fL)
304 for the first occurrence of the regular expression
316 is replaced by the match.
321 except that all occurrences of the regular expression
326 return the number of replacements.
328 .BI sprintf( fmt , " expr" , " ...\fL)
329 the string resulting from formatting
339 and returns its exit status
344 with all upper-case characters translated to their
345 corresponding lower-case equivalents.
350 with all lower-case characters translated to their
351 corresponding upper-case equivalents.
358 to the next input record from the current input file;
363 to the next record from
378 returns the next line of output from
382 returns 1 for a successful input,
383 0 for end of file, and \-1 for an error.
385 Patterns are arbitrary Boolean combinations
388 of regular expressions and
389 relational expressions.
390 Regular expressions are as in
392 Isolated regular expressions
393 in a pattern apply to the entire line.
394 Regular expressions may also occur in
395 relational expressions, using the operators
400 is a constant regular expression;
401 any string (constant or variable) may be used
402 as a regular expression, except in the position of an isolated regular expression
405 A pattern may consist of two patterns separated by a comma;
406 in this case, the action is performed for all lines
407 from an occurrence of the first pattern
408 though an occurrence of the second.
410 A relational expression is one of the following:
412 .I expression matchop regular-expression
414 .I expression relop expression
416 .IB expression " in " array-name
418 .BI ( expr , expr,... ") in " array-name
422 is any of the six relational operators in C,
431 A conditional is an arithmetic expression,
432 a relational expression,
433 or a Boolean combination
440 may be used to capture control before the first input line is read
445 do not combine with other patterns.
447 Variable names with special meanings:
451 conversion format used when converting numbers
456 regular expression used to separate fields; also settable
461 number of fields in the current record
464 ordinal number of the current record
467 ordinal number of the current record in the current file
470 the name of the current input file
473 input record separator (default newline)
476 output field separator (default blank)
479 output record separator (default newline)
482 output format for numbers (default
486 separates multiple subscripts (default 034)
489 argument count, assignable
492 argument array, assignable;
493 non-null members are taken as file names
496 array of environment variables; subscripts are names.
499 Functions may be defined (at the position of a pattern-action statement) thus:
502 function foo(a, b, c) { ...; return x }
504 Parameters are passed by value if scalar and by reference if array name;
505 functions may be called recursively.
506 Parameters are local to the function; all other variables are global.
507 Thus local variables may be created by providing excess parameters in
508 the function definition.
513 Print lines longer than 72 characters.
517 Print first two fields in opposite order.
520 BEGIN { FS = ",[ \et]*|[ \et]+" }
525 Same, with input fields separated by comma and/or blanks and tabs.
529 END { print "sum is", s, " average is", s/NR }
533 Add up first column, print sum and average.
537 Print all lines between start/stop pairs.
540 BEGIN { # Simulate echo(1)
541 for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
551 A. V. Aho, B. W. Kernighan, P. J. Weinberger,
553 The AWK Programming Language,
554 Addison-Wesley, 1988. ISBN 0-201-07981-X
556 There are no explicit conversions between numbers and strings.
557 To force an expression to be treated as a number add 0 to it;
558 to force it to be treated as a string concatenate
561 The scope rules for variables in functions are a botch;
564 UTF is not always dealt with correctly,
567 does make an attempt to do so.
570 function with an empty string as final argument now copes
571 with UTF in the string being split.