10.3 Functions

awk comes with a large array of built-in numeric functions, including sqrt(x) (square root of x), log(x), exp(x) (exponential of x), cos(x), sin(x) and tan(x).

$ echo "10 100" | awk '{print log($1), sqrt($2)}'
2.30259 10

It also offers built-in functions for string manipulations, including length(), substr(), sub(), tolower() and toupper():

length(): Printing the length of the line, including spaces.

$ echo "How long am I?" | awk '{print length($0)}'
14

substr(): the function substr(a,b,c) takes a string a, starts at position b inside that string, and returns everything inside c characters from there. If c is not given, the whole string is printed:

$ echo "first second third" | awk '{print substr($1,1,2)}'
$ echo "first second third" | awk '{print substr($1,1,3)}'
$ echo "first second third" | awk '{print substr($1,1,4)}'
$ echo "first second third" | awk '{print substr($1,1)}'
$ echo "first second third" | awk '{print substr($2,1)}'
$ echo "first second third" | awk '{print substr($2,3)}'
fi
fir
firs
first
second
cond

sub(): replace one string by another:

$ echo "This is awkward!" | awk '{sub("ward", "", $0); print "No,", $0}'
No, This is awk!

tolower() / toupper(): convert into lower or upper cases:

$ echo "This is awkward!" | awk '{sub("ward", "", $0); print "No,", tolower($1), $2, toupper($3)}'
No, this is AWK!

Of course there are many more functions within awk. All available functions are explained in the (very long) man pages!

10.3.1 Random numbers with awk

Random number generators calculate new random numbers based on the current one deterministically. To get different output, they need a different starting point, known as seed. By default, the rand() function will always start from the same seed within each started command - so if you repeat the following line multiple times, you will always start from the same point:

$ echo -e "1" | awk '{print rand()}'
$ echo -e "1" | awk '{print rand()}'
0.924046
0.924046

Note: There are different types of awk, namely mawk, gawk and base awk. mawk will always use a different seed, while gwak will always use the same seed. For awk, it depends on the specific version that you are using, try out the different commands in your command line and see how they behave.

$ echo -e "1" | gawk '{print rand()}'
$ echo -e "1" | gawk '{print rand()}'
0.924046
0.924046

If you want to be sure to use a different seed every time, you can simply use srand(), it will use time as a seed: srand() function. :

$ echo -e "1" | awk 'BEGIN{srand()}{print rand()}'
$ echo -e "1" | awk 'BEGIN{srand()}{print rand()}'
0.285272
0.047439