Awk One-Liners Explained (Preview Copy)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12
At a glance
Powered by AI
Some key takeaways from the document are that awk can be used to perform various text processing tasks with simple one-line commands, and that it is useful for working with text files and in the shell.

Some examples of awk one-liners given are to print the first field of each line of /etc/passwd and to number the lines of a file.

The one-liner to double space a file works by having the first statement print the current line, and the second statement print a blank line, causing each line to be double spaced when printed.

by

@pkrumins Peteris Krumins peter@catonmat.net http://www.catonmat.net good coders code, great reuse

Preview copy (first 11 pages) Get full e-book at www.catonmat.net/blog/awk-book/

Contents
Contents Preface 1 Introduction 1.1 Awk One-Liners . . . . . . . . . . . . . . . . . . . . . . . . . 2 Line Spacing 2.1 Double-space a le . . . . . . . . . 2.2 Another way to double-space a le 2.3 Double-space a le so that no more pears between lines of text . . . . . 2.4 Triple-space a le . . . . . . . . . . 2.5 Join all lines . . . . . . . . . . . . . i v 1 1 4 4 5 6 6 7 8 8 8 9 9 9 10 10 11 11 12 12 13 13

. . . . . . than . . . . . . . . .

. . . . . . . . . . . . . . . . one blank line . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . ap. . . . . . . . .

3 Numbering and Calculations 3.1 Number lines in each le separately . . . . . . . . . . . . . 3.2 Number lines for all les together . . . . . . . . . . . . . . 3.3 Number lines in a fancy manner . . . . . . . . . . . . . . . 3.4 Number only non-blank lines in les . . . . . . . . . . . . . 3.5 Count lines in les . . . . . . . . . . . . . . . . . . . . . . 3.6 Print the sum of elds in every line . . . . . . . . . . . . . 3.7 Print the sum of elds in all lines . . . . . . . . . . . . . . 3.8 Replace every eld by its absolute value . . . . . . . . . . 3.9 Count the total number of elds (words) in a le . . . . . . 3.10 Print the total number of lines containing word "Beth" . . 3.11 Find the line containing the largest (numeric) rst eld . . 3.12 Print the number of elds in each line, followed by the line 3.13 Print the last eld of each line . . . . . . . . . . . . . . . . i

. . . . . . . . . . . . .

Preview copy (first 11 pages) Get full e-book at www.catonmat.net/blog/awk-book/

CONTENTS 3.14 Print the last eld of the last line . . . . . . . . . . 3.15 Print every line with more than 4 elds . . . . . . . 3.16 Print every line where the value of the last eld is than 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . greater . . . . .

ii 13 14 14

4 Text Conversion and Substitution 15 4.1 Convert Windows/DOS newlines (CRLF) to Unix newlines (LF) from Unix . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Convert Unix newlines (LF) to Windows/DOS newlines (CRLF) from Unix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3 Convert Unix newlines (LF) to Windows/DOS newlines (CRLF) from Windows/DOS . . . . . . . . . . . . . . . . . . . . . . 16 4.4 Convert Windows/DOS newlines (CRLF) to Unix newlines (LF) from Windows/DOS . . . . . . . . . . . . . . . . . . . 17 4.5 Delete leading whitespace (spaces and tabs) from the beginning of each line (ltrim) . . . . . . . . . . . . . . . . . . . . 18 4.6 Delete trailing whitespace (spaces and tabs) from the end of each line (rtrim) . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.7 Delete both leading and trailing whitespaces from each line (trim) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.8 Insert 5 blank spaces at beginning of each line . . . . . . . . 19 4.9 Align all text to the right right on a 79-column width . . . . 19 4.10 Center all text on a 79-character width . . . . . . . . . . . . 20 4.11 Substitute (nd and replace) "foo" with "bar" on each line . 20 4.12 Substitute "foo" with "bar" only on lines that contain "baz" 21 4.13 Substitute "foo" with "bar" only on lines that dont contain "baz" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.14 Change "scarlet" or "ruby" or "puce" to "red" . . . . . . . . 22 4.15 Reverse order of lines (emulate "tac") . . . . . . . . . . . . . 22 4.16 Join a line ending with a backslash with the next line . . . . 23 4.17 Print and sort the login names of all users . . . . . . . . . . 23 4.18 Print the rst two elds in reverse order on each line . . . . 24 4.19 Swap rst eld with second on every line . . . . . . . . . . . 25 4.20 Delete the second eld on each line . . . . . . . . . . . . . . 25 4.21 Print the elds in reverse order on every line . . . . . . . . . 25 4.22 Remove duplicate, consecutive lines (emulate "uniq") . . . . 26 4.23 Remove duplicate, nonconsecutive lines . . . . . . . . . . . . 27 4.24 Concatenate every 5 lines of input with a comma . . . . . . 28

Preview copy (first 11 pages) Get full e-book at www.catonmat.net/blog/awk-book/

CONTENTS 5 Selective Printing and Deleting of Certain Lines 5.1 Print the rst 10 lines of a le (emulates "head -10") . . . . 5.2 Print the rst line of a le (emulates "head -1") . . . . . . . 5.3 Print the last 2 lines of a le (emulates "tail -2") . . . . . . 5.4 Print the last line of a le (emulates "tail -1") . . . . . . . . 5.5 Print only the lines that match a regular expression "/regex/" (emulates "grep") . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Print only the lines that do not match a regular expression "/regex/" (emulates "grep -v") . . . . . . . . . . . . . . . . 5.7 Print the line immediately before a line that matches "/regex/" 5.8 Print the line immediately after a line that matches "/regex/" (but not the line that matches itself) . . . . . . . . . . . . . 5.9 Print lines that match any of "AAA" or "BBB", or "CCC" . 5.10 Print lines that contain "AAA", "BBB", and "CCC" in this order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.11 Print only the lines that are 65 characters in length or longer 5.12 Print only the lines that are less than 64 characters in length 5.13 Print a section of le from regular expression to end of le . 5.14 Print lines 8 to 12 (inclusive) . . . . . . . . . . . . . . . . . 5.15 Print line number 52 . . . . . . . . . . . . . . . . . . . . . . 5.16 Print section of a le between two regular expressions (inclusive) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.17 Print all lines where 5th eld is equal to "abc123" . . . . . . 5.18 Print any line where eld #5 is not equal to "abc123" . . . . 5.19 Print all lines whose 7th eld matches a regular expression . 5.20 Print all lines whose 7th eld doesnt match a regular expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.21 Delete all blank lines from a le . . . . . . . . . . . . . . . .

iii 30 30 31 31 32 32 33 33 34 34 34 35 35 36 36 36 37 37 38 38 38 39

6 String and Array Creation 40 6.1 Create a string of a specic length (generate a string of xs of length 513) . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.2 Insert a string of specic length at a certain character position (insert 49 xs after 6th char) . . . . . . . . . . . . . . . 41 6.3 Create an array from string . . . . . . . . . . . . . . . . . . 42 6.4 Create an array named "mdigit", indexed by strings . . . . . 42 A Awk Special Variables 44

Preview copy (first 11 pages) Get full e-book at www.catonmat.net/blog/awk-book/

CONTENTS A.1 A.2 A.3 A.4 A.5 A.6 FS Input Field Separator . . . . . . . . . . OFS Output Field Separator . . . . . . . . NF Number of Fields on the current line . NR Number of records seen so far (current RS Input Record Separator . . . . . . . . ORS Output Record Separator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . line number) . . . . . . . . . . . . . . . . . . . . . .

iv 44 45 46 47 47 48 49 51

B Idiomatic Awk Index

Preview copy (first 11 pages) Get full e-book at www.catonmat.net/blog/awk-book/

Preface
Thanks!
Thank you for purchasing my "Awk One-Liners Explained" e-book! This is my rst e-book that I have ever written and I based it on article series "Famous Awk One-Liners Explained" that I wrote on my www.catonmat.net blog. I went through all the one-liners in the articles, improved them, xed a lot of mistakes, added an introduction to Awk one-liners and two new chapters. The two new chapters are Awk Special Variables that summarizes some of the most commonly used Awk variables and Idiomatic Awk that explains what idiomatic Awk is. You might wonder why I called the article series "famous"? Well, because I based the articles on the famous awk1line.txt le by Eric Pement. This le has been circulating around Unix newsgroups and forums for years and its very popular among Unix programmers. Thats how I actually learned the Awk language myself. I went through all the one-liners in this le, tried them out and understood how they exactly work. Then I thought it would be a good idea to explain them on my blog, which I did, and after that I thought, why not turn it into a book? Thats how I ended up writing this book. I have also planned writing two more books called "Sed One-Liners Explained" and "Perl One-Liners Explained". The sed book will be based on Eric Pements sed1line.txt le and "Famous Sed One-Liners Explained" article series and the Perl book will be based on my "Famous Perl OneLiners Explained" article series. I am also going to create perl1line.txt le of my own. If youre interested, subscribe to my blog and follow me on Twitter. That way youll know when I publish all of this!

Preview copy (first 11 pages) Get full e-book at www.catonmat.net/blog/awk-book/

PREFACE

vi

Credits
Id like to thank Eric Pement who made the famous awk1line.txt le that I learned Awk from and that I based this book on. Id also like to thank waldner and pgas from #awk channel on FreeNode IRC network for always helping me with Awk, Madars Virza for proof reading the book before I published it and correcting several glitches, Antons Suspans for proof reading the book after I published it, Abraham Alhashmy for giving advice on how to improve the design of the book, everyone who commented on my blog while I was writing the Awk one-liners article series, and everyone else who helped me with Awk and this book.

Preview copy (first 11 pages) Get full e-book at www.catonmat.net/blog/awk-book/

One Introduction
1.1 Awk One-Liners

Knowing Awk makes you really powerful when working in the shell. Check this out, suppose you want to print the usernames of all users on your system. You can do it very quickly with this one-liner:
awk -F: '{print $1}' /etc/passwd

/etc/passwd

This is really short and powerful, isnt it? As you know, the format of is colon separated:

root:x:0:0:0:/root:/bin/bash

The one-liner above says: Take each line from /etc/passwd, split it on the colon -F: and print the rst eld $1 of each line. Here are the rst few lines of output when I run this program on my system:
root bin daemon adm lp sync ...

Exactly what I expected.

Preview copy (first 11 pages) Get full e-book at www.catonmat.net/blog/awk-book/

CHAPTER 1. INTRODUCTION Now compare it to a C program that I just wrote that does the same:
#include <stdio.h> #include <stdlib.h> #include <string.h> #define MAX_LINE_LEN 1024 int main() { char line[MAX_LINE_LEN]; FILE *in = fopen("/etc/passwd", "r"); if (!in) exit(EXIT_FAILURE); while (fgets(line, MAX_LINE_LEN, in) != NULL) { char *sep = strchr(line, ':'); if (!sep) exit(EXIT_FAILURE); *sep = '\0'; printf("%s\n", line); } fclose(in); return EXIT_SUCCESS; }

This is much longer and you have to compile the program, only then you can run it. If you make any mistakes, you have to recompile again. Thats why one-liners are called one-liners. They are short, easy to write and they do one and only one thing really well. I am pretty sure youre starting to see how mastering Awk and one-liners can make you much more ecient when working in the shell, with text les and with computers in general. Here is another one-liner, this one numbers the lines in some le:
awk '{ print NR ". " $0 }' somefile

Isnt this beautiful? The NR special variable keeps track of current line number so I just print it out, followed by a dot and $0 that, as youll learn, contains the whole line. And youre done.

Preview copy (first 11 pages) Get full e-book at www.catonmat.net/blog/awk-book/

CHAPTER 1. INTRODUCTION

I know that a lot of my book readers would argue that Perl does exactly the same, so why should you learn Awk? My answer is very simple, yes, Perl does exactly the same, but why not be the master of the shell? Why not learn Awk, sed, Perl and other utilities? Besides Perl was created based on ideas from Awk, so why not learn Awk to see how Perl evolved. That gives you a unique perspective on programming languages, doesnt it? Overall, this book contains 70 well explained one-liners. Once you go through them, you should have a really good understanding of Awk and youll be the master shell problem solver. Enjoy this book!

Preview copy (first 11 pages) Get full e-book at www.catonmat.net/blog/awk-book/

Two Line Spacing


2.1 Double-space a le

awk '1; { print "" }'

So how does this one-liner work? A one-liner is an Awk program and every Awk program consists of a sequence of pattern-action statements pattern { action statement }. In this case there are two statements 1 and { print "" }. In a pattern-action statement either the pattern or the action may be missing. If the pattern is missing, the action is applied to every single line of input. A missing action is equivalent to { print }. The rst pattern-action statement is missing the action, therefore we can rewrite it as:
awk '1 { print }; { print "" }'

An action is applied to the line only if the pattern matches, i.e., pattern is true. Since 1 is always true, this one-liner translates further into two print statements:
awk '{ print }; { print "" }'

Every print statement in Awk is silently followed by the ORS Output Record Separator variable, which is a newline by default. The rst print statement with no arguments is equivalent to print $0, where $0 is the variable holding the entire line (not including the newline at the end). The second print statement seemingly prints nothing, but knowing that each print statement is followed by ORS, it actually prints a newline. So there we have it, each line gets double-spaced. 4

Preview copy (first 11 pages) Get full e-book at www.catonmat.net/blog/awk-book/

This is only the e-book preview.

Get the full e-book at:

http://www.catonmat.net/blog/awk-book/

You might also like