0% found this document useful (0 votes)
289 views97 pages

Linux

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 97

Table

of Contents
About 1.1
Linux Introduction 1.2
Command Line Introduction 1.3
Files and Directories 1.4
Working with Files and Directories 1.5
Text Processing 1.6
Shell 1.7
Shell Customization 1.8
Shell Scripting 1.9

1
About

Linux Command Line


Introduction to Linux commands and Shell scripting

Linux curated resources for more complete resources list, including tutorials for beginners
For more related resources, visit scripting course

Chapters
Linux Introduction
What is Linux?, Why use Linux?, Where is Linux deployed?, Linux Distros, Linux resource lists
Command Line Introduction
File System, Command Line Interface, Command Help, Do one thing and do it well
Files and Directories
pwd, clear, ls, cd, mkdir, touch, rm, cp, mv, rename, ln, tar and gzip
Working with Files and Directories
cat, less, tail, head, Text Editors, grep, find, locate, wc, du, df, touch, file, identify, basename, dirname, chmod
Text Processing
sort, uniq, comm, cmp, diff, tr, sed, awk, perl, cut, paste, column, pr
Shell
What is Shell?, Popular Shells, Wildcards, Redirection, Process Control, Running jobs in background
Shell Customization
Variables, Config files, Emac mode Readline shortcuts
Shell Scripting
Need for scripting, Hello script, Command Line Arguments, Variables and Comparisons, Accepting User Input
interactively, if then else, for loop, while loop, Debugging, Resource lists

ebook
Read as ebook on gitbook
Download ebook for offline reading - link

Acknowledgements
unix.stackexchange and stackoverflow - for getting answers to pertinent questions as well as sharpening skills by

2
About

understanding and answering questions


Devs and Hackers - helpful slack group
Forums like /r/commandline/, Weekly Coders, Hackers & All Tech related thread - for suggestions and critique

License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

3
Linux Introduction

Linux Introduction
Table of Contents

What is Linux?
Why use Linux?
Where is Linux deployed?
Linux Distros
Linux resource lists

What is Linux?
Quoting from Wikipedia

Linux is a family of free and open-source software operating systems built around the Linux kernel. Typically,
Linux is packaged in a form known as a Linux distribution (or distro for short) for both desktop and server use. The
defining component of a Linux distribution is the Linux kernel, an operating system kernel first released on
September 17, 1991, by Linus Torvalds. Many Linux distributions use the word "Linux" in their name. The Free
Software Foundation uses the name GNU/Linux to refer to the operating system family, as well as specific
distributions, to emphasize that most Linux distributions are not just the Linux kernel, and that they have in
common not only the kernel, but also numerous utilities and libraries, a large proportion of which are from the
GNU project

Why use Linux?


Faster, Secure, Stable
it helps that developers from all over the world contribute, instead of just a single company
Highly configurable
Suitable for both single/multiuser environment
Well defined hierarchy and permissions to allow networking across different groups and sites
Strong set of commands to automate repetitive manual tasks
Read more on using Linux and whether it fits your computing needs on computefreely

Where is Linux deployed?


Servers
Supercomputers
To quote TOP500 article on wikipedia, "Since November 2017, all the listed supercomputers (100% of the
performance share) use an operating system based on the Linux kernel"
Embedded/IoT devices like POS, Raspberry Pi

4
Linux Introduction

Smart phones
Android - built on top of Linux kernel
iOS - Unix based
Personal and Enterprise Computers
And many more uses, thanks to being open source
Usage Share of Operating Systems

Linux Distros
There are various Linux flavors called 'distribution' (distro for short), to cater the needs of beginners to advanced users
as well as highly customized as per end use case

There are hundreds of known distributions


One can keep track of them at distrowatch
Statistics of various Linux Distros
Popular Linux Distros compared
Light Weight Linux Distros

Installation

Usually, you'll find installation instructions from respective website of the distro you chose. If you need an overview of
installation process, this should help

Installing Nearly Any Distro On Just About Any Machine


Try out Linux on Virtual Machine

Linux resource lists


Linux curated resources
Awesome linux by Aleksandar
Linux resources by Paul

5
Command Line Introduction

Command Line introduction


Table of Contents

File System
Absolute and Relative paths
Command Line Interface
Command Help
Do one thing and do it well
Command Structure
Command Network

For any thing that is repetitive or programmable, there likely is a relevant command. Ask your peers or search online
before you start writing a script. Just remember that Unix was first introduced in late 1960s - there is likely to be a
command for what you need

Starting trouble with command line (for those accustomed to GUI) is the sudden trouble of interacting with the computer
using just text commands. After using for a week or so, things will seem very systematic and GUI feels ill suited for
frequent tasks. With continuous use, recalling various commands becomes easier. Short-cuts, history, aliases and tab-
completion help in the process

If you've used a scientific calculator, you'd know that it is handy with too many functionalities cramped into tiny screen
and plethora of multi-purpose buttons. Commands and short-cuts pack much more punch than that on a terminal

Commands presented here are Linux specific and generally behave similarly across distros
Commands in Linux usually have added features compared to POSIX specification
If any command is not found in a particular distro, either it has to be manually installed or probably an alternate
exists
The bash shell version 4+ is used throughout this material
rough overview of changes to Bash over time

File System
Before we dive into ocean of commands, lets get a brief on Linux file system. If you've used Windows, you would be
familiar with C: D: etc.
In Linux, directory structure starts with / symbol, which is referred as the root directory

man hier gives description of the filesystem hierarchy. A few examples:


/ This is the root directory. This is where the whole tree starts.
/bin This directory contains executable programs which are needed in single user mode and to bring the

system up or repair it.


/home On machines with home directories for users, these are usually beneath this directory, directly or not.
The structure of this directory depends on local administration decisions.
/tmp This directory contains temporary files which may be deleted with no notice, such as by a regular job or
at system boot up.

6
Command Line Introduction

/usr This directory is usually mounted from a separate partition. It should hold only sharable, read-only data,

so that it can be mounted by various machines running Linux.


/usr/bin This is the primary directory for executable programs. Most programs executed by normal users

which are not needed for booting or for repairing the system and which are not installed locally should be
placed in this directory.
/usr/share This directory contains subdirectories with specific application data, that can be shared among

different architectures of the same OS. Often one finds stuff here that used to live in /usr/doc or /usr/lib or
/usr/man.

Absolute and Relative paths


Quoting wikipedia

An absolute or full path points to the same location in a file system regardless of the current working directory.
To do that, it must contain the root directory.

By contrast, a relative path starts from some given working directory, avoiding the need to provide the full
absolute path. A filename can be considered as a relative path based at the current working directory. If the
working directory is not the file's parent directory, a file not found error will result if the file is addressed by its
name.

/home/learnbyexample absolute path


../design relative path

unix.stackexchange: Is ~/Documents a relative or an absolute path?

Further Reading

Learning the Linux File System - video tutorial


Overview of file system

Command Line Interface


Command Line Interface (CLI) allows us interact with computer using text commands

For example: the cd command would help navigating to a particular directory and ls command to view contents of a
directory. In GUI, you'd use an explorer for directory navigation by point and click, directory contents are shown by
default

Shell and Terminal are sometimes interchangeably used to mean the same thing - a prompt where user types and
executes commands. However, they are quite different

Shell is command line interpreter, sets the syntax rules for invoking commands, etc
Terminal text input/output environment, responsible for visual details like font size, color, etc

We'll learn more about Shell in later chapters. For now, open a Terminal and try these commands by typing them and
pressing Enter key. You can spot the command lines by the prompt $ at start of line

7
Command Line Introduction

$ cat /etc/shells
# /etc/shells: valid login shells
/bin/sh
/bin/dash
/bin/bash
/bin/rbash
/bin/tcsh
/usr/bin/tcsh

$ echo "$SHELL"
/bin/bash

Note: Your command prompt might be different, for now you can leave it as or change it to the simple prompt I prefer by
executing PS1="$ "

In the above example, cat command is used to display contents of a file and echo command is used to display
contents of a variable - these commands have other uses as well, which will be covered later on

Command Help
Most distros for personal use come with documentation for commands already installed. Getting used to reading
manual from terminal is handy and there are various ways to get specific information

man command is an interface to reference manuals

usually displayed using less command, press q key to quit the man page and h key to get help
for Linux commands, the info command gives the complete documentation
you could also read them online, for ex: GNU Coreutils manual has manuals for most of the commands
covered in this material
man man will give details about the man command itself

man bash will give you the manual page for bash

man find | gvim - to open the manual page in your favorite text editor
man -k printf will search the short descriptions in all the manual pages for the string printf
-k here is a command option
man -k is equivalent for apropos command
Excellent resource unix.stackexchange: How do I use man pages to learn how to use commands?

For certain operations, shell provides its own set of commands, referred as builtin commands

type will display information about command type


typically used to get path of command or expand alias/function, use help type for documentation
See also unix.stackexchange: What is the difference between a builtin command and one that is not?
See also unix.stackexchange: Why not use “which”? What to use then?

8
Command Line Introduction

$ type cd
cd is a shell builtin
$ type sed
sed is /bin/sed

$ multiple commands can be given as arguments


$ type pwd awk
pwd is a shell builtin
awk is /usr/bin/awk

$ type ls
ls is aliased to `ls --color=auto'
$ type -a ls
ls is aliased to `ls --color=auto'
ls is /bin/ls

help command provides documentation for builtin commands

help help help page on help command


-m option will display usage in pseudo-manpage format

-d option gives short description for each topic, similar to whatis command
help command by itself without any argument displays all shell commands that are defined internally

$ help pwd
pwd: pwd [-LP]
Print the name of the current working directory.

Options:
-L print the value of $PWD if it names the current working directory
-P print the physical directory, without any symbolic links

By default, `pwd' behaves as if `-L' were specified.

Exit Status:
Returns 0 unless an invalid option is given or the current directory
cannot be read.

$ help -d compgen
compgen - Display possible completions depending on the options.

Here's some more companion commands

whatis displays one-line manual page descriptions


whereis locates the binary, source, and manual page files for a command
explainshell is a web app that shows the help text that matches each argument of command line
example: tar xzvf archive.tar.gz
ch is a script, inspired by explainshell, to extract option descriptions from man/help pages

9
Command Line Introduction

$ whatis grep
grep (1) - print lines matching a pattern

$ whereis awk
awk: /usr/bin/awk /usr/share/awk /usr/share/man/man1/awk.1.gz

$ ch sort -k
sort - sort lines of text files

-k, --key=KEYDEF
sort via a key; KEYDEF gives location and type

Do one thing and do it well


The Unix Philosophy applies to Linux as well:

Write programs that do one thing and do it well

Write programs to work together

Write programs to handle text streams, because that is a universal interface

Examples given below are for demonstration purposes only, more detail in later chapters

Command Structure
only the command

clear clear the terminal screen

top display Linux processes

command with options

ls -l list directory contents, use a long listing format

df -h report file system disk space usage, print sizes in human readable format (e.g., 1K 234M 2G)

command with arguments

mkdir project create directory named 'project' in current working directory


man sort manual page for sort command

wget https://s.ntnu.no/bashguide.pdf download file from internet

command with options and arguments

rm -r project remove 'project' directory


paste -sd, ip.txt combine all lines from 'ip.txt' file to single line using , as delimiter

single quotes vs double quotes

single quotes preserves the literal value of each character within the quotes
double quotes preserves the literal value of all characters within the quotes, with the exception of '$', '`', '\', and,
when history expansion is enabled, '!'

10
Command Line Introduction

See also stackoverflow: Difference between single and double quotes

$ echo '$SHELL'
$SHELL

$ echo "$SHELL"
/bin/bash

Command Network
Redirecting output of a command

to another command
du -sh * | sort -h calculate size of files/folders, display size in human-readable format which is then sorted

to a file (instead of displaying on terminal)


grep 'pass' *.log > pass_list.txt writes to file (if file already exists, it is overwritten)

grep 'error' *.log >> errors.txt appends to file (creates new file if necessary)
to a variable
p=$(pwd) saves the output of pwd command in variable p , there should be no spaces around =

Redirecting input

wc -l < file.txt useful to get just the number of lines, without displaying file name
tr 'a-z' 'A-Z' < ip.txt some commands like tr only work on stdin

Redirecting error

xyz 2> cmderror.log assuming a non-existent command xyz , it would give an error and gets redirected to

specified file

Redirecting output of command as input file

comm -23 <(sort file1.txt) <(sort file2.txt) process substitution, avoids need to create temporary files

Combining output of several commands

(head -n5 ~/.vimrc ; tail -n5 ~/.vimrc) > vimrc_snippet.txt multiple commands (separated by ; ) can be
grouped inside a list

Command substitution

sed -i "s|^|$(basename $PWD)/|" dir_list.txt add current directory path and forward-slash character at the

start of every line


Note the use of double quotes

stdin, stdout and stderr

< or 0< is stdin filehandle


> or 1> is stdout filehandle

2> is stderr filehandle


See also stackoverflow: stdin, stdout and stderr

More detailed discussion in Shell chapter

11
Command Line Introduction

12
Files and Directories

Files and Directories


Table of Contents

pwd
cd
clear
ls
mkdir
touch
rm
cp
mv
rename
ln
tar and gzip

Let's look at commonly used commands to navigate directories, create and modify files and directories. For certain
commands, a list of commonly used options are also given

Make it a habit to use man command to read about a new command - for example man ls

Short descriptions for commands are shown as quoted text (taken from whatis or help -d )

pwd
print name of current/working directory

apart from knowing your current working directory, often used to copy the absolute path to be pasted elsewhere,
like in a script
some Terminal emulators display the current directory path as window/tab title

$ pwd
/home/learnbyexample

cd
Change the shell working directory

Like pwd , the cd command is a shell builtin


Let's see an example of changing working directory to some other directory and coming back
Specifying / at end of path argument is optional

13
Files and Directories

$ pwd
/home/learnbyexample

$ # providing an absolute path as argument


$ cd /etc
$ pwd
/etc

$ # to go back to previous working directory


$ # if there's a directory named '-', use './-' to go that directory
$ cd -
/home/learnbyexample
$ pwd
/home/learnbyexample

Relative paths are well, relative to current working directory


. refers to current directory
.. refers to directory one hierarchy above

../.. refers to directory two hierarchies above and so on

$ pwd
/home/learnbyexample

$ # go to directory one hierarchy above


$ cd ..
$ pwd
/home

$ # go to directory 'learnbyexample' present in current directory


$ # './' is optional in this case
$ cd ./learnbyexample
$ pwd
/home/learnbyexample

$ # go to directory two hierarchies above


$ cd ../..
$ pwd
/

cd ~/ or cd ~ or cd will go to directory specified by HOME shell variable (which is usually set to user's home

directory)

$ pwd
/
$ echo "$HOME"
/home/learnbyexample

$ cd
$ pwd
/home/learnbyexample

Further Reading

Use help cd for documentation


cd Q&A on unix stackexchange
cd Q&A on stackoverflow

14
Files and Directories

bash manual: Tilde Expansion

clear
clear the terminal screen

You can also use Ctrl+l short-cut to clear the Terminal screen (in addition, this retains any typed text)

ls
list directory contents

by default, ls output is sorted alphabetically

$ # if no argument is given, current directory contents are displayed


$ ls
backups hello_world.py palindrome.py projects todo
ch.sh ip.txt power.log report.log workshop_brochures

$ # absolute/relative paths can be given as arguments


$ ls /var/
backups crash local log metrics run spool
cache lib lock mail opt snap tmp
$ # for multiple arguments, listing is organized by directory
$ ls workshop_brochures/ backups/
backups:
chrome_bookmarks_02_07_2018.html dot_files

workshop_brochures:
Python_workshop_2017.pdf Scripting_course_2016.pdf

$ # single column listing


$ ls -1 backups/
chrome_bookmarks_02_07_2018.html
dot_files

-F appends a character to each file name indicating the file type (other than regular files)
/ for directories
* for executable files
@ for symbolic links

| for FIFOs
= for sockets
> for doors
the indicator details are described in info ls , not in man ls

15
Files and Directories

$ ls -F
backups/ hello_world.py* palindrome.py* projects@ todo
ch.sh* ip.txt power.log report.log workshop_brochures/

$ # if you just need to distinguish file and directory, use -p


$ ls -p
backups/ hello_world.py palindrome.py projects todo
ch.sh ip.txt power.log report.log workshop_brochures/

or use the color option

long listing format


shows details like file permissions, ownership, size, timestamp, etc
See chmod section for details on permissions, groups, etc
file types are distinguished as d for directories, - for regular files, l for symbolic links, etc

$ ls -l
total 84
drwxrwxr-x 3 learnbyexample eg 4096 Jul 4 18:23 backups
-rwxr-xr-x 1 learnbyexample eg 2746 Mar 30 11:38 ch.sh
-rwxrwxr-x 1 learnbyexample eg 41 Aug 21 2017 hello_world.py
-rw-rw-r-- 1 learnbyexample eg 34 Jul 4 09:01 ip.txt
-rwxrwxr-x 1 learnbyexample eg 1236 Aug 21 2017 palindrome.py
-rw-r--r-- 1 learnbyexample eg 10449 Mar 8 2017 power.log
lrwxrwxrwx 1 learnbyexample eg 12 Jun 21 12:08 projects -> ../projects/
-rw-rw-r-- 1 learnbyexample eg 39120 Feb 25 2017 report.log
-rw-rw-r-- 1 learnbyexample eg 5987 Apr 11 11:06 todo
drwxrwxr-x 2 learnbyexample eg 4096 Jul 5 12:05 workshop_brochures

$ # to show size in human readable format instead of byte count


$ ls -lh power.log
-rw-r--r-- 1 learnbyexample eg 11K Mar 8 2017 power.log

$ # use -s option instead of -l if only size info is needed


$ ls -1sh power.log report.log
12K power.log
40K report.log

changing sorting criteria


use -t to sort by timestamp, often combined with -r to reverse the order so that most recently modified file
shows as last item
-S option sorts by file size (not suitable for directories)
-v option does version sorting (suitable for filenames with numbers in them)
-X option allows to sort by file extension (i.e characters after the last . in filename)

16
Files and Directories

$ ls -lhtr
total 84K
-rw-rw-r-- 1 learnbyexample eg 39K Feb 25 2017 report.log
-rw-r--r-- 1 learnbyexample eg 11K Mar 8 2017 power.log
-rwxrwxr-x 1 learnbyexample eg 1.3K Aug 21 2017 palindrome.py
-rwxrwxr-x 1 learnbyexample eg 41 Aug 21 2017 hello_world.py
-rwxr-xr-x 1 learnbyexample eg 2.7K Mar 30 11:38 ch.sh
-rw-rw-r-- 1 learnbyexample eg 5.9K Apr 11 11:06 todo
lrwxrwxrwx 1 learnbyexample eg 12 Jun 21 12:08 projects -> ../projects/
-rw-rw-r-- 1 learnbyexample eg 34 Jul 4 09:01 ip.txt
drwxrwxr-x 3 learnbyexample eg 4.0K Jul 4 18:23 backups
drwxrwxr-x 2 learnbyexample eg 4.0K Jul 5 12:05 workshop_brochures

$ ls -X
backups todo power.log hello_world.py ch.sh
projects workshop_brochures report.log palindrome.py ip.txt

filenames starting with . are considered as hidden files

$ # -a option will show hidden files too


$ ls -a backups/dot_files/
. .. .bashrc .inputrc .vimrc

$ # . and .. are special directories pointing to current and parent directory


$ # if you recall, we have used them in specifying relative paths
$ # so, 'ls', 'ls .' and 'ls backups/..' will give same result
$ ls -aF backups/dot_files/
./ ../ .bashrc .inputrc .vimrc

$ # use -A option to show hidden files excluding . and .. special directories


$ ls -A backups/dot_files/
.bashrc .inputrc .vimrc

use -R option to recursively list sub-directories too

$ ls -ARF
.:
backups/ hello_world.py* palindrome.py* projects@ todo
ch.sh* ip.txt power.log report.log workshop_brochures/

./backups:
chrome_bookmarks_02_07_2018.html dot_files/

./backups/dot_files:
.bashrc .inputrc .vimrc

./workshop_brochures:
Python_workshop_2017.pdf Scripting_course_2016.pdf

tree command displays contents of a directory recursively as a tree like structure

you might have to install this command or have an equivalent command like gvfs-tree

17
Files and Directories

$ # -h option will show hidden files


$ gvfs-tree -h
file:///home/learnbyexample/ls_ex
|-- backups
| |-- chrome_bookmarks_02_07_2018.html
| `-- dot_files
| |-- .bashrc
| |-- .inputrc
| `-- .vimrc
|-- ch.sh
|-- hello_world.py
|-- ip.txt
|-- palindrome.py
|-- power.log
|-- projects -> ../projects/
|-- report.log
|-- todo
`-- workshop_brochures
|-- Python_workshop_2017.pdf
`-- Scripting_course_2016.pdf

often, we want to prune which files/directories are to be listed


commands like find provide extensive features in this regard
the shell itself provides a matching technique called glob/wildcards
see Shell wildcards section for more examples and details
beginners incorrectly associate globbing with ls command, so globbing results are shown below using echo
command as a demonstration

$ # all unquoted arguments are subjected to shell globbing interpretation


$ echo *.py *.log
hello_world.py palindrome.py power.log report.log
$ echo '*.py' *.log
*.py power.log report.log

$ # long list only files ending with .py


$ ls -l *.py
-rwxrwxr-x 1 learnbyexample eg 41 Aug 21 2017 hello_world.py
-rwxrwxr-x 1 learnbyexample eg 1236 Aug 21 2017 palindrome.py

$ # match all filenames starting with alphabets c/d/e/f/g/h/i


$ echo [c-i]*
ch.sh hello_world.py ip.txt
$ ls -sh [c-i]*
4.0K ch.sh 4.0K hello_world.py 4.0K ip.txt

use -d option to not show directory contents

18
Files and Directories

$ echo b*
backups
$ # since backups is a directory, ls will list its contents
$ ls b*
chrome_bookmarks_02_07_2018.html dot_files
$ # -d option will show the directory entry instead of its contents
$ ls -d b*
backups

$ # a simple way to get only the directory entries


$ # assuming simple filenames without spaces/newlines/etc
$ echo */
backups/ projects/ workshop_brochures/
$ ls -d */
backups/ projects/ workshop_brochures/

Further Reading

man ls and info ls for more options and complete documentation


ls Q&A on unix stackexchange
ls Q&A on stackoverflow
mywiki.wooledge: avoid parsing output of ls
unix.stackexchange: why not parse ls?
unix.stackexchange: What are ./ and ../ directories?

mkdir
make directories

Linux filenames can use any character other than / and the ASCII NUL character
quote the arguments if name contains characters like space, * , etc to prevent shell interpretation
shell considers space as argument separator, * is a globbing character, etc
unless otherwise needed, try to use only alphabets, numbers and underscores for filenames

$ # one or more absolute/relative paths can be given to create directories


$ mkdir reports 'low power adders'

$ # listing can be confusing when filename contains characters like space


$ ls
low power adders reports
$ ls -1
low power adders
reports

use -p option to create multiple directory hierarchies in one go


it is also useful in scripts to create a directory without having to check if it already exists
special variable $? gives exit status of last executed command
0 indicates success and other values indicate some kind of failure
see documentation of respective commands for details

19
Files and Directories

$ mkdir reports
mkdir: cannot create directory ‘reports’: File exists
$ echo $?
1
$ # when -p is used, mkdir won't give an error if directory already exists
$ mkdir -p reports
$ echo $?
0

$ # error because 'a/b' doesn't exist


$ mkdir a/b/c
mkdir: cannot create directory ‘a/b/c’: No such file or directory
$ # with -p, any non-existing directory will be created as well
$ mkdir -p a/b/c
$ ls -1R a
a:
b

a/b:
c

a/b/c:

Further Reading

mkdir Q&A on unix stackexchange


mkdir Q&A on stackoverflow
unix.stackexchange: Characters best avoided in filenames

touch
Usually files are created using a text editor or by redirecting output of a command to a file
But sometimes, for example to test file renaming, creating empty files comes in handy
the touch command is primarily used to change timestamp of a file (see touch section of next chapter)
if a filename given to touch doesn't exist, an empty file gets created with current timestamp

$ touch ip.txt
$ ls -1F
a/
ip.txt
low power adders/
reports/

rm
remove files and directories

to delete files, specify them as separate arguments


to delete directories as well, use -r option (deletes recursively)

20
Files and Directories

use -f option to force remove without prompt for non-existing files and write protected files (provided user has
appropriate permissions)

$ ls
a ip.txt low power adders reports
$ rm ip.txt
$ ls
a low power adders reports

$ rm reports
rm: cannot remove 'reports': Is a directory
$ rm -r reports
$ ls
a low power adders

$ # to remove only empty directory, same as 'rmdir' command


$ rm -d a
rm: cannot remove 'a': Directory not empty

typos like misplaced space, wrong glob, etc could wipe out files not intended for deletion
apart from having backups and snapshots, one could take some mitigating steps
using -i option to interactively delete each file
using echo as a dry run to see how the glob expands
using a trash command (see links below) instead of rm

$ rm -ri 'low power adders'


rm: remove directory 'low power adders'? n
$ ls
a low power adders

$ rm -ri a
rm: descend into directory 'a'? y
rm: descend into directory 'a/b'? y
rm: remove directory 'a/b/c'? y
rm: remove directory 'a/b'? y
rm: remove directory 'a'? y
$ ls
low power adders

Further Reading

See if a trash command is available for your distro (for ex: gvfs-trash on Ubuntu) - this will send items to trash
instead of deletion
or, unix.stackexchange: creating a simple trash command
Files removed using rm can still be recovered with time/skill. Use shred command to overwrite files
unix.stackexchange: recover deleted files
unix.stackexchange: recovering accidentally deleted files
wiki.archlinux: Securely wipe disk
rm Q&A on unix stackexchange
rm Q&A on stackoverflow

21
Files and Directories

cp
copy files and directories

to copy a single file or directory, specify the source as first argument and destination as second argument
similar to rm command, use -r for directories

$ # when destination is a directory, specified sources are placed inside that directory
$ # recall that . is a relative path referring to current directory
$ cp /usr/share/dict/words .
$ ls
low power adders words

$ cp /usr/share/dict .
cp: omitting directory '/usr/share/dict'
$ cp -r /usr/share/dict .
$ ls -1F
dict/
low power adders/
words

often, we want to copy for the purpose of modifying it


in such cases, a different name can be given while specifying the destination
if the destination filename already exists, it will be overwritten (see options -i and -n to avoid this)

$ cp /usr/share/dict/words words_ref.txt
$ cp -r /usr/share/dict word_lists

$ ls -1F
dict/
low power adders/
word_lists/
words
words_ref.txt

multiple files and directories can be copied at once if the destination is a directory
using -t option, one could specify destination directory first followed by sources (this is helpful with find
command and other places)

$ mkdir bkp_dot_files

$ # here, ~ will get expanded to user's home directory


$ cp ~/.bashrc ~/.bash_profile bkp_dot_files/
$ ls -A bkp_dot_files
.bash_profile .bashrc

see man cp and info cp for more options and complete documentation
some notable options are
-u copy files from source only if they are newer than those in destination or if it doesn't exist in destination

location
-b and --backup for back up options if file with same name already exists in destination location
--preserve option to copy files along with source file attributes like timestamp

Further Reading

22
Files and Directories

cp Q&A on unix stackexchange


cp Q&A on stackoverflow
rsync a fast, versatile, remote (and local) file-copying tool

rsync examples
rsync Q&A on unix stackexchange
rsync Q&A on stackoverflow

mv
move (rename) files

as name suggests, mv can move files from one location to another


if multiple files need to be moved, destination argument should be a directory (or specified using -t option)
unlike rm and cp , both files and directories have same syntax, no additional option required
use -i option to be prompted instead of overwriting file of same name in destination location

$ ls
bkp_dot_files dict low power adders word_lists words words_ref.txt
$ mkdir backups

$ mv bkp_dot_files/ backups/
$ ls -F
backups/ dict/ low power adders/ word_lists/ words words_ref.txt
$ ls -F backups/
bkp_dot_files/

$ mv dict words backups/


$ ls -F
backups/ low power adders/ word_lists/ words_ref.txt
$ ls -F backups/
bkp_dot_files/ dict/ words

like cp command, for single file/directory one can provide a different destination name
so, when source and destination has same parent directory, mv acts as renaming command

$ mv backups/bkp_dot_files backups/dot_files
$ ls -F backups/
dict/ dot_files/ words

Further Reading

mv Q&A on unix stackexchange


mv Q&A on stackoverflow

rename
renames multiple files

23
Files and Directories

Note: The perl based rename is presented here which is different from util-linux-ng version. Check man rename for
details

$ ls
backups low power adders word_lists words_ref.txt
$ # here, the * glob will expand to all non-hidden files in current directory
$ # -n option is for dry run, to see changes before actually renaming files
$ # s/ /_/g means replace all space characters with _ character
$ rename -n 's/ /_/g' *
rename(low power adders, low_power_adders)

$ rename 's/ /_/g' *


$ ls
backups low_power_adders word_lists words_ref.txt

Further Reading

rename Q&A on unix stackexchange


See Perl one liners for examples and details on Perl substitution command
Some more rename examples - unix.stackexchange: replace dots except last one and stackoverflow: change date
format

ln
make links between files

there are two types of links - symbolic and hard links


symbolic links is like a pointer/shortcut to another file or directory
if the original file is deleted or moved to another location, symbolic link will no longer work
if the symbolic link is moved to another location, it will still work if the link was done using absolute path (for
relative path, it will depend on whether or not there's another file with same name in that location)
a symbolic link file has its own inode, permissions, timestamps, etc
most commands will work the same when original file or the symbolic file is given as command line argument,
see their documentation for details

$ # similar to cp, a different name can be specified if needed


$ ln -s /usr/share/dict/words .
$ ls -F
words@

$ # to know which file the link points to


$ ls -l words
lrwxrwxrwx 1 learnbyexample eg 21 Jul 9 13:41 words -> /usr/share/dict/words
$ readlink words
/usr/share/dict/words
$ # the linked file may be another link
$ # use -f option to get original file
$ readlink -f words
/usr/share/dict/english

hard link can only point to another file (not a directory, and restricted to within the same filesystem)

24
Files and Directories

the . and .. special directories are the exceptions, they are hard links which are automatically created
once a hard link is created, there is no distinction between the two files other than different filename/location - they
have same inode, permissions, timestamps, etc
any of the hard link will continue working even if all the other hard links are deleted
if a hard link is moved to another location, the links will still be in sync - any change in one of them will be reflected
in all the other links

$ touch foo.txt
$ ln foo.txt baz.txt

$ # the -i option gives inode


$ ls -1i foo.txt baz.txt
649140 baz.txt
649140 foo.txt

Further Reading

unlink command to delete links ( rm can be used as well)

ln Q&A on unix stackexchange


ln Q&A on stackoverflow
askubuntu: What is the difference between a hard link and a symbolic link?
unix.stackexchange: What is the difference between symbolic and hard links?
unix.stackexchange: What is a Superblock, Inode, Dentry and a File?

tar and gzip


tar is an archiving utility

first, lets see an example of creating single archive file from multiple input files
note that the archive file so created is a new file and doesn't overwrite input files

$ ls -F
backups/ low_power_adders/ word_lists/ words_ref.txt

$ # -c option creates a new archive, existing archive will be overwritten


$ # -f option allows to specify name of archive to be created
$ # rest of the arguments are the files to be archived
$ tar -cf bkp_words.tar word_lists words_ref.txt

$ ls -F
backups/ bkp_words.tar low_power_adders/ word_lists/ words_ref.txt
$ ls -sh bkp_words.tar
2.3M bkp_words.tar

once we have an archive, we can compress it using gzip


this will replace the archive file with compressed version, adding a .gz suffix

25
Files and Directories

$ gzip bkp_words.tar

$ ls -F
backups/ bkp_words.tar.gz low_power_adders/ word_lists/ words_ref.txt
$ ls -sh bkp_words.tar.gz
652K bkp_words.tar.gz

to uncompress, use gunzip or gzip -d


this will replace the compressed version with the uncompressed archive file

$ gunzip bkp_words.tar.gz

$ ls -F
backups/ bkp_words.tar low_power_adders/ word_lists/ words_ref.txt
$ ls -sh bkp_words.tar
2.3M bkp_words.tar

to extract the original files from archive, use -x option

$ mkdir test_extract
$ mv bkp_words.tar test_extract/
$ cd test_extract/
$ ls
bkp_words.tar

$ tar -xf bkp_words.tar


$ ls -F
bkp_words.tar word_lists/ words_ref.txt
$ cd ..
$ rm -r test_extract/

the GNU version of tar supports compressing/uncompressing options as well

$ ls -F
backups/ low_power_adders/ word_lists/ words_ref.txt

$ # -z option gives same compression as gzip command


$ # reverse would be: tar -zxf bkp_words.tar.gz
$ tar -zcf bkp_words.tar.gz word_lists words_ref.txt
$ ls -sh bkp_words.tar.gz
652K bkp_words.tar.gz

there are loads of options for various needs, see documentation for details
-v for verbose option
-r to append files to archive
-t to list contents of archive
--exclude= to specify files to be ignored from archiving
-j and -J to use bzip2 or xz compression technique instead of -z which uses gzip

there are commands starting with z to work with compressed files


zcat to display file contents of compressed file on standard output
zless to display file contents of compressed file one screenful at a time
zgrep to search compressed files and so on...

26
Files and Directories

Further Reading

tar Q&A on unix stackexchange


tar Q&A on stackoverflow
superuser: gzip without tar? Why are they used together?
zip and unzip commands

27
Working with Files and Directories

Working with Files and Directories


cat
less
tail
head
Text Editors
grep
find
locate
wc
du
df
touch
file
basename
dirname
chmod

In this chapter, we will see how to display contents of a file, search within files, search for files, get file properties and
information, what are the permissions for files/directories and how to change them to our requirements

cat
concatenate files and print on the standard output

Options

-n number output lines


-s squeeze repeated empty lines into single empty line
-e show non-printing characters and end of line
-A in addition to -e , also shows tab characters

Examples

One or more files can be given as input and hence a lot of times, cat is used to quickly see contents of small
single file on terminal
But not needed to simply pass file content as stdin to other commands. See Useless Use of Cat Award
To save the output of concatenation, just redirect stdout

28
Working with Files and Directories

$ ls
marks_2015.txt marks_2016.txt marks_2017.txt

$ cat marks_201*
Name Maths Science
foo 67 78
bar 87 85
Name Maths Science
foo 70 75
bar 85 88
Name Maths Science
foo 68 76
bar 90 90

$ # save stdout to a file


$ cat marks_201* > all_marks.txt

often used option is -A to see various non-printing characters and end of line

$ printf 'foo\0bar\tbaz \r\n'


foobar baz

$ printf 'foo\0bar\tbaz \r\n' | cat -A


foo^@bar^Ibaz ^M$

use tac to reverse input line wise

$ tac marks_2015.txt
bar 87 85
foo 67 78
Name Maths Science

$ seq 3 | tac
3
2
1

Further Reading

For more detailed examples and discussion, see section cat from command line text processing repo
cat Q&A on unix stackexchange
cat Q&A on stackoverflow

less
opposite of more

cat command is not suitable for viewing contents of large files on the Terminal. less displays contents of a file,
automatically fits to size of Terminal, allows scrolling in either direction and other options for effective viewing. Usually,
man command uses less command to display the help page. The navigation options are similar to vi editor

Commonly used commands are given below, press h for summary of options

29
Working with Files and Directories

g go to start of file

G go to end of file
q quit

/pattern search for the given pattern in forward direction

?pattern search for the given pattern in backward direction


n go to next pattern

N go to previous pattern

Example and Further Reading

less -s large_filename display contents of file large_filename using less command, consecutive blank lines are

squeezed as single blank line


Use -N option to prefix line number
less command is an improved version of more command

differences between most, more and less


less Q&A on unix stackexchange

tail
output the last part of files

Examples

By default, tail displays last 10 lines


Use -n option to change how many lines are needed

$ # last two lines


$ tail -n2 report.log
Error: something seriously went wrong
blah blah blah

$ # all lines starting from 3rd line i.e all lines except first two lines
$ seq 13 17 | tail -n +3
15
16
17

multiple file input

$ # use -q option to avoid filename in output


$ tail -n2 report.log sample.txt
==> report.log <==
Error: something seriously went wrong
blah blah blah

==> sample.txt <==


He he he
Adios amigo

characterwise extraction
Note that this works byte wise and not suitable for multi-byte character encodings

30
Working with Files and Directories

$ # last three characters including the newline character


$ echo 'Hi there!' | tail -c3
e!

$ # excluding the first character


$ echo 'Hi there!' | tail -c +2
i there!

Further Reading

For more detailed examples and discussion, see section tail from command line text processing repo
tail Q&A on unix stackexchange
tail Q&A on stackoverflow

head
output the first part of files

Examples

By default, head displays first 10 lines


Use -n option to change how many lines are needed

$ head -n3 report.log


blah blah
Warning: something went wrong
more blah

$ # tail gives last 10 lines, head then gives first 2 from tail output
$ tail sample.txt | head -n2
Just do-it
Believe it

$ # except last 2 lines


$ seq 13 17 | head -n -2
13
14
15

multiple file input

$ # use -q option to avoid filename in output


$ head -n3 report.log sample.txt
==> report.log <==
blah blah
Warning: something went wrong
more blah

==> sample.txt <==


Hello World!

Good day

characterwise extraction

31
Working with Files and Directories

Note that this works byte wise and not suitable for multi-byte character encodings

$ # first two characters


$ echo 'Hi there!' | head -c2
Hi

$ # excluding last four characters


$ echo 'Hi there!' | head -c -4
Hi the

Further Reading

For more detailed examples and discussion, see section head from command line text processing repo
head Q&A on unix stackexchange

Text Editors
For editing text files, the following applications can be used. Of these, gedit , nano , vi and/or vim are available in
most distros by default

Easy to use

gedit
geany
nano

Powerful text editors

vim
vim learning resources and vim reference for further info
emacs
atom
sublime

grep
print lines matching a pattern

grep stands for Global Regular Expression Print. Used to search for a pattern in given files - whether a particular
word or pattern is present (or not present), name of files containing the pattern, etc. By default, matching is performed in
any part of a line, options and regular expressions can be used to match only the desired part

Options

--color=auto display the matched pattern, file names, line numbers etc with color distinction
-i ignore case while matching
-v print non-matching lines, i.e it inverts the selection
-n print also line numbers of matched pattern

-c display only the count of number of lines matching the pattern


-l print only the filenames with matching pattern

32
Working with Files and Directories

-L print filenames NOT matching the pattern

-w match pattern against whole word


-x match pattern against whole line

-F interpret pattern to search as fixed string (i.e not a regular expression). Faster as well

-o print only matching parts


-A number print matching line and 'number' of lines after the matched line

-B number print matching line and 'number' of lines before the matched line

-C number print matching line and 'number' of lines before and after the matched line
-m number restrict printing to upper limit of matched lines specified by 'number'

-q no standard output, quit immediately if match found. Useful in scripts to check if a file contains search pattern

or not
-s suppress error messages if file doesn't exist or not readable. Again, more useful in scripts

-r recursively search all files in specified folders

-h do not prefix filename for matching lines (default behavior for single file search)
-H prefix filename for matching lines (default behavior for multiple file search)

Examples

grep 'area' report.log will print all lines containing the word area in report.log

grep 'adder power' report.log will print lines containing adder power
man grep | grep -i -A 5 'exit status' will print matched line and 5 lines after containing the words 'exit status'

independent of case
See Context Line Control topic in info grep for related options like --no-group-separator
grep -m 5 'error' report.log will print maximum of 5 lines containing the word error

grep "$HOME" /etc/passwd will print lines matching the value of environment variable HOME
Note the use of double quotes for variable substitution
grep -w 'or' story.txt match whole word or, not part of word like for, thorn etc
grep -x 'power' test_list.txt match whole line containing the pattern power

Note: All of the above examples would be suited for -F option as these do not use regular expressions and will be
faster with -F option

Regular Expressions

Regular Expressions help in defining precise patterns, like extracting only alphabets or numbers, matching at start of
line, end of line, character sequence, etc
The following reference is for ERE - Extended Regular Expressions

Anchors

^ match from start of line


$ match end of line

\< match beginning of word


\> match end of word
\b match edge of word
\B match other than edge of word

Character Quantifiers

. match any single character


* match preceding character 0 or more times
+ match preceding character 1 or more times

33
Working with Files and Directories

? match preceding character 0 or 1 times

{n} match preceding character exactly n times


{n,} match preceding character n or more times

{n,m} match preceding character n to m times, including n and m

{,m} match preceding character up to m times

Character classes

[aeiou] match any of these characters


[^aeiou] do not match any of these characters

[a-z] match any lowercase alphabet

[0-9] match any digit character

\w match alphabets, digits and underscore character, short cut for [a-zA-Z0-9_]

\W opposite of \w , short cut for [^a-zA-Z0-9_]

\s match white-space characters: tab, newline, vertical tab, form feed, carriage return, and space
\S match other than white-space characters

Pattern groups

| matches either of the given patterns


() patterns within () are grouped and treated as one pattern, useful in conjunction with |

\1 backreference to first grouped pattern within ()


\2 backreference to second grouped pattern within () and so on

Basic vs Extended Regular Expressions

By default, the pattern passed to grep is treated as Basic Regular Expressions(BRE), which can be overridden using
options like -E for ERE and -P for Perl Compatible Regular Expression(PCRE)
Paraphrasing from info grep

In Basic Regular Expressions the meta-characters ? + { | ( ) lose their special meaning, instead use the
backslashed versions \? \+ \{ \| \( \)

Examples

grep -i '[a-z]' report.log will print all lines having atleast one alphabet character
grep '[0-9]' report.log will print all lines having atleast one number
grep 'area\|power' report.log will match lines containing area or power
grep -E 'area|power' report.log will match lines containing area or power

grep -E 'hand(y|ful)' short_story.txt match handy or handful


grep -E '(Th)?is' short_story.txt match This or is
grep -iE '([a-z])\1' short_story.txt match same alphabet appearing consecutively like 'aa', 'FF', 'Ee' etc

Perl Compatible Regular Expression

grep -P '\d' report.log will print all lines having atleast one number
grep -P '(Th)?is' short_story.txt match This or is
grep -oP 'file\K\d+' report.log print only digits that are preceded by the string 'file'
man pcrepattern syntax and semantics of the regular expressions that are supported by PCRE
look-around assertions example

Example input files

34
Working with Files and Directories

$ cat ip.txt
Roses are red,
Violets are blue,
Sugar is sweet,
And so are you.

$ echo -e 'Red\nGreen\nBlue\nBlack\nWhite' > colors.txt


$ cat colors.txt
Red
Green
Blue
Black
White

string search, use -F for faster results

$ grep -F 'are' ip.txt


Roses are red,
Violets are blue,
And so are you.

$ grep -Fv 'are' ip.txt


Sugar is sweet,

$ grep -Fc 'are' ip.txt


3

$ grep -F -m2 'are' ip.txt


Roses are red,
Violets are blue,

$ grep -F 'rose' ip.txt


$ grep -Fi 'rose' ip.txt
Roses are red,

regular expression, cannot use -F

$ # lines with words starting with s or S


$ grep -iE '\bs' ip.txt
Sugar is sweet,
And so are you.

$ # get only the words starting with s or S


$ grep -ioE '\bs[a-z]+' ip.txt
Sugar
sweet
so

using file input to specify search terms

35
Working with Files and Directories

$ grep -Fif colors.txt ip.txt


Roses are red,
Violets are blue,

$ echo -e 'Brown\nRed\nGreen\nBlue\nYellow\nBlack\nWhite' > more_colors.txt

$ # get common lines between two files


$ grep -Fxf colors.txt more_colors.txt
Red
Green
Blue
Black
White

$ # get lines present in more_colors.txt but not colors.txt


$ grep -Fxvf colors.txt more_colors.txt
Brown
Yellow

Further Reading

For more detailed examples and discussion, see GNU grep chapter from command line text processing repo
how grep command was born
why GNU grep is fast
Difference between grep, egrep and fgrep
grep Q&A on stackoverflow
grep Q&A on unix stackexchange

find
search for files in a directory hierarchy

Examples

Filtering based on file name

find . -iname 'power.log' search and print path of file named power.log (ignoring case) in current directory and
its sub-directories
find -name '*log' search and print path of all files whose name ends with log in current directory - using . is
optional when searching in current directory
find -not -name '*log' print path of all files whose name does NOT end with log in current directory

find -regextype egrep -regex '.*/\w+' use extended regular expression to match filename containing only [a-
zA-Z_] characters
.*/ is needed to match initial part of file path

Filtering based on file type

find /home/guest1/proj -type f print path of all regular files found in specified directory

find /home/guest1/proj -type d print path of all directories found in specified directory
find /home/guest1/proj -type f -name '.*' print path of all hidden files

Filtering based on depth

36
Working with Files and Directories

The relative path . is considered as depth 0 directory, files and folders immediately contained in a directory are at
depth 1 and so on

find -maxdepth 1 -type f all regular files (including hidden ones) from current directory (without going to sub-

directories)
find -maxdepth 1 -type f -name '[!.]*' all regular files (but not hidden ones) from current directory (without

going to sub-directories)
-not -name '.*' can be also used

find -mindepth 1 -maxdepth 1 -type d all directories (including hidden ones) in current directory (without going to

sub-directories)

Filtering based on file properties

find -mtime -2 print files that were modified within last two days in current directory

Note that day here means 24 hours


find -mtime +7 print files that were modified more than seven days back in current directory

find -daystart -type f -mtime -1 files that were modified from beginning of day (not past 24 hours)
find -size +10k print files with size greater than 10 kilobytes in current directory

find -size -1M print files with size less than 1 megabytes in current directory
find -size 2G print files of size 2 gigabytes in current directory

Passing filtered files as input to other commands

find report -name '*log*' -exec rm {} \; delete all filenames containing log in report folder and its sub-folders

here rm command is called for every file matching the search conditions
since ; is a special character for shell, it needs to be escaped using \
find report -name '*log*' -delete delete all filenames containing log in report folder and its sub-folders

find -name '*.txt' -exec wc {} + list of files ending with txt are all passed together as argument to wc
command instead of executing wc command for every file
no need to use escape the + character in this case
also note that number of invocations of command specified is not necessarily once if number of files found is
too large
find -name '*.log' -exec mv {} ../log/ \; move files ending with .log to log directory present in one hierarchy
above. mv is executed once per each filtered file
find -name '*.log' -exec mv -t ../log/ {} + the -t option allows to specify target directory and then provide
multiple files to be moved as argument
Similarly, one can use -t for cp command

Further Reading

using find
Collection of find examples
find Q&A on unix stackexchange
find and tar example
find Q&A on stackoverflow
Why is looping over find's output bad practice?

locate
find files by name

37
Working with Files and Directories

Faster alternative to find command when searching for a file by its name. It is based on a database, which gets
updated by a cron job. So, newer files may be not present in results. Use this command if it is available in your distro
and you remember some part of filename. Very useful if one has to search entire filesystem in which case find
command might take a very long time compared to locate

Examples

locate 'power' print path of files containing power in the whole filesystem

matches anywhere in path, ex: '/home/learnbyexample/lowpower_adder/result.log' and


'/home/learnbyexample/power.log' are both a valid match
implicitly, locate would change the string to *power* as no globbing characters are present in the string
specified
locate -b '\power.log' print path matching the string power.log exactly at end of path

'/home/learnbyexample/power.log' matches but not '/home/learnbyexample/lowpower.log'


since globbing character '\' is used while specifying search string, it doesn't get implicitly replaced by
*power.log*

locate -b '\proj_adder' the -b option also comes in handy to print only the path of directory name, otherwise
every file under that folder would also be displayed
find vs locate - pros and cons

wc
print newline, word, and byte counts for each file

Examples

$ # by default, gives newline/word/byte count (in that order)


$ wc sample.txt
5 17 78 sample.txt

$ # options to get only newline/word/byte count


$ wc -l sample.txt
5 sample.txt
$ wc -w sample.txt
17 sample.txt
$ wc -c sample.txt
78 sample.txt

$ # use shell input redirection if filename is not needed


$ wc -l < sample.txt
5

multiple file input

$ # automatically displays total at end


$ wc *.txt
5 10 57 fruits.txt
2 6 32 greeting.txt
5 17 78 sample.txt
12 33 167 total

other options

38
Working with Files and Directories

$ # use -L to get length of longest line


$ # won't count non-printable characters, tabs are converted to equivalent spaces
$ wc -L < sample.txt
24
$ printf 'foo\tbar\0baz' | wc -L
14
$ printf 'foo\tbar\0baz' | awk '{print length()}'
11

$ # -c gives byte count, -m gives character count


$ printf 'hi' | wc -m
3
$ printf 'hi' | wc -c
6

Further Reading

For more detailed examples and discussion, see section wc from command line text processing repo
wc Q&A on unix stackexchange
wc Q&A on stackoverflow

du
estimate file space usage

Examples

$ # By default, size is given in size of 1024 bytes


$ # Files are ignored, all directories and sub-directories are recursively reported
$ # add -a option if files are also needed and -L if links should be dereferenced
$ du
17920 ./projs/full_addr
14316 ./projs/half_addr
32952 ./projs
33880 .

$ # use -s to show total directory size without descending into sub-directories


$ # add -c to also show total size at end
$ du -s projs words.txt
32952 projs
924 words.txt

different size formatting options

39
Working with Files and Directories

$ # number of bytes
$ du -b words.txt
938848 words.txt

$ # kilobytes = 1024 bytes


$ du -sk projs
32952 projs

$ # megabytes = 1024 kilobytes


$ du -sm projs
33 projs

$ # human readable, use --si for powers of 1000 instead of 1024


$ du -h words.txt
924K words.txt

$ # sorting
$ du -sh projs/* words.txt | sort -h
712K projs/report.log
924K words.txt
14M projs/half_addr
18M projs/full_addr

Further Reading

For more detailed examples and discussion, see section du from command line text processing repo
du Q&A on unix stackexchange
du Q&A on stackoverflow

df
report file system disk space usage

Examples

$ # use df without arguments to get information on all currently mounted file systems
$ df .
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 98298500 58563816 34734748 63% /

$ # use -B option for custom size


$ # use --si for size in powers of 1000 instead of 1024
$ df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 94G 56G 34G 63% /

Use --output to report only specific fields of interest

40
Working with Files and Directories

$ df -h --output=size,used,file / /media/learnbyexample/projs
Size Used File
94G 56G /
92G 35G /media/learnbyexample/projs

$ df -h --output=pcent .
Use%
63%

$ df -h --output=pcent,fstype | awk -F'%' 'NR>2 && $1>=40'


63% ext3
40% ext4
51% ext4

Further Reading

For more detailed examples and discussion, see section df from command line text processing repo
df Q&A on stackoverflow

touch
change file timestamps

Examples

$ # last access and modification time


$ stat -c $'%x\n%y' fruits.txt
2017-07-19 17:06:01.523308599 +0530
2017-07-13 13:54:03.576055933 +0530

$ # Updating both access and modification timestamp to current time


$ # add -a to change only access timestamp and -m to change only modification
$ touch fruits.txt
$ stat -c $'%x\n%y' fruits.txt
2017-07-21 10:11:44.241921229 +0530
2017-07-21 10:11:44.241921229 +0530

$ # copy both access and modification timestamp from power.log to report.log


$ # add -a or -m as needed
$ # See also -d and -t options
$ touch -r power.log report.log

If file doesn't exist, an empty one gets created unless -c is used

$ ls foo.txt
ls: cannot access 'foo.txt': No such file or directory

$ touch foo.txt
$ ls foo.txt
foo.txt

Further Reading

For more detailed examples and discussion, see section touch from command line text processing repo

41
Working with Files and Directories

touch Q&A on unix stackexchange

file
determine file type

Examples

$ file sample.txt
sample.txt: ASCII text
$ printf 'hi\n' | file -
/dev/stdin: UTF-8 Unicode text
$ file ch
ch: Bourne-Again shell script, ASCII text executable

$ printf 'hi\r\n' | file -


/dev/stdin: ASCII text, with CRLF line terminators

$ file sunset.jpg moon.png


sunset.jpg: JPEG image data
moon.png: PNG image data, 32 x 32, 8-bit/color RGBA, non-interlaced

find all files of particular type in current directory, for example image files

$ find -type f -exec bash -c '(file -b "$0" | grep -wq "image data") && echo "$0"' {} \;
./sunset.jpg
./moon.png

$ # if filenames do not contain : or newline characters


$ find -type f -exec file {} + | awk -F: '/\<image data\>/{print $1}'
./sunset.jpg
./moon.png

Further Reading

For more detailed examples and discussion, see section file from command line text processing repo
See also identify command which describes the format and characteristics of one or more image files

basename
strip directory and suffix from filenames

Examples

42
Working with Files and Directories

$ # same as using pwd command


$ echo "$PWD"
/home/learnbyexample

$ basename "$PWD"
learnbyexample

$ # use -a option if there are multiple arguments


$ basename -a foo/a/report.log bar/y/power.log
report.log
power.log

$ # use single quotes if arguments contain space and other special shell characters
$ # use suffix option -s to strip file extension from filename
$ basename -s '.log' '/home/learnbyexample/proj adder/power.log'
power
$ # -a is implied when using -s option
$ basename -s'.log' foo/a/report.log bar/y/power.log
report
power

For more detailed examples and discussion, see section basename from command line text processing repo

dirname
strip last component from file name

Examples

$ echo "$PWD"
/home/learnbyexample

$ dirname "$PWD"
/home

$ # use single quotes if arguments contain space and other special shell characters
$ dirname '/home/learnbyexample/proj adder/power.log'
/home/learnbyexample/proj adder

$ # unlike basename, by default dirname handles multiple arguments


$ dirname foo/a/report.log bar/y/power.log
foo/a
bar/y

$ # if no / in argument, output is . to indicate current directory


$ dirname power.log
.

For more detailed examples and discussion, see section dirname from command line text processing repo

chmod
change file mode bits

43
Working with Files and Directories

$ ls -l sample.txt
-rw-rw-r-- 1 learnbyexample learnbyexample 111 May 25 14:47 sample.txt

In the output of ls -l command, the first 10 characters displayed are related to type of file and its permissions.

First character indicates the file type The most common are

- regular file

d directory

l symbolic link

for complete list, see -l option in info ls

The other 9 characters represent three sets of file permissions for 'user', 'group' and 'others' - in that order

user file properties for owner of file - u

group file properties for the group the file belongs to - g


others file properties for everyone else - o

We'll be seeing only rwx file properties in this section, for other types of properties, refer this detailed doc

Permission characters and values

Character Meaning Value File Directory

r read 4 file can be read can see contents of directory

w write 2 file can be modified can add/remove files in directory

x execute 1 file can be run as a program can access contents of directory

- no permission 0 permission is disabled permission is disabled

File Permissions

44
Working with Files and Directories

$ pwd
/home/learnbyexample/linux_tutorial
$ ls -lF
total 8
-rw-rw-r-- 1 learnbyexample learnbyexample 40 May 28 13:25 hello_world.pl
-rw-rw-r-- 1 learnbyexample learnbyexample 111 May 25 14:47 sample.txt

$ #Files need execute permission to be run as program


$ ./hello_world.pl
bash: ./hello_world.pl: Permission denied
$ chmod +x hello_world.pl
$ ls -lF hello_world.pl
-rwxrwxr-x 1 learnbyexample learnbyexample 40 May 28 13:25 hello_world.pl*
$ ./hello_world.pl
Hello World

$ #Read permission
$ cat sample.txt
This is an example of adding text to a new file using cat command.
Press Ctrl+d on a newline to save and quit.
$ chmod -r sample.txt
$ ls -lF sample.txt
--w--w---- 1 learnbyexample learnbyexample 111 May 25 14:47 sample.txt
$ cat sample.txt
cat: sample.txt: Permission denied

$ #Files need write permission to modify its content


$ cat >> sample.txt
Adding a line of text at end of file
^C
$ cat sample.txt
cat: sample.txt: Permission denied
$ chmod +r sample.txt
$ ls -lF sample.txt
-rw-rw-r-- 1 learnbyexample learnbyexample 148 May 29 11:00 sample.txt
$ cat sample.txt
This is an example of adding text to a new file using cat command.
Press Ctrl+d on a newline to save and quit.
Adding a line of text at end of file
$ chmod -w sample.txt
$ ls -lF sample.txt
-r--r--r-- 1 learnbyexample learnbyexample 148 May 29 11:00 sample.txt
$ cat >> sample.txt
bash: sample.txt: Permission denied

Directory Permissions

45
Working with Files and Directories

$ ls -ld linux_tutorial/
drwxrwxr-x 2 learnbyexample learnbyexample 4096 May 29 10:59 linux_tutorial/

$ #Read Permission
$ ls linux_tutorial/
hello_world.pl sample.txt
$ chmod -r linux_tutorial/
$ ls -ld linux_tutorial/
d-wx-wx--x 2 learnbyexample learnbyexample 4096 May 29 10:59 linux_tutorial/
$ ls linux_tutorial/
ls: cannot open directory linux_tutorial/: Permission denied
$ chmod +r linux_tutorial/

$ #Execute Permission
$ chmod -x linux_tutorial/
$ ls -ld linux_tutorial/
drw-rw-r-- 2 learnbyexample learnbyexample 4096 May 29 10:59 linux_tutorial/
$ ls linux_tutorial/
ls: cannot access linux_tutorial/hello_world.pl: Permission denied
ls: cannot access linux_tutorial/sample.txt: Permission denied
hello_world.pl sample.txt
$ chmod +x linux_tutorial/

$ #Write Permission
$ chmod -w linux_tutorial/
$ ls -ld linux_tutorial/
dr-xr-xr-x 2 learnbyexample learnbyexample 4096 May 29 10:59 linux_tutorial/
$ touch linux_tutorial/new_file.txt
touch: cannot touch ‘linux_tutorial/new_file.txt’: Permission denied
$ chmod +w linux_tutorial/
$ ls -ld linux_tutorial/
drwxrwxr-x 2 learnbyexample learnbyexample 4096 May 29 10:59 linux_tutorial/
$ touch linux_tutorial/new_file.txt
$ ls linux_tutorial/
hello_world.pl new_file.txt sample.txt

Changing multiple permissions at once

$ # r(4) + w(2) + 0 = 6
$ # r(4) + 0 + 0 = 4
$ chmod 664 sample.txt
$ ls -lF sample.txt
-rw-rw-r-- 1 learnbyexample learnbyexample 148 May 29 11:00 sample.txt

$ # r(4) + w(2) + x(1) = 7


$ # r(4) + 0 + x(1) = 5
$ chmod 755 hello_world.pl
$ ls -lF hello_world.pl
-rwxr-xr-x 1 learnbyexample learnbyexample 40 May 28 13:25 hello_world.pl*

$ chmod 775 report/


$ ls -ld report/
drwxrwxr-x 2 learnbyexample learnbyexample 4096 May 29 14:01 report/

Changing single permission selectively

46
Working with Files and Directories

$ chmod o-r sample.txt


$ ls -lF sample.txt
-rw-rw---- 1 learnbyexample learnbyexample 148 May 29 11:00 sample.txt

$ chmod go-x hello_world.pl


$ ls -lF hello_world.pl
-rwxr--r-- 1 learnbyexample learnbyexample 40 May 28 13:25 hello_world.pl*

$ chmod go+x hello_world.pl


$ ls -lF hello_world.pl
-rwxr-xr-x 1 learnbyexample learnbyexample 40 May 28 13:25 hello_world.pl*

Recursively changing permission for directory

$ ls -lR linux_tutorial/
linux_tutorial/:
total 12
-rwxr-xr-x 1 learnbyexample learnbyexample 40 May 28 13:25 hello_world.pl
drwxrwxr-x 2 learnbyexample learnbyexample 4096 May 29 14:32 report
-rw-rw---- 1 learnbyexample learnbyexample 148 May 29 11:00 sample.txt

linux_tutorial/report:
total 0
-rw-rw-r-- 1 learnbyexample learnbyexample 0 May 29 11:46 new_file.txt
$ ls -ld linux_tutorial/
drwxrwxr-x 3 learnbyexample learnbyexample 4096 May 29 14:32 linux_tutorial/

$ #adding/removing files to a directory depends only on parent directory permissions


$ chmod -w linux_tutorial/
$ ls -ld linux_tutorial/
dr-xr-xr-x 3 learnbyexample learnbyexample 4096 May 29 14:32 linux_tutorial/
$ ls -ld linux_tutorial/report/
drwxrwxr-x 2 learnbyexample learnbyexample 4096 May 29 14:32 linux_tutorial/report/
$ rm linux_tutorial/sample.txt
rm: cannot remove ‘linux_tutorial/sample.txt’: Permission denied
$ touch linux_tutorial/report/power.log
$ ls linux_tutorial/report/
new_file.txt power.log
$ rm linux_tutorial/report/new_file.txt
$ ls linux_tutorial/report/
power.log

$ chmod +w linux_tutorial/
$ ls -ld linux_tutorial/
drwxrwxr-x 3 learnbyexample learnbyexample 4096 May 29 14:32 linux_tutorial/
$ chmod -w -R linux_tutorial/
$ ls -lR linux_tutorial/
linux_tutorial/:
total 12
-r-xr-xr-x 1 learnbyexample learnbyexample 40 May 28 13:25 hello_world.pl
dr-xr-xr-x 2 learnbyexample learnbyexample 4096 May 29 14:40 report
-r--r----- 1 learnbyexample learnbyexample 148 May 29 11:00 sample.txt

linux_tutorial/report:
total 0
-r--r--r-- 1 learnbyexample learnbyexample 0 May 29 14:39 power.log
$ rm linux_tutorial/report/power.log
rm: remove write-protected regular empty file ‘linux_tutorial/report/power.log’? y
rm: cannot remove ‘linux_tutorial/report/power.log’: Permission denied

47
Working with Files and Directories

What permissions are affected by +-/rwx depends on umask value as well. It is usually 002 which means
+r -r +x -x without u g o qualifier affects all the three categories
+w -w without u g o qualifier affects only user and group categories

Further Reading

Linux File Permissions


Linux Permissions Primer
unix.stackexchange - Why chmod +w filename not giving write permission to other
chmod Q&A on unix stackexchange
chmod Q&A on stackoverflow

48
Text Processing

Text Processing
sort
uniq
comm
cmp
diff
tr
sed
awk
perl
cut
paste
column
pr

The rich set of text processing commands is comprehensive and time saving. Knowing even their existence is enough
to avoid the need of writing yet another script (which takes time and effort plus debugging) – a trap which many
beginners fall into. An extensive list of text processing commands and examples can be found here

sort
sort lines of text files

As the name implies, this command is used to sort files. How about alphabetic sort and numeric sort? Possible. How
about sorting a particular column? Possible. Prioritized multiple sorting order? Possible. Randomize? Unique? Just
about any sorting need is catered by this powerful command

Options

-R random sort
-r reverse the sort order
-o redirect sorted result to specified filename, very useful to sort a file inplace
-n sort numerically
-V version sort, aware of numbers within text

-h sort human readable numbers like 4K, 3M, etc


-k sort via key
-u sort uniquely
-b ignore leading white-spaces of a line while sorting
-t use SEP instead of non-blank to blank transition

Examples

sort dir_list.txt display sorted file on standard output


sort -bn numbers.txt -o numbers.txt sort numbers.txt numerically (ignoring leading white-spaces) and overwrite
the file with sorted output
sort -R crypto_keys.txt -o crypto_keys_random.txt sort randomly and write to new file

49
Text Processing

shuf crypto_keys.txt -o crypto_keys_random.txt can also be used

du -sh * | sort -h sort file/directory sizes in current directory in human readable format

$ cat ip.txt
6.2 : 897 : bar
3.1 : 32 : foo
2.3 : 012 : bar
1.2 : 123 : xyz

$ # -k3,3 means from 3rd column onwards to 3rd column


$ # for ex: to sort from 2nd column till end, use -k2
$ sort -t: -k3,3 ip.txt
2.3 : 012 : bar
6.2 : 897 : bar
3.1 : 32 : foo
1.2 : 123 : xyz

$ # -n option for numeric sort, check out what happens when -n is not used
$ sort -t: -k2,2n ip.txt
2.3 : 012 : bar
3.1 : 32 : foo
1.2 : 123 : xyz
6.2 : 897 : bar

$ # more than one rule can be specified to resolve same values


$ sort -t: -k3,3 -k1,1rn ip.txt
6.2 : 897 : bar
2.3 : 012 : bar
3.1 : 32 : foo
1.2 : 123 : xyz

Further Reading

sort like a master


sort Q&A on unix stackexchange
sort on multiple columns using -k option

uniq
report or omit repeated lines

This command is more specific to recognizing duplicates. Usually requires a sorted input as the comparison is made on
adjacent lines only

Options

-d print only duplicate lines

-c prefix count to occurrences


-u print only unique lines

Examples

sort test_list.txt | uniq outputs lines of test_list.txt in sorted order with duplicate lines removed

50
Text Processing

uniq <(sort test_list.txt) same command using process substitution

sort -u test_list.txt equivalent command


uniq -d sorted_list.txt print only duplicate lines

uniq -cd sorted_list.txt print only duplicate lines and prefix the line with number of times it is repeated

uniq -u sorted_list.txt print only unique lines, repeated lines are ignored
uniq Q&A on unix stackexchange

$ echo -e 'Blue\nRed\nGreen\nBlue\nRed\nBlack\nRed' > colors.txt


$ uniq colors.txt
Blue
Red
Green
Blue
Red
Black
Red

$ echo -e 'Blue\nRed\nGreen\nBlue\nRed\nBlack\nRed' | sort > sorted_colors.txt


$ uniq sorted_colors.txt
Black
Blue
Green
Red

$ uniq -d sorted_colors.txt
Blue
Red

$ uniq -cd sorted_colors.txt


2 Blue
3 Red

$ uniq -u sorted_colors.txt
Black
Green

comm
compare two sorted files line by line

Without any options, it prints output in three columns - lines unique to file1, line unique to file2 and lines common to
both files

Options

-1 suppress lines unique to file1


-2 suppress lines unique to file2
-3 suppress lines common to both files

Examples

comm -23 sorted_file1.txt sorted_file2.txt print lines unique to sorted_file1.txt


comm -23 <(sort file1.txt) <(sort file2.txt)' same command using process substitution, if sorted input
files are not available

51
Text Processing

comm -13 sorted_file1.txt sorted_file2.txt print lines unique to sorted_file2.txt

comm -12 sorted_file1.txt sorted_file2.txt print lines common to both files


comm Q&A on unix stackexchange

$ echo -e 'Brown\nRed\nPurple\nBlue\nTeal\nYellow' | sort > colors_1.txt


$ echo -e 'Red\nGreen\nBlue\nBlack\nWhite' | sort > colors_2.txt

$ # the input files viewed side by side


$ paste colors_1.txt colors_2.txt
Blue Black
Brown Blue
Purple Green
Red Red
Teal White
Yellow

examples

$ # 3 column output - unique to file1, file2 and common


$ comm colors_1.txt colors_2.txt
Black
Blue
Brown
Green
Purple
Red
Teal
White
Yellow

$ # suppress 1 and 2 column, gives only common lines


$ comm -12 colors_1.txt colors_2.txt
Blue
Red

$ # suppress 1 and 3 column, gives lines unique to file2


$ comm -13 colors_1.txt colors_2.txt
Black
Green
White

$ # suppress 2 and 3 column, gives lines unique to file1


$ comm -23 colors_1.txt colors_2.txt
Brown
Purple
Teal
Yellow

cmp
compare two files byte by byte

Useful to compare binary files. If the two files are same, no output is displayed (exit status 0)
If there is a difference, it prints the first difference - line number and byte location (exit status 1)
Option -s allows to suppress the output, useful in scripts

52
Text Processing

$ cmp /bin/grep /bin/fgrep


/bin/grep /bin/fgrep differ: byte 25, line 1

More examples here

diff
compare files line by line

Useful to compare old and new versions of text files


All the differences are printed, which might not be desirable if files are too long

Options

-s convey message when two files are same

-y two column output


-i ignore case while comparing

-w ignore white-spaces
-r recursively compare files between the two directories specified
-q report if files differ, not the details of difference

Examples

diff -s test_list_mar2.txt test_list_mar3.txt compare two files


diff -s report.log bkp/mar10/ no need to specify second filename if names are same
diff -qr report/ bkp/mar10/report/ recursively compare files between report and bkp/mar10/report directories,

filenames not matching are also specified in output


see this link for detailed analysis and corner cases
diff report/ bkp/mar10/report/ | grep -w '^diff' useful trick to get only names of mismatching files (provided
no mismatches contain the whole word diff at start of line)

Further Reading

diff Q&A on unix stackexchange


gvimdiff edit two, three or four versions of a file with Vim and show differences
GUI diff and merge tools

tr
translate or delete characters

Options

-d delete the specified characters


-c complement set of characters to be replaced

Examples

tr a-z A-Z < test_list.txt convert lowercase to uppercase


tr -d ._ < test_list.txt delete the dot and underscore characters

53
Text Processing

tr a-z n-za-m < test_list.txt > encrypted_test_list.txt Encrypt by replacing every lowercase alphabet with

13th alphabet after it


Same command on encrypted text will decrypt it
tr Q&A on unix stackexchange

sed
stream editor for filtering and transforming text

Options

-n suppress automatic printing of pattern space

-i edit files inplace (makes backup if SUFFIX supplied)

-r use extended regular expressions


-e add the script to the commands to be executed

-f add the contents of script-file to the commands to be executed


for examples and details, refer to links given below

commands

We'll be seeing examples only for three commonly used commands

d Delete the pattern space

p Print out the pattern space


s search and replace

check out 'Often-Used Commands' and 'Less Frequently-Used Commands' sections in info sed for complete list
of commands

range

By default, sed acts on all of input contents. This can be refined to specific line number or a range defined by line
numbers, search pattern or mix of the two

n,m range between nth line to mth line, including n and m


i~j act on ith line and i+j, i+2j, i+3j, etc
1~2 means 1st, 3rd, 5th, 7th, etc lines i.e odd numbered lines
5~3 means 5th, 8th, 11th, etc
n only nth line

$ only last line


/pattern/ lines matching pattern
n,/pattern/ nth line to line matching pattern
n,+x nth line and x lines after
/pattern/,m line matching pattern to mth line

/pattern/,+x line matching pattern and x lines after


/pattern1/,/pattern2/ line matching pattern1 to line matching pattern2
/pattern/I lines matching pattern, pattern is case insensitive
for more details, see section 'Selecting lines with sed' in info sed
see 'Regular Expressions' in grep command for extended regular expressions reference
also check out 'Overview of Regular Expression Syntax' section in info sed

Examples for selective deletion(d)

54
Text Processing

sed '/cat/d' story.txt delete every line containing cat

sed '/cat/!d' story.txt delete every line NOT containing cat


sed '$d' story.txt delete last line of the file

sed '2,5d' story.txt delete lines 2,3,4,5 of the file

sed '1,/test/d' dir_list.txt delete all lines from beginning of file to first occurrence of line containing test (the
matched line is also deleted)
sed '/test/,$d' dir_list.txt delete all lines from line containing test to end of file

Examples for selective printing(p)

sed -n '5p' story.txt print 5th line, -n option overrides default print behavior of sed

use sed '5q;d' story.txt on large files. Read more


sed -n '/cat/p' story.txt print every line containing the text cat

equivalent to sed '/cat/!d' story.txt


sed -n '4,8!p' story.txt print all lines except lines 4 to 8
man grep | sed -n '/^\s*exit status/I,/^$/p' extract exit status information of a command from manual

/^\s*exit status/I checks for line starting with 'exit status' in case insensitive way, white-space may be
present at start of line
/^$/ empty line

man ls | sed -n '/^\s*-F/,/^$/p' extract information on command option from manual


/^\s*-F/ line starting with option '-F', white-space may be present at start of line

Examples for search and replace(s)

sed -i 's/cat/dog/g' story.txt search and replace every occurrence of cat with dog in story.txt
sed -i.bkp 's/cat/dog/g' story.txt in addition to inplace file editing, create backup file story.txt.bkp, so that if a

mistake happens, original file can be restored


sed -i.bkp 's/cat/dog/g' *.txt to perform operation on all files ending with .txt in current directory
sed -i '5,10s/cat/dog/gI' story.txt search and replace every occurrence of cat (case insensitive due to

modifier I) with dog in story.txt only in line numbers 5 to 10


sed '/cat/ s/animal/mammal/g' story.txt replace animal with mammal in all lines containing cat

Since -i option is not used, output is displayed on standard output and story.txt is not changed
spacing between range and command is optional, sed '/cat/s/animal/mammal/g' story.txt can also be
used
sed -i -e 's/cat/dog/g' -e 's/lion/tiger/g' story.txt search and replace every occurrence of cat with dog

and lion with tiger


any number of -e option can be used
sed -i 's/cat/dog/g ; s/lion/tiger/g' story.txt alternative syntax, spacing around ; is optional
sed -r 's/(.*)/abc: \1 :xyz/' list.txt add prefix 'abc: ' and suffix ' :xyz' to every line of list.txt
sed -i -r "s/(.*)/$(basename $PWD)\/\1/" dir_list.txt add current directory name and forward-slash character

at the start of every line


Note the use of double quotes to perform command substitution
sed -i -r "s|.*|$HOME/\0|" dir_list.txt add home directory and forward-slash at the start of every line
Since the value of '$HOME' itself contains forward-slash characters, we cannot use / as delimiter
Any character other than backslash or newline can be used as delimiter, for example | # ^ see this link for
more info
\0 back-reference contains entire matched string

Example input file

55
Text Processing

$ cat mem_test.txt
mreg2 = 1200 # starting address
mreg4 = 2180 # ending address

dreg5 = get(mreg2) + get(mreg4)


print dreg5

replace all reg with register

$ sed 's/reg/register/g' mem_test.txt


mregister2 = 1200 # starting address
mregister4 = 2180 # ending address

dregister5 = get(mregister2) + get(mregister4)


print dregister5

change start and end address

$ sed 's/1200/1530/; s/2180/1870/' mem_test.txt


mreg2 = 1530 # starting address
mreg4 = 1870 # ending address

dreg5 = get(mreg2) + get(mreg4)


print dreg5

$ # to make changes only on mreg initializations, use


$ # sed '/mreg[0-9] *=/ s/1200/1530/; s/2180/1870/' mem_test.txt

Using bash variables

$ s_add='1760'; e_add='2500'
$ sed "s/1200/$s_add/; s/2180/$e_add/" mem_test.txt
mreg2 = 1760 # starting address
mreg4 = 2500 # ending address

dreg5 = get(mreg2) + get(mreg4)


print dreg5

split inline commented code to comment + code

$ sed -E 's/^([^#]+)(#.*)/\2\n\1/' mem_test.txt


# starting address
mreg2 = 1200
# ending address
mreg4 = 2180

dreg5 = get(mreg2) + get(mreg4)


print dreg5

range of lines matching pattern

56
Text Processing

$ seq 20 | sed -n '/3/,/5/p'


3
4
5
13
14
15

inplace editing

$ sed -i -E 's/([md]r)eg/\1/g' mem_test.txt


$ cat mem_test.txt
mr2 = 1200 # starting address
mr4 = 2180 # ending address

dr5 = get(mr2) + get(mr4)


print dr5

$ # more than one input files can be given


$ # use glob pattern if files share commonality, ex: *.txt

Further Reading

sed basics
sed detailed tutorial
sed-book
cheat sheet
sed examples
sed one-liners explained
common search and replace examples with sed
sed Q&A on unix stackexchange
sed Q&A on stackoverflow

awk
pattern scanning and text processing language

awk derives its name from authors Alfred Aho, Peter Weinberger and Brian Kernighan.

syntax

awk 'BEGIN {initialize} condition1 {stmts} condition2 {stmts}... END {finish}'


BEGIN {initialize} used to initialize variables (could be user defined or awk variables or both), executed

once - optional block


condition1 {stmts} condition2 {stmts}... action performed for every line of input, condition is optional,
more than one block {} can be used with/without condition
END {finish} perform action once at end of program - optional block
commands can be written in a file and passed using the -f option instead of writing it all on command line
for examples and details, refer to links given below

57
Text Processing

Example input file

$ cat test.txt
abc : 123 : xyz
3 : 32 : foo
-2.3 : bar : bar

Just printing something, no input

$ awk 'BEGIN{print "Hello!\nTesting awk one-liner"}'


Hello!
Testing awk one-liner

search and replace


when the {stmts} portion of condition {stmts} is not specified, by default {print $0} is executed if the
condition evaluates to true
1 is a generally used awk idiom to print contents of $0 after performing some processing
print statement without argument will print the content of $0

$ # sub will replace only first occurrence


$ # third argument to sub specifies variable to change, defaults to $0
$ awk '{sub("3", "%")} 1' test.txt
abc : 12% : xyz
% : 32 : foo
-2.% : bar : bar

$ # gsub will replace all occurrences


$ awk '{gsub("3", "%")} 1' test.txt
abc : 12% : xyz
% : %2 : foo
-2.% : bar : bar

$ # add a condition to restrict processing only to those records


$ awk '/foo/{gsub("3", "%")} 1' test.txt
abc : 123 : xyz
% : %2 : foo
-2.3 : bar : bar

$ # using shell variables


$ r="@"
$ awk -v r_str="$r" '{sub("3", r_str)} 1' test.txt
abc : 12@ : xyz
@ : 32 : foo
-2.@ : bar : bar

$ # bash environment variables like PWD, HOME is also accessible via ENVIRON
$ s="%" awk '{sub("3", ENVIRON["s"])} 1' test.txt
abc : 12% : xyz
% : 32 : foo
-2.% : bar : bar

filtering content

58
Text Processing

$ # regex pattern, by default tested against $0


$ awk '/a/' test.txt
abc : 123 : xyz
-2.3 : bar : bar

$ # use ! to invert condition


$ awk '!/abc/' test.txt
3 : 32 : foo
-2.3 : bar : bar

$ seq 30 | awk 'END{print}'


30

$ # generic, length(var) - default is $0


$ seq 8 13 | awk 'length==1'
8
9

selecting based on line numbers


NR is record number

$ seq 123 135 | awk 'NR==7'


129

$ seq 123 135 | awk 'NR>=3 && NR<=5'


125
126
127

$ seq 5 | awk 'NR>=3'


3
4
5

$ # for large input, use exit to avoid unnecessary record processing


$ seq 14323 14563435 | awk 'NR==234{print; exit}'
14556

selecting based on start and end condition


for following examples
numbers 1 to 20 is input
regex pattern /4/ is start condition
regex pattern /6/ is end condition
f is idiomatically used to represent a flag variable

59
Text Processing

$ # records between start and end


$ seq 20 | awk '/4/{f=1; next} /6/{f=0} f'
5
15

$ # records between start and end and also includes start


$ seq 20 | awk '/4/{f=1} /6/{f=0} f'
4
5
14
15

$ # records between start and end and also includes end


$ seq 20 | awk '/4/{f=1; next} f; /6/{f=0}'
5
6
15
16

$ # records from start to end


$ seq 20 | awk '/4/{f=1} f{print} /6/{f=0}'
4
5
6
14
15
16

$ # records excluding start to end


$ seq 10 | awk '/4/{f=1} !f; /6/{f=0}'
1
2
3
7
8
9
10

column manipulations
by default, one or more consecutive spaces/tabs are considered as field separators

60
Text Processing

$ echo -e "1 3 4\na b c"


1 3 4
a b c

$ # second column
$ echo -e "1 3 4\na b c" | awk '{print $2}'
3
b

$ # last column
$ echo -e "1 3 4\na b c" | awk '{print $NF}'
4
c

$ # default output field separator is single space character


$ echo -e "1 3 4\na b c" | awk '{print $1, $3}'
1 4
a c

$ # condition for specific field


$ echo -e "1 3 4\na b c" | awk '$2 ~ /[0-9]/'
1 3 4

specifying a different input/output field separator


can be string alone or regex, multiple separators can be specified using | in regex pattern

$ awk -F' *: *' '$1 == "3"' test.txt


3 : 32 : foo

$ awk -F' *: *' '{print $1 "," $2}' test.txt


abc,123
3,32
-2.3,bar

$ awk -F' *: *' -v OFS="::" '{print $1, $2}' test.txt


abc::123
3::32
-2.3::bar

$ awk -F: -v OFS="\t" '{print $1 OFS $2}' test.txt


abc 123
3 32
-2.3 bar

dealing with duplicates, line/field wise

61
Text Processing

$ cat duplicates.txt
abc 123 ijk
foo 567 xyz
abc 123 ijk
bar 090 pqr
tst 567 zzz

$ # whole line
$ awk '!seen[$0]++' duplicates.txt
abc 123 ijk
foo 567 xyz
bar 090 pqr
tst 567 zzz

$ # particular column
$ awk '!seen[$2]++' duplicates.txt
abc 123 ijk
foo 567 xyz
bar 090 pqr

inplace editing

$ awk -i inplace '{print NR ") " $0}' test.txt


$ cat test.txt
1) abc : 123 : xyz
2) 3 : 32 : foo
3) -2.3 : bar : bar

Further Reading

awk basics
Gawk: Effective AWK Programming
awk detailed tutorial
basic tutorials for grep, awk, sed
awk one-liners explained
awk book
awk cheat-sheet for awk variables, statements, functions, etc
awk examples
awk Q&A on unix stackexchange
awk Q&A on stackoverflow

perl
The Perl 5 language interpreter

Larry Wall wrote Perl as a general purpose scripting language, borrowing features from C, shell scripting, awk,
sed, grep, cut, sort etc

Reference tables given below for frequently used constructs with perl one-liners. Resource links given at end for
further reading.

62
Text Processing

Descriptions adapted from perldoc - command switches

Option Description

-e execute perl code

-n iterate over input files in a loop, lines are NOT printed by default

-p iterate over input files in a loop, lines are printed by default

-l chomp input line, $\ gets value of $/ if no argument given

-a autosplit input lines on space, implicitly sets -n for Perl version 5.20.0 and above

-F specifies the pattern to split input lines, implicitly sets -a and -n for Perl version 5.20.0 and above

-i edit files inplace, if extension provided make a backup copy

-0777 slurp entire file as single string, not advisable for large input files

Descriptions adapted from perldoc - Special Variables

Variable Description

$_ The default input and pattern-searching space

$. Current line number

$/ input record separator, newline by default

$\ output record separator, empty string by default

@F contains the fields of each line read, applicable with -a or -F option

%ENV contains current environment variables

$ARGV contains the name of the current file

Function Description

length Returns the length in characters of the value of EXPR. If EXPR is omitted, returns the length of $_

eof Returns 1 if the next read on FILEHANDLE will return end of file

Simple Perl program

$ perl -e 'print "Hello!\nTesting Perl one-liner\n"'


Hello!
Testing Perl one-liner

Example input file

63
Text Processing

$ cat test.txt
abc : 123 : xyz
3 : 32 : foo
-2.3 : bar : bar

Search and replace

$ perl -pe 's/3/%/' test.txt


abc : 12% : xyz
% : 32 : foo
-2.% : bar : bar

$ # use g flag to replace all occurrences, not just first match in line
$ perl -pe 's/3/%/g' test.txt
abc : 12% : xyz
% : %2 : foo
-2.% : bar : bar

$ # conditional replacement
$ perl -pe 's/3/@/g if /foo/' test.txt
abc : 123 : xyz
@ : @2 : foo
-2.3 : bar : bar

$ # using shell variables


$ r="@"
$ perl -pe "s/3/$r/" test.txt
abc : 12@ : xyz
@ : 32 : foo
-2.@ : bar : bar

$ # preferred approach is to use ENV hash variable


$ export s="%"
$ perl -pe 's/3/$ENV{s}/' test.txt
abc : 12% : xyz
% : 32 : foo
-2.% : bar : bar

Search and replace special characters

The \Q and q() constructs are helpful to nullify regex meta characters

64
Text Processing

$ # if not properly escaped or quoted, it can lead to errors


$ echo '*.^[}' | perl -pe 's/*.^[}/abc/'
Quantifier follows nothing in regex; marked by <-- HERE in m/* <-- HERE .^[}/ at -e line 1.

$ echo '*.^[}' | perl -pe 's/\*\.\^\[}/abc/'


abc

$ echo '*.^[}' | perl -pe 's/\Q*.^[}/abc/'


abc

$ echo '*.^[}' | perl -pe 's/\Q*.^[}/\$abc\$/'


$abc$

$ echo '*.^[}' | perl -pe 's/\Q*.^[}/q($abc$)/e'


$abc$

Print lines based on line number or pattern

$ perl -ne 'print if /a/' test.txt


abc : 123 : xyz
-2.3 : bar : bar

$ perl -ne 'print if !/abc/' test.txt


3 : 32 : foo
-2.3 : bar : bar

$ seq 123 135 | perl -ne 'print if $. == 7'


129

$ seq 1 30 | perl -ne 'print if eof'


30

$ # Use exit to save time on large input files


$ seq 14323 14563435 | perl -ne 'if($. == 234){print; exit}'
14556

$ # length() can also be used instead of length $_


$ seq 8 13 | perl -lne 'print if length $_ == 1'
8
9

Print range of lines based on line number or pattern

65
Text Processing

$ seq 123 135 | perl -ne 'print if $. >= 3 && $. <= 5'
125
126
127

$ # $. is default variable compared against when using ..


$ seq 123 135 | perl -ne 'print if 3..5'
125
126
127

$ # can use many alternatives, eof looks more readable


$ seq 5 | perl -ne 'print if 3..eof'
3
4
5

$ # matching regex specified by /pattern/ is checked against $_


$ seq 5 | perl -ne 'print if 3../4/'
3
4

$ seq 1 30 | perl -ne 'print if /4/../6/'


4
5
6
14
15
16
24
25
26

$ seq 2 8 | perl -ne 'print if !(/4/../6/)'


2
3
7
8

.. vs ...

$ echo -e '10\n11\n10' | perl -ne 'print if /10/../10/'


10
10

$ echo -e '10\n11\n10' | perl -ne 'print if /10/.../10/'


10
11
10

Column manipulations

66
Text Processing

$ echo -e "1 3 4\na b c" | perl -nale 'print $F[1]'


3
b

$ echo -e "1,3,4,8\na,b,c,d" | perl -F, -lane 'print $F[$#F]'


8
d

$ perl -F: -lane 'print "$F[0] $F[2]"' test.txt


abc xyz
3 foo
-2.3 bar

$ perl -F: -lane '$sum+=$F[1]; END{print $sum}' test.txt


155

$ perl -F: -lane '$F[2] =~ s/\w(?=\w)/$&,/g; print join ":", @F' test.txt
abc : 123 : x,y,z
3 : 32 : f,o,o
-2.3 : bar : b,a,r

$ perl -F'/:\s*[a-z]+/i' -lane 'print $F[0]' test.txt


abc : 123
3 : 32
-2.3

$ perl -F'\s*:\s*' -lane 'print join ",", grep {/[a-z]/i} @F' test.txt
abc,xyz
foo
bar,bar

$ perl -F: -ane 'print if (grep {/\d/} @F) < 2' test.txt
abc : 123 : xyz
-2.3 : bar : bar

Dealing with duplicates

$ cat duplicates.txt
abc 123 ijk
foo 567 xyz
abc 123 ijk
bar 090 pqr
tst 567 zzz

$ # whole line
$ perl -ne 'print if !$seen{$_}++' duplicates.txt
abc 123 ijk
foo 567 xyz
bar 090 pqr
tst 567 zzz

$ # particular column
$ perl -ane 'print if !$seen{$F[1]}++' duplicates.txt
abc 123 ijk
foo 567 xyz
bar 090 pqr

67
Text Processing

Multiline processing

$ # save previous lines to make it easier for multiline matching


$ perl -ne 'print if /3/ && $p =~ /abc/; $p = $_' test.txt
3 : 32 : foo

$ perl -ne 'print "$p$_" if /3/ && $p =~ /abc/; $p = $_' test.txt


abc : 123 : xyz
3 : 32 : foo

$ # with multiline matching, -0777 slurping not advisable for very large files
$ perl -0777 -ne 'print $1 if /.*abc.*\n(.*3.*\n)/' test.txt
3 : 32 : foo
$ perl -0777 -ne 'print $1 if /(.*abc.*\n.*3.*\n)/' test.txt
abc : 123 : xyz
3 : 32 : foo

$ # use s flag to allow .* to match across lines


$ perl -0777 -pe 's/(.*abc.*32)/ABC/s' test.txt
ABC : foo
-2.3 : bar : bar

$ # use m flag if ^$ anchors are needed to match individual lines


$ perl -0777 -pe 's/(.*abc.*3)/ABC/s' test.txt
ABC : bar : bar
$ perl -0777 -pe 's/(.*abc.*^3)/ABC/sm' test.txt
ABC : 32 : foo
-2.3 : bar : bar

$ # print multiple lines after matching line


$ perl -ne 'if(/abc/){ print; foreach (1..2){$n = <>; print $n} }' test.txt
abc : 123 : xyz
3 : 32 : foo
-2.3 : bar : bar

Using modules

68
Text Processing

$ echo 'a,b,a,c,d,1,d,c,2,3,1,b' | perl -MList::MoreUtils=uniq -F, -lane 'print join ",",uniq(@F)'


a,b,c,d,1,2,3

$ base64 test.txt
YWJjICA6IDEyMyA6IHh5egozICAgIDogMzIgIDogZm9vCi0yLjMgOiBiYXIgOiBiYXIK
$ base64 test.txt | base64 -d
abc : 123 : xyz
3 : 32 : foo
-2.3 : bar : bar
$ base64 test.txt | perl -MMIME::Base64 -ne 'print decode_base64($_)'
abc : 123 : xyz
3 : 32 : foo
-2.3 : bar : bar

$ perl -MList::MoreUtils=indexes -nale '@i = indexes { /[a-z]/i } @F if $. == 1; print join ",", @F[@i]' tes
t.txt
abc,xyz
3,foo
-2.3,bar

In place editing

$ perl -i -pe 's/\d/*/g' test.txt


$ cat test.txt
abc : *** : xyz
* : ** : foo
-*.* : bar : bar

$ perl -i.bak -pe 's/\*/^/g' test.txt


$ cat test.txt
abc : ^^^ : xyz
^ : ^^ : foo
-^.^ : bar : bar
$ cat test.txt.bak
abc : *** : xyz
* : ** : foo
-*.* : bar : bar

Further Reading

Perl Introduction - Introductory course for Perl 5 through examples


Perl curated resources
Handy Perl regular expressions
What does this regex mean?
Perl one-liners
Perl command line switches
Env

cut

69
Text Processing

remove sections from each line of files

For columns operations with well defined delimiters, cut command is handy

Examples

ls -l | cut -d' ' -f1 first column of ls -l


-d option specifies delimiter character, in this case it is single space character (Default delimiter is TAB

character)
-f option specifies which fields to print separated by commas, in this case field 1

cut -d':' -f1 /etc/passwd prints first column of /etc/passwd file

cut -d':' -f1,7 /etc/passwd prints 1st and 7th column of /etc/passwd file with : character in between
cut -d':' --output-delimiter=' ' -f1,7 /etc/passwd use space as delimiter between 1st and 7th column while

printing
cut Q&A on unix stackexchange

paste
merge lines of files

Examples

paste list1.txt list2.txt list3.txt > combined_list.txt combines the three files column-wise into single file,

the entries separated by TAB character


paste -d':' list1.txt list2.txt list3.txt > combined_list.txt the entries are separated by : character instead

of TAB
See pr command for multiple character delimiter
paste Q&A on unix stackexchange

$ # joining multiple files


$ paste -d, <(seq 5) <(seq 6 10)
1,6
2,7
3,8
4,9
5,10

$ paste -d, <(seq 3) <(seq 4 6) <(seq 7 10)


1,4,7
2,5,8
3,6,9
,,10

Single column to multiple columns

70
Text Processing

$ seq 5 | paste - -
1 2
3 4
5

$ # specifying different output delimiter, default is tab


$ seq 5 | paste -d, - -
1,2
3,4
5,

$ # if number of columns to specify is large, use the printf trick


$ seq 5 | paste $(printf -- "- %.s" {1..3})
1 2 3
4 5

Combine all lines to single line

$ seq 10 | paste -sd,


1,2,3,4,5,6,7,8,9,10

$ # for multiple character delimiter, perl can be used


$ seq 10 | perl -pe 's/\n/ : / if(!eof)'
1 : 2 : 3 : 4 : 5 : 6 : 7 : 8 : 9 : 10

column
columnate lists

$ cat dishes.txt
North alootikki baati khichdi makkiroti poha
South appam bisibelebath dosa koottu sevai
West dhokla khakhra modak shiro vadapav
East handoguri litti momo rosgulla shondesh

$ column -t dishes.txt
North alootikki baati khichdi makkiroti poha
South appam bisibelebath dosa koottu sevai
West dhokla khakhra modak shiro vadapav
East handoguri litti momo rosgulla shondesh

More examples here

pr
convert text files for printing

71
Text Processing

$ pr sample.txt

2016-05-29 11:00 sample.txt Page 1

This is an example of adding text to a new file using cat command.


Press Ctrl+d on a newline to save and quit.
Adding a line of text at end of file

Options include converting text files for printing with header, footer, page numbers, double space a file, combine
multiple files column wise, etc
More examples here

$ # single column to multiple column, split vertically


$ # for example, in command below, output of seq is split into two
$ seq 5 | pr -2t
1 4
2 5
3

$ # different output delimiter can be used by passing string to -s option


$ seq 5 | pr -2ts' '
1 4
2 5
3

$ seq 15 | pr -5ts,
1,4,7,10,13
2,5,8,11,14
3,6,9,12,15

Use -a option to split across

$ seq 5 | pr -2ats' : '


1 : 2
3 : 4
5

$ seq 15 | pr -5ats,
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15

$ # use $ to expand characters denoted by escape characters like \t for tab


$ seq 5 | pr -3ts$'\t'
1 3 5
2 4

$ # or leave the argument to -s empty as tab is default


$ seq 5 | pr -3ts
1 3 5
2 4

The default PAGE_WIDTH is 72


The formula (col-1)*len(delimiter) + col seems to work in determining minimum PAGE_WIDTH required for

72
Text Processing

multiple column output


The -J option will help in turning off line truncation

$ seq 74 | pr -36ats,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36
37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72
73,74
$ seq 74 | pr -37ats,
pr: page width too narrow

$ # (37-1)*1 + 37 = 73
$ seq 74 | pr -Jw 73 -37ats,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37
38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,
74

$ # (3-1)*4 + 3 = 11
$ seq 6 | pr -Jw 10 -3ats'::::'
pr: page width too narrow
$ seq 6 | pr -Jw 11 -3ats'::::'
1::::2::::3
4::::5::::6

Use -m option to combine multiple files in parallel

$ pr -mts', ' <(seq 3) <(seq 4 6) <(seq 7 9)


1, 4, 7
2, 5, 8
3, 6, 9

We can use a combination of different commands for complicated operations. For example, transposing a table

$ tr ' ' '\n' < dishes.txt | pr -$(wc -l < dishes.txt)t


North South West East
alootikki appam dhokla handoguri
baati bisibelebath khakhra litti
khichdi dosa modak momo
makkiroti koottu shiro rosgulla
poha sevai vadapav shondesh

Notice how pr neatly arranges the columns. If spacing is too much, we can use column

$ tr ' ' '\n' < dishes.txt | pr -$(wc -l < dishes.txt)ts | column -t


North South West East
alootikki appam dhokla handoguri
baati bisibelebath khakhra litti
khichdi dosa modak momo
makkiroti koottu shiro rosgulla
poha sevai vadapav shondesh

73
Shell

Shell
What is Shell?
Popular Shells
Wildcards
Redirection
Process Control
Running jobs in background

What is Shell?
Quoting from wikipedia

A Unix shell is a command-line interpreter or shell that provides a traditional Unix-like command line user
interface. Users direct the operation of the computer by entering commands as text for a command line interpreter
to execute, or by creating text scripts of one or more such commands. Users typically interact with a Unix shell
using a terminal emulator, however, direct operation via serial hardware connections, or networking session, are
common for server systems. All Unix shells provide filename wildcarding, piping, here documents, command
substitution, variables and control structures for condition-testing and iteration

Interprets user commands


from terminal, from a file or as a shell script
expand wildcards, command/variable substitution
Command history, command completion and command editing
Managing processes
Shell variables to customize the environment
Difference between shell, tty and console

Popular Shells
Like any indispensible software, Shell has undergone transformation from the days of basic sh shell that was used in
1970s Unix. While bash is default shell in most distros and most commonly used, powerful and feature rich shells are
still being developed and released

sh bourne shell (light weight Linux distros might come with sh shell only)
bash bourne again shell
csh C shell
tcsh tenex C shell
ksh Korn shell

zsh Z shell (bourne shell with improvements, including features from bash, tcsh, ksh)
cat /etc/shells displays list of login shells available in the current Linux distro
echo $SHELL path of current user's login shell
The material presented here is primarily for interactive shell
difference between login shell and non-login shell

74
Shell

Further Reading

Comparison of command shells


Features and differences between various shells
syntax comparison on different shells with examples
bash shell has also been ported on Windows platform

git bash
Cygwin
MinGW
Linux Subsystem for Windows
Shell, choosing shell and changing default shells

Wildcards
It is easy to specify complete filenames as command arguments when they are few in number. But suppose, one has to
delete hundreds of log files, spread across different sub-directories? Wildcards, or also known as globbing patterns help
in such cases, provided the filenames have a commonality to exploit. We have already seen regular expressions used
in commands like grep and sed . Shell wildcards are similar but has fundamental and syntactical differences

* match any character, 0 or more times

as a special case, * won't match the starting . of hidden files and has to be explicity specified
? match any character exactly 1 time

[aeiou] match any vowel character


[!aeiou] exclude vowel characters, i.e match a consonant
[!0-9] match any character except digits

[a-z] match any lower case alphabets


[0-9a-fA-F] match any hexademical character

{word1,word2} match either of the specified words


words can themselves be made of wildcards

Examples

ls txt* list all files starting with txt


ls *txt* list all files containing txt anywhere in its name
ls *txt list all files ending with txt in the current directory
ls -d .* list only hidden files and directories

rm *.??? remove any file ending with . character followed by exactly three characters
ls bkp/201[0-5] list files in bkp directory matching 2010/2011/2012/2013/2014/2015
echo *txt for dry runs, use echo command to see how the wildcard expands

Brace Expansion

ls *{txt,log} list all files ending with txt or log in the current directory
cp ~/projects/adders/verilog/{half_,full_}adder.v . copy half_adder.v and full_adder.v to current directory
mv story.txt{,.bkp} rename story.txt as story.txt.bkp
cp story.txt{,.bkp} to create bkp file as well retain original
mv story.txt{.bkp,} rename story.txt.bkp as story.txt

mv story{,_old}.txt rename story.txt as story_old.txt


touch file{1..4}.txt same as touch file1.txt file2.txt file3.txt file4.txt

75
Shell

touch file_{x..z}.txt same as touch file_x.txt file_y.txt file_z.txt

rm file{1..4}.txt same as rm file1.txt file2.txt file3.txt file4.txt


echo story.txt{,.bkp} displays the expanded version 'story.txt story.txt.bkp' , useful to dry run before executing

actual command

Extended globs

From info bash , where pattern-list is a list of one or more patterns separated by a |

?(pattern-list) Matches zero or one occurrence of the given patterns

*(pattern-list) Matches zero or more occurrences of the given patterns

+(pattern-list) Matches one or more occurrences of the given patterns


@(pattern-list) Matches one of the given patterns

!(pattern-list) Matches anything except one of the given patterns

To check if extglob is enabled or to enable/disable:

$ shopt extglob
extglob on

$ # unset extglob
$ shopt -u extglob
$ shopt extglob
extglob off

$ # set extglob
$ shopt -s extglob
$ shopt extglob
extglob on

Examples

$ ls
123.txt main.c math.h power.log

$ echo +([0-9]).txt
123.txt

$ ls @(*.c|*.h)
main.c math.h

$ ls !(*.txt)
main.c math.h power.log
$ ls !(*.c|*.h)
123.txt power.log

Recursively search current directory and its sub-folders

Set globstar and prefix pattern with **/ to search recursively

76
Shell

$ find -name '*.txt'


./song_list.txt
./bar/f1.txt
./bar/baz/f2.txt

$ shopt -s globstar
$ ls **/*.txt
bar/baz/f2.txt bar/f1.txt song_list.txt

Further Reading

Glob
See topic 'Pathname Expansion' in info bash
brace expansion wiki
when to use brace expansion

Redirection
By default all results of a command are displayed on the terminal, which is the default destination for standard output.
But often, one might want to save or discard them or send as input to another command. Similarly, inputs to a command
can be given from files or from another command. Errors are special outputs generated on a wrong usage of command
or command name

< or 0< is stdin filehandle


> or 1> is stdout filehandle

2> is stderr filehandle

Redirecting output of a command to a file

grep -i 'error' report/*.log > error.log create new file, overwrite if file already exists
grep -i 'fail' test_results_20mar2015.log >> all_fail_tests.log creates new file if file doesn’t exist, otherwise
append the result to existing file
./script.sh > /dev/null redirect output to a special file /dev/null that just discards everything written to it,
whatever may be the size
explicitly override the setting of noclobber with the >| redirection operator

Redirecting output of a command to another command

ls -q | wc -l the 'pipe' operator redirects stdout of ls command to wc command as stdin

du -sh * | sort -h calculate size of files/folders, display size in human-readable format which is then sorted
./script.sh | tee output.log the tee command displays standard output on terminal as well as writes to file

Combining output of several commands

(head -5 ~/.vimrc ; tail -5 ~/.vimrc) > vimrc_snippet.txt multiple commands can be grouped in () and

redirected as if single command output


commands grouped in () gets executed in a subshell environment
{ head -5 ~/.vimrc ; tail -5 ~/.vimrc ; } > vimrc_snippet.txt gets executed in current shell context
Command grouping

Command substitution

77
Shell

sed -i "s|^|$(basename $PWD)/|" dir_list.txt add current directory path and forward-slash character at the

start of every line


Note the use of double quotes to perform command substitution
file_count=$(ls -q | wc -l) save command output to a variable

Command Substitution

Process Substitution

comm -23 <(sort file1.txt) <(sort file2.txt) allows to create named pipes, effectively avoiding need to create
temporary files
Process Substitution
input and output process substitution examples

Redirecting error

xyz 2> cmderror.log assuming a non-existent command xyz , it would give an error and gets redirected to

specified file
./script.sh 2> /dev/null discard error messages

Combining stdout and stderr

Assume that the file 'report.log' exists containing the text 'test' and non-existing file 'xyz.txt'

Bash version 4+:

grep 'test' report.log xyz.txt &> cmb_out.txt redirect both stdout and stderr to a file
grep 'test' report.log xyz.txt &>> cmb_out.txt append both stdout and stderr to a file

ls report.log xyz.txt |& grep '[st]' redirect both stdout and stderr as stdin

Earlier versions:

grep 'test' report.log xyz.txt > cmb_out.txt 2>&1 redirect both stdout and stderr to a file
grep 'test' report.log xyz.txt 2> cmb_out.txt 1>&2 redirect both stdout and stderr to a file

grep 'test' report.log xyz.txt >> cmb_out.txt 2>&1 append both stdout and stderr to a file

ls report.log xyz.txt 2>&1 | grep '[st]' redirect both stdout and stderr as stdin

Redirecting input

tr a-z A-Z < test_list.txt convert lowercase to uppercase, tr command only reads from stdin and doesn't
have the ability to read from a file directly
wc -l < report.log useful to avoid filename in wc output
< report.log grep 'test' useful to easily modify previous command for different command options, search
patterns, etc
grep 'test' report.log | diff - test_list.txt output of grep as one of the input file for diff command
difference between << , <<< and < <

Using xargs to redirect output of command as input to another command

grep -rlZ 'pattern' | xargs -0 sed -i 's/pattern/replace/' search and replace only those files matching the
required pattern (Note: search pattern could be different for grep and sed as per requirement)
the -Z option would print filename separated by ASCII NUL character which is in turn understood by xargs
via the -0 option. This ensures the command won't break on filenames containing characters like spaces,
newlines, etc
When to use xargs
has a good example for parallel processing jobs with xargs

78
Shell

Further Reading

See topic 'REDIRECTION' in info bash


stdin, stdout and stderr
Illustrated Redirection Tutorial
short introduction
redirect a stream to another file descriptor using >&
difference between 2>&1 >foo and >foo 2>&1
redirect and append both stdout and stderr to a file
Redirections explained

Process Control
Process is any running program

Program is a set of instructions written to perform a task


Daemon to simply put, are background processes
Job in Shell parlance is a process that is not a daemon, i.e an interactive program with user control

ps

report a snapshot of the current processes

First column indicates the process id (PID)


-e select all processes
-f full-format listing

ps Q&A on unix stackexchange


ps Q&A on stackoverflow
ps tutorial

kill

send a signal to a process

kill -l list signal names


kill PID send default 'SIGTERM' signal to a process (specified by the PID) asking the process to terminate
gracefully shutdown processes
why kill -9 should be avoided
kill Q&A on unix stackexchange
kill Q&A on stackoverflow
See also pkill and killall commands

top

display Linux processes

Press M (uppercase) to sort the processes by memory usage


Press q to quit the command
Press W (uppercase) to write your favorite view of top command to ~/.toprc file and quit immediately, so that
next time you use top command, it will display in the format you like
htop is better/prettier alternative to top
install instructions here

79
Shell

top Q&A on unix stackexchange

free

Display amount of free and used memory in the system

free -h shows amount of free and used memory in human readable format

pgrep

look up or signal processes based on name and other attributes

pgrep -l foobar search for process names containing foobar, displays PID and full process name

pgrep -x gvim search for processes exactly named gvim

pgrep -c chrom total number of processes matching chrom

pgrep -nl chrom most recently started process matching chrom

Further Reading

Process Management
Managing Linux Processes
Linux Processes
what is daemon)
Job Control commands
Useful examples for top command

Running jobs in background


Often commands and scripts can take more than few minutes to complete, but user might still need to continue using
the shell. Opening a new shell might not serve the purpose if local shell variable settings are needed too. Shell provides
the & operator to push the commad (or script) execution to background and return the command prompt to the user.
However, the standard outputs and errors would still get displayed on the terminal unless appropriately redirected

tkdiff result_v1.log result_v2.log & tkdiff, if installed, shows differences between two files in a GUI. If & is
not used, the program would hog the command prompt

Pushing current job to background

What if you forgot to add & and using kill on the process might corrupt lot of things?

Ctrl+z suspends the current running job


bg push the recently suspended job to background
Continue using shell
fg bring the recently pushed background job to foreground
jobs built-in command - Display status of jobs

nohup command - run a command immune to hangups, with output to a non-tty


job control

80
Shell Customization

Shell Customization
Variables
Config files
Emac mode Readline shortcuts

Variables
Quoting from article on BASH Environment & Shell Variables

Variables provide a simple way to share configuration settings between multiple applications and processes in
Linux, and are mainly set in either a terminal or shell configuration file upon start up.

They are either environmental or shell variables by convention. Both of which are usually defined using all capital
letters. This helps users distinguish environmental variables from within other contexts.

“Environment variables” have been defined for use in the current shell and will be inherited by any child shells or
processes spawned as a result of the parent. Environmental variables can also be used to pass information into
processes that are spawned by the shell

“Shell variables” are contained exclusively within the shell in which they were set or defined. They are mostly used
to keep track of ephemeral temporal data, like the current working directory in a session

Some example Variables:

HOME The home directory of the current user; the default argument for the cd builtin command. The value of this
variable is also used when performing tilde expansion
SHELL The full pathname to the shell is kept in this environment variable. If it is not set when the shell starts, bash
assigns to it the full pathname of the current user's login shell
PATH The search path for commands. It is a colon-separated list of directories in which the shell looks for
commands. A common value is /usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin
PWD and OLDPWD full path of current working directory and previous working directory
HISTFILESIZE,HISTSIZE,HISTCONTROL,HISTFILE command history related variables
PS1 The value of this parameter is expanded and used as the primary prompt string. The default value is \s-
\v\$
printenv command to display names and values of Environment variables

set builtin command to display the names and values of all the variables when used without options/arguments
echo "$HOME" use $ when Variable value is needed

User defined variables

User can define variables as well - for temporary use, in shell script, etc.
Using lowercase is preferred to avoid potential conflict with shell or environment variables

81
Shell Customization

$ #array of 8-bit binary numbers in ascending order


$ dec2bin=({0..1}{0..1}{0..1}{0..1}{0..1}{0..1}{0..1}{0..1})
$ echo "${dec2bin[2]}"
00000010
$ echo "${dec2bin[120]}"
01111000
$ echo "${dec2bin[255]}"
11111111

Further Reading

Section 'Shell Variables' in info bash


Difference between shell and environment variables
Variable behavior varies with different type of shells
How to correctly modify PATH variable
Read more on the dec2bin brace expansion example
Parameter expansion - from simple ways to get Variable values to complicated manipulations

Config files
Through use of aliases, functions, shell variables, etc one can customize the shell as per their needs

From 'FILES' section in info bash

/etc/profile The systemwide initialization file, executed for login shells

/etc/bash.bashrc The systemwide per-interactive-shell startup file


/etc/bash.bash.logout The systemwide login shell cleanup file, executed when a login shell exits
~/.bash_profile The personal initialization file, executed for login shells

~/.bashrc The individual per-interactive-shell startup file


~/.bash_logout The individual login shell cleanup file, executed when a login shell exits

~/.inputrc Individual readline initialization file

~/.bashrc

From 'INVOCATION' section in info bash

When an interactive shell that is not a login shell is started, bash reads and executes commands from
/etc/bash.bashrc and ~/.bashrc, if these files exist. This may be inhibited by using the --norc option. The --rcfile file
option will force bash to read and execute commands from file instead of /etc/bash.bashrc and ~/.bashrc.

shopt Set and unset shell options


shopt -s autocd change directory by typing just the name, without having to explicity type the cd command
( -s sets/enables this option)
shopt -u autocd unset/disable autocd option
shopt -s dotglob include files starting with . also for wildcard expansion

shopt builtin command


set Set or unset values of shell options and positional parameters
set -o emacs Use emacs-style line editing interface
set -o vi Use vi-style line editing interface
set -o history enable command history

set +o history disable command history, useful to temporarily disable logging command history for current

82
Shell Customization

session until it is re-enabled


set -o see current status of various options - are they on/off
set builtin command
aliases
aliases and functions are generally used to construct new commands or invoke commands with preferred
options
source ~/.bash_aliases to avoid cluttering the bashrc file, it is recommended to put them in a separate file

and use source command to add to bashrc


history
By default, history commands are stored in ~/.bash_history, can be changed using HISTFILE variable
HISTSIZE=5000 this variable affects how many commands are in history of current shell session. Use negative

number for unlimited size


HISTFILESIZE=10000 this variable affects how many commands are stored in the history file. Use negative

number for unlimited file size


HISTCONTROL=ignorespace:erasedups don't save commands with leading space and erase all previous

duplicates matching current command line


shopt -s histappend append to history file instead of overwriting
using bash history efficiently
common history across sessions
Setting prompt using the PS1 variable
PS1="$ " simple prompt '$ '
PS1="\s-\v\$ " default prompt, adds bash version number, for ex: 'bash-4.3$ '
PS1="\u@\h\\$ \[$(tput sgr0)\]" is way of saying to set the prompt as 'username@hostname$ '

easy way to generate PS1 - above example was generated using this site, has options to add color as well
What does the ~/.bashrc file do?
Distros like Ubuntu come with ~/.bashrc already created with useful configurations like bash_completion
sample bashrc

~/.inputrc

Key bindings for command line (readline) are customized in this file. By default, emacs-style is on and can be changed
using the set command as discussed in previous section
Some of the default key bindings are discussed later in this chapter

"\e[A": history-search-backward up arrow to match history starting with partly typed text

"\e[B": history-search-forward down arrow to search in forward direction


"\C-d": unix-filename-rubout Ctrl+d to delete from cursor backwards to filename boundary
set echo-control-characters off turn off control characters like ^C (Ctrl+C) from showing on screen
set completion-ignore-case on ignore case for Tab completion
set show-all-if-ambiguous on combines single and double Tab presses behavior into single Tab press

Simpler introduction to Readline


discussion on GNU Readline library library allows user to interact/edit command line
sample inputrc

~/.bash_aliases

Before creating an alias or function, use type alias_name to check if an existing command or alias exists with that
name

alias used without argument shows all aliases currently set, sorted in alphabetical order
alias c='clear' alias clear command to just the single letter c

83
Shell Customization

Note that there should be no white-space around = operator


alias b1='cd ../' alias b1 to go back one hierarchy above
alias app='cd /home/xyz/Android/xyz/app/src/main/java/com/xyz/xyzapp/' alias frequently used long paths.

Particularly useful when working on multiple projects spanning multiple years


and if aliases are forgotten over the years, they can recalled by opening ~/.bash_aliases file or using alias
command
alias oa='gvim ~/.bash_aliases' open aliases file with your favorite editor

alias sa='source ~/.bash_aliases' useful to apply changes to current session


alias ls='ls --color=auto' colorize output to distinguish file types

alias l='ls -ltrh' map favorite options, plus color output as previously set alias will be substituted for ls

alias grep='grep --color=auto' colorize file names, line numbers, matched pattern, etc

alias s='du -sh * | sort -h' sort files/directories by size and display in human-readable format

\ls override alias and use original command by using the \ prefix

ch() { man $1 | sed -n "/^\s*$2/,/^$/p" ; } simple command help (ch) function to get information on a
command option
for example: ch ls -F , ch grep -o , etc
ch() { whatis $1; man $1 | sed -n "/^\s*$2/,/^$/p" ; } also prints description of command
ch does a much better job with capability to handle multiple options, multiple arguments, builtin commands, etc
explainshell does even better
o() { gnome-open "$@" &> /dev/null ; } open files with their default applications, discards output and error

messages
for example: o bashguide.pdf
$1 first positional argument

$2 second positional argument


$@ all the arguments

sample bash_aliases

Further Reading

Sensible bash customizations


shell config files
command line navigation
difference between bashrc and bash_profile
when to use alias, functions and scripts
what does rc in bashrc stand for

Emac mode Readline shortcuts


Ctrl+c sends SIGINT signal, requesting the current running process to terminate

how Ctrl+c works


Ctrl+c can also be used to abort the currently typed command and give fresh command prompt
Ctrl+z suspends the current running process
Tab the tab key completes the command (even aliases) or filename if it is unique, double Tab gives list of possible
matches if it is not unique
set show-all-if-ambiguous on combines single and double Tab presses behavior into single Tab press
Ctrl+r Search command history. After pressing this key sequence, type characters you wish to match from
history, then press Esc key to return to command prompt or press Enter to execute the command

84
Shell Customization

Esc+b move cursor backward by one word

Esc+f move cursor forward by one word


Esc+Backspace delete backwards upto word boundary

Ctrl+a or Home move cursor to beginning to prompt

Ctrl+e or End move cursor to end of command line


Ctrl+l preserve whatever is typed in command prompt and clear the terminal screen

Ctrl+u delete from beginning of command line upto cursor

Ctrl+k delete from cursor to end of command line


Ctrl+t swap the previous two characters around

For example: if you typed sp instead of ps, press Ctrl+t when the cursor is to right of sp and it will change to ps
Esc+t swap the previous two words around

!$ last used argument

for example: if cat temp.txt was the last command used, rm !$ will delete temp.txt file
Esc+. will insert the last used argument, useful when you need to modify before execution. Also multiple
presses allows to traverse through second last command and so on
Mouse scroll button click highlight text you want to copy and then press scroll button of mouse in destination to
paste the text
to disable pasting text on Mouse scroll button click , use the xinput command and get the number
corresponding to your mouse.. say it is 11
xinput set-button-map 11 1 0 3 to disable

xinput set-button-map 11 1 2 3 to enable back

85
Shell Scripting

Shell Scripting
Need for scripting
Hello script
Sourcing script
Command Line Arguments
Variables and Comparisons
Accepting User Input interactively
if then else
for loop
while loop
Reading a file
Debugging
Real world use case
Resource lists

Need for scripting


Automate repetitive manual tasks
Create specialized and custom commands
Difference between scripting and programming languages

Note:

.sh is typically used as extension for shell scripts


Material presented here is for GNU bash, version 4.3.11(1)-release

Hello script

#!/bin/bash

# Print greeting message


echo "Hello $USER"
# Print day of week
echo "Today is $(date -u +%A)"

# use single quotes for literal strings


echo 'Have a nice day'

The first line has two parts

/bin/bash is path of bash


type bash to get path
#! called as shebang), directs the program loader to use the interpreter path provided

Comments

86
Shell Scripting

Comments start with #


Comments can be placed at end of line of code as well
echo 'Hello' # end of code comment

Multiline comments

Single quotes vs Double quotes

Single quotes preserves the literal value of each character within the quotes
Double quotes preserves the literal value of all characters within the quotes, with the exception of '$', '`', '\', and,
when history expansion is enabled, '!'
Difference between single and double quotes

echo builtin command

help -d echo Write arguments to the standard output


By default, echo adds a newline and doesn't interpret backslash
-n do not append a newline
-e enable interpretation of the following backslash escapes
-E explicitly suppress interpretation of backslash escapes

echo Q&A on unix stackexchange

$ chmod +x hello_script.sh
$ ./hello_world.sh
Hello learnbyexample
Today is Wednesday
Have a nice day

Sourcing script

$ help -d source
source - Execute commands from a file in the current shell.

If script should be executed in current shell environment instead of sub-shell, use the . or source command
For example, after editing ~/.bashrc one can use source ~/.bashrc for changes to be immeditely effective

$ # contents of prev_cmd.sh
prev=$(fc -ln -2 | sed 's/^[ \t]*//;q')
echo "$prev"

For example, to access history of current interactive shell from within script

$ printf 'hi there\n'


hi there
$ bash prev_cmd.sh

$ printf 'hi there\n'


hi there
$ source prev_cmd.sh
printf 'hi there\n'

87
Shell Scripting

Command Line Arguments

#!/bin/bash

# Print line count of files given as command line argument


echo "No of lines in '$1' is $(wc -l < "$1")"
echo "No of lines in '$2' is $(wc -l < "$2")"

Command line arguments are saved in positional variables starting with $1 $2 $3 etc
If a particular argument requires multiple word string, enclose them in quotes or use appropriate escape sequences
$0 contains the name of the script itself - useful to code different behavior based on name of script used

$@ array of all the command line arguments passed to script

$# Number of command line arguments passed to script


Use double quotes around variables when passing its value to another command
why does my shell script choke on whitespace or other special characters?
bash special parameters reference

$ ./command_line_arguments.sh hello_script.sh test\ file.txt


No of lines in 'hello_script.sh' is 9
No of lines in 'test file.txt' is 5

Variables and Comparisons


dir_path=/home/guest space has special meaning in bash, cannot be used around = in variables
greeting='hello world' use single quotes for literal strings

user_greeting="hello $USER" use double quotes for substitutions


echo $user_greeting use $ when variable's value is needed
no_of_lines=$(wc -l < "$filename") use double quotes around variables when passing its value to another
command
num=534 numbers can also be declared
(( num = 534 )) but using (( )) for numbers makes life much easier
(( num1 > num2 )) number comparisons are also more readable within (( ))
[[ -e story.txt ]] test if the file/directory exists
[[ $str1 == $str2 ]] for string comparisons

Further Reading

bash arithmetic expressions


how can I add numbers in a bash script?
[difference between test, [ and [[](http://mywiki.wooledge.org/BashFAQ/031)
Tests and Conditionals
How to use double or single bracket, parentheses, curly braces?
Variable quoting and using braces for variable substitution
Parameters
Parameter expansion - substitute a variable or special parameter for its value

88
Shell Scripting

Accepting User Input interactively

#!/bin/bash

# Get user input


echo 'Hi there! This script returns the sum of two numbers'
read -p 'Enter two numbers separated by spaces: ' number1 number2

echo -e "\n$number1 + $number2 = $((number1 + number2))"


echo 'Thank you for using the script, Have a nice day :)'

help -d read Read a line from the standard input and split it into fields

-a array assign the words read to sequential indices of the array variable ARRAY, starting at zero
-p prompt output the string PROMPT without a trailing newline before attempting to read

-s do not echo input coming from a terminal


More examples with read and getting input from stdin

$ ./user_input.sh
Hi there! This script returns the sum of two numbers
Enter two numbers separated by spaces: 7 42

7 + 42 = 49
Thank you for using the script, Have a nice day :)

if then else

#!/bin/bash

if (( $# != 2 ))
then
echo "Error!! Please provide two file names"
# simple convention for exit values is '0' for success and '1' for error
exit 1
else
# Use ; to combine multiple commands in same line
# -f option checks if file exists, ! negates the value
# white-space around [[ and ]] is necessary
if [[ ! -f $1 ]] ; then
echo "Error!! '$1' is not a valid filename" ; exit 1
else
echo "No of lines in '$1' is $(wc -l < "$1")"
fi

# Conditional Execution
[[ ! -f $2 ]] && echo "Error!! '$2' is not a valid filename" && exit 1
echo "No of lines in '$2' is $(wc -l < "$2")"
fi

When handling user provided arguments, it is always advisable to check the sanity of arguments. A simple check
can reduce hours of frustrating debug when things go wrong

89
Shell Scripting

The code inside if [[ ! -f $1 ]] ; then block is only intended for demonstration, we could as well have used
error handling of wc command if file doesn't exist
Default exit value is 0 , so need not be explicitly written for successful script completion
Use elif if you need to test more conditions after if
The operator && is used to execute a command only when the preceding one successfully finishes
To redirect error message to stderr, use echo "Error!! Please provide two file names" 1>&2 and so on
Control Operators && and ||
More examples for if conditional block

$ ./if_then_else.sh
Error!! Please provide two file names
$ echo $?
1

$ ./if_then_else.sh hello_script.sh
Error!! Please provide two file names
$ echo $?
1

$ ./if_then_else.sh hello_script.sh xyz.tzt


No of lines in 'hello_script.sh' is 9
Error!! 'xyz.tzt' is not a valid filename
$ echo $?
1

$ ./if_then_else.sh hello_script.sh 'test file.txt'


No of lines in 'hello_script.sh' is 9
No of lines in 'test file.txt' is 5
$ echo $?
0

Combining if with exit status of command executed

Sometimes one needs to know if intended command operation was successful or not and then take action depending
on outcome. Exit status of 0 is considered as successful condition when used with if statement. When avaiable,
use appropriate options to suppress stdout/stderr of command being used, otherwise redirection might be needed to
avoid cluttering output on terminal

$ grep 'echo' hello_script.sh


echo "Hello $USER"
echo "Today is $(date -u +%A)"
echo 'Have a nice day'

$ # do not write anything to standard output


$ grep -q 'echo' hello_script.sh
$ echo $?
0

$ grep -q 'echo' xyz.txt


grep: xyz.txt: No such file or directory
$ echo $?
2
$ # Suppress error messages about nonexistent or unreadable files
$ grep -qs 'echo' xyz.txt
$ echo $?
2

90
Shell Scripting

Example

#!/bin/bash

if grep -q 'echo' hello_script.sh ; then


# do something
echo "string found"
else
# do something else
echo "string not found"
fi

for loop

#!/bin/bash

# Ensure atleast one argument is provided


(( $# == 0 )) && echo "Error!! Please provide atleast one file name" && exit 1

file_count=0
total_lines=0

# every iteration, variable file gets next positional argument


for file in "$@"
do
# Let wc show its error message if file doesn't exist
# terminate the script if wc command exit status is not 0
no_of_lines=$(wc -l < "$file") || exit 1
echo "No of lines in '$file' is $no_of_lines"
((file_count++))
((total_lines = total_lines + no_of_lines))
done

echo -e "\nTotal Number of files = $file_count"


echo "Total Number of lines = $total_lines"

This form of for loop is useful if we need only element of an array, without having to iterate over length of an
array and using an index for each iteration to get array elements
In this example we use the control operator || to stop the script if wc fails i.e 'exit status' other than 0

91
Shell Scripting

$ ./for_loop.sh
Error!! Please provide atleast one file name
$ echo $?
1

$ ./for_loop.sh hello_script.sh if_then_else.sh command_line_arguments.sh


No of lines in 'hello_script.sh' is 9
No of lines in 'if_then_else.sh' is 21
No of lines in 'command_line_arguments.sh' is 5

Total Number of files = 3


Total Number of lines = 35
$ echo $?
0

$ ./for_loop.sh hello_script.sh xyz.tzt


No of lines in 'hello_script.sh' is 9
./for_loop.sh: line 14: xyz.tzt: No such file or directory
$ echo $?
1

Index based for loop

#!/bin/bash

# Print 0 to 4
for ((i = 0; i < 5; i++))
do
echo $i
done

Iterating over used defined array

$ files=('report.log' 'pass_list.txt')
$ for f in "${files[@]}"; do echo "$f"; done
report.log
pass_list.txt

Files specified by glob pattern

A common mistake is to use output of ls command which is error prone and needless. Instead, the arguments can be
directly used.

$ ls
pass_list.txt power.log report.txt

$ for f in power.log *.txt; do echo "$f"; done


power.log
pass_list.txt
report.txt

more examples and use of continue/break

92
Shell Scripting

while loop

#!/bin/bash

# Print 5 to 1
(( i = 5 ))
while (( i != 0 ))
do
echo $i
((i--))
done

Use while when you need to execute commands according to a specified condition

$ ./while_loop.sh
5
4
3
2
1

Reading a file
Reading line by line

#!/bin/bash

while IFS= read -r line; do


# do something with each line
echo "$line"
done < 'files.txt'

IFS is used to specify field separator which is by default whitespace. IFS= will clear the default value and

prevent stripping of leading and trailing whitespace of lines


The -r option for read will prevent interpreting \ escapes
Last line from input won't be read if not properly terminated by newline character

$ cat files.txt
hello_script.sh
if_then_else.sh
$ ./while_read_file.sh
hello_script.sh
if_then_else.sh

Reading line as different fields

By default, whitespace is delimiter


Specify a different one by setting IFS

93
Shell Scripting

$ cat read_file_field.sh
#!/bin/bash

while IFS=: read -r genre name; do


echo -e "$genre\t:: $name"
done < 'books.txt'
$ cat books.txt
fantasy:Harry Potter
sci-fi:The Martian
mystery:Sherlock Holmes

$ ./read_file_field.sh
fantasy :: Harry Potter
sci-fi :: The Martian
mystery :: Sherlock Holmes

Reading 'n' characters at a time

$ while read -n1 char; do echo "Character read is: $char"; done <<< "\word"
Character read is: w
Character read is: o
Character read is: r
Character read is: d
Character read is:

$ # if ending newline character is not desirable


$ while read -n1 char; do echo "Character read is: $char"; done < <(echo -n "hi")
Character read is: h
Character read is: i

$ while read -r -n2 chars; do echo "Characters read: $chars"; done <<< "\word"
Characters read: \w
Characters read: or
Characters read: d

Debugging
-x Print commands and their arguments as they are executed
-v verbose option, print shell input lines as they are read
set -xv use this command to enable debugging from within script itself

$ bash -x hello_script.sh
+ echo 'Hello learnbyexample'
Hello learnbyexample
++ date -u +%A
+ echo 'Today is Friday'
Today is Friday
+ echo 'Have a nice day'
Have a nice day

94
Shell Scripting

$ bash -xv hello_script.sh


#!/bin/bash

# Print greeting message


echo "Hello $USER"
+ echo 'Hello learnbyexample'
Hello learnbyexample
# Print day of week
echo "Today is $(date -u +%A)"
date -u +%A
++ date -u +%A
+ echo 'Today is Friday'
Today is Friday

# use single quotes for literal strings


echo 'Have a nice day'
+ echo 'Have a nice day'
Have a nice day

Real world use case


With so much copy-paste of commands and their output involved in creating these chapters, mistakes do happen. So a
script to check correctness comes in handy. Consider the below markdown file

## <a name="some-heading"></a>Some heading

Some explanation

```bash
$ seq 3
1
2
3

$ printf 'hi there!\n'


hi there!
```

## <a name="another-heading"></a>Another heading

More explanations

```bash
$ help -d readarray
readarray - Read lines from a file into an array variable.

$ a=5
$ printf "$a\n"
5
```

The whole file is read into an array so that index of next line to be read can be controlled dynamically
Once a command is identified to be tested
the expected output is collected into a variable. Multiple lines are concatenated. Some commands do not have
stdout to compare against

95
Shell Scripting

accordingly the index of next iteration is corrected


Note that this is a sample script to demonstrate use of shell script. It is not fool-proof, doesn't have proactive check
for possible errors, etc
Be sure eval is being used for known commands like is the case here
See Parameter Expansion for examples and explanations on string processing constructs

#!/bin/bash

cb_start=0
readarray -t lines < 'sample.md'

for ((i = 0; i < ${#lines[@]}; i++)); do


# mark start/end of command block
# Line starting with $ to be verified only between ```bash and ``` block end
[[ ${lines[$i]:0:7} == '```bash' ]] && ((cb_start=1)) && continue
[[ ${lines[$i]:0:3} == '```' ]] && ((cb_start=0)) && continue

if [[ $cb_start == 1 && ${lines[$i]:0:2} == '$ ' ]]; then


cmd="${lines[$i]:2}"

# collect command output lines until line starting with $ or ``` block end
cmp_str=''
j=1
while [[ ${lines[$i+$j]:0:2} != '$ ' && ${lines[$i+$j]:0:3} != '```' ]]; do
cmp_str+="${lines[$i+$j]}"
((j++))
done
((i+=j-1))

cmd_op=$(eval "$cmd")
if [[ "${cmd_op//$'\n'}" == "${cmp_str//$'\n'}" ]]; then
echo "Pass: $cmd"
else
echo "Fail: $cmd"
fi
fi
done

Note how sourcing the script is helpful to take into consideration commands dependent on previous commands

$ ./verify_cmds.sh
Pass: seq 3
Pass: printf 'hi there!\n'
Pass: help -d readarray
Pass: a=5
Fail: printf "$a\n"

$ source verify_cmds.sh
Pass: seq 3
Pass: printf 'hi there!\n'
Pass: help -d readarray
Pass: a=5
Pass: printf "$a\n"

Resource lists

96
Shell Scripting

The material in this chapter is only a basic introduction

Shell Scripting

Bash Guide - for everything related to bash and bash scripting, also has a downloadable pdf
ryanstutorial - good introductory tutorial
bash handbook
writing shell scripts
snipcademy - shell scripting
wikibooks - bash shell scripting
linuxconfig - Bash scripting tutorial
learnshell

Specific topics

using source command to execute bash script


functions
Reading file(s)
Reading file
Loop through the lines of two files in parallel
arrays
nameref
also see this FAQ
getopts
getopts tutorial
wooledge - handle command-line arguments
stackoverflow - getopts example

Handy tools, tips and reference

shellcheck - online static analysis tool that gives warnings and suggestions for scripts
See github link for more info and install instructions
Common bash scripting issues faced by beginners
bash FAQ
bash best Practices
bash pitfalls
Google shell style guide
better bash scripting
robust shell scripting
Bash Sheet
bash reference - nicely formatted and explained well
bash special variables reference
Testing exit values in bash

97

You might also like