0% found this document useful (0 votes)
207 views

Introduction To Programming in Perl

This document provides an introduction to programming in Perl. It begins with definitions of programming and programming languages. It then discusses what Perl is, why it is used, and provides a basic example program in Perl. It also covers scalar and array data types, operators for these data types, and control structures like if/else statements. The goal is to introduce basic Perl concepts to allow writing simple programs for bioinformatics tasks.

Uploaded by

Ram Sagar Mourya
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
207 views

Introduction To Programming in Perl

This document provides an introduction to programming in Perl. It begins with definitions of programming and programming languages. It then discusses what Perl is, why it is used, and provides a basic example program in Perl. It also covers scalar and array data types, operators for these data types, and control structures like if/else statements. The goal is to introduce basic Perl concepts to allow writing simple programs for bioinformatics tasks.

Uploaded by

Ram Sagar Mourya
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Introduction to programming in Perl WS 2006/07: Bioinformatics I

Introduction to programming in Perl

Nicodème Paul

Nicodeme.paul@unibas.ch

http://www2.biozentrum.unibas.ch/personal/schwede/Teaching/BixI-WS0607/frame.htm

01-11-06 1

What is programming ?
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Programming is breaking a task into small steps (divide and conquer).

Sum : 15 + 25 + 11 ?

15 + 25 + 11

40 + 11

51

Programs are written in a programming language such as :


Fortran, Pascal, C, C++, java, Perl, Python, ….

01-11-06 2
Program translator
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Computer

Processor

Compiler
Program 0101011
or
Interpreter

Memory

01-11-06 3

What is Perl ?
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Perl : Practical Extraction and Report Language


by Larry Wall (1987)

• Text-processing language

• Glue language

• Very high level language

• perl is the language compiler/interpreter program

01-11-06 4
Why do we use Perl?
Introduction to programming in Perl WS 2006/07: Bioinformatics I

• Simplicity

• Rapid prototyping

• Portability

• Widely used in Bioinformatics

01-11-06 5

A first example
Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl # shebang line

# Pragmas
use strict; # Restrict unsafe constructs
use warnings; # Provide helpful diagnostics

# Assign 15 to $number1
my $number1 = 15;

# Assign 25 to $number2
my $number2 = 25;

# Assign 11 to $number3
my $number3 = 11;

$number1 = $number1 + $number2; # $number1 contains 40


$number1 = $number1 + $number3; # $number1 contains 51
print “My result is : $number1\n”; # Print the result on the terminal

01-11-06 6
Scalar Data Type
Introduction to programming in Perl WS 2006/07: Bioinformatics I

• $answer = 36; # an integer


• $pi = 3141659265 # a real number
• $avocados = 6.02e23; # scientific notation
• $language = “Perl”; # a string
• $sign1 = “$language is nice”; # string with interpolation
• $sign2 = ‘$language is nice’; # string without interpolation

Scalar = singular variable

$ S

01-11-06 7

Scalar Binary Operators


Introduction to programming in Perl WS 2006/07: Bioinformatics I

$u = 17 $v = 3 $s = “Perl”

Name Example Result


Addition $u + $v 17 + 3 = 20
Subtraction $u - $v 17 – 3 = 14
Multiplication $u * $v 17 * 3 = 51

Division $u / $v 17 / 3 = 5.66666666667

Modulus $u % $v 17 % 3 = 2
Exponentiation $u ** $v 17 ** 3 = 4913

Concatenation $s . $s “Perl” . “Perl” = “PerlPerl”


Repetition $s x n “Perl” x 3 = “PerlPerlPerl”

01-11-06 8
Scalar Unary Operators
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Numbers Strings

abs(expr) uc(expr)

sqrt(expr) lc(expr)

exit(expr) chop(variable)

exp(expr) chomp(variable)

int(expr) reverse(expr)

log(expr) length(expr)

¾ perldoc –f function_name
01-11-06 9

Context
Introduction to programming in Perl WS 2006/07: Bioinformatics I

$u = “12” + 5;
¾17

$u = “12john” +5;
¾17

$u = “john12” + 5;
¾5

use strict;

$u = “john12” + 5;
¾ Argument “john12” isn’t numeric in addition (+) at line 3
¾5

$u = “12” + 5;
¾17
01-11-06 10
Array data type
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Values

0 35

1 12.4

Indices
2 “bye\n”

3 1.7e23

4 ‘Hi’

$data[0] = 35; $data[1] = 12.4; $data[2] = “bye\n”; $data[3] = 1.7e23; $data[4] = ‘Hi’;

@data = (35, 12.4, “bye\n”, 1.7e23, ‘Hi’)

Array = plural variable

01-11-06
@ a 11

Array operators
Introduction to programming in Perl WS 2006/07: Bioinformatics I

@let = (“J”, “P”, “S”, “D”, “C”);

pop $r=pop(@let) $r=“C”


@let=(“J”,“P”,“S”,“D”)
push push(@let,“G”) @let=(“J”,“P”,“S”,“D”,”C”,“G”)

shift $r=shift(@let) $r=“J”


@let=(“P”,“S”,“D”,“C”)
unshift unshift(@let,”G”) @let=(“G”,“j”,“P”,“S”,“D”,“C”)

splice @a=splice(@let,1,2) @a=(“P”,”S”)


@let=(“J”,”D”,”C”)
join $r=join(‘:’,@let) $r=“J:P:S:D:C”
scalar $r=scalar(@let) $r=5
reverse @a=reverse(@let) @a=(“C”,”D”,”S”,”P”,”J”)

¾ perldoc –f function_name
01-11-06 12
Search for a name
Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl

use strict;
use warnings;

my @names = (“John”, “Peter”, “Simon”, “Dave”, “Chris”);


my $offset = int(rand(scalar(@names))); # random index in [0, …, 4], int(2.55) = 2

if ($names[$offset] eq “Simon”) { # block start for the if statement


print “Simon is found\n”;
print “Success!\n”;
} # block end for the if statement
else { # block start for the else statement
print “Simon is not found\n”;
print “Failed!\n”;
} # block end for the else statement

01-11-06 13

Comparison operators
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Comparison Numeric String Return Value


Equal == eq 1 if $a is equal to $b , otherwise “”
Not equal != ne 1 if $a is not equal to $b , otherwise “”
Less than < lt 1 if $a is less than $b , otherwise “”
Greater than > gt 1 if $a is greater than $b , otherwise “”
Less than or equal <= le 1 if $a is not greater than $b , otherwise “”
Greater than or equal >= ge 1 if $a is not less than $b , otherwise “”
Comparison <=> cmp 0 if $a and $b are equal, 1 if $a is greater,
-1 if $b is greater

“” is the empty string

01-11-06 14
What is true or false?
Introduction to programming in Perl WS 2006/07: Bioinformatics I

• Any number is true except for 0.

• Any string is true except for “” and “0”.

• Anything else converted to a true value string or a true value number is true.

• Anything that is not true is false.

01-11-06 15

Logical operators
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Example Name Result


$a && $b AND $a if $a is false, $b otherwise
$a || $b OR $a if $a is true, $b otherwise
! $a NOT True if $a is not true, false otherwise
$a and $b AND $a if $a is false, $b otherwise
$a or $b OR $a if $a is true, $b otherwise
not $a NOT True if $a is not true, false otherwise

$a xor $b XOR True if $a or $b is true, false if both are true

Pay attention to precedence rule :

$xyz = $x || $y || $z is not the same as $xyz = $x or $y or $y

! Use parentheses !

01-11-06 16
Conditional statements
Introduction to programming in Perl WS 2006/07: Bioinformatics I

• Simple

Statement if (Expression);

• Compound

if (Expression) Block

if (Expression) Block else Block

if (Expression) Block elsif (Expression) Block else Block

01-11-06 17

Search for a name


Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl

use strict;
use warnings;

my @names = (“John”, “Peter”, “Simon”, “Dave”, “Chris”);


my $offset = int(rand(scalar(@names)));
my $count = 1;

while($names[$offset] ne “Simon”) { # block start for the while statement


$offset = int(rand(scalar(@names)));
$count = $count + 1;
} # block end for the while statement

print “Simon is found after $count trials\n”;

01-11-06 18
Check for a name
Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl

use strict;
use warnings;

my @names = (“John”, “Peter”, “Simon”, “Dave”, “Chris”);

for (my $i = 0; $i < scalar(@names); $i = $i + 1) { # block start for the for loop
if ($names[$i] eq “Simon”) {
print “Simon is found\n”;
}
} # end block for the for loop

01-11-06 19

Check for a name


Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl

use strict;
use warnings;

my @names = (“John”, “Peter”, “Simon”, “Dave”, “Chris”);

for (my $i = 0; $i < scalar(@names); $i = $i + 1) { # block start for the for loop
if ($names[$i] eq “Simon”) {
print “Simon is found\n”;
last; # jump outside of the loop
}
} # end block for the for loop

01-11-06 20
Loop statements
Introduction to programming in Perl WS 2006/07: Bioinformatics I

• Simple

Statement while (Expression);

• Compound

while (Expression) Block

for (Initialization; Expression; Incrementing) Block

01-11-06 21

Hashes
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Values

John 5

Peter 3
Keys

Simon 11

Dave 1

Chris 4

%names

$names{“John”} = 5; $names{“Peter”} = 3; $name{“Simon”} = 11

$names{“Dave”} = 1 $names{“Chris”} = 4

01-11-06
% Key/value 22
Check for a name
Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl

use strict;
use warnings;

my %names = (“John”, 5, “Peter”, 3, “Simon”, 11, “Dave”, 1, “Chris”, 4);


my $key = “Simon”;

if (exists $names{$key}) { exists return true if the key is in %names otherwise false
print “$key is found, his value is : $names{$key}\n”;
}
else {
print “$key is not found\n”;
}

01-11-06 23

Check for a name


Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl

use strict;
use warnings;

my %names = (
“John” => 5,
“Peter” => 3,
“Simon” => 11,
“Dave” => 0,
“Chris” => 4
);
my $key = “Simon”;

if (exists $names{$key}) {
print “$key is found, his value is : $names{$key}\n”;
}
else {
print “$key is not found\n”;
}

01-11-06 24
Hash operators
Introduction to programming in Perl WS 2006/07: Bioinformatics I

exists exists $hash{$key} Returns true if $key is in %hash,


otherwise it returns false

delete delete $hash{$key} Deletes $key => $hash{$key} from


%hash.
each each %hash Steps through a hash one key/value
pair at a time

keys keys %hash Returns a list consisting of all


the keys of %hash

values Values %hash Returns a list consisting of all


the keys of %hash

¾ perldoc –f function_name

01-11-06 25

Getting user input


Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl

use strict;
use warnings;

my $line;
print “Type something : “;

while ($line = <STDIN>) { # STDIN : Standard Input


if ($line eq “\n”) {
print “That was just a blank line\n”;
}
else {
print “Input : $line”;
}
print “Type something : “;
}

¾ Ctr-C to exit
01-11-06 26
Reading from a file
Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl

use strict;
use warnings;

print “Enter the filename: “;


Input file
my $filename = <STDIN>; # Read Standard Input for a
# filename
John
chomp($filename); # Remove the end of line character
Peter
Simon
if (! (-e $filename)) { # Test whether the file exists
Dave
print “File not found\n”;
Chris
exit 1;
}

open(IN, $filename) || die “Could not open $filename\n”;


my @names = <IN>; # Store the content of the file in an array
close(IN);

print @names;

01-11-06 27

Reading from a file


Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl

use strict;
use warnings;

print “Enter the input file name : “;


my $filename = <STDIN>; # Read Standard Input for a filename Input file : data.txt
chomp($filename); # Remove the end of line character
John 5
if (! (-e $filename)) {
print “File not found\n”;
Peter 3
exit 1; Simon 11
} Dave 1
my %names = (); Chris 4
my ($key, $value);
open(IN, $filename) || die “Could not open $filename\n”;
while ($line = <IN>) {
chomp($line);
($key, $values) = split(‘\t’, $line);
$name{$key} = $value;
}
close(IN);
$, = “ “; # It contains the separator for the print statement
print %names, “\n”;

01-11-06 28
Input and output functions
Introduction to programming in Perl WS 2006/07: Bioinformatics I

open open FILEHANDLE, EXPR open a file to referred using


FILEHANDLE

close Close FILEHANDLE Close the file associated with


FILEHANDLE

print print [FILEHANDLE] LIST Print each element of LIST to


FILEHANDLE

¾ perldoc –f function_name

01-11-06 29

Testing files
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Example Name Result


-e $filename Exists True if file named in $a exists, otherwise false
-r $filename Readable True if file named in $a is readable, otherwise false
-w $filename Writable True if file named in $a is writable, otherwise false
-d $filename Directory True if file named in $a is a directory, otherwise false
-f $filename File True if file named in $a is a regular file, otherwise false
-T $filename Text File True if file named in $a is a text file, otherwise false

01-11-06 30
Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl

use strict;
use warnings;

My $filename = “data.txt”;
my $line; Input file : data.txt
my %data = ();
my $key;
>id1
open(IN, $filename) || die “Could not open $filename\n”;
ATTGTC
while ($line = <IN>) { >id2
chomp($line); GGTCCT
if ($line =~ /^>/) { # check for ids using pattern matching >id3
$key = $line; TATGAAA
} >id4
else {
GTGTATA
data{$key} = $line;
}
}
close(IN);
my @ids = keys %data;
my @sequences = values %data;
$, = “ “;
print @ids, “\n”, @sequences, “\n”;
01-11-06 31

Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I

EXPR =~ m/PATTERN/
m// Operator (Matching): searches the string in the scalar EXPR (or $_) for
PATTERN; in scalar context the operator returns true (1) if successful, false (””)
otherwise; in list context m// returns a list of substrings matched by any
capturing parentheses in PATTERN; PATTERN undergoes double-quote
interpolation.
$line = “>id1” => $line =~ /^>/

VAR =~ s/PATTERN/REPLACEMENT/
s/// Operator (Substitution): searches the string in scalar variable VAR (or $_) for
PATTERN and, if found, replaces the matched substring with the
REPLACEMENT text; in scalar and list context s// returns the number of times it
succeeded; both PATTERN and REPLACEMENT undergo double-quote
interpolation.
$line = “>id1” => $line =~ s/>//

VAR =~ tr/SEARCHLIST/REPLACEMENTLIST/
tr/// Operator (Transliteration): scans the string in scalar variable VAR (or $_) ,
character by character, and replaces each occurrence of a character found in
SEARCHLIST with the corresponding character in REPLACEMENT list; in scalar
and list context tr// returns the number of characters replaced or deleted;
SEARCHLIST is NOT a regular expression and both SEARCHLIST and
REPLACEMENT list do not undergo full double-quote interpolation (backslash
sequences but no variable interpolation).
$line = “id1” => $line =~ tr/a-z/A-Z/

01-11-06 32
Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl

use strict;
use warnings;

my $filename = “data.txt”;
my $line;
my %data = (); Input file : data.txt
my $key;
>id1
open(IN, $filename) || die “Could not open $filename\n”;
while ($line = <IN>) { ATTGTC
chomp($line); >id2
if ($line =~ /^>/) { #check for ids using pattern matching GGTCCT
$line =~ s/>//; #substitute > by nothing in id
$line =~ tr/a-z/A-Z/; #translate lower case to upper case
>id3
$key = $line; TATGAAA
} >id4
else { GTGTATA
data{$key} = $line;
}
}
close(IN);
my @ids = keys %data;
my @sequences = values %data;
$, = “ “;
print @ids, “\n”, @sequences, “\n”;

01-11-06 33

Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Symbol Meaning
\... Used to escape metacharacters (including itself) or to make the
next character a metacharacter (like \s, \w, \n)
...|... Alternation (match one or the other)
(...) Grouping (treat as a unit)
[...] Character class (match one character from a set)
^ True at the beginning of string (or sometimes after any newline)
$ True at the end of the string (or sometimes before any newline)
. Match any one character (except newline, normally)

$seq =~ /AAA$/

01-11-06 34
Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Quantifier Meaning
* Match 0 or more times (maximal)
+ Match 1 or more times (maximal)
? Match 0 or 1 time (maximal)
{COUNT} Match exactly COUNT times
{MIN,} Match at least MIN times (maximal)
{MIN,MAX} Match at least MIN times but not more than MAX times (maximal)
*? Match 0 or more times (minimal)
+? Match 1 or more times (minimal)
?? Match 0 or 1 time (minimal)
{MIN,}? Match at least MIN times (minimal)
{MIN,MAX}? Match at least MIN times but not more than MAX times (minimal)

$seq=“TATGAAA” $seq =~ /.*AAA$/ $seq =~ /.*A{3}$/

01-11-06 35

Regular expressions
Introduction to programming in Perl WS 2006/07: Bioinformatics I

Symbol Meaning Character Class


\d Digit [0-9]

\D Nondigit [^0-9]
\s Whitespace [ \t\n\r\f]
\S Nonwhitespace [^ \t\n\r\f]

\w Word character [a-zA-Z0-9_]

\W Non-(word character) [^a-zA-Z0-9_]

$id = “id2” $id =~ /id\d+$/

01-11-06 36
Subroutines or functions
Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl
use strict;
use warnings; Input file : data1.txt

my $filename1 = “data1.txt”;
>id1
my $filename2 = “data2.txt”;
my %data1 = get_data($filename1); #subroutine call ATTGTC
my %data2 = get_data($filename2); #subroutine call >id2
$, = “ “; GGTCCT
print keys %data1, “\n”, values %data2, “\n”; >id3
print keys %data1, “\n”, values %data2, “\n”; TATGAAA
sub get_data { >id4
my $filename = shift(@_); GTGTATA
my $key;
my %tmp = ();
open(IN, $filename) || die “Could not open $filename\n”;
while (my $line = <IN>) { Input file : data2.txt
chomp($line);
if ($line =~ /^>/) {
$line =~ s/>//; >id5
$line =~ tr/a-z/A-Z/; ATAAAAA
$key = $line; >id6
}
else { GGAATTT
$tmp{$key} = $line; >id7
} TATGATT
} >id8
close(IN); GTGTAAT
return %tmp;
}
01-11-06 37

Packages
Introduction to programming in Perl WS 2006/07: Bioinformatics I

#!/usr/bin/perl package MyTools;

use strict; sub get_data {


use warnings; my $filename = shift(@_);
use MyTools; my $key;
my %tmp = ();
my $filename1 = “data1.txt”; open(IN, $filename) || die “Could not open
my $filename2 = “data2.txt”; $filename\n”;
while (my $line = <IN>) {
my %data1 = MyTools::get_data($filename1); chomp($line);
my %data2 = MyTools::get_data($filename2); if ($line =~ /^>/) {
$line =~ s/>//;
$, = “ “; # set the print separator $key = $line;
}
print keys %data1, “\n”, values %data1, “\n”; else {
print keys %data2, “\n”, values %data2, “\n”; $tmp{$key} = $line;
}
}
close(IN);
return %tmp;
}
1; # this should be your last line

Comprehensive Perl Archive Network (CPAN) http://www.cpan.org/


01-11-06 38
References
Introduction to programming in Perl WS 2006/07: Bioinformatics I

• Recommended Books
– Beginner
» “Learning Perl”, 4th Edition by Randal Schwartz, Tom
Phoenix & Brian D Foy

» “Beginning Perl for Bioinformatics”, 1st Edition by James


Tisdall

» Edition by Cynthia Gibas & PerJambeck


» “Developing Bioinformatics Computer Skills”, 1st
01-11-06 39

You might also like