0% found this document useful (0 votes)
13 views17 pages

Unit7_RegularExpressionpdf__2023_10_17_09_16_29

This document provides an overview of Regular Expressions in Python, detailing their purpose for pattern matching and information extraction. It covers character classes, predefined character classes, metacharacters, and various regex functions such as match(), search(), and findall(). Additionally, it includes examples for mobile number and email validation, as well as extracting data from text files.

Uploaded by

mohammed sani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views17 pages

Unit7_RegularExpressionpdf__2023_10_17_09_16_29

This document provides an overview of Regular Expressions in Python, detailing their purpose for pattern matching and information extraction. It covers character classes, predefined character classes, metacharacters, and various regex functions such as match(), search(), and findall(). Additionally, it includes examples for mobile number and email validation, as well as extracting data from text files.

Uploaded by

mohammed sani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

01CE1705-Programming with Python

Unit-7 Regular Expression

Prof. Chetan Chudasama


Computer Engineering Department
Regular Expression
• It is used for pattern matching.
• It is an expression that is used to extract information.
• Python has a built-in module called re, which can be used to work with Regular
Expressions.re module is a collection of predefined functions that is used to process the
input text.
Applications:-
• Regular expression is used for validation.
• It is used to develop pattern matching application.
• To develop translators like compiler,interpreters,assemblers .
• To develop communication protocols TCP/IP,UDP,HTTPS.
Character Classes
Set Description
[abc] either a or b or c
[^abc] Except a and b and c
[a-z] Any lowercase alphabet
[A-Z] Any uppercase alphabet
[a-zA-Z] Any alphabet
[0-9] Any digit
[a-zA-Z0-9] Any alphanumeric character
[^a-zA-Z0-9] Any character(special character) except alphanumeric character
Predefined Character Classes
Character Description
\s space character
\S except space character
\d Any digit
\D except digits
\w Any word character(alpha numeric character)
\W Any character except word(special character)
.(dot) Every character
Metacharacters
Character Description
[] A set of character
\ can also be used to escape special characters
^ starts with
$ ends with
* Zero or more occurrences
+ One or more occurrences
? Zero or one occurrences
{} Exactly the specified number of occurrences
| Either Or
() Capture and group
Regex Functions
match():-
• It is used to test the input string starts with specified pattern or not.
•On success, it returns the match object and on failure, returns None.
Example:-
import re
str1=input("Enter pattern to check ")
m=re.match(str1,'abcdefgh')
if m!=None:
print('Match is available at beginning of the string')
else:
print('Match is not available at beginning of the string')
Example:-
import re
target_string="virat kohli was born on 5th Nov"
result=re.match('\w{5}',target_string)
print("Match object:",result)
print("Match value:",result.group())
fullmatch():-
• It returns a match object if and only if the entire string matches the pattern.Otherwise, it
will return None.
Example:-
import re
str1="marwadi university rajkot"
print(re.match("marwadi",str1))
print(re.fullmatch("marwadi",str1))
search():-
• It is used to test the specified pattern is present or not in the given string.
• It returns None (if the pattern doesn’t match), or a re.MatchObject that contains
information about the matching part of the string.
• This method stops after the first match.
Example:-
import re
test_string="marwadi university rajkot university"
s=re.search("university",test_string)
print(s)
findall():-
• It returns a list containing all matches.
• The list contains the matches in the order they are found.
• If no matches are found, an empty list is returned.
Example:-
import re
test_string="marwadi university rajkot university"
list1=re.findall("university",test_string)
print(list1)
Example:-
import re
lst=re.findall('[0-9]','a7b9k6z')
print(lst)
sub():-
• It replaces the matches with the text of your choice.
•We can control the number of replacements by specifying the count parameter
Example:-
import re
result=re.sub('\d','#','a7b9k5t9k',2)
print(result)
subn():-
• This method is similar to sub() and also returns new string along with number of
replacements.
• It return tuple(consist of new string and number of replacement).
Example:-
import re
str1=re.subn('\d','#','a7b9k5t9k')
print("The result string is:",str1[0])
print("The number of replacements“,str1[1])
split():-
• This function splits the given string according to the occurrence of a particular character
or pattern.
• Upon finding the pattern, this function returns the remaining characters from the string in
a list.
Example:-
import re
result=re.split('-','10-20-30-40-50')
print(result)
lst=re.split('\.','www.marwadiuniversity.ac.in')
print(lst)
for x in lst:
print(x)
compile():-
• It is regular expression pattern into pattern object, which can be used for pattern
matching.
• It also help to search a pattern again without rewriting it.
Example:-
import re
pattern=re.compile("\d{3}")
print(pattern.findall("virat's lucky number is 183 018 251"))
print(pattern.findall("rohit's lucky number is 45 209"))
Examples
Example:1 Mobile number validation
import re
s=input("Enter mobile number ")
m=re.fullmatch('(\+91)[6-9]{1}[0-9]{9}',s)
if m!=None:
print(s,'is valid identifier')
else:
print(s,'is not valid identifier’)
Example:2 Email validation
import re
s=input("Enter mail id ")
m=re.fullmatch("[a-zA-Z][a-zA-Z0-9]+[._]?[a-zA-Z0-9]+@[a-zA-Z]{5,10}\.(com|in)",s)
if m!=None:
print("Valid mail id")
else:
print("Invalid mail id")
Example:3 Extract all mobile number from text file and store it into another file
import re
f1=open("input.txt","r")
f2=open("output.txt","w")
for line in f1:
list1=re.findall('[6-9][0-9]{9}',line)
for number in list1:
f2.write(number+"\n")
f2.close()
f1.close()
Example:4 Web Scraping
import re, urllib, urllib.request
s=urllib.request.urlopen("https://www.w3schools.com/python/ref_list_append.asp")
text=s.read()
print(text)
m1=re.findall("[0-9]{14}",str(text))
print(m1)
Thank You

You might also like