Data Analysis using Python Lab
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
In this tutorial, you will learn about regular expressions
(RegEx), and use Python's “re” module to work with RegEx
A Regular Expression (RegEx) is a sequence of characters that
defines a search pattern.
For example ---- ^a...s$
The above code defines a RegEx pattern. The pattern is: any
five letter string starting with a and ending with s.
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
A Regular Expressions (RegEx) is a special sequence of
characters that uses a search pattern to find a string or set of
strings. It can detect the presence or absence of a text by
matching with a particular pattern, and also can split a pattern
into one or more sub-patterns. Python provides a re module
that supports the use of regex in Python. Its primary function
is to offer a search, where it takes a regular expression and a
string. Here, it either returns the first match or else none.
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
For example ---- ^a...s$
The above code defines a RegEx pattern. The pattern is: any
five letter string starting with a and ending with s.
A pattern defined using RegEx can be used to match against a
string.
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Python has a module named “ re ” to work with RegEx.
Here's an example:
import re
pattern = '^a...s$‘
test_string = 'abyss‘
result = re.match(pattern, test_string)
if result: print("Search successful.")
else: print("Search unsuccessful.")
Here, we used re.match() function to search pattern within
the test_string. The method returns a match object if the
search is successful. If not, it returns None.
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Specify Pattern Using RegEx
• To specify regular expressions, metacharacters are used. In
the above example, ^ and $ are metacharacters.
MetaCharacters
• Metacharacters are characters that are interpreted in a
special way by a RegEx engine. Here's a list of metacharacters:
[] . ^ $ * + ? {} () \ |
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
[] - Square brackets
• Square brackets specifies a set of characters you wish to
match.
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
[] - Square brackets
• Here, [abc] will match if the string you are trying to match
contains any of the a, b or c.
• You can also specify a range of characters using - inside square
brackets.
• [a-e] is the same as [abcde].
• [1-4] is the same as [1234].
• [0-39] is the same as [01239].
• You can complement (invert) the character set by using
caret ^ symbol at the start of a square-bracket.
• [^abc] means any character except a or b or c.
• [^0-9] means any non-digit character.
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Ex.no: 3
Regular Expression: Demonstrate usage of regular expression
Special Sequences
• Special sequences make commonly used patterns easier to
write. Here's a list of special sequences:
Special Sequences
Special Sequences
Special Sequences
Special Sequences
Special Sequences
re.findall()
• The re.findall() method returns a list of strings containing all
matches.
Example 1: re.findall()
# Program to extract numbers from a string
import re
string = 'hello 12 hi 89. Howdy 34‘
pattern = '\d+‘
result = re.findall(pattern, string)
print(result)
# Output: ['12', '89', '34']
• If the pattern is not found, re.findall() returns an empty list.
re.sub()
• The method returns a string where matched occurrences are
replaced with the content of replace variable.
• The syntax of re.sub() is:
re.sub(pattern, replace, string)
Example 3: re.sub()
# Program to remove all whitespaces
import re
# multiline string
string = 'abc 12\
de 23 \n f45 6'
# matches all whitespace characters
pattern = '\s+'
# empty string
replace = '‘”
new_string = re.sub(pattern, replace, string)
print(new_string)
# Output: abc12de23f456