Python Pandas - Home
Python Pandas - Introduction
Python Pandas - Environment Setup
Python Pandas - Basics
Python Pandas - Introduction to Data Structures
Python Pandas - Index Objects
Python Pandas - Panel
Python Pandas - Basic Functionality
Python Pandas - Indexing & Selecting Data
Python Pandas - Series
Python Pandas - Series
Python Pandas - Slicing a Series Object
Python Pandas - Attributes of a Series Object
Python Pandas - Arithmetic Operations on Series Object
Python Pandas - Converting Series to Other Objects
Python Pandas - DataFrame
Python Pandas - DataFrame
Python Pandas - Accessing DataFrame
Python Pandas - Slicing a DataFrame Object
Python Pandas - Modifying DataFrame
Python Pandas - Removing Rows from a DataFrame
Python Pandas - Arithmetic Operations on DataFrame
Python Pandas - IO Tools
Python Pandas - IO Tools
Python Pandas - Working with CSV Format
Python Pandas - Reading & Writing JSON Files
Python Pandas - Reading Data from an Excel File
Python Pandas - Writing Data to Excel Files
Python Pandas - Working with HTML Data
Python Pandas - Clipboard
Python Pandas - Working with HDF5 Format
Python Pandas - Comparison with SQL
Python Pandas - Data Handling
Python Pandas - Sorting
Python Pandas - Reindexing
Python Pandas - Iteration
Python Pandas - Concatenation
Python Pandas - Statistical Functions
Python Pandas - Descriptive Statistics
Python Pandas - Working with Text Data
Python Pandas - Function Application
Python Pandas - Options & Customization
Python Pandas - Window Functions
Python Pandas - Aggregations
Python Pandas - Merging/Joining
Python Pandas - MultiIndex
Python Pandas - Basics of MultiIndex
Python Pandas - Indexing with MultiIndex
Python Pandas - Advanced Reindexing with MultiIndex
Python Pandas - Renaming MultiIndex Labels
Python Pandas - Sorting a MultiIndex
Python Pandas - Binary Operations
Python Pandas - Binary Comparison Operations
Python Pandas - Boolean Indexing
Python Pandas - Boolean Masking
Python Pandas - Data Reshaping & Pivoting
Python Pandas - Pivoting
Python Pandas - Stacking & Unstacking
Python Pandas - Melting
Python Pandas - Computing Dummy Variables
Python Pandas - Categorical Data
Python Pandas - Categorical Data
Python Pandas - Ordering & Sorting Categorical Data
Python Pandas - Comparing Categorical Data
Python Pandas - Handling Missing Data
Python Pandas - Missing Data
Python Pandas - Filling Missing Data
Python Pandas - Interpolation of Missing Values
Python Pandas - Dropping Missing Data
Python Pandas - Calculations with Missing Data
Python Pandas - Handling Duplicates
Python Pandas - Duplicated Data
Python Pandas - Counting & Retrieving Unique Elements
Python Pandas - Duplicated Labels
Python Pandas - Grouping & Aggregation
Python Pandas - GroupBy
Python Pandas - Time-series Data
Python Pandas - Date Functionality
Python Pandas - Timedelta
Python Pandas - Sparse Data Structures
Python Pandas - Sparse Data
Python Pandas - Visualization
Python Pandas - Visualization
Python Pandas - Additional Concepts
Python Pandas - Caveats & Gotchas

Pandas Series.str.findall() Method

Quiz

The Series.str.findall() method in Python Pandas is used to find all occurrences of a pattern or regular expression within each string in the Series or Index. This method is equivalent to applying re.findall() to all elements in the Series/Index.

The method returns a Series or Index of lists, where each list contains all non-overlapping matches of the pattern or regular expression found in the corresponding string. And it is useful for finding and extracting all non-overlapping occurrences of a specified pattern or regular expression from each string in a Pandas Series, Index, or a DataFrame column.

Syntax

Following is the syntax of the Pandas Series.str.findall() method −

Series.str.findall(pat, flags=0)

Parameters

The Series.str.findall() method accepts the following parameters −

pat − A string representing the pattern or regular expression to be searched for.
flags − An optional integer, default is 0. Flags from the re module, such as re.IGNORECASE, to modify the pattern matching behavior.

Return Value

The Series.str.findall() method returns a Series or Index of lists of strings. Each list contains all non-overlapping matches of the pattern or regular expression found in the corresponding string. If no matches are found, an empty list is returned for those elements.

Example 1

This example demonstrates finding all occurrences of the substring 't' in each string element in a Series.

import pandas as pd

# Create a Series of strings
s = pd.Series(['tutorials', 'articles', 'Examples'])

# Find all occurrences of the substring 't' in each string
result = s.str.findall('t')

print("Input Series:")
print(s)
print("\nOccurrences of 't':")
print(result)

When we run the above code, it produces the following output −

Input Series:
0    tutorials
1     articles
2     Examples
dtype: object

Occurrences of 't':
0    [t, t]
1       [t]
2        []
dtype: object

An empty list [] indicates that there are no occurrences of the pattern in the element.

Example 2

This example demonstrates finding all occurrences of a pattern using a regular expression. Here, we look for all substrings starting with 't' followed by any character.

import pandas as pd

# Create a Series of strings
s = pd.Series(['tutorials', 'testing', 'test cases'])

# Find all substrings starting with 't' followed by any character
result = s.str.findall(r't.')

print("Input Series:")
print(s)
print("\nOccurrences of pattern 't.':")
print(result)

When we run the above code, it produces the following output −

Input Series:
0    tutorials
1      testing
2   test cases
dtype: object

Occurrences of pattern 't.':
0    [tu, to]
1    [te, ti]
2    [te, t ]
dtype: object

The output shows lists of matches for the regular expression pattern 't.' where each element represents substrings that match the pattern.

Example 3

This example demonstrates applying the Series.str.findall() method to a DataFrame. We find all email addresses in a DataFrame that match a specified pattern.

import pandas as pd

# Create a DataFrame 
df = pd.DataFrame({
    'Email': ['user1@example.com', 'info@tutorialspoint.com', 'contact@website.org']
})

# Find all occurrences of the pattern 'tutorialspoint.com' in the 'Email' column
result = df['Email'].str.findall('tutorialspoint.com')

print("Input DataFrame:")
print(df)
print("\nOccurrences of 'tutorialspoint.com':")
print(result)

When we run the above code, it produces the following output −

Input DataFrame:
                      Email
0          user1@example.com
1  info@tutorialspoint.com
2       contact@website.org

Occurrences of 'tutorialspoint.com':
0    []
1    [tutorialspoint.com]
2    []
Name: Email, dtype: object

The output shows that the pattern 'tutorialspoint.com' is found in the second email address only.

python_pandas_working_with_text_data.htm

Print Page