Pandas Series.str.contains() Method



The Series.str.contains() method in Pandas is used to test if a pattern or regex is contained within a string of a Series or Index. This method returns a boolean Series or Index based on whether the given pattern is present in each string.

This method is useful for filtering and identifying strings that match a specific pattern, which can be a literal string or a regular expression.

Syntax

Following is the syntax of the Pandas Series.str.contains() method −

Series.str.contains(pat, case=True, flags=0, na=None, regex=True)

Parameters

The Series.str.contains() method accepts the following parameters −

  • pat − The character sequence or regular expression to search for.

  • case − A boolean indicating if the search should be case-sensitive. Default is True.

  • flags − An integer value for regex flags from the re module. Default is 0 (no flags).

  • na − A scalar value to fill in for missing values. Default is numpy.nan for object dtype and pandas.NA for StringDtype.

  • regex − A boolean indicating if the pattern should be treated as a regex. Default is True.

Return Value

The Series.str.contains() method returns a Series or Index of boolean values indicating whether the given pattern is present in each string element of the Series or Index.

Example 1

This example, demonstrate the basic usage of the Series.str.contains() method by checking if the string 'og' is present in each element of a Series.

import pandas as pd
import numpy as np

# Create a Series of strings
s1 = pd.Series(['panda', 'dog', 'house and python', '23', np.nan])

# Check if 'og' is present in each string (literal match, not regex)
result = s1.str.contains('og', regex=False)

print("Input Series:")
print(s1)
print("\nSeries after calling str.contains('og', regex=False):")
print(result)

When we run the above code, it produces the following output −

Input Series:
0               panda
1                 dog
2    house and python
3                  23
4                 NaN
dtype: object

Series after calling str.contains('og', regex=False):
0    False
1     True
2    False
3    False
4      NaN
dtype: object

Example 2

Here is another example demonstrates how to use the Series.str.contains() method to identify strings in an Index that contain the sub-string '26'.

import pandas as pd
import numpy as np

# Create a series 
s= pd.Series([1, 2, 3, 4, 5], index=['panda', 'dog', 'house and python', '26.0', np.nan])

# Check if '26' is present in each string (literal match, not regex)
result = s.index.str.contains('26', regex=False)

print("Input Series:")
print(s)
print("\nIndex after calling str.contains('23', regex=False):")
print(result)

Following is the output of the above code −

Input Series:
panda               1
dog                 2
house and python    3
26.0                4
NaN                 5
dtype: int64

Index after calling str.contains('23', regex=False):
Index([False, False, False, True, nan], dtype='object')

Example 3

In this example, we apply the Series.str.contains() method with a regular expression to match any string containing 'house' or 'dog'.

import pandas as pd
import numpy as np

# Create a Series of strings
s1 = pd.Series(['panda', 'dog', 'house and python', '23', np.nan])

# Check if 'house' or 'dog' is present in each string (regex match)
result = s1.str.contains('house|dog', regex=True)

print("Input Series:")
print(s1)
print("\nSeries after calling str.contains('house|dog', regex=True):")
print(result)

Output of the above code is as follows −

Input Series:
0               panda
1                 dog
2    house and python
3                  23
4                 NaN
dtype: object

Series after calling str.contains('house|dog', regex=True):
0    False
1     True
2     True
3    False
4      NaN
dtype: object
python_pandas_working_with_text_data.htm
Advertisements