Pandas Series.str.casefold() Method



The Series.str.casefold() method in Pandas is used to convert strings in a Series or Index to be casefolded. Casefolding is a more aggressive form of lower casing used for text normalization. It is especially useful for performing case-insensitive comparisons and for handling text in a more uniform manner.

This method is equivalent to Python's built-in str.casefold() method and is typically used to standardize text data in data analysis tasks.

Syntax

Following is the syntax of the Pandas Series.str.casefold() method −

Series.str.casefold()

Parameters

The Pandas Series.str.casefold() method does not accept any parameters.

Return Value

The Series.str.casefold() method returns a Series or Index of the same shape, where each string has been casefolded. This means that all characters in each string are converted to their casefolded (lowercase) form.

Example 1

Let's look at a basic example to understand how the Series.str.casefold() method works −

import pandas as pd

# Create a Series
s = pd.Series(['Hi', 'WELCOME to', 'TUTORIALSPOINT'])

# Display the input Series
print("Input Series")
print(s)

# Apply the casefold method
print("Series after applying the casefold:")
print(s.str.casefold())

When we run the above program, it produces the following result −

Input Series
0                Hi
1        WELCOME to
2    TUTORIALSPOINT
dtype: object

Series after applying the casefold:
0                hi
1        welcome to
2    tutorialspoint
dtype: object

Example 2

In this example, we'll demonstrate the use of the Series.str.casefold() method in a DataFrame −

import pandas as pd

# Create a DataFrame
df = pd.DataFrame({'Day': ['Mon', 'Tue', 'Wed', 'Thu', 'Fri'], 'Subject': ['Math', 'English', 'Science', 'Music', 'Games']})

# Print the original DataFrame
print("Input DataFrame")
print(df)

# Apply the casefold method to the 'Day' column
df.Day = df.Day.str.casefold()

# Print the modified DataFrame
print("Modified DataFrame:")
print(df)

Following is the output of the above code −

Original DataFrame:
   Day  Subject
0  Mon     Math
1  Tue  English
2  Wed  Science
3  Thu    Music
4  Fri    Games

Modified DataFrame:
   Day  Subject
0  mon     Math
1  tue  English
2  wed  Science
3  thu    Music
4  fri    Games

Example 3

Let's see another example where we apply Series.str.casefold() in a more complex scenario −

import pandas as pd

# Create a DataFrame with mixed-case text
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'CHARLIE', 'david'], 'Role': ['Admin', 'user', 'MANAGER', 'staff']})

# Print the original DataFrame
print("Original DataFrame:")
print(df)

# Apply casefold to both 'Name' and 'Role' columns
df = df.apply(lambda x: x.str.casefold() if x.dtype == "object" else x)

# Print the modified DataFrame
print("Modified DataFrame:")
print(df)

Output of the above code is as follows −

Original DataFrame:
      Name     Role
0    Alice    Admin
1      Bob     user
2  CHARLIE  MANAGER
3    david    staff

Modified DataFrame:
      Name     Role
0    alice    admin
1      bob     user
2  charlie  manager
3    david    staff
python_pandas_working_with_text_data.htm
Advertisements