Pandas Series.str.encode() Method



The Series.str.encode() method in Pandas is used to encode character strings in a Series or Index into byte strings using the specified encoding. This method is useful for converting text data into encoded formats for storage or transmission.

This method is similar to the str.encode() method, and it provides an easy way to handle encoding of text data within a Pandas Series or Index.

Syntax

Following is the syntax of the Pandas Series.str.encode() method −

Series.str.encode(encoding, errors='strict')

Parameters

The Series.str.encode() method accepts the following parameters −

  • encoding − A string representing the name of the encoding to use for encoding the text.

  • errors − An optional string specifying how encoding errors should be handled. The default is 'strict', which raises a UnicodeEncodeError on encoding errors. Other options include 'ignore', 'replace', 'backslashreplace', and 'namereplace'.

Return Value

The Series.str.encode() method returns a Series or Index of the same type as the calling object, containing the encoded byte strings.

Example

In this example, we demonstrate the basic usage of the Series.str.encode() method by encoding a Series of strings using the 'ascii' encoding.

import pandas as pd

# Create a Series of strings
ser = pd.Series(['Tutorialspoint', '123', '$'])

# Encode strings using 'ascii' encoding
result = ser.str.encode('ascii')

print("Input Series:")
print(ser)
print("\nSeries after calling str.encode('ascii'):")
print(result)

When we run the above code, it produces the following output −

Input Series:
0    Tutorialspoint
1               123
2                 $
dtype: object

Series after calling str.encode('ascii'):
0    b'Tutorialspoint'
1               b'123'
2                 b'$'
dtype: object

Example

This example demonstrates how to use the Series.str.encode() method to encode a column of strings in a DataFrame using the 'utf-8' encoding.

import pandas as pd

# Create a DataFrame with a column of strings
df = pd.DataFrame({ 'COLUMN1': ['', '', ''] })

# Encode strings using 'utf-8' encoding
result = df['COLUMN1'].str.encode('utf-8')

print("Input DataFrame:")
print(df)
print("\nDataFrame column after calling str.encode('utf-8'):")
print(result)

Following is the output of the above code −

Input DataFrame:
  COLUMN1
0       
1       
2     

DataFrame column after calling str.encode('utf-8'):
0    b'\xc2\xa9'
1    b'\xe2\x82\xac'
2    b'\xf0\x9f\x87\x80'
Name: COLUMN1, dtype: object

Example

Here's another example demonstrating the use of the Series.str.encode() method to encode strings with special characters using the 'utf-8' encoding.

import pandas as pd

# Create a Series of strings with special characters
ser = pd.Series(['', '', ''])

# Encode strings using 'utf-8' encoding
result = ser.str.encode('utf-8')

print("Input Series:")
print(ser)
print("\nSeries after calling str.encode('utf-8'):")
print(result)

Following is the output of the above code −

Input Series:
0    
1    
2    
dtype: object

Series after calling str.encode('utf-8'):
0    b'\xe2\x9c\x94'
1    b'\xe2\x9c\x93'
2    b'\xe2\x9c\x9c'
dtype: object
python_pandas_working_with_text_data.htm
Advertisements