Python Pandas - Home
Python Pandas - Introduction
Python Pandas - Environment Setup
Python Pandas - Basics
Python Pandas - Introduction to Data Structures
Python Pandas - Index Objects
Python Pandas - Panel
Python Pandas - Basic Functionality
Python Pandas - Indexing & Selecting Data
Python Pandas - Series
Python Pandas - Series
Python Pandas - Slicing a Series Object
Python Pandas - Attributes of a Series Object
Python Pandas - Arithmetic Operations on Series Object
Python Pandas - Converting Series to Other Objects
Python Pandas - DataFrame
Python Pandas - DataFrame
Python Pandas - Accessing DataFrame
Python Pandas - Slicing a DataFrame Object
Python Pandas - Modifying DataFrame
Python Pandas - Removing Rows from a DataFrame
Python Pandas - Arithmetic Operations on DataFrame
Python Pandas - IO Tools
Python Pandas - IO Tools
Python Pandas - Working with CSV Format
Python Pandas - Reading & Writing JSON Files
Python Pandas - Reading Data from an Excel File
Python Pandas - Writing Data to Excel Files
Python Pandas - Working with HTML Data
Python Pandas - Clipboard
Python Pandas - Working with HDF5 Format
Python Pandas - Comparison with SQL
Python Pandas - Data Handling
Python Pandas - Sorting
Python Pandas - Reindexing
Python Pandas - Iteration
Python Pandas - Concatenation
Python Pandas - Statistical Functions
Python Pandas - Descriptive Statistics
Python Pandas - Working with Text Data
Python Pandas - Function Application
Python Pandas - Options & Customization
Python Pandas - Window Functions
Python Pandas - Aggregations
Python Pandas - Merging/Joining
Python Pandas - MultiIndex
Python Pandas - Basics of MultiIndex
Python Pandas - Indexing with MultiIndex
Python Pandas - Advanced Reindexing with MultiIndex
Python Pandas - Renaming MultiIndex Labels
Python Pandas - Sorting a MultiIndex
Python Pandas - Binary Operations
Python Pandas - Binary Comparison Operations
Python Pandas - Boolean Indexing
Python Pandas - Boolean Masking
Python Pandas - Data Reshaping & Pivoting
Python Pandas - Pivoting
Python Pandas - Stacking & Unstacking
Python Pandas - Melting
Python Pandas - Computing Dummy Variables
Python Pandas - Categorical Data
Python Pandas - Categorical Data
Python Pandas - Ordering & Sorting Categorical Data
Python Pandas - Comparing Categorical Data
Python Pandas - Handling Missing Data
Python Pandas - Missing Data
Python Pandas - Filling Missing Data
Python Pandas - Interpolation of Missing Values
Python Pandas - Dropping Missing Data
Python Pandas - Calculations with Missing Data
Python Pandas - Handling Duplicates
Python Pandas - Duplicated Data
Python Pandas - Counting & Retrieving Unique Elements
Python Pandas - Duplicated Labels
Python Pandas - Grouping & Aggregation
Python Pandas - GroupBy
Python Pandas - Time-series Data
Python Pandas - Date Functionality
Python Pandas - Timedelta
Python Pandas - Sparse Data Structures
Python Pandas - Sparse Data
Python Pandas - Visualization
Python Pandas - Visualization
Python Pandas - Additional Concepts
Python Pandas - Caveats & Gotchas

Pandas DataFrame to_xml() Method

Quiz

The Python Pandas library provides the DataFrame.to_xml() method to convert the contents of a DataFrame into an XML document. You can customize the output and save it to a file or return it as a string. This method is a powerful tool for handling data transformation and is highly customizable. This method provides various parameters for customizing the structure and format of the XML output.

XML (Extensible Markup Language) is a widely used format for data interchange and storage. Using this method, you can generate structured and hierarchical representations of tabular data.

Syntax

The syntax of the to_xml() method is as follows −

DataFrame.to_xml(path_or_buffer=None, *, index=True, root_name='data', row_name='row', na_rep=None, attr_cols=None, elem_cols=None, namespaces=None, prefix=None, encoding='utf-8', xml_declaration=True, pretty_print=True, parser='lxml', stylesheet=None, compression='infer', storage_options=None)

Parameters

The Python Pandas to_xml() method accepts the following parameters −

path_or_buffer: The file path or buffer to write the XML output. If None, the XML is returned as a string instead of saving it to a file.
index: a boolean determines whether to include the DataFrame index in the XML. Defaults to True.
root_name: The name of the root element in the XML. Defaults to 'data'.
row_name: The name of each row element. Defaults to 'row'.
na_rep: Representation for missing values (NaN).
attr_cols: List of columns to be written as attributes in the row elements.
elem_cols: List of columns to be written as child elements of the row element.
namespaces: Dictionary defining XML namespaces.
prefix: Namespace prefix for elements and attributes.
xml_declaration: Boolean to include the XML declaration. Defaults to True.
pretty_print: Whether to pretty-print the XML with indentation and line breaks. Defaults to True.
parser: Specifies the parser module to use ('lxml' or 'etree'). Default is 'lxml'.
stylesheet: An optional XSLT stylesheet to transform the XML.
compression: Compression options for the output file.
storage_options: Additional options for storage connection.
encoding: Encoding for the XML output. Defaults to 'utf-8'.

Note − The to_xml() method does not support advanced XML features like DTD, CData, XSD schemas, or processing instructions. It supports namespaces at the root level, and the layout can be transformed using stylesheet.

Return Value

The to_xml() method returns an XML string if path_or_buffer is not specified. Otherwise, it writes the XML to the given file path or buffer.

Example: Converts DataFrame to an XML string

Here is a basic example demonstrates working of the DataFrame.to_xml() method for converting pandas DataFrame to an XML string.

import pandas as pd

# Create a DataFrame
data = {'Name': ['Kiran', 'Priya', 'Naveen'], 'Age': [25, 30, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']}
df = pd.DataFrame(data)

# Convert to XML
xml_data = df.to_xml()

print('Output XML String:')
print(xml_data)

Output of the above code is as follows −

Output XML String:
<?xml version='1.0' encoding='utf-8'?>
<data>
  <row>
    <index>0</index>
    <Name>Kiran</Name>
    <Age>25</Age>
    <City>New Delhi</City>
  </row>
  <row>
    <index>1</index>
    <Name>Priya</Name>
    <Age>30</Age>
    <City>Hyderabad</City>
  </row>
  <row>
    <index>2</index>
    <Name>Naveen</Name>
    <Age>35</Age>
    <City>Chennai</City>
  </row>
</data>

Example: Saving DataFrame to an XML File

The following example shows how to save a Pandas DataFrame as an XML file. Here we have specified the file path the to_xml() method.

import pandas as pd

# Create a DataFrame
data = {'Name': ['Kiran', 'Priya', 'Naveen'], 'Age': [25, 30, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']}
df = pd.DataFrame(data)

# Save the XML to a file
df.to_xml('output.xml')
print("DataFrame saved to 'output.xml'")

Following is an output of the above code −

DataFrame saved to 'output.xml'

Example: Customizing Root and Row Names while Saving DataFrame as an XML

The following example demonstrates specifying custom names for the root and row elements of an XML string created from the Pandas DataFrame. For this you can use the root_name and row_name parameters of the to_xml() method.

import pandas as pd

# Create a DataFrame
data = {'Name': ['Kiran', 'Priya', 'Naveen'], 'Age': [25, 30, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']}
df = pd.DataFrame(data)

# Customizing the root and row names
xml_data = df.to_xml(root_name='employees', row_name='employee')

print('Output XML String with custom root and row names:')
print(xml_data)

When we run above program, it produces following result −

Output XML String with custom root and row names:
<?xml version='1.0' encoding='utf-8'?>
<employees>
  <employee>
    <index>0</index>
    <Name>Kiran</Name>
    <Age>25</Age>
    <City>New Delhi</City>
  </employee>
  <employee>
    <index>1</index>
    <Name>Priya</Name>
    <Age>30</Age>
    <City>Hyderabad</City>
  </employee>
  <employee>
    <index>2</index>
    <Name>Naveen</Name>
    <Age>35</Age>
    <City>Chennai</City>
  </employee>
</employees>

Example: Convert DataFrame to XML with Missing Value

This example demonstrates using the na_rep parameter to handle missing values in the DataFrame while converting it to an XML data.

import pandas as pd

# Create a DataFrame
data = {'Name': [None, 'Priya', 'Naveen'], 'Age': [25, None, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']}
df = pd.DataFrame(data)

# Convert to XML with missing value representation
xml_data = df.to_xml(na_rep='NA')

print('Output XML String with missing value representation:')
print(xml_data)

Output of the above code is as follows −

Output XML String with missing value representation:
<?xml version='1.0' encoding='utf-8'?>
<data>
  <row>
    <index>0</index>
    <Name>NA</Name>
    <Age>25.0</Age>
    <City>New Delhi</City>
  </row>
  <row>
    <index>1</index>
    <Name>Priya</Name>
    <Age>NA</Age>
    <City>Hyderabad</City>
  </row>
  <row>
    <index>2</index>
    <Name>Naveen</Name>
    <Age>35.0</Age>
    <City>Chennai</City>
  </row>
</data>

Example: Specifying Columns as Attributes in the XML

This example demonstrates saving Pandas DataFrame as an XML with specific columns as attributes in the XML.

import pandas as pd

# Create a DataFrame
data = {'Name': ['Kiran', 'Priya', 'Naveen'], 'Age': [25, 30, 35], 'City': ['New Delhi', 'Hyderabad', 'Chennai']}
df = pd.DataFrame(data)

# Convert columns to attributes
xml_data = df.to_xml(attr_cols=['Name'])

print('Output XML String with specific columns as attributes:')
print(xml_data)

Following is an output of the above code −

Output XML String with specific columns as attributes:
<?xml version='1.0' encoding='utf-8'?>
<data>
  <row index="0" Name="Kiran"/>
  <row index="1" Name="Priya"/>
  <row index="2" Name="Naveen"/>
</data>

python_pandas_io_tool.htm

Print Page