0% found this document useful (0 votes)
11 views3 pages

Session - 7 Data Operations in A File Using Pandas

Uploaded by

mouneshyatham99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views3 pages

Session - 7 Data Operations in A File Using Pandas

Uploaded by

mouneshyatham99
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Session-7: Data file operations using pandas

Aim: develop a python program that reads data from a CSV file and applies various
operations using the Pandas library.

Software requirement: Python

Program:

import pandas as pd
# Read data from a CSV file (replace 'data.csv' with your file path)
df = pd.read_csv('data.csv')
# Display the first few rows of the DataFrame
print("First 5 rows:")
print(df.head())
# Basic statistics
print("\nSummary Statistics:")
print(df.describe())
# Filtering data
filtered_df = df[df['Age'] > 25]
# Sorting data
sorted_df = df.sort_values(by='Age', ascending=False)
# Grouping and aggregation
grouped_df = df.groupby('Department')['Salary'].mean()
# Adding a new column
df['Salary Increased'] = df['Salary'] * 1.1
# Save the modified DataFrame to a new CSV file
df.to_csv('modified_data.csv', index=False)
# Pivot table
pivot_table = df.pivot_table(index='Department', columns='Gender', values='Salary',
aggfunc='mean')

# Display the results


print("\nFiltered DataFrame:")
print(filtered_df)
print("\nSorted DataFrame:")
print(sorted_df)
print("\nGrouped DataFrame:")
print(grouped_df)
print("\nDataFrame with Added Column:")
print(df)
print("\nPivot Table:")
print(pivot_table)
# This program reads data from a CSV file, applies various operations like filtering,
sorting,
# grouping, adding a new column, and creating a pivot table using Pandas. Make sure
to
# replace 'data.csv' with the path to your CSV file, and adjust the operations as
needed # for your specific data and requirements.

Theory: Reading data from a .doc file directly using Pandas can be a bit challenging
since Pandas is primarily designed to work with structured data like CSV, Excel, and
databases. However, you can convert the data from a .doc file to a format that Pandas
can handle, such as text or CSV, and then perform operations on it. Here's an example
of how to do that using the python-docx library to read data from a Word document:
First, you'll need to install the python-docx library:
pip install python-docx
Now, let's create a Python program that reads data from a Word document, converts it
to a DataFrame, and applies some operations using Pandas:

import pandas as pd
from docx import Document

# Read data from a Word document (replace 'document.docx' with your file path)
document = Document('document.docx')
# Extract text from the Word document
text = []
for paragraph in document.paragraphs:
text.append(paragraph.text)
# Create a DataFrame from the extracted text
df = pd.DataFrame({'Text': text})
# Display the first few rows of the DataFrame
print("First 5 rows:")
print(df.head())
# Basic statistics
print("\nSummary Statistics:")
print(df.describe())
# Filter data
filtered_df = df[df['Text'].str.contains('keyword')]
# Save the filtered DataFrame to a new CSV file
filtered_df.to_csv('filtered_data.csv', index=False)
# Display the filtered DataFrame
print("\nFiltered DataFrame:")
print(filtered_df)
Result: A python program developed to read data from a CSV file and applies various
operations using the Pandas library.

You might also like