CW MD Jahid Hasan 2024
CW MD Jahid Hasan 2024
CW MD Jahid Hasan 2024
EXPLANATORY NOTE
COURSE WORK: CSV (Comma Separated Values) Module in
Python
Class: Basics of programming
___________________
MD JAHID HASAN
________________________
Verified By- Ph.D. E.A. Anikeev
Voronezh – 2024
1
Tabel of Content
Introduction …………………………………………………………..…………. 3
1. Working with CSV files in Python ...……………………………………………. 3
2. Analysis of information sources on CSV Module in Python
2.1. Python Documentation ……………………………………………………… 4
2.2. Free Code Camp …………………………………………………………….. 8
2.3. Geeks for geeks ……………………………………………………………… 9
2.4. Study Tonight ……………………………………………………………….. 11
3. Description of the program
3.1. Introduction …………………………………………………………………. 13
3.2. Objective ……………………………………………………………………. 13
3.3. Required Component ……………………………………………………….. 13
3.4. Program Functionality ……………………………………………………… 14
3.5. Algorithm …………………………………………………………………… 14
3.6. Block Diagram ……………………………………………………………… 15
4. Description of the programming language tools
4.1. Program Code ………………………………………………………………. 16
4.2. Output - Program results ……………………………………………………. 16
4.3. Describe Output …………………………………………………………….. 17
4.4. Functions Used ……………………………………………………………… 17
4.5. Libraries Used ………………………………………………………………. 17
Conclusion ……………………………………………………………………….. 18
Literature …………………………………………………………………………. 19
Appendix – Program Code ……………………………………………………….. 20
2
Introduction
The so-called CSV (Comma Separated Values) format is the most common import and
export format for spreadsheets and databases. CSV format was used for many years prior
to attempts to describe the format in a standardized way in RFC 4180. The lack of a
well-defined standard means that subtle differences often exist in the data produced and
consumed by different applications. These differences can make it annoying to process
CSV files from multiple sources. Still, while the delimiters and quoting characters vary, the
overall format is similar enough that it is possible to write a single module which can
efficiently manipulate such data, hiding the details of reading and writing the data from the
programmer.
The csv module implements classes to read and write tabular data in CSV format. It allows
programmers to say, “write this data in the format preferred by Excel,” or “read data from
this file which was generated by Excel,” without knowing the precise details of the CSV
format used by Excel. Programmers can also describe the CSV formats understood by
other applications or define their own special-purpose CSV formats.
The csv module’s reader and writer objects read and write sequences. Programmers can
also read and write data in dictionary form using the DictReader and DictWriter classes.
Return a reader object that will process lines from the given csvfile. A csvfile must be an
iterable of strings, each in the reader’s defined csv format. A csvfile is most commonly a
file-like object or list. If csvfile is a file object, it should be opened with newline=''. An
optional dialect parameter can be given which is used to define a set of parameters specific
to a particular CSV dialect. It may be an instance of a subclass of the Dialect class or one
of the strings returned by the list_dialects() function. The other optional fmtparams
keyword arguments can be given to override individual formatting parameters in the
current dialect. For full details about the dialect and formatting parameters, see section
Dialects and Formatting Parameters.
Each row read from the csv file is returned as a list of strings. No automatic data type
conversion is performed unless the QUOTE_NONNUMERIC format option is specified
(in which case unquoted fields are transformed into floats).
3
1. Working with CSV files in Python
Python is one of the important fields for data scientists and many programmers to handle a
variety of data. CSV (Comma-Separated Values) is one of the prevalent and accessible file
formats for storing and exchanging tabular data.
Reading from a CSV file is done using the reader object. The CSV file is opened as a text
file with Python’s built-in open() function, which returns a file object. In this example, we
first open the CSV file in READ mode, file object is converted to csv.reader object and
further operation takes place. Code and detailed explanation is given below.
the CSV file is opened using the open() method in ‘r’ mode(specifies read mode while
opening a file) which returns the file object then it is read by using the reader() method of
CSV module that returns the reader object that iterates throughout the lines in the specified
CSV document.
CSV (Comma Separated Values) is a simple file format used to store tabular data, such as a
spreadsheet or database. A CSV file stores tabular data (numbers and text) in plain text.
Each line of the file is a data record. Each record consists of one or more fields, separated
by commas. The use of the comma as a field separator is the source of the name for this file
format. For working CSV files in Python, there is an inbuilt module called CSV.
4
2. Analysis of information sources on CSV Module
in Python
2.1. Python Documentation
The csv module implements classes to read and write tabular data in CSV format. It allows
programmers to say, “write this data in the format preferred by Excel,” or “read data from
this file which was generated by Excel,” without knowing the precise details of the CSV
format used by Excel. Programmers can also describe the CSV formats understood by
other applications or define their own special-purpose CSV formats.
The csv module’s reader and writer objects read and write sequences. Programmers can
also read and write data in dictionary form using the DictReader and DictWriter classes.
Module Contents
5
A short usage example:
6
A short usage example:
import csv
with open('eggs.csv', 'w', newline='') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=' ',
quotechar='|', quoting=csv.QUOTE_MINIMAL)
spamwriter.writerow(['Spam'] * 5 + ['Baked Beans'])
spamwriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])
To make it easier to specify the format of input and output records, specific formatting
parameters are grouped together into dialects. A dialect is a subclass of the Dialect class
containing various attributes describing the format of the CSV file. When creating reader
or writer objects, the programmer can specify a string or a subclass of the Dialect class as
the dialect parameter. In addition to, or instead of, the dialect parameter, the programmer
can also specify individual formatting parameters, which have the same names as the
attributes defined below for the Dialect class.
Dialect.delimiter
A one-character string used to separate fields. It defaults to ','.
Dialect.doublequote
Controls how instances of quotechar appearing inside a field should themselves be quoted.
When True, the character is doubled. When False, the escapechar is used as a prefix to the
quotechar. It defaults to True.
7
2.2. Free Code Camp
CSV is an acronym for comma-separated values. It's a file format that you can use to store
tabular data, such as in a spreadsheet. You can also use it to store data from a tabular
database. We can refer to each row in a CSV file as a data record. Each data record consists
of one or more fields, separated by commas.This article shows you how to use the Python
built-in module called csv to create CSV files. In order to fully comprehend this tutorial,
you should have a good understanding of the fundamentals of the Python programming
language.
The csv module has two classes that you can use in writing data to CSV. These classes are:
You can use the csv.writer class to write data into a CSV file. The class returns a writer object,
which you can then use to convert data into delimited strings.To ensure that the newline
characters inside the quoted fields interpret correctly, open a CSV file object with newline=''.
import csv
writer.writerow(field)
writer.writerow(["Oladele Damilola", "40", "Nigeria"])
writer.writerow(["Alina Hricko", "23", "Ukraine"])
writer.writerow(["Isabel Walter", "50", "United Kingdom"])
8
2.3. Geeks for geeks
A CSV (Comma Separated Values) file is a form of plain text document that uses a
particular format to organize tabular information. CSV file format is a bounded text
document that uses a comma to distinguish the values. Every row in the document is a data
log. Each log is composed of one or more fields, divided by commas. It is the most popular
file format for importing and exporting spreadsheets and databases.
Using csv.reader()
At first, the CSV file is opened using the open() method in ‘r’ mode(specifies read mode
while opening a file) which returns the file object then it is read by using the reader()
method of CSV module that returns the reader object that iterates throughout the lines in
the specified CSV document.
Note: The ‘with’ keyword is used along with the open() method as it simplifies exception
handling and automatically closes the CSV file.
Example: This code reads and prints the contents of a CSV file named ‘Giants.csv’ using
the csv module in Python. It opens the file in read mode, reads the lines, and prints them
one by one using a for loop. The csv.reader() function is used to read the CSV file, and the
data from each row is printed to the console.
import csv
with open('Giants.csv', mode ='r')as file:
csvFile = csv.reader(file)
for lines in csvFile:
print(lines)
9
Using csv.DictReader() class
It is similar to the previous method, the CSV file is first opened using the open() method then it
is read by using the DictReader class of csv module which works like a regular reader but maps
the information in the CSV file into a dictionary. The very first line of the file consists of
dictionary keys.
Example: This code reads and prints the contents of a CSV file named ‘Giants.csv’ using the
csv module with DictReader. It opens the file in read mode, reads the lines, and prints them one
by one. csv.DictReader() reads the CSV file and treats the first row as headers, creating a
dictionary for each row where the header values are the keys. The code prints each row as a
dictionary, making it easier to work with structured CSV data.
import csv
with open('Giants.csv', mode ='r') as file:
csvFile = csv.DictReader(file)
for lines in csvFile:
print(lines)
It is very easy and simple to read a CSV file using pandas library functions. Here read_csv()
method of pandas library is used to read data from CSV files.
Example: This code uses the pandas library to read and display the contents of a CSV file
named ‘Giants.csv.’ It reads the CSV file and stores it as a DataFrame using the
pandas.read_csv() function. Finally, it prints the entire DataFrame, which provides a
structured and tabular representation of the CSV data. This is a common approach when
working with tabular data in Python, as pandas offers powerful tools for data manipulation and
analysis.
import pandas
csvFile = pandas.read_csv('Giants.csv')
print(csvFile)
10
2.4. Study To Night
CSV stands for Comma Separated Values. The file uses a separator character called
delimiter to separate each value. CSV is a typical format for information exchange as it's
smaller, straightforward, and general. Each line of the file is a data record. The standard
format of the data record is defined by rows and columns. Each record comprises one or
more fields, separated by commas.
If we take a table having thousands of data, the .csv file has the ability to separate the values
using commas into distinguishable fields or columns. Normally, first-line tells the heading
or column name of data and after that actual data set is listed.
To read data from the CSV file, we must use the reader function to generate a reader object.
We use python’s open() function to open a text file, which returns a file object. This is then
passed to the reader.
The csv.reader() method returns a reader object which will iterate over each line in the given
CSV file. Each row read from the CSV file is returned as a list of strings.
import csv
Instead of printing a list of individual String elements, CSV data can be directly printed in
the form of an ordered dictionary. The first line of the CSV file is assumed to contain the
keys to use to build the dictionary.
The csv.DictReader() function creates an object that operates like a regular reader but maps
the information in each row to a dictionary whose keys are given by the optional fieldnames
parameter.
11
#import necessary modules
import csv
data = csv.DictReader(open("model.csv"))
for row in data:
print(row)
The CSV file has initial spaces, quotes around each entry, and uses a delimiter. The
csv.register_dialect() function is used to define a custom dialect.
The custom dialect requires a name in the form of a string. Other specifications can be done
either by passing a sub-class of Dialect class, or by individual formatting patterns.
import csv
While creating the reader object, we pass dialect='myDialect' to specify that the reader
instance must use that particular dialect. The advantage of using dialect is that it makes the
program more modular. CSV files and different ways to read a CSV file by using several
built-in modules and libraries. We used Dialect class, Dictionary Reader object, and CSV
Reader object. We used some custom parsing codes as well to parse the CSV file using
different text files and CSV files.
12
3. Description of the program
3.1. Introduction:
The CSV (Comma Separated Values) module in Python is a powerful and flexible library
designed to facilitate reading from and writing to CSV files. CSV files are a widely used
format for storing tabular data, where each line in the file represents a row, and columns
within that row are separated by commas or other delimiters.
3.2. Objective:
The objective of this document and program is to provide a concise and practical guide on
using Python's built-in `csv` module to read from and write to CSV files. It aims to
demonstrate how to efficiently handle CSV data by showcasing basic usage of
`csv.reader`, `csv.writer`, `csv.DictReader`, and `csv.DictWriter`, and by providing a
concrete example where data is filtered based on a condition. Additionally, the guide
highlights customization options for delimiters, quote characters, and line terminators,
enabling users to adapt the CSV handling to various formats and requirements.
13
3.4. Program Functionality:
The program reads data from an input CSV file, filters out rows where the age is less than
30, and writes the filtered data to a new output CSV file. This is achieved through the
following steps:
3.5. Algorithm:
1. Start
2. Define Function filter_csv(input_file, output_file):
● Open Input File:
1. Use with open(input_file, newline='') as infile to open the input CSV file
for reading.
● Open Output File:
1. Use with open(output_file, 'w', newline='') as outfile to open the output
CSV file for writing.
● Initialize CSV Reader:
1. Initialize csv.DictReader(infile) to read the input file as dictionaries.
● Initialize CSV Writer:
1. Initialize csv.DictWriter(outfile, fieldnames=reader.fieldnames) to write to
the output file using the same headers as the input file.
● Write Header to Output File:
1. Use writer.writeheader() to write the column headers to the output file.
● Filter and Write Rows:
1. For each row in reader:
● Check if int(row['Age']) >= 30.
● If the condition is true, write the row to the output file using
writer.writerow(row).
14
3. Main Execution:
● Call filter_csv Function:
1. Call filter_csv('input.csv', 'output.csv') with the appropriate file names.
● Print Completion Message:
1. Print "Filtered data has been written to output.csv" to indicate the process is
complete.
4. End
Start
-------------------------
-------------------------
Read input CSV file and open output CSV file for writing
-------------------------
-------------------------
-------------------------
-------------------------
End
if __name__ == "__main__":
filter_csv('input.csv', 'output.csv')
print("Filtered data has been written to output.csv")
Figure – 2 Simple Array Program in Python
4.2. Output - Program Result:
Name,Age,City Name,Age,City
Alice,30,New York Alice,30,New York
Bob,25,Los Angeles Charlie,35,Chicago
Charlie,35,Chicago Eve,40,Seattle
Diana,28,Houston
Eve,40,Seattle
16
4.3. Describe Output
filter_csv(input_file, output_file): Opens the input CSV file for reading and the output CSV
file for writing.
1. open(): Used to open files in different modes ('r' for reading, 'w' for writing, 'a' for
appending). It returns a file object.
2. csv.DictReader(): Creates a reader object that iterates over lines in the CSV file,
interpreting each line as a dictionary where the keys are column headers and the values
are the corresponding row values.
3. csv.DictWriter(): Creates a writer object that enables writing data to a CSV file in
dictionary format. It requires specifying the fieldnames (column names) for the CSV file.
4. writer.writeheader(): Writes the header (fieldnames) to the output CSV file.
5. writer.writerow(row): Writes a row of data (a dictionary) to the output CSV file.
17
Conclusion:
The CSV (Comma Separated Values) module in Python provides a powerful and flexible
tool for reading from and writing to CSV files, a common format for storing tabular data.
With its user-friendly interface and built-in functionalities, the csv module simplifies the
process of handling CSV files, offering classes like csv.reader and csv.writer for basic CSV
operations and csv.DictReader and csv.DictWriter for more advanced operations with
dictionary-based data. The module's versatility allows developers to handle different CSV
dialects and customize various aspects of CSV parsing and formatting, such as delimiters,
quoting characters, and line terminators. By leveraging the csv module, Python developers
can efficiently manipulate CSV data, making it an essential tool for data processing,
analysis, and manipulation tasks. Overall, the CSV module enhances productivity and
facilitates seamless interaction with CSV files, making it a valuable asset in Python
programming for handling tabular data. Source Analysis of this Online Websites: Various
online resources such as tutorials, forums, and programming websites were consulted to
gather information on CSV module in Python. These resources provided insights into
different techniques for array manipulation and best practices for writing efficient code.
Python Documentation: The official Python documentation, particularly the documentation
for lists and built-in functions, served as a valuable reference for understanding the syntax
and usage of various functions and methods used in the program. It provided detailed
explanations and examples that helped in implementing the desired functionalities. Based
on the insights gathered from the source analysis, functions were implemented for array
initialization, access, modification, arithmetic operations, and displaying array information.
These functions were designed to be simple, efficient, and easy to understand. The program
was tested iteratively to ensure that each function works as intended and produces the
expected output. Debugging was performed to identify and fix any errors or unexpected
behaviors encountered during testing. Efforts were made to optimize the code for clarity,
readability, and performance. List comprehensions were used for concise and efficient
implementation of array arithmetic operations. Comments and docstrings were added
throughout the code to explain the purpose and functionality of each function. This
documentation enhances code readability and makes it easier for other developers to
understand and maintain the code in the future. Overall, the development of the Python
Array Operations Program involved thorough research, careful implementation, and
rigorous testing to ensure its functionality and reliability.
18
Literature
1. Documentation of Python 3.12.2. The Python Module in CSV File Reading and
Writing.(Last updated on Apr 03, 2024) URL:
https://docs.python.org/3.12/library/csv.html (accessed: 26/05/2024).
2. Free Code Camp Dionysia Lemonaki Published Article About The Python Module in
CSV File Reading and Writing On (31 Jan 2022). The article covers arrays that you
create by importing the array module. URL:
https://www.freecodecamp.org/news/how-to-create-a-csv-file-in-python/ (accessed:
26/05/2024)
3. Geeks for Geeks Organization Published Article About The Python Module in CSV
File Reading and Writing (21 Nov 2023). The article covers arrays that you create
Integers and Doubles, Slicing, Removing and Searching module. URL:
https://www.geeksforgeeks.org/working-csv-files-python/ (accessed: 26/05/2024)
4. Study To Night Published Article About The Python Module in CSV File Reading and
Writing Using an initializer On (27 Oct 2023). URL:
https://www.studytonight.com/python-howtos/how-to-read-csv-to-list-in-python
(accessed: 26/05/2024)
19
Appendix - Program code
import csv
if __name__ == "__main__":
filter_csv('input.csv', 'output.csv')
print("Filtered data has been written to output.csv")
20