0% found this document useful (0 votes)
22 views8 pages

CSV Files

Download as docx, pdf, or txt
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 8

Introduction

CSV (Comma Separated Values) is a simple file format used to store


tabular data, such as a spreadsheet or database. A CSV file stores tabular
data (numbers and text) in plain text.

Each line of the file is a data record. Each record consists of one or more
fields, separated by commas.
For working CSV files in Python, there is an inbuilt module called
CSV.

CSV is a common format for data interchange as it is compact, simple, and


general. Many online services allow their users to export tabular data from
the website into a CSV file. Files of CSV will open into Excel, and nearly all
databases have a tool to allow import from CSV files.

It is important to know to work with CSV because we mostly rely on CSV data
in our day-to-day lives as data scientists.

The so-called CSV (Comma Separated Values) format is the most common
import and export format for spreadsheets and databases. The lack
of a well-defined standard means that subtle differences often exist in the
data produced and consumed by different applications. These differences
can make it annoying to process CSV files from multiple sources. Still, while
the delimiters and quoting characters vary, the overall format is similar
enough that it is possible to write a single module which can efficiently
manipulate such data, hiding the details of reading and writing the data from
the programmer.

The csv module’s reader and writer objects read and write
sequences.

Example (A CSV File):


Structure of CSV:

We have a file named “Salary_Data.csv.” The first line of a CSV file is the
header and contains the names of the fields/features.
After the header, each line of the file is an observation/a record. The values
of a record are separated by “comma.”

CSV File Characteristics


 One line for each record

 Comma separated fields

 Space-characters adjacent to commas are ignored

 Fields with in-built commas, double quotes are separated by double


quote characters.
When Use CSV?
 When data has a strict tabular structure

 To transfer large database between programs

 To import and export data to office applications

 To store, manage and modify shopping cart catalogue

Excel Vs CSV

Excel CSV
It stores data along with formatting It only stores data
and formulae
Extension is .xls or xlsx Extension is .csv
File cannot be opened with other File can be opened with other editors
editors like notepad like notepad
Consumes more memory Consumes less memory

Working with CSV files in Python


While we could use the built-in open() function to work with CSV files in
Python, there is a dedicated csv module that makes working with CSV files
much easier.

Before we can use the methods to the csv module, we need to


import the module first using:
Write:

import scv

Read a CSV File


Python provides various functions to read csv file. We are describing few
method of reading function.

Using csv.reader() function

In Python, the csv.reader() module is used to read the csv file. It takes each
row of the file and makes a list of all the columns.

Example 1: (Method 1)

import csv

file = open("C:\\Users\\Student\\Desktop\\Sample100.csv","r")

# Opening a file in read mode

print(file.read())

Output:

Example 2: (Method 2)

import csv

with open("C:\\Users\\Student\\Desktop\\Sample100.csv","r") as f1:

csv_read=csv.reader(f1) # csv.reader object to read the CSV file.

# Return a reader object which will iterate over lines in the given csvfile
for line in csv_read:

print(line)

Note: csv.reader return a reader object which will iterate over lines in the
given csvfile

Output:

Note: Read CSV file Having Tab Delimiter

import csv

with open('people.csv', 'r',) as file:

reader = csv.reader(file, delimiter = '\t')

for row in reader:

print(row)

Find total number of rows in the file

Instead of printing the data we can find out the number rows in a particular
data set.

Example:
import csv

with open("C:\\Users\\Student\\Desktop\\covideffects.csv","r") as f1:

csv_read=csv.reader(f1)

for line in csv_read:

pass

print("Total no. of rows: %d"%(csv_read.line_num))

Output:

Total no. of rows: 111439

Write data in CSV File

We can also write any new and existing CSV files in Python by using the
csv.writer() module. It is similar to the csv.reader() module and also has two
methods, i.e., writer function or the Dict Writer class.

It presents two functions, i.e., writerow() and writerows().


The writerow() function only write one row, and the writerows() function
write more than one row.

Example:

import csv

with open("C:\\Users\\Student\\Desktop\\covideffects.csv","w") as f1:

csv_write = csv.writer(f1)

csv_write.writerow(['Madhu', '2021', '15/12/2021', 'Wednesday', 'All', 'All',


'All', '$', '1000000', '168000000'])

print(("Data is written successfully"))

Output:

Data is written successfully

Note:

To add data (append) in already existing file open file in “a” mode.
Close the file

.close() method is used to close the opened file. Once it is closed, we cannot
perform any operations on it.

file.close()
Implementing the above code using with() statement:
Syntax: with open(filename, mode) as alias_filename:

Modes:

‘r’ – to read an existing file,


‘w’ – to create a new file if the given file doesn’t exist and write to it,
‘a’ – to append to existing file content,

CSV Advantages

 CSV is human readable and easy to create and manage


 CSV files can be read using almost any text editor.
 CSV provides a straightforward information schema

 CSV is faster to handle

 CSV is smaller in size

 CSV is considered to be standard format

 CSV Files can be easily imported into most databases


 Text in CSV Files is human-readable.

CSV Disadvantages

 CSV allows to move most basic data only. Complex configurations


cannot be imported and exported this way
 There is no distinction between text and numeric values

 Problems with importing CSV into SQL (no distinction between NULL
and quotes)

 Poor support of special characters

 Lack of universal standard

You might also like