0% found this document useful (0 votes)
26 views22 pages

Data Analytics Systems & Algorithms COMP 20036: Week 10

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 22

Data Analytics Systems & Algorithms

COMP 20036
Week 10
File / Data Handling

• How to Open a Text File


• How to Create a Text File
• How to Append Data to a File
• How to Read a File
• How to Read a File line by line
• File Modes in Python
• How to Open a Text File
File Handling in Python

file_object = open("filename", "mode")

filename: gives name of the file that the file object has opened.
mode: attribute of a file object tells you which mode a file was
opened in.
File Modes in Python

'r' This is the default mode. It Opens file for reading.


'w' This Mode Opens file for writing.
If file does not exist, it creates a new file.
If file exists it truncates the file.
'x' Creates a new file. If file already exists, the operation fails.
'a' Open file in append mode.
If file does not exist, it creates a new file.
't' This is the default mode. It opens in text mode.
'b' This opens in binary mode.
'+' This will open a file for reading and writing (updating)
How to Create a Text File

f= open(“filename.txt","w+")
for i in range(10):
f.write("This is line %d\r\n" % (i+1))
f.close()

We have a for loop that runs over a range of 10 numbers.


Using the write function to enter data into the file.

\content\gdrive\My Drive\Colab Notebooks\
How to Append Data to a File

f=open(“filename.txt", "a+")
for i in range(2):
f.write("Appended line %d\r\n" % (i+1))

you could see a plus sign in the code, it indicates that it will
create a new file if it does not exist. But in our case we already
have the file, so we are not required to create a new file.
How to Read a File

f=open(“filename", "r")
if f.mode == 'r’:
contents =f.read()
def main():
  f=open('filename.txt','w+')
  #f=open(“filename.txt","a+")
  for i in range(10):
    f.write("This is line %d\r\n" % (i+1))
  f.close()
  #Open the file back and read the contents
  f=open("filename.txt", "r")
  if f.mode=='r':
    contents=f.read()
  print(contents)
  #or, readlines reads the individual line into a list
  fl=f.readlines()
  for x in fl:
    print(x)
if __name__=="__main__":
  main()
Summary

• Python allows you to read, write and delete files


• Use the function open("filename","w+") to create a file. The +
tells the python interpreter to open file with read and write
permissions.
• To append data to an existing file use the command
open("Filename", "a")
• Use the read function to read the ENTIRE contents of a file
• Use the readlines function to read the content of the file one
by one.
File / Data Handling

• What is a CSV file?


• CSV Sample File.
• Python CSV Module
• CSV Module Functions
• Reading CSV Files
• Reading as a Dictionary
• Writing to CSV Files
• Reading CSV Files with Pandas
• Writing to CSV Files with Pandas
What is a CSV file?

• Data in the form of tables is also called CSV (comma separated values) -
literally "comma-separated values."
• This is a text format intended for the presentation of tabular data.
• Each line of the file is one line of the table.
• The values of individual columns are separated by a separator symbol - a
comma (,), a semicolon (;) or another symbol.
• CSV can be easily read and processed by Python.
Reading and Writing CSV Files in
Python using CSV Module & Pandas
• A CSV file is a type of plain text file that uses specific structuring to arrange
tabular data.
• CSV is a common format for data interchange as it's compact, simple and
general.
• Many online services allow its users to export tabular data from the website
into a CSV file.
• Files of CSV will open into Excel, and nearly all databases have a tool to allow
import from CSV file.
• The standard format is defined by rows and columns data.
• Moreover, each row is terminated by a newline to begin the next row.
• Also within the row, each column is separated by a comma.
Consider the following Table

Table Data

Programming language Designed by AppearedExtension


Python Guido van Rossum 1991 .py
Java James Gosling 1995 .java
C++ Bjarne Stroustrup 1983 .cpp

You can represent this table in csv as below.


CSV Data

Programming language, Designed by, Appeared, Extension


Python, Guido van Rossum, 1991, .py
Java, James Gosling, 1995, .java
C++, Bjarne Stroustrup,1983,.cpp

As you can see each row is a new line, and each column is separated with a
comma. This is an example of how a CSV file looks like.
Download CSV Data
Python CSV Module
Python provides a CSV module to handle CSV files. To read/write data, you need to loop through rows of the CSV.
You need to use the split method to get data from specified columns.

CSV Module Functions


In CSV module documentation you can find following functions:

csv.field_size_limit – return maximum field size


csv.get_dialect – get the dialect which is associated with the name
csv.list_dialects – show all registered dialects
csv.reader – read data from a csv file
csv.register_dialect - associate dialect with name
csv.writer – write data to a csv file
csv.unregister_dialect - delete the dialect associated with the name the dialect registry
csv.QUOTE_ALL - Quote everything, regardless of type.
csv.QUOTE_MINIMAL - Quote fields with special characters
csv.QUOTE_NONNUMERIC - Quote all fields that aren't numbers value
csv.QUOTE_NONE – Don't quote anything in output
How to Read a CSV File

#import necessary modules


import csv
with open('data.csv','rt')as f:
data = csv.reader(f)
for row in data:
print(row)

To read data from CSV files, you must use the reader function to generate a reader object.

The reader function is developed to take each row of the file and make a list of all columns.
How to Read a CSV as a Dictionary

#import necessary modules


import csv

reader = csv.DictReader(open("file2.csv"))
for raw in reader:
print(raw)
How to write CSV File

#import necessary modules


import csv

with open('X:\writeData.csv', mode='w') as file:


writer = csv.writer(file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)

#way to write to csv file


writer.writerow(['Programming language', 'Designed by', 'Appeared', 'Extension'])
writer.writerow(['Python', 'Guido van Rossum', '1991', '.py'])
writer.writerow(['Java', 'James Gosling', '1995', '.java'])
writer.writerow(['C++', 'Bjarne Stroustrup', '1985', '.cpp'])

When you have a set of data that you would like to store in a CSV file you have to use writer() function. To
iterate the data over the rows(lines), you have to use the writerow() function.
Reading CSV Files with Pandas

• Pandas is an opensource library that allows to you


perform data manipulation in Python.
• Pandas provide an easy way to create, manipulate
and delete the data.
• Reading the CSV into a pandas DataFrame is very
quick and easy:
#import necessary modules
import pandas
result = pandas.read_csv('data.csv')
print(result)
Result:
Writing to CSV Files with Pandas

from pandas import DataFrame


C = {'Programming language': ['Python','Java', 'C++'],
'Designed by': ['Guido van Rossum', 'James Gosling', 'Bjarne Stroustrup'],
'Appeared': ['1991', '1995', '1985'],
'Extension': ['.py', '.java', '.cpp'],
}
df = DataFrame(C, columns= ['Programming language', 'Designed by',
'Appeared', 'Extension'])
export_csv = df.to_csv (r'X:\pandaresult.csv', index = None, header=True) # here
you have to write path, where result file will be stored
print (df)
Conclusion

CSV files are widely used in software applications because they are easy to
read and manage, and their small size makes them relatively fast for
processing and transmission.

The csv module provides various functions and classes which allow you to
read and write easily. You can look at the official Python documentation
and find some more interesting tips and modules. CSV is the best way for
saving, viewing, and sending data. Actually, it isn't so hard to learn as it
seems at the beginning. But with a little practice, you'll master it.

Pandas is a great alternative to read CSV files.


Q&A Sessions

Thank You

You might also like