UNIT-03 FILES
UNIT-03 FILES
UNIT 03
File is simply a sequence of characters stored on your computer or
network.
One of the things that makes a file different from a string or list of
characters is that the file exists even after a program ends.
This makes a file useful for maintaining information that must be
Files remembered for a long period of time
Within a Python program a file is represented by a value of type file.
This value does not actually hold the contents of the file, rather the
value is a portal through which the user can access the contents of
the file
A file value is used in three distinct steps.
txt = “Orange,,,,,ssqqqww....."
x = txt.rstrip(",.qsw") Orange
print(x)
Warning! Opening a file for write removes old values
Copy the four lines given earlier into a file named peas2.txt. Then try executing the
following two lines:
Open the file with notepad or some other word processor. What has happened? Remember, opening a
file for writing causes the old values in the file to be deleted. This is true even if no new values are
written into the file.
def freqCount (f): # f is a file of input
freq = { }
line = f.readline()
while line:
words = line.split()
Rewriting for word in words:
freq[word] = freq.get(word, 0) + 1
Word Count line = f.readline()
return freq
Program def main ():
f = open(“text.txt”)
freq = freqCount(f)
# now all words have been read
for word in freq:
print word + ‘ occurs ‘+ freq[word] + ‘ times’
The operating system (such as Windows, Mac, or Unix) is normally
in charge of the management of files.
There are a number of useful operating system commands that can
be executed from within a Python program by including the os
module.
Operating The two most useful commands are os.remove(name), which
deletes (removes) the named file, and os.rename(oldname,
System newname), which renames a file.
Commands
>>> import os
>>> os.remove(“gone.txt”) # delete file named gone
>>> os.rename(“fred.txt”, “alice.txt”) # fred becomes alice
A file value can be used in a for statement. The resulting loop reads
from the file line by line, and assigns the line to the for variable
f = open(“peas.txt”)
for line in f:
print line.reverse()
def main():
Standard I/O # invoke frequency program, reading from
console input
freq = freqCount(sys.stdin)
# now all words have been read
for word in freq:
print word + ‘ occurs ‘+ freq[word] + ‘ times’
• A more subtle use of the system • There are several other functions and variables defined in the
module is to change these variables, sys module. The function sys.exit(“message”) can be used to
thereby altering the effect of the terminate a running Python program.
standard functions. • The function sys.argv is a list of the command line-options
• To see an example, by executing the passed to a program.
following program, and then • On systems that support command line arguments these are
examining the files output.txt and often used to pass information, such as file names, into a
error.txt. program. Assume that echo.py is the following simple
program:
import sys
sys.stdout = open(‘output.txt, ‘w’) import sys
sys.stderr = open(‘error.txt’, ‘w’) print sys.argv
print “see where this goes” The following might be an example execution:
print 5/4 $ python echo.py abc def
print 7.0/0 ['echo.py', 'abc', 'def']
sys.stdout.close()
sys.stderr.close()
Persistence and Pickle
• There is an alternative module that is also useful in saving Later, perhaps in a different program or at a
and restoring the values of Python variables. different time, the contents of the variable
• This module is, somewhat humorously, known as pickle. can be retrieved from the file as follows:
(When you pickle a fruit or vegetable you are saving it for
long term storage).
import pickle
• A more common name for pickling is serialization.
…
f = open(filename, ‘w’)
1. The pickle module supplies two functions, dump and load. object = pickle.load(f)
These can be used to save the contents of most Python
variables to a file and later restore their values.
2. The following is an example
Multiple objects can be saved and restored in the
same file. However the user is responsible for
import pickle remembering the order that values were saved.
… Most Python objects can be saved and restored
object = ... # create some Python value using pickle and/or shelve.
f = open(filename, ‘w’)
pickle.dump(f, object)
Consider the main program. Let us assume that the input is
contained in the file input.txt, and the output should go into file
output.txt.
At a high level, we can describe the algorithm as follows:
import os
# step 1: make all the temporary files
Example – File try
fin = open(“input.txt”)
Sort except IOERROR:
print ‘unable to open input.txt’
else:
tlist = makeTempFiles(fin)
# step 2: merge temp files
while len(tlist) > 1:
mergeTwoIntoOne(tlist)
# step 3: rename the remaining temp file
tname = tlist.pop()
os.rename(tname, “output.txt”)
def makeTempFiles (fin)
# read from fin and break into temp files
tnames = [ ] # make empty list of temp files
done = False def mergeTwoIntoOne (tlist):
while not done: ta = tlist.pop(0) # first file name
tn = makeTempFileName() tb = tlist.pop(0) #second file name
tnames.append(tn) tn = makeTempFileName() # make output file
fn = open(tn, “w”) name
lines = [ ] tlist.append(tn)
I=0 fa = open(ta)
while not done and I < 100: fb = open(tb)
I=I+1 fn = open(tn, “w”)
line = fin.readline() mergeFiles(fa, fb, fn)
if line: fa.close()
lines.append(line) fb.close()
else: os.remove(ta) # remove temp files
done = True os.remove(tb)
lines.sort() # sort the last 100 lines read fn.close()
fn.writelines(lines)
fn.close()
return tnames
def mergeFiles (fa, fb, fn):
# merge the contents of fa and fb into fn
# step 1, mege as lone as both files have lines
linea = fa.readline()
lineb = fb.readline()
while linea and lineb: • We have started from a high level description of the original
if linea < lineb: problem, reduced each task to smaller problems, and then
fn.write(linea) repeatedly addressed each of the smaller problem until
linea = fa.readline() everything is reduced to simple Python statements.
else: • All that is left is putting together the pieces, and verifying
fn.write(lineb) that it works as it should.
lineb = fb.readline()
# step 2 – write remaining lines
# only one of the following will do anything
while linea:
fn.write(linea)
linea = fa.readline()
while lineb:
fn.write(lineb)
lineb = fb.readline()
The urllib module provides a simple way to read the contents of a
file stored at a specific URL.
It returns an object that uses the same interface as a file.
import urllib
remotefile =
Reading from urrlib.urlopen(“http://www.python.org”)
line = remotefile.readline()
a URL while line:
print line
line = remotefile.readline()