File Handling

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

File Handling

Let’s explore various Python data types and how data can be stored in different file
formats. Let’s understand the importance of file handling, which is crucial for
creating, reading, updating, and deleting files. Using Python's open() built-in function,
we can manipulate data in various file formats, such as .txt, .json, .xml, .csv, .tsv,
and .excel. Let's start by understanding file handling in Python, especially focusing
on the common .txt file format

Let’s understand few file formats:

.txt (Text File)


● Description: Plain text files are one of the most common file types for storing
textual data. They contain unformatted text, meaning no special formatting
such as bold, italics, images, or other types of media embeddings.
● Use Cases: Ideal for storing notes, configuration instructions, or any form of
textual data that doesn't require formatting. Text files are universally
compatible across different operating systems and software.
● Characteristics: Easy to create, edit, and read with standard text editors. They
support a simple structure, making them highly accessible but limited in
capabilities for complex data representation.

.csv (Comma-Separated Values)


● Description: CSV files store tabular data (numbers and text) in plain text, with
each line representing a data record. Each record consists of one or more
fields, separated by commas.
● Use Cases: Commonly used for exporting and importing data to and from
spreadsheet programs like Microsoft Excel or Google Sheets. Ideal for data
manipulation, statistical analysis, and easy data exchange between different
applications.
● Characteristics: While the CSV format is simple and widely supported, it lacks
standardisation for complex data structures or definitions, which can lead to
inconsistencies in data interpretation across different systems.
.xlsx (Excel File Format)
● Description: An Excel file format used by Microsoft Excel, part of the Office
suite. It supports complex spreadsheets with multiple sheets, formulas,
charts, and rich formatting.
● Use Cases: Extensively used in business, finance, and academia for data
analysis, reporting, and complex data modeling. Excel files can store a vast
range of data types, from simple lists to complex statistical data sets.
● Characteristics: .xlsx files are part of the Open XML format, offering
compression for storage efficiency. They support advanced features like pivot
tables, conditional formatting, and the ability to embed images and other
objects. This format is highly versatile but requires specific software to fully
utilize its capabilities.

.xml (eXtensible Markup Language)


● Description: XML is a markup language that defines a set of rules for
encoding documents in a format that is both human-readable and
machine-readable. It's designed to store and transport data.
● Use Cases: Widely used for representing complex data structures in web
services, configuration files, and document formats (like XHTML). It facilitates
data sharing across different systems, particularly those connected via the
internet.
● Characteristics: XML allows users to define their own tags, making it
extremely flexible in representing hierarchical data structures. It supports
metadata, attributes, and namespaces, which enhance data description and
complexity.

.json (JavaScript Object Notation)


● Description: JSON is a lightweight data interchange format that is easy for
humans to read and write, and easy for machines to parse and generate. It
uses text format to store and transport data objects consisting of
attribute-value pairs.
● Use Cases: Frequently used in web applications for data interchange between
clients and servers. It's also used for configuration files and data storage due
to its easy-to-use structure and wide language support.
● Characteristics: JSON supports arrays and objects, allowing for a hierarchical
organization of data. It is language-independent but uses conventions familiar
to programmers of the C-family of languages, including C, C++, C#, Java,
JavaScript, Perl, Python, and many others.
JSON in JavaScript
In JavaScript, objects are collections of key-value pairs where the keys are strings,
and the values can be strings, numbers, arrays, or even other objects. This structure
allows for the representation of complex data hierarchies in a readable and easily
accessible manner. JSON mirrors this structure, providing a textual representation of
data that can be easily converted to a JavaScript object. When we say JSON is a
"stringified" JavaScript object, we mean that it is a string representation of a
JavaScript object that follows the JSON format. This conversion is done using the
JSON.stringify() method in JavaScript, which takes a JavaScript object and
returns a JSON string

JSON and Python Dictionaries


Similarly, in Python, dictionaries are used to store data in key-value pairs, where keys
are unique and can be of various types, including strings and numbers, and values
can be anything: numbers, strings, lists, or even other dictionaries. This makes
Python dictionaries structurally similar to JavaScript objects. When working with
JSON in Python, the conversion involves serializing a Python dictionary into a JSON
string or deserializing a JSON string into a Python dictionary. This is achieved using
the json module in Python, with json.dumps() for serialization (dictionary to JSON
string) and json.loads() for deserialization (JSON string to dictionary).

File Handling in Python


File handling is a fundamental aspect of programming, enabling the manipulation of
file data. Python's open() function is pivotal for this purpose, providing a gateway to
file operations.

Syntax:

The mode parameter specifies the operation: read (r), append (a), write (w), create (x),
text (t), and binary (b). Here's a detailed look at each mode:
● "r" - Read: This is the default mode. It opens a file for reading and returns an
error if the file does not exist. Useful for reading file contents without
modifying them.
● "a" - Append: Opens a file for appending at the end without truncating its
existing content. If the file doesn't exist, it's created. Ideal for adding data to
logs or data files.
● "w" - Write: Opens a file for writing, truncating the file first. If the file doesn't
exist, it's created. Use this mode for writing new data while removing old data.
● "x" - Create: Specifically for creating a new file. Returns an error if the file
already exists, ensuring no data is overwritten accidentally.
● "t" - Text: The default mode, indicating that operations are performed in text
mode. It interprets files as containing strings.
● "b" - Binary: For binary mode operations. It treats files as binary, which is
essential for non-text files like images and executables.

Understanding the open() Function


To read from a file, you first need to open it using the open() function. The syntax for
opening a file is:

If you do not specify the mode, Python assumes you want to read the file, hence the
default mode is 'r' (or 'rt', which stands for "read text"). Here's an example of opening
a file named reading_file_example.txt located in a directory named files:

This operation returns an object (<_io.TextIOWrapper>) containing details about the


file, such as its name, mode ('r'), and encoding (usually 'UTF-8').

When working with file paths in Python, especially on Windows, you need to be
mindful of the backslash (\) character because it is used to escape characters that
otherwise have a special meaning, like newline (\n) or tab (\t). To correctly specify a
Windows file path in the Python open() function use double backslashes to avoid the
backslash being interpreted as an escape character:
Reading from Files
Once a file is opened for reading, there are several methods to retrieve its contents:

● read() Method: Reads the entire content of the file into a single string.
Optionally, you can pass an integer to read(number) to specify the number of
characters to read.

To read just the first 10 characters:

● readline() Method: It is used to read a single line from the file each time it is
called. If the end of the file (EOF) is reached, readline() returns an empty
string ('')
● readlines() Method: Reads all the lines in a file and returns them as a list of
strings.

Alternatively, you can use read().splitlines() to achieve a similar result


without the newline characters:

Ensuring Files Are Closed


It's important to close files after you're done with them to free up system resources.
Using f.close() is a manual way to close files, but it's easy to forget. A more robust
approach is to use the with statement, which automatically closes the file once the
nested block of code is executed:

Append Mode ("a")


● Usage: The append mode is used to add new content to the end of a file
without removing the existing content. If the specified file does not exist,
Python will create a new file.
● Syntax: open('filename', 'a')
● Example: Appending text to an existing file.
In this example, the text "This text has to be appended at the end." is added to
the end of the reading_file_example.txt file. The \n is used to ensure the
text starts on a new line if the file already ends with content.

Write Mode ("w")


● Usage: The write mode is used to write content to a file. If the file already
exists, its content will be erased before the new content is written. If the file
does not exist, Python will create it.
● Syntax: open('filename', 'w')
● Example: Creating a new file or overwriting an existing file.

In this example, writing_file_example.txt is either created (if it doesn't


already exist) or overwritten with the text "This text will be written in a newly
created file.".

Practical Examples
● Appending Log Entries: Use append mode to add new log entries to a log file,
preserving the existing entries.

● Generating Reports: Use write mode to create or overwrite a report file with
fresh data.
"x" - Create Mode
● Purpose: The "x" mode is used for creating a new file. If the file already exists,
the operation will fail, making it a safe way to ensure that new files are created
without overwriting existing files.
● Use Case: Ideal for scenarios where you want to ensure that a new file is
being created and not accidentally overwriting an existing file.
● Example:

"t" - Text Mode


● Purpose: The "t" mode stands for text mode, which is the default mode when
no binary mode is specified. It's used for reading and writing standard text
files.
● Use Case: Suitable for dealing with text files, such as .txt or .csv files, where
the content is intended to be readable as standard text.
● Example:

In this example, text_file.txt is opened in text mode for writing. The mode "wt"
explicitly specifies writing in text mode, although just "w" would suffice since "t" is
the default.
"b" - Binary Mode
● Purpose: The "b" mode is used for binary files. This mode treats the file as a
binary blob, which is essential for non-text files like images, videos, executable
files, or even text files encoded in a specific encoding that you want to read or
write byte by byte.
● Use Case: Necessary for processing binary data, such as reading or writing
images, audio files, or any file where the data must not be treated as text.
● Example:

In this example, image.png is opened in binary read mode. The file's content is read
as bytes, and the example prints the first byte of the file.

Understanding JSON in Python


JSON (JavaScript Object Notation) is a lightweight data interchange format that is
easy for humans to read and write and easy for machines to parse and generate. It is
widely used in web applications for exchanging data between a client and server, as
well as in many programming and configuration contexts because it is
language-independent. JSON represents data as text in the form of key-value pairs
and arrays.

Files with .json Extension


A file with a .json extension stores data in JSON format. It is essentially a text file
that contains data structured in the JSON notation. This data format is particularly
useful for storing configurations, serializing and deserializing complex data
structures, and facilitating data exchange between different languages and
platforms.

Example of JSON Data


Consider a dictionary in Python that contains a person's details:
The equivalent JSON representation as a string (which could be stored in a .json
file) is:

Converting JSON to a Dictionary


To convert a JSON string back into a Python dictionary, the json module provides the
loads() function. This process is known as deserialization or decoding.
Converting a Dictionary to JSON
Converting a Python dictionary to a JSON string is done using the dumps() function
from the json module. This process is known as serialization or encoding.
Question 1: File Handling Basics
Task: Write a Python script that reads a file called input.txt and prints its content to
the console. Use a try-except block to handle the case where the file does not exist,
printing an error message: "File not found."

Question 2: Writing to Files


Task: Create a Python program that asks the user for their name and age, then writes
this information to a file named user_info.txt in the format: "Name: [user's name],
Age: [user's age]". If the file already exists, the new information should be appended
without overwriting the existing content.

Question 3: JSON Processing


Task: Given a JSON file named data.json containing an array of objects with keys
name and email, write a Python script to read this file and print each person's name
and email in separate lines. Handle JSON parsing errors with appropriate error
messages.

Question 4: Using "x" Mode for File Creation


Task: Write a Python script that attempts to create a file named exclusive.txt
using the "x" mode, writing "This is exclusive content" to it. Use exception handling to
catch the error if the file already exists and print "Cannot create file: File already
exists." (FileExistsError)

You might also like