Skip to content

gh-135243: improve CSV docs #135246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 17 additions & 15 deletions Doc/library/csv.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,23 +14,25 @@

--------------

The so-called CSV (Comma Separated Values) format is the most common import and
export format for spreadsheets and databases. CSV format was used for many
years prior to attempts to describe the format in a standardized way in
:rfc:`4180`. The lack of a well-defined standard means that subtle differences
often exist in the data produced and consumed by different applications. These
differences can make it annoying to process CSV files from multiple sources.
Still, while the delimiters and quoting characters vary, the overall format is
similar enough that it is possible to write a single module which can
efficiently manipulate such data, hiding the details of reading and writing the
data from the programmer.
The Comma Separated Values (CSV) format is the most common import and
export format for spreadsheets and databases. The basic format is columns
of text data separated by a comma delimiter. The standards for CSV data are
defined in :rfc:`4180`.

The CSV format was used for many years before the standards were defined, and
adherence to the standards is inconsistent. As a result, there can be variations
in the delimiters and quoting characters in the CSV data that is produced and
consumed by different applications. These differences can make it troublesome to
process CSV files from multiple sources. However, the basic format is standard enough
that a single module can efficiently manipulate this data and enable the programmer to
read and write files without having to account for inconsistencies.

The :mod:`csv` module implements classes to read and write tabular data in CSV
format. It allows programmers to say, "write this data in the format preferred
by Excel," or "read data from this file which was generated by Excel," without
knowing the precise details of the CSV format used by Excel. Programmers can
also describe the CSV formats understood by other applications or define their
own special-purpose CSV formats.
format. For example, the module enables programmers to say, "write this data in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reasoning for the changes are:

  1. "it" is ambiguous but refers to the module.
  2. Those two statements the programmer is saying (to themself?) is a scenario. By adding "for example" (which is typical in technical documentation), we highlight that it is one thing someone can do but not the only thing.
  3. Very very minor: there is a technical difference between "allows" and "enables".

format preferred by Excel," or "read data from this file which was generated by Excel,"
without knowing the precise details of the CSV format that is understood by Excel.
Programmers can also describe CSV formats that are understood by other applications or
define custom dialects (CSV formats) for specific use cases.

The :mod:`csv` module's :class:`reader` and :class:`writer` objects read and
write sequences. Programmers can also read and write data in dictionary form
Expand Down
Loading