Pandas 1
Pandas 1
Pandas 1
Pandas also contains functionality for time series analysis and analyzing text data.
Use read_csv() with the path to the CSV file to read a comma-separated values file
Reading text files is similar to CSV files. The only nuance is that you need to specify
a separator with the sep argument, as shown below. The separator argument refers
to the symbol used to separate rows in a DataFrame. Comma (sep = ","),
whitespace(sep = "\s"), tab (sep = "\t"), and colon(sep = ":") are the commonly used
separators.
Reading excel files (both XLS and XLSX) is as easy as the read_excel() function,
using the file path as an input.
Reading Excel files with multiple sheets is not that different. You just need to specify
one additional argument, sheet_name, where you can either pass a string for the
sheet name or an integer for the sheet position (note that Python uses 0-indexing,
where the first sheet can be accessed with sheet_name = 0)
Similar to the read_csv() function, you can use read_json() for JSON file types with
the JSON file name as the argument.
The .describe() method prints the summary statistics of all numeric columns, such as
count, mean, standard deviation, range, and quartiles of numeric columns.
The .info() method is a quick way to look at the data types, missing values, and data
size of a DataFrame. Here, we’re setting the show_counts argument to True, which
gives a few over the total non-missing values in each column. We’re also
setting memory_usage to True, which shows the total memory usage of the
DataFrame elements. When verbose is set to True, it prints the full summary
from .info().
Calling the .columns attribute of a DataFrame object returns the column names in
the form of an Index object.