2 .
Introduction - Data Visualization
Data visualization is the graphical representation of information and data. By using
visual elements like charts, graphs, and maps, data visualization tools provide an
accessible way to see and understand trends, outliers, and patterns in data.
Additionally, it provides an excellent way for employees or business owners to present
data to non-technical audiences without confusion.
In the world of Big Data, data visualization tools and technologies are essential to
analyze massive amounts of information and make data-driven decisions.
Something as simple as presenting data in graphic format may seem to have no
downsides. But sometimes data can be misrepresented or misinterpreted when placed
in the wrong style of data visualization. When choosing to create a data visualization,
it’s best to keep both the advantages and disadvantages in mind.
Advantages:
Our eyes are drawn to colors and patterns. We can quickly identify red from blue, and
squares from circles. Our culture is visual, including everything from art and
advertisements to TV and movies. Data visualization is another form of visual art that
grabs our interest and keeps our eyes on the message.
Some other advantages of data visualization include:
- Easily sharing information.
- Interactively explore opportunities.
- Visualize patterns and relationships.
Disadvantages
While there are many advantages, some of the disadvantages may seem less obvious.
For example, when viewing a visualization with many different datapoints, it’s easy to
make an inaccurate assumption. Or sometimes the visualization is just designed wrong
so that it’s biased or confusing.
Some other disadvantages include:
- Biased or inaccurate information.
- Correlation doesn’t always mean causation.
- Core messages can get lost in translation.
-
Data Visualization through Python:
Python provides various libraries that come with different features for visualizing
data. All these libraries come with different features and can support various
types of graphs. In this tutorial, we will be discussing four such libraries.
- Matplotlib
- Bar Charts
- Line Charts
- Scatterplots
a). Matplotlib:
Matplotlib is a 2-D plotting library for the Python programming language
and its numerical mathematics extension NumPy. It provides an object-
oriented API for embedding plots into applications using general-purpose
GUI toolkits like Tkinter, wxPython, Qt, or GTK. There is also a
procedural "pylab" interface.
Matplotlib is an easy-to-use, low-level data visualization library that is
built on NumPy arrays. It consists of various plots like scatter plot, line
plot, histogram, etc. Matplotlib provides a lot of flexibility.
Key Features:
It supports various types of graphical representation, including
o line graphs,
o bar graphs,
o histograms.
It can work with the NumPy arrays and border SciPy stack.
It has a huge number of plots for understanding trends and making
correlations.
b). Bar Plots:
The Matplotlib library in Python is widely used to create bar plots.
The bar() function in Matplotlib is used to create bar plots.
It accepts data in the form of lists, numpy arrays, and pandas dataframes.
There are several types of bar plots, including
simple bar plots,
grouped bar plots,
stacked bar plots,
horizontal bar plots, and
error bar plots.
Creating a Bar Plot in Python Using Matplotlib:
The Matplotlib bar() function is the easiest way to create a bar chart. We import
the library as plt and use.
The following code creates a bar plot in matplotlib:
import numpy as np
import matplotlib.bar as plt
plt.bar(x, height, width, bottom, align)
The bar width in bar charts can be controlled or specified using the “width”
parameter in the bar() function of the Matplotlib library.
The “width” parameter determines the width of each bar in the bar chart
https://www.oreilly.com/library/view/data-science-from/9781491901410/
ch07.html#:~:text=In%20the%20classical%20setup%2C%20we,more%20sense
%20with%20an%20example.