Report Data Visualisation PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15
At a glance
Powered by AI
The project discusses developing a shiny application to visualize technical specifications of supercars using different graphs and plots.

The project involves building a shiny app that demonstrates different features of supercars using various visualizations like scatter plots, line graphs, word clouds and bar graphs.

A 5-sheet design methodology was used which provided a structured design process with improvements in each subsequent sheet.

FIT5147 Data Exploration

and Visualization
Semester 1, 2018
Data Visualization Project
FIT5147 Data Exploration and Visualization
Semester 1, 2018
Data Visualization Project

Name: VARUN MATHUR


ID: 28954114
Contents

1. Introduction ......................................................................................... 1
1. Design .................................................................................................. 2
2. Implementation ................................................................................... 5
3. User guide ............................................................................................ 6
4. Conclusion and Reflection .................................................................. 12
6. References ........................................................................................... 13
1. Introduction

This is an interesting dataset on supercars with a lot of parameters describing the technical
specifications of the super car. These include values such as top speed, engine size, name of
the car, make of the car, the torque etc.
Every year in Delhi, the capital of the country India, we have a automotive show called as
“Auto Expo”. Many super cars and super bikes from all over the country are showcased
across the state. Super cars such as Buggati Veyron, Ferrari F50 etc are showcased for the
public.
In this project I am building a shiny app that demonstrates the different features of the
supercars. I have chosen my intended audience to be the General public who are mainly
interested in seeing the cars for pleasure or for taking photographs of the supercars.
My other intended audience chosen are the super car enthusiasts and the super car
manufacturers who are interested in the technical specs of the car, how does the different
parameters actually vary with each other (such as top speed of the car Vs the Horsepower).
Which were the years where the top speed of a car was maximum and similar questions.
They will get to know how such information with the help of different kinds of plots such as
scatter plots. Line graphs, using word clouds, bar graphs etc. A server side and a UI (user
interface) side of the application will be developed that will aid in showing the different plots
and graphs of the dataset.

1|Page
2. Design

For the purposes of the design, the 5-sheet design methodology is chosen as this provides a
structured process for the development process. It provides a well client-led information
visualisation solution. Different sketches (over the 5 sheets) are made, each of has an
improvement of the ideas over the previous sheet. Finally, the 5th sheet produced consists of
the final design of the process/ application that the client requires.
My design uses the following process:
Sheet 01:

In the first sheet, we generate all the ideas through brainstorming and generating rough
sketches. The initial focus is on the quantity to generate all possible ideas. Initially, I have
included all the possible graphs and plots that I could think of. I have tried filtering the ideas
to avoid duplication of ideas. Multiple views/ possibilities of a graph are shown in the sheet.
In the upper left corner of the sheet I have basically shown the different attributes of the super
car in the form of a cloud. This gives a rough idea about how to use them in different plots. I
thought of some plots that used just one variable such as a density plot or a histogram.

2|Page
The name of the car in the dataset consists of both the name of the car and the year it was
manufactured. So, from this I derived the decade of the car. This can be used as a varying
attribute for the sliders in the shiny app. The user will have the choice of choosing the make
of the car (such as Audi, Mercedes) in the form of a dropdown. Accordingly, a heat map or a
scatter plot can be shown depicting how the plot varies in accordance to the make of the car
or the torque etc. Various other plots such as a bubble graph or a word cloud are also shown.

Sheet 02, 03 and 04:

The 2nd, 3rd and the 4th sheet gives a more general outline of the application that needs to be
developed. In this case I have combined all my intermediate ideas into one sheet (sheet 2, 3
and 4).
Here, I might show who my intended audience is. I thought about the different dropdowns
that can be shown to the user, the different sliders that can be used and which are the varying
attributes that can be shown as sliders. Initially, the user will be able to choose the X axis for

3|Page
the Scatter plot or the line graph. Depending on what the use chooses, the plot will change/
vary accordingly.
Similarly, I can provide a slider for the decade, where as the user slides over the decade a
heat map can be shown that shows the variation in the density of the make of the car over the
heat map. Similarly, a histogram can be shown which shows the make of the car over the x
axis. The plot changes as the user slides over the top speed slider. The user will be able to
choose as many makes of the car in this case.
Finally, in the bottom right corner, I have shown the different operations that the app will
consist of. Things like using sliders, using dropdowns will be instilled in the app. It will be
browser supported. In the discussion part, I have shown the app will be capable of doing,
such as performing data analysis on different variables. The app will help me to learn a lot
about developing a shiny application. However it will be very difficult to incorporate live
streaming in this application as all the data is already present in the .txt files.

Sheet 05:

The final sheet is the realization design. This is what the developer thinks that the application
will look like finally. This is the design that may finally be implemented.

4|Page
In my design, I have come up with the following design for the implementation of my
application. In the layout, I will provide the user with different dropdowns and sliders that
can be used for the visualization purposes.
1) Initially the user will be provided with a dropdown that will help them in choosing the X
variable and Y variable for the plot.
2) Once the user chooses the respective variables, a line graph/ scatter plot can be shown
which depicts the variation of the 2 chosen variables.
3) A slider for the top speed will be provided. 2 more dropdowns for selecting the decade and
the make of the car will be provided to the user. The user will be able to choose multiple
makes of the car and multiple decades. A bar graph will show how the count changes as
the top speed changes.
4) A word cloud can be provided that will show the names of the super cars in accordance
with their change in top speed and the decade.
5) Different operations such as using sliders, using dropdowns etc are discussed. The app will
be browser supported and will show the different types of data analysis through various
plots.
6) Details of the app are discussed such as requiring R to do the analysis, the completion time
it will take for the application, other hardware requirements etc.

3. Implementation

For implementing this app, various libraries had to be used. The process involved a bit of data
wrangling so that data could be brought into a better format for performing data analysis.
The application consists of 2 parts. A server side that consists of a reactive part. Here all the
data wrangling is performed. All the commands for the different plots are coded in the server
side. Depending on what the user chooses in the UI side of the application, the corresponding
function is implemented in the server side of the app.
The other part of the app is called the UI (user interface). Here all the commands for
providing the dropdowns and the sliders are implemented. The variables to be shown in the
dropdown are given in this section. The main panel and the side bar panel is given here.
The different libraries that were used are as follows:

Sno: Library used Description


1 shiny It is used to build interactive web apps straight from R. This
library is used to build the shiny application. It will consist of 2
parts the server side and the UI side.

2 ggplot2 Ggplot library is used for creating complex plots and graphics. In
this application, this library is used to implement the scatter plot
and the line graph in a single plot. It is also used to implement the
bar graph for performing the count comparison.

5|Page
3 reshape2 This library is used for the melting purpose on the attributes
make_nm and the decade. This is done to get plot the bar graph
and get the count of the car for that decade in a proper manner.

4 dplyr This library is used for data cleaning, manipulation, visualisation


purposes. In this application, this library is used for the merging
and the mutating purposes. This is basically done to get the
decades out of the year attribute and to get the make of the car
form the name of the car. This is later used in the different plots.

5 word cloud A word cloud is basically a visual representation of text data. It


helps to add simplicity and clarity. In this application, the word
cloud is built using the name of the car. The word cloud changes
as the top speed and the decade of the car changes.

4. User guide

In the following section, I will provide a step by step guide for using this application.
Steps to be followed:
1) Initially the user should click on the “Run App” icon in the application. This is shown
below:

6|Page
2) Kindly note, the user will have to install all the libraries that are mentioned in the commented
code in the application prior to running the app.
After running the app, the user will be shown with the following window. The user will need to
click on the “Open in Browser” icon in the window. After this, the same application will open
in the user’s default browser. This is shown below:

3) Now the user will be provided with different options to choose from. Initially, the user can
choose any attribute as the X axis and any attribute as the Y axis. Initially, if the user doesn’t
choose any decade, then the plot will be shown for all the decades for the 2 chosen axes
variables. This is shown below.

7|Page
If the user chooses a particular decade/ or multiple decade, only those will be plotted. This is
shown below:

4) Now we analyse the count plot. The user will need to select a decade(s), make of the car (one
or more) and the slide over the top speed slider. The count plot will show the make of the car
on the X axis. Now as the user slides over the top speed slider, it will show the top speed of
that car on Y axis in that particular decade depending on the top speed. This is shown below:

8|Page
Note: If no data on the top speed is present for that particular decade or for that particular
range of the top speed, then no data will be shown in the count comparison graph. An example
of such kind is shown below:

9|Page
5) Now if the user does not select any decade, then the count plot will show the data for
top speed for all the decades for the chosen cars. This is shown below:

6) On the other hand, if the user does not choose any decade, and chooses any make of the car,
if the data for all the decades is not present, then only those decades will be shown where
the top speed data is present. Hence only a selected few decades will be shown. This is
shown below:

10 | P a g e
7) If the user dos not select any decade or any car, then the data for all the cars and for all the
decades will be shown.

8) Lastly, I have shown a word cloud. Here I have used the car name as my text data. As
the user changes the top speed of the car, and the decade, the word cloud will also
change. It will show which super cars were more prominent in that decade and for that
range of the top speed. The size of the car names will vary accordingly. This is shown
below:

11 | P a g e
5. Conclusion and Reflection

In this project, I learnt a great deal about using Shiny application. I did an in-depth analysis
of the dataset. I learnt a lot about developing the shiny application and how to use the various
libraries to perform the data visualization.
Libraries such as “reshape”, “dplyr” helped me to understand how data wrangling needs to be
performed to get the data into a format that is suitable for visualization. I learnt a lot about
different graphs such as combining a line graph with a scatter plot, performing a word count
using a bar plot. Libraries such as ggplot2 were used for this purpose.
The application helped me understand how to create the server side and the UI (User
interface) side of the application and how to link the two to produce the output. It helped me
to understand what varying attributes are and how they can be used in sliders. I learnt how to
create dropdowns and how to populate them with different attributes of the dataset. With the
perspective of the intended audience, I was hopefully able to convey as much information
about the super cars as possible.
Overall, this application really helped me in learning the different uses of R libraries, how
they can be used for wrangling/ cleaning purposes, how they can be used for data
visualization, representing the text data in a simple and clear form using a word cloud, all
through a shiny application.

12 | P a g e
6. References

The following are the links from where I have retrieved the data (.txt files):
http://www.sharpsightlabs.com/wp-content/uploads/2014/11/auto-snout_0-60-
times_DATA.txt
http://www.sharpsightlabs.com/wp-content/uploads/2014/11/auto-snout_engine-
size_DATA.txt
http://www.sharpsightlabs.com/wp-content/uploads/2014/11/auto-
snout_horsepower_DATA.txt
http://www.sharpsightlabs.com/wp-content/uploads/2014/11/auto-snout_power-to-
weight_DATA.txt
http://www.sharpsightlabs.com/wp-content/uploads/2014/11/auto-snout_top-
speed_DATA.txt
http://www.sharpsightlabs.com/wp-content/uploads/2014/11/auto-snout_torque_DATA.txt

In Addition, I had used the following the following sites to get additional information on
supercars:
Inside Supercars. (2011). Retrieved from https://www.thesupercarscollective.com/inside-
supercars/

Exotics, Sports Cars, & Supercars | Pics, Reviews, & More. (2018). Retrieved from
https://www.supercars.net/blog/

The Supercar Blog. (2018). Retrieved from http://www.thesupercarblog.com/

13 | P a g e

You might also like