0% found this document useful (0 votes)
68 views

Ls 5 Big Data Visualization

This document discusses challenges and techniques for big data visualization. It begins by introducing data visualization and noting its benefits. Some key challenges of big data visualization are discussed, including scalability, heterogeneity of data, and information loss from dimensionality reduction. Potential solutions discussed include hardware improvements, clustering data, dealing with outliers, and interactive visualization. Common visualization types and tools for big data are also outlined.

Uploaded by

Nikita Mandhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Ls 5 Big Data Visualization

This document discusses challenges and techniques for big data visualization. It begins by introducing data visualization and noting its benefits. Some key challenges of big data visualization are discussed, including scalability, heterogeneity of data, and information loss from dimensionality reduction. Potential solutions discussed include hardware improvements, clustering data, dealing with outliers, and interactive visualization. Common visualization types and tools for big data are also outlined.

Uploaded by

Nikita Mandhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Ls. 5.

Big Data Visualization

5.1 Introduction to Data Visualization


Data visualization is representing data in some systematic form
including attributes and variables for the unit of information [1].
Visualization-based data discovery methods allow business users to
mash up disparate data sources to create custom analytical views.
Advanced analytics can be integrated in the methods to support
creation of interactive and animated graphics on desktops, laptops, or
mobile devices such as tablets and smartphones [2]. Table 1 [3] shows
the benefits of data visualization according to the respondent
percentages of a survey.

Table 1: Benefits of Data Visualization Tools.


There are some points of advice for visualization [4]:
(1) Do not forget the metadata. Data about data can be very
revealing.
(2) Participation matters. Visualization tools should be interactive,
and user engagement is very important.
(3) Encourage interactivity. Static data tools don’t lead to discovery
as well as interactive tools do.
Big data are high volume, high velocity, and/or high variety datasets
that require new forms of processing to enable enhanced process
optimization, insight discovery and decision making. Challenges of
Big Data lie in data capture, storage, analysis, sharing, searching,
and visualization [5].

Visualization can be thought of as the “front end” of big data. There


are following data visualization myths [4]:
•All data must be visualized : It is important not to overly rely on
visualization; some data does not need visualization methods to
uncover its messages.
•Only good data should be visualized:  A simple and quick
visualization can highlight something wrong with data just as it helps
uncover interesting trends.
• Visualization will always manifest the right decision or
action: Visualization cannot replace critical thinking.

•Visualization will lead to certainty:  Data is visualized doesn’t


mean it shows an accurate picture of what is important. Visualization
can be manipulated with different effects.

Visualization approaches are used to create tables, diagrams, images,


and other intuitive display ways to represent data. Big Data
visualization is not as easy as traditional small data sets. The
extension of traditional visualization approaches have already been
emerged but far from enough. In large-scale data visualization, many
researchers use feature extraction and geometric modeling to greatly
reduce data size before actual data rendering. Choosing proper data
representation is also very important when visualizing big data [5].

The goal and the objectives of this chapter are to present new
methods and advances of Big Data visualization through introducing
conventional visualization methods and the extension of some them
to handling big data, discussing the challenges of big data
visualization, and analyzing technology progress in big data
visualization.
In this study, authors first searched for papers that are related to data
visualization and were published in recent years through the
university library system. At this stage, authors mainly summarized
traditional data visualization methods and new progress in this area.
Next, authors searched for papers that are related to big data
visualization. Most of these papers were published in the past three
years because big data is a newer area. At this stage, authors found
that most conventional data visualization methods do not apply to
big data. The extension of some conventional visualization
approaches to handling big data is far from enough in functions. The
authors focused on big data visualization challenges as well as new
methods, technology progress, and developed tools for big data
visualization.

5.2 Challenges to Big Data Visualization:


Scalability and dynamics are two major challenges in visual
analytics. Table 2 shows the research status for static data and
dynamic data according to the data size. For big dynamic data,
solutions for type A problems or type B problems often do not work
for A and B problems [9].

Table 2. The research status and challenge of visual analytics

The visualization-based methods take the challenges presented by


the “four Vs” of big data and turn them into following
opportunities [2].

• Volume: The methods are developed to work with an immense


number of datasets and enable to derive meaning from large volumes
of data.
•Variety: The methods are developed to combine as many data
sources as needed.

•Velocity: With the methods, businesses can replace batch processing


with real-time stream processing.

•Value: The methods not only enable users to create attractive


infographics and heatmaps, but also create business value by gaining
insights from big data.

Visualization of big data with diversity and heterogeneity


(structured, semi-structured, and unstructured) is a big problem.
Speed is the desired factor for the big data analysis. Designing a new
visualization tool with efficient indexing is not easy in big data.
Cloud computing and advanced graphical user interface can be
merged with the big data for the better management of big data
scalability [3].

Visualization systems must contend with unstructured data forms


such as graphs, tables, text, trees, and other metadata. Big data often
has unstructured formats. Due to bandwidth limitations and power
requirements, visualization should move closer to the data to extract
meaningful information efficiently. Visualization software should be
run in an in situ manner. Because of the big data size, the need for
massive parallelization is a challenge in visualization. The challenge
in parallel visualization algorithms is decomposing a problem into
independent tasks that can be run concurrently [10].

Effective data visualization is a key part of the discovery process in


the era of big data. For the challenges of high complexity and high
dimensionality in big data, there are different dimensionality
reduction methods. However, they may not always be applicable.
The more dimensions are visualized effectively, the higher are the
chances of recognizing potentially interesting patterns, correlations,
or outliers [11].
There are also following problems for big data visualization [12]:

• Visual noise: Most of the objects in dataset are too relative to each


other. Users cannot divide them as separate objects on the screen.
• Information loss: Reduction of visible data sets can be used, but
leads to information loss.
• Large image perception:  Data visualization methods are not only
limited by aspect ratio and resolution of device, but also by physical
perception limits.
• High rate of image change : Users observe data and cannot react to
the number of data change or its intensity on display.
• High performance requirements:  It can be hardly noticed in static
visualization because of lower visualization speed requirements--
high performance requirement.

Perceptual and interactive scalability are also challenges of big data


visualization. Visualizing every data point can lead to over-plotting
and may overwhelm users’ perceptual and cognitive capacities;
reducing the data through sampling or filtering can elide interesting
structures or outliers. Querying large data stores can result in high
latency, disrupting fluent interaction [13].

In Big Data applications, it is difficult to conduct data visualization


because of the large size and high dimension of big data. Most of
current Big Data visualization tools have poor performances in
scalability, functionalities, and response time. Uncertainty can result
in a great challenge to effective uncertainty-aware visualization and
arise during a visual analytics process [5].

Potential solutions to some challenges or problems about


visualization and big data were presented [14]:

1. Meeting the need for speed: One possible solution is hardware.


Increased memory and powerful parallel processing can be used.
Another method is putting data in-memory but using a grid
computing approach, where many machines are used.
2. Understanding the data: One solution is to have the proper domain
expertise in place.
3. Addressing data quality: It is necessary to ensure the data is clean
through the process of data governance or information management.
4. Displaying meaningful results: One way is to cluster data into a
higher-level view where smaller groups of data are visible and the
data can be effectively visualized.
5. Dealing with outliers: Possible solutions are to remove the outliers
from the data or create a separate chart for the outliers.

Common general types of data visualization:


 Charts.
 Tables.
 Graphs.
 Maps.
 Infographics.
 Dashboards.

Visualisation Tools for Big Data


1. Power BI
2. Kibana
3. Grafana
4. Tableau

Analytical Visualization Techniques for Big Data


 Word Clouds. Word clouds work easy: the larger and bolder the word is in the term
cloud the more a particular word is displayed in a source of text information (such as
a lecture, newspaper post or database). ...
 Symbol Maps. ...
 Line charts. ...
 Pie charts. ...
 Bar Charts. ...
 Heat Maps.
…….many more
References:
1. http://pubs.sciepub.com/dt/1/1/7/
2. https://online.hbs.edu/blog/post/data-visualization-
techniques
3. https://medium.com/xnewdata/introduction-to-data-
visualization-for-big-data-2dcc504529b2 (Section 2)
4. https://dimensionless.in/visualization-techniques-for-
datasets-in-big-data/
5. https://www.mygreatlearning.com/blog/understanding-data-
visualization-techniques/

You might also like