T-Visne: Interactive Assessment and Interpretation of T-Sne Projections

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. XX, NO.
X, JULY 2020 1
t-viSNE: Interactive Assessment and

Interpretation of t-SNE Projections
Angelos Chatzimparmpas, Student Member, IEEE, Rafael M. Martins, Member, IEEE Computer Society,
and Andreas Kerren, Senior Member, IEEE
Abstract—t-Distributed Stochastic Neighbor Embedding (t-SNE) for the visualization of multidimensional data has proven to be a
popular approach, with successful applications in a wide range of domains. Despite their usefulness, t-SNE projections can be hard to
interpret or even misleading, which hurts the trustworthiness of the results. Understanding the details of t-SNE itself and the reasons
behind specific patterns in its output may be a daunting task, especially for non-experts in dimensionality reduction. In this work, we
present t-viSNE, an interactive tool for the visual exploration of t-SNE projections that enables analysts to inspect different aspects of
arXiv:2002.06910v4 [cs.LG] 1 Dec 2020
their accuracy and meaning, such as the effects of hyper-parameters, distance and neighborhood preservation, densities and costs of
specific neighborhoods, and the correlations between dimensions and visual patterns. We propose a coherent, accessible, and
well-integrated collection of different views for the visualization of t-SNE projections. The applicability and usability of t-viSNE are
demonstrated through hypothetical usage scenarios with real data sets. Finally, we present the results of a user study where the tool’s
effectiveness was evaluated. By bringing to light information that would normally be lost after running t-SNE, we hope to support
analysts in using t-SNE and making its results better understandable.
Index Terms—Interpretable t-SNE, dimensionality reduction, high-dimensional data, explainable machine learning, visualization.
1 I NTRODUCTION
D IMENSIONALITY Reduction (DR) techniques are an impor-

tant part of the toolbox of high-dimensional data analysis,
with its initial techniques such as Principal Component Analysis
the publication of t-distributed Stochastic Neighbor Embedding
(t-SNE) [10]. Through a series of complex transformations and
fine-tuned optimization procedures (cf. Section 3), t-SNE usually
(PCA) [1] and Multidimensional Scaling (MDS) [2] being several manages to create low-dimensional representations that capture
decades old now. The problem that DR tries to solve is, in general, complex patterns from the high-dimensional space very accurately,
to find a low-dimensional representation of a high-dimensional showing them as well-separated clusters of points. It has been
data set that retains—as much as possible—its original structure. used successfully in many different domains such as single-cell
When used for visualization, the output is set to two or three mass cytometry [11], natural language processing [12], and cancer
dimensions, and the results are commonly visualized with scat- analysis [13].
terplots, where similar objects are modeled by nearby points, and t-SNE’s inherent complexity, however, has also raised con-
dissimilar ones by distant points. cerns regarding the trustworthiness of the results and the difficulty
Linear DR methods, such as PCA, are easier to understand in interpreting them. Wattenberg et al. [14] demonstrated several
and to explain, since the remaining axes are linear combinations important pitfalls of t-SNE, such as (i) the highly-complicated
of the original dimensions, which establishes a direct relationship relationship between input parameters and visualization, (ii) the
between the low-dimensional and the high-dimensional data set. apparent irrelevance of the sizes (or density) of high-dimensional
When the specific constraints of being simple and easily explain- clusters, (iii) the disregard for the distance between clusters,
able are abrogated, other more intricate non-linear DR (or mani- (iv) the appearance of clusters even when the input is random,
fold learning) methods can be used in order to capture much more and (v) the difficulty in assessing and (vi) interpreting shapes.
complex high-dimensional patterns [3]. In general, non-linear DR Although they also include advices and guidelines for using t-
methods opt to maintain local structures in detriment of global SNE effectively, the examples use simple and carefully-engineered
ones, i.e., their algorithms favor the optimization of neighborhoods artificial data sets, for which the original appearance is clear.
of points and mostly disregard large distances. Although non- Therefore, one question remains open: how to avoid such pitfalls
linear DR methods have also been around for quite some time with real-world high-dimensional data, possibly in the thousands
(e.g., Sammon Mapping [4]), they have gained popularity in the of dimensions, when little or no previous knowledge is available?
past few years—due to increasingly better performance—with Inspired by the work of Wattenberg et al. [14] and the existing
techniques such as Isomap [5], LLE [6], or LAMP [7]; a few visualization literature on interpreting and assessing DR meth-
comparative review papers on general DR exist already, see the ods [15], [16], we present t-viSNE, a tool designed to support the
surveys [8] or [9]. This popularity has reached its peak after interactive exploration of t-SNE projections (an extension to our
previous poster abstract [17]). In contrast to other, more general
approaches, t-viSNE was designed with the specific problems
• Angelos Chatzimparmpas, Rafael M. Martins, and Andreas Kerren
are with the Department of Computer Science and Media Technology, related to the investigation of t-SNE projections in mind, bringing
Linnaeus University, Växjö 35195, Sweden. to light some of the hidden internal workings of the algorithm
E-mail: {angelos.chatzimparmpas,rafael.martins,andreas.kerren}@lnu.se. which, when visualized, may provide important insights about the
Manuscript received October XX, 2019; revised February XX, 2020. high-dimensional data set under analysis. Our proposed solution
© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including
reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of
any copyrighted component of this work in other works.
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. XX, NO. X, JULY 2020 2
is composed of a set of coordinated views that work together 2 R ELATED W ORK

in order to fulfill four main goals: (G1) facilitate the choice A DR method is an algorithm that projects a high-dimensional data
of hyper-parameters through visual exploration and the use of set to a low-dimensional representation, preserving the structure of
quality metrics; (G2) provide a quick overview of the accuracy of the original data as much as possible. Most of these algorithms
the projection, to support the decision of either moving forward have some (or many) hyper-parameters that may considerably
with the analysis or repeating the process of hyper-parameter affect their results, but setting them correctly is not a trivial task. In
exploration; (G3) provide the means to investigate quality further, Subsection 2.1, we briefly describe techniques that try to solve this
differentiating between the trustworthiness of different regions problem, and discuss the differences to our tool’s functionality.
of the projection; and (G4) allow the interpretation of different The resulting projection is usually visualized with scatterplots,
visible patterns of the projection in terms of the original data set’s which support tasks such as finding groups of similar points,
dimensions. correlations, and outliers [16]. However, a scatterplot is simply
The implemented views are a mix of adapted and improved the first step in analyzing a high-dimensional data set through
classic techniques (e.g., our Shepard Heatmap and Adaptive a projection: questions regarding the quality of the results (see
Parallel Coordinates Plot (PCP)), new proposals (e.g., the Di- Subsection 2.2) and how to interpret them (see Subsection 2.3)
mension Correlation view), and standard visual mappings with are pervasive in the literature on the subject. A few other tools
information that is usually hidden or lost after the projection is have been proposed throughout the years that incorporate these
created (e.g., Density and Remaining Cost views). They were techniques to deal with the problem of supporting the exploration
created in a careful design process that aimed to bring forward of multidimensional data with DR. In Subsection 2.4, we discuss
a selection of visualization techniques, combined and put together their goals and trade-offs, and compare them with t-viSNE.
as a coherent whole in order to support—as much as possible—an Additionally, a summary of the tools discussed in this sec-
accessible and usable analysis workflow with t-SNE. To the best tion and a feature comparison with t-viSNE is presented in
of our knowledge, t-viSNE is the first interactive visualization tool Table 1. A tick indicates that the tool has the corresponding
designed with the goal of alleviating the specific shortcomings of t- features/capabilities, while a tick in parentheses means the tool
SNE and supporting, at the same time and in a coherent and usable offers implicit support (i.e., it could be done manually, in an ad
way, the assessment of quality and the interpretation of patterns in hoc manner, but is not explicitly supported). The table does not
t-SNE projections. In summary, our contributions consist of include works that do not contain a concrete visualization tool as
their research contribution, as in Schreck et al. [18], for instance.
Furthermore, we excluded the works which are not generalizable
• a selection of different views, interaction techniques, and and focus on specific domain applications such as [19], [20].
visual mappings designed to support the interpretation and
assessment of t-SNE projections; TABLE 1: Feature comparison of t-viSNE [21] with other related
• their implementation in a carefully-designed system tools from the literature. The last column indicates if the tool (T)
geared towards supporting analysts in overcoming well- and/or its source code (SC) are available online (last checked:
documented difficulties of working with t-SNE; and January 15, 2020).
• a discussion on the design and the outcomes of a user study
Features/Capabilities
that showed promising results.
Available
Overview Quality Dimensions
Tool Multiple Visual Global Local Interact.
Rank
DR Algor. t-SNE Param. Quality Quality Shape
Dim.
Support Expl. Assess. Assess. Analysis
Although our proposed solution is inspired by the work of Wat- t-viSNE ✓ ✓ ✓ ✓ ✓ ✓ T+SC
Clustervision [51] ✓ ✓ (✓) (✓) ✓ T
tenberg et al. [14] and touches on most of the points raised VisCoDeR [22] ✓ ✓ ✓ ✓ T
by the authors, not all of them are fully covered by t-viSNE. Clustrophile 2 [23] ✓ ✓ (✓) (✓)
AxiSketcher [47] ✓ ✓ ✓
More specifically, t-viSNE addresses points (ii), (iii), (v), and (vi) GEP [63] ✓ ✓ (✓) T+SC
ccPCA [44] ✓ ✓ T+SC
described previously, partially covers (i), and leaves point (iv) DimReader [45] ✓ (✓) SC
for future work, i.e., we only omit the investigation on how the Coimbra et al. [42] ✓ ✓ ✓ ✓ T+SC
Praxis [46] ✓ (✓) ✓
formation of clusters might erroneously convey messages to the FocusChanger [50] ✓ ✓
users even when the input is random. Thus, we intend this work to Probing Proj. [36] ✓ ✓ T+SC
ProxiLens [34] ✓
be a comprehensive proposal of possible solutions to the problem Stress Maps [29] ✓
of opening t-SNE’s black box, and to provide very important and
relevant steps towards that final goal.
The rest of this paper is organized as follows. In the next two 2.1 Hyper-parameter Exploration
sections, we discuss literature that is related to visual, interactive VisCoDeR [22] supports the comparison between multiple projec-
assessment and interpretation of t-SNE projections as well as the tions generated by different DR techniques and parameter settings,
necessary background information on how the t-SNE algorithm similarly to our initial parameter exploration, using a scatterplot
works. Section 4 presents our visualization approach including view with an on-top heatmap visualization for evaluating the
the various features of t-viSNE in the three categories: overview, quality of these projections. In contrast to t-viSNE, it does not
quality, and dimensions. We then demonstrate the effectiveness of support further exploratory visual analysis tasks after the layout
t-viSNE by describing two use cases with real data in Section 5. is selected, such as optimizing the hyper-parameters for specific
Thereafter in Section 6, we discuss the usability and applicability user selections. Clustrophile 2 [23] contains a Clustering Tour
of t-viSNE by reporting the results of a user study. Section 7 feature to partially assist users in immediately exploring the space
discusses our design choices, limitations, and possible future of potential clustering results by visualizing previous and current
work. Finally, Section 8 concludes our paper. solution states, and providing choices of modalities by which the
user can restrain how parameters are updated. These features help views for projection exploration, but they focus on projection-
with the investigation of the quality of different clustering results agnostic 3-D scatterplots, and the widgets have different goals.
(see the subsection below) in relation to the users’ analytical Probing Projections [36] is another such interactive system that
tasks. However, t-viSNE supports the visual exploration of a supports both explaining and assessing projections, but limited
predetermined space of solutions, which allows users to optimize to MDS [43]. Groups of points can be compared in terms of
sublocations and highlight these patterns with new projections. the data set’s dimensions, and a heatmap of the distribution of a
selected dimension can be overlaid on the visualization, but there
2.2 Quality Assessment is no special prioritization of dimensions to deal with very high-
dimensional data; the user must simply cycle through all of them in
One way to obtain an indication of a projection’s quality is
order to find the most relevant one. Fujiwara et al. [44] proposed
to compute a single scalar value, equivalent to a final score.
the contrasting clusters in PCA (ccPCA) method to find which
Examples are Normalized Stress [7], Trustworthiness and Con-
dimensions contributed more to the formation of a selected cluster
tinuity [24], and Distance Consistency (DSC) [25]. More recently,
and why it differs from the rest of the dataset, based on information
ClustMe [26] was proposed as a perception-based measure that
on separation and internal vs. external variability. We have similar
ranks scatterplots based on cluster-related patterns. While this
goals, but approach them with different methods. For exploring
might be useful for quick overviews or automatic selection of
clusters and selections in general, we use PCA to filter and order a
projections, a single score fails to capture more intricate details,
local PCP plot; this could be easily adapted to use ccPCA instead
such as where and why a projection is good or bad [27]. In contrast,
as an underlying method for choosing which dimensions to filter
local measures such as the projection precision score (pps) [18]
and how to re-order the axes, without affecting the overall pro-
describe the quality for each individual point of the projection,
posed analytical flow of the tool. On the other hand, ccPCA does
which can then be visualized as an extra layer on top of the
not deal with the analysis of shapes, which we support with our
scatterplot itself. These measures usually focus on the preservation
proposed Dimension Correlation. Other recent approaches include
of neighborhoods [28], [29], [30] or distances [27], [31], [32]. As
DimReader [45], where the authors create so-called generalized
an example, the set difference from Martins et al. [33] uses the
axes for non-linear DR methods, but besides explaining a single
Jaccard set-distance between the two sets of neighbors of a point
dimension at a time, it is currently unclear how exactly it can be
in low- and high-dimensional space in order to compute a measure
used in an interactive exploration scenario; and Praxis [46], with
of Neighborhood Preservation. We have chosen to adopt it in our
two methods—backward and forward projection—but it requires
work, in contrast to others, because of its intuitive interpretation,
fast out-of-sample extensions which are not available for the
simple computation, and straightforward adaptation for displaying
original t-SNE.
the preservation of neighborhoods of different scales.
Most similarly to one of our proposed interactions (the Di-
Quality measures can also be used in interactive explorations
mension Correlation, Subsection 4.4), in AxiSketcher [47] (and
to expose errors on demand and direct the user’s further explo-
its prior version InterAxis [48]) the user can draw a polyline in
rations [34]. Liu et al. [35] use a combination of hierarchical
the scatterplot to identify a shape, which results in new non-linear
clustering and different local measures to guide the user in
high-dimensional axes to match the user’s intentions. Since the
manipulating the projection and testing hypotheses about the data.
resulting dimension contributions to the axes are not uniform, it
Probing Projections [36] simulates a correction of the points’ posi-
is not possible to represent them using simple means such as bar
tions according to a reference point, showing how a more accurate
charts. In our Dimension Correlation tool, the user also draws a
projection would look like. Fernstad et al. [37] use combinations
polyline to identify a shape, but our intention is exactly the oppo-
of quality measures to determine the most interesting dimensions
site of AxiSketcher: we want to capture dimension contributions in
of the data and guide user exploration. t-viSNE is similar to these
an easy and accessible way. For this, we project low-dimensional
works in its use of measures to guide the user’s exploration, but
points into the line (not high-dimensional ones, as in AxiSketcher),
we use measures and mappings that are either specific to t-SNE’s
and we compute the dimension contributions in a different way,
algorithm or customized to be more useful in this scenario. For
using Spearman’s rank correlation. In summary, although there is
more details about the assessment of quality in projections, we
a superficial similarity between the two techniques regarding how
refer the reader to Nonato and Aupetit’s recent survey [16].
the user interacts with the scatterplot, their goals and their inner
workings are quite different. Since t-viSNE adopts an approach of
2.3 Interpretation of Projections combining many different coordinated views, it is important for
Some attempts to enrich scatterplots with automatically-derived the Dimension Correlation to maintain—as much as possible—
statistical descriptions of patterns [38], [39], [40] have shown the users’ mental map of the projection, and to give simple and
that static mappings may be useful in simple scenarios, but the straightforward interpretations of the patterns they see.
complex relations between low- and high-dimensional space in
non-linear projections cannot be well represented. In such cases,
interactive visual interfaces are necessary, as noted by Sacha et 2.4 Comparison with Other Tools
al. [15] in their survey on interaction techniques for DR. Inter- Other than the ones discussed so far, some interactive tools have
active solutions for specific domains such as text [19], [20] and been designed with either specific DR methods in mind, such as
images [7], [41] use inherent characteristics of the data in order SIRIUS [49], and FocusChanger [50], or for specific domains,
to explain layouts, however, they are not easily generalizable to such as Cytosplore [11]. t-SNE can also be used to explore and
other domains. In their tool, Coimbra et al. [42] support interactive judge different clustering partitions of the same data set, as in
exploration of 3-D projections using adapted biplots and different Clustervision [51].
widgets for viewpoint selection. Our tool is similar to theirs SIRIUS [49] focuses on the concurrent exploration of sim-
from the perspective of providing a collection of interconnected ilarity relationships between instances and between dimensions,
analyzing their relationship and providing interaction techniques its distance to x j , then the same but centered on x j , and finally
via a dual visualization approach, with two coordinated side-by- combining both. That translates to high probabilities for near
side scatterplots. In t-viSNE, we focus on bringing forward hidden neighbors and very small probabilities for farther ones. One of
information about the DR algorithm that is usually lost, with all the the most important things to notice from Equation 1, however, is
interactions occurring in a single main scatterplot view (and some that the variance of the Gaussian, i.e., σi , is different for each xi :
additional auxiliary views). One of our goals is to also support that means that the bandwidth of the Gaussian changes for each
the user in testing the quality of the algorithm to increase its high-dimensional instance, in order to capture the variations in
trustworthiness, a task that is not supported by SIRIUS. density for different high-dimensional neighborhoods. This value
FocusChanger [50] empowers users to perform local analyses is found iteratively by trial-and-error, using binary search, until
by setting Points of Interest (POIs) in a linear projection, which a user-defined perplexity is reached, with Perp(i) = 2H(i) and
is then updated to enhance the representation of the selected H(i) = − ∑ j p j|i log2 p j|i .
POIs. When hovering over specific points, the information of true Considering P as the joint distribution including all pairwise
neighborhood of other points is mapped to the saturation of the probabilities computed according to Equation 1, the goal of t-SNE
color. This allows for a simple mechanism of quality assessment, is, then, to find another probability distribution Q that faithfully
but hurts the possibility of using color for other mappings and represents P in low-dimensional spaces, usually in 2-D or 3-D (to
requires pointwise interaction. The used projections are linear and, allow for their straightforward visualization). Each pair of low-
thus, potentially not as representative and useful as t-SNE. Similar dimensional points (yi , y j ) is also modeled as a probability, now
to Andromeda, it relies on the possibility of quickly updating called qi j , as
them, which might not be currently feasible with t-SNE. (1 + kyi − y j k2 )−1
qi j = . (2)
Cytosplore [11] is an example of tools that use t-SNE for visual ∑k6=l (1 + kyk − yl k2 )−1
data exploration within a specific domain: single-cell analysis with
Instead of using Gaussians again, a Student’s t-distribution
mass cytometry data. Apart from showing a t-SNE projection
with one degree of freedom is used for Q. Notice that, as opposed
of the data, Cytosplore is also supported by a domain-specific
to P, the distribution of Q is not parameterized with a variable
clustering technique which serves as the base for the rest of the
neighborhood density (i.e., there is no σi ). This means that,
provided visualizations, but is not generalizable to other domains.
potentially, neighborhoods with very different densities in the
Clustervision [51] is a visualization tool used to test multiple
original high-dimensional space may be mapped into areas of
batches of a varying number of clusters and allows the users to
equivalent size in the low-dimensional representation.
pick the best partitioning according to their task. Then, the di-
The search for a Q that faithfully represents P in a low-
mensions are ordered according to a cluster separation importance
dimensional space is done by optimizing a cost function (C)
ranking. As a result, the interpretation and assessment of the final
given by the Kullback-Leibler (KL) divergence between the two
results are intrinsically tied to the choice of clustering algorithm,
distributions,
which is an external technique that is (in general) not related to the
pi j
DR itself. Thus, the quality of the results is tied to the quality of C = KL(PkQ) = ∑ KL(Pi kQi ), KL(Pi kQi ) = ∑ pi j log , (3)
the chosen clustering algorithm. With t-viSNE it is also possible i j qi j
to explore the results of a clustering technique by, for example, which is performed with gradient descent for a user-specified
mapping them to labels, then using the labels as regions of interest number of iterations. In each iteration, every point yi is adjusted
during the interactive exploration of the data. However, the labels towards the direction of the largest decrease in its associated cost
do not influence the results of t-viSNE, whether they exist or not, KL(Pi kQi ), i.e., the Kullback-Leibler Divergence (KLD) between
since we did not intend to tie the quality of our results to other the low-dimensional neighborhood of yi and the high-dimensional
external (and independent) techniques. neighborhood of xi . Computing this cost involves comparing yi
with all other points, which results in a complexity of O(N 2 ). The
3 OVERVIEW OF T-SNE final remaining cost C after the optimization is, then, a sum of all
the remaining costs KL(Pi kQi ).
All the details about t-SNE’s algorithm have been exhaustively
It is important to notice that the original t-SNE algorithm has
described since its first publication (see, e.g., [52], [53], [54]).
been updated and accelerated in many different ways throughout
Here, we give a quick overview of the general steps of the algo-
the years, most famously by the original author [52], but also by
rithm and focus mostly on the specific details that are important
other researchers using techniques such as approximations [53]
for understanding the features of our tool.
and parallel computing [54]. These newer versions give mostly
The input to t-SNE is an n × N data matrix X, composed of
accurate results, but are not completely exact. Please refer to
a set of n instances xi (rows) in N dimensions (columns). As the
Subsection 7.2 for a discussion on why we chose to use the
first step, pairwise distances between instances are transformed
Barnes–Hut t-SNE algorithm in this paper [52].
into probability distributions that represent neighborhoods in the
following way. For every pair of instances (xi , x j ) with 1 ≤ i, j ≤ n,
a probability pi j is computed as 4 T- VI SNE: A V ISUAL I NSPECTOR OF T-SNE
p j|i + pi| j exp(−kxi − x j k2 /2σi2 ) Most of the related works described in Section 2 deal with
pi j = , p j|i = . (1) the problem of assessing and interpreting DR in general, and
2n ∑k6=l exp(−kxk − xl k2 /2σi2 ) aim to be applicable to a wide range of different scenarios,
Equation (1) can be interpreted as the probability that two providing solutions that overlook the specific shortcomings of
instances xi and x j would pick each other as close neighbors. each DR method. While this approach has its merits, a gap
It is roughly equivalent to centering a multivariate Gaussian remains regarding the treatment of method-specific problems that
around xi and setting p j|i to the Gaussian-transformed value of might lead to more directly-applicable results. However, very few
Metrics Performance
e
c k
Preservation, %
d g
Fig. 1: Visual inspection of t-SNE results with t-viSNE: (a) a panel for uploading data sets, choosing between two execution modes (grid
search or a single set of parameters), and storing new (or loading previous) executions; (b) overview of the results with data-specific
labels encoded with categorical colors; (c) the Shepard Heatmap of all pairwise distances; (d) the histogram with the Density and
Remaining Cost distributions; (e) list of available projections, ranked by quality; (f) the main scatterplot view representing the Density
of neighborhoods in the original high-dimensional space and the Remaining Cost of each point; (g) the Neighborhood Preservation bar
chart/line plot; (h) control elements for the different interaction modes of the tool; (i) the visual mapping panel with a variety of options
for the users such as an annotation tool for saving notes for multi-session analyses; (j) the Dimension Correlation bar chart visualizing
the correlations between the data dimensions; and (k) the Adaptive PCP plot representing the most important dimensions.
single DR methods have enough widespread acceptance to warrant a good t-SNE projection for their data by using visual exploration,
customized treatments (with the exception of PCA and MDS, for as follows. A Grid Search mode (Figure 1(a)) initiates a systematic
example). Nowadays, arguably, the situation has changed: t-SNE parameter search that computes 500 projections by varying the
is almost a standard DR method for both analysts and researchers. parameters perplexity, learning rate, and max iterations. From this
Due to this, it is our understanding that a set of methods that is pool of 500 projections, 25 representative examples are singled
specifically designed to meet t-SNE’s shortcomings deserves its out and shown to the user—in a matrix of thumbnails depicted
place among the current body of work in the interpretation and in Figure 2—as suggestions of possible projections of the data.
assessment of DR methods, and its potentials are large enough to In order to choose the representatives, we partition the pool of
deserve their own treatment. 500 projections into 25 clusters (with K-Medoids [55]), using
In this section we describe t-viSNE, a web-based system that Procrustes distance [56] as the dissimilarity measure. The medoids
implements an assortment of views and interaction tools that bring of the 25 resulting clusters are used as representatives. This whole
to light many facets of a t-SNE projection which are usually process is transparent to the user and happens in the backend;
hidden behind its black box. We aim to enhance the trust into and only the representatives are shown. We give extra support to
interpretability of t-SNE through visualization and exploration of the user by providing the results of 5 quality measures for each
the model, the data, and the hyper-parameters. An overall picture representative projection: neighborhood hit (NH), trustworthiness
of the interface is shown in Figure 1, and each of its different (T), continuity (C), normalized stress (S), and Shepard diagram
views is described below, divided into our four design goals: correlation (SDC), accompanied by the quality metrics average
Hyper-parameter Exploration (G1), Overview (G2), Quality (G3), (QMA). They are shown as a grayscale heatmap under each cell
and Dimensions (G4). Further discussions on the design choices of the thumbnail matrix (Figure 2). For more details on the quality
behind some of the views can be found in Subsection 7.1. measures, please refer to [9]. It is important to clarify, however,
that these quality measures are offered only as a support for the
4.1 Goal 1: Hyper-parameter Exploration visual analysis. The main goal here is not to show the 25 best
Significantly-different t-SNE projections can be generated from projections, but the most diverse ones; it is then the task of users—
the same data set, due to its well-known sensitivity to hyper- through visual exploration and by matching their own personal
parameter settings [14]. We propose to support users in finding preferences—to choose the one that looks more promising.
4.2 Goal 2: Overview

The main view of the tool (Figure 1(f)) presents the t-SNE
results as an interactive scatterplot, with specific mappings on
the points’ colors and sizes (see Subsection 4.3 for details).
There are four Interaction Modes (Figure 1(h)) for this view, as
described next. The first (and default) mode—t-SNE Points Ex-
ploration—activates panning, zooming, and hovering, supporting
the user to focus on individual patterns of the projection, and
to investigate specific points’ dimensions. The second mode—
Group Selection—provides a lasso selection tool that triggers
updates in other views, such as the Neighborhood Preservation
view (Subsection 4.3) and the Adaptive PCP (Subsection 4.4).
The third option—Dimension Correlation—provides a tool for the
user to check the hypothesis that a visual pattern, as observed,
is strongly correlated to a pattern in the high-dimensional space
(Subsection 4.4). The final mode—Reset Filters—removes every
filter applied with the previously-described interaction modes.
To complement the main view, the Overview (Figure 1(b))
shows the static t-SNE projection and serves as a contextual
anchor that is independent of the interactions and/or filters applied
to the main view. Data-specific labels (when those exist) are shown
using a categorical colormap, along with simple statistics about the
data set.
4.3 Goal 3: Quality

Before the users move on with a more detailed interpretation of
Fig. 2: Hyper-parameter exploration (presented in a dialog at the patterns that are visible in the scatterplot resulting from t-
the beginning of an analytical session), with 25 representative SNE’s projection, it is important that they trust what they see. We
projections from a pool of 500 alternatives obtained through a grid approach the investigation of quality both globally, with simplified
search. Five quality metrics, plus their Quality Metrics Average and aggregated views for the entire projection, and locally, so that
(QMA), are also displayed to support the visual analysis. The the users can check if specific visible patterns are indeed present
thumbnails are sorted according to the QMA and ordered row-wise in the original space of the data set.
from top to bottom. The currently-selected projection is indicated
Shepard Heatmap A Shepard Diagram [57] is a common
by a red box (top row, third column).
way of assessing the accuracy of a visualization produced by a
projection method. It consists of a scatterplot where each point
represents a pair of instances from the data set. The value of the
y-axis indicates their distance in the N-dimensional (N-D) space,
and the x-axis their 2-D distance. Both axes are scaled between
After choosing a projection, users will proceed with the visual 0.0 (minimum distance) and 1.0 (maximum distance), with the
analysis using all the functionalities described in the next sections. origin located on the top-left. For large data sets, however, such a
However, the hyper-parameter exploration does not necessarily scatterplot may become hard to read due to the very large number
stop here. The top 6 representatives (according to a user-selected of points (in the order of n2 ). To avoid this clutter problem and
quality measure) are still shown at the top of the main view increase the readability of the Shepard Diagram for large data
(Figure 1(e)), and the projection can be switched at any time if sets, we propose the Shepard Heatmap (Figure 1(c)), which is an
the user is not satisfied with the initial choice. We also provide the aggregated version of the Shepard Diagram, with the information
mechanism for a selection-based ranking of the representatives. of the number of points in each cell mapped to a single-hue
During the exploration of the projection, if the user finds a certain colormap.
pattern of interest (i.e., cluster, shape, etc.), one possible question The main goal of the Shepard Heatmap is to offer a broad,
might be whether this specific pattern is better visible or better simplified overview of the accuracy of the projection in terms
represented in another projection. After selecting these points, of distance preservation: cells close to the main diagonal of the
the list of top representatives can be ranked again to contain heatmap indicate that the respective pairs of instances have been
the projections with the best quality regarding the selection (as represented in the 2-D space with distances that are comparable
opposed to the best global quality, which is the default). The to their original N-D distances. Although it is well-known that t-
way this “selection-based quality” is computed is by adapting SNE’s goal is not to preserve distances [10], but neighborhoods,
the global quality measures we used, taking advantage of the the Shepard Heatmap still provides useful information to the
fact that they all work by aggregating a measure-specific quality analyst: if the cell values are closer to the y-axis than to the x-axis,
computation over all the points of the projection. In the case of the then a large part of the data has been compressed, i.e., a diverse
selection-based quality, we aggregate only over the selected points range of distances from the N-D space have been represented with
to reach the final value of the quality measure, which is then used small distances in 2-D. The opposite scenario (cell values being
to re-rank the representatives. closer to x than y) indicates that a small range of N-D distances
low
points (or groups of points) were harder to optimize according
1 to t-SNE’s cost function and, thus, affects the perception of the
3
local trustworthiness of different areas of the projection. A simple
Density
1 3
example is shown in Figure 4, using the well-known Iris data
2
2
set [58]. A group of points with high remaining cost can be
high found in the middle of the largest cluster in Figure 4(a). This
cluster is, actually, a mix of two different species (versicolor and
(a) PCA (b) t-SNE (c) t-viSNE
virginica), and the points with high remaining cost belong to the
Fig. 3: The importance of the visual mapping of Density, using area where the two species are mixed, indicating instances where
three 5-D Gaussian clusters with varying standard deviations and the 2-D mapping might not be as straightforward as the rest. The
slight overlap. (a) A simple linear projection using PCA shows dimensions of the selected points are highlighted in Figure 4(b)
the clusters’ varying density. (b) A t-SNE projection shows all using a PCP (see Subsection 4.4), confirming that these points
clusters with roughly the same size. (c) t-viSNE accurately shows are indeed characterized by dimension values that are relatively
the densities of the clusters (color-encoded) and helps us identify, common to both species, which makes them harder to separate
for example, that clusters 2 and 3 are separate. into isolated clusters.
(a) Points with high remaining cost (b) Dimensions of selected points
versicolor virginica setosa
have been spread in a wide spectrum of distances in the 2-D
visualization.
Mixed labels
Selection
Visual Mapping The Visual Mapping panel (Figure 1(i))
includes controls for mapping Density (1/σi ) and Remaining
Cost (KLD(Pi ||Qi )) of each point to either color or size in the
main view. These correspond to information extracted from the (c) Neighborhood Preservation of selected points
Projection average
t-SNE algorithm itself, which would otherwise be hidden from Selected points
high Number of neighbors
the analyst. Their inspection, however, may prove fruitful, as we
describe next.
Fig. 4: Investigation of a group of points from the well-known
As we discussed in Section 3, when t-SNE models the N-D Iris data set [58]. (a) The points’ sizes indicate that a region
space as probability distributions, each instance is assigned a in-between the species versicolor and virginica has the highest
different σi that represents the Density of that instance’s orig- Remaining Cost. (b) The points have similar dimension values, but
inal neighborhood. However, during the projection to the low- are classified as different species. (c) Neighborhood Preservation
dimensional representation (2- or 3-D), this information is usually starts high (for close neighbors), but steadily decreases.
lost, and neighborhoods with different densities appear to be very
similar. Consider the simple example from Figure 3, where three
5-D Gaussian clusters (with varying densities) are projected into Neighborhood Preservation Since the proposal of non-linear
2-D using PCA and t-SNE. The linear projection of PCA shows DR methods, the idea of prioritizing the preservation of close
quite clearly that the clusters have different densities. The t-SNE neighborhoods instead of pairwise distances in projections has
projection, on the other hand, shows three clusters that are basi- been accepted as a positive trade-off, especially in visualiza-
cally identical. We propose to recover this lost density information tion scenarios. The t-SNE algorithm also follows this idea: by
by extracting the values of σi from the t-SNE process and mapping transforming the pairwise distances into probability distributions
them on top of the points (using a sequential colormap, by default). using Gaussians (cf. Section 3), it aims to preserve only the
The actual mapping is done with σi−2 , so that higher densities closest neighbors of each point, effectively ignoring farther ones.
(lower values of σi ) are mapped to higher values. As an example Due to this, the ability to investigate the extent to which such
of the practical consequences of such a mapping, the visualization neighborhoods are preserved is one important piece of the puzzle
of the different density profiles of clusters 2 and 3 in t-viSNE that forms a full assessment of the accuracy of a t-SNE projection.
(Figure 3(c)) helps to identify that they are separate clusters and We present a Neighborhood Preservation plot (Figure 1(g))
not a single large one, which could have been an erroneous insight that shows an overview of the preservation of neighborhoods of
in case no extra information was available (as in Figure 3(b)). different sizes (k) in both the entire projection and the current
The second option of the Visual Mapping panel, the Remaining selection, based on the Jaccard distance between sets:
Cost, indicates (in the points’ sizes, by default) the final value 1 νk2 (i) ∩ νkN (i)
of KLD(Pi kQi ) for each instance xi , i.e., the remaining cost NPk = ∑ · , (4)
i n νk2 (i) ∪ νkN (i)
for each instance after the last iteration of t-SNE’s optimization
procedure (see Section 3). It is common for the information of where νk2 (i) is the k-neighborhood of instance i in 2-D, νkN (i) is
the overall remaining cost (KLD(PkQ)) to be used as a direct the k-neighborhood of instance i in N-D, and n is the number
judgment of the projection’s quality. However, this perspective of selected points (or the size of the data set, if nothing is
is limited, because a low overall remaining cost does not mean selected). For each value of k, NPk yields the average preservation
that the entire projection is equally good (and vice-versa for a of neighborhoods of up to k points, centered at the n selected
high overall remaining cost). This is related to the idea of local points (or for the entire projection, if nothing is selected). This is
quality measures that has been motivated and explored in different an aggregated and interactive adaptation of ideas introduced by,
previous work (see Section 2), and shares the potential advantages for example, Joia et al. [7] and Martins et al. [33]. The default
of these measures. Hence, it allows the analyst to investigate which visualization for the Neighborhood Preservation is a bar chart (as
described below), but users have two more options to visualize (a) Points user-drawn path (b) Comparison of orderings (c) Dimension Correlation
the same information using line plots (see Subsection 7.1 for a 𝜌 User
7 3 4 2 1 8 5 9 6
3 Dim. 2 0.817
discussion and comparison). 7 Dim. 2 3 4 5 1 2 8 6 9 7
The black bars are always fixed, showing the average preser- 2 Dim. 3 -0.450
9 1 6 5 9 3 4 8 2 7
4 Dim. 3
vation for all points of the projection. For example, in Figure 4(c), 1 Dim. 1 0.016
8 5 6 Dim. 1 4 1 2 5 9 3 6 8 7
the relatively tall black bars starting from the point k = 20
mean that, on average, neighborhoods of 20 points or more are
well preserved. The same rationale applies to the gray-colored Fig. 5: The Dimension Correlation tool. (a) Nearby points are
bars. However, their values change in connection with the lasso projected to a user-drawn path, creating a user-induced ordering.
selection, so that they always show an up-to-date view of the Here 7, 3, 4, and so on are data instance IDs. (b) The user-induced
Neighborhood Preservation centered at the selected group of ordering is compared to dimension-specific orderings using a
points. This allows the analyst to compare them to the rest of correlation measure. (c) Results are shown in the lengths of bars,
the projection to get a relative assessment, which is important ordered by the absolute value of the correlation (with highest on
since there are no absolute rules as to how much preservation top). Note that if the same polyline is drawn by the user in the
is good or bad; such insights depend on the scale of the data opposite direction over a pattern, then the signs of the correlations
set and of each high-dimensional pattern. In Figure 4(c), for change but not their magnitude.
example, the tall gray-colored bars around k = 4 mean that, on
average, neighborhoods of around 4 points are well preserved for
the selected points. This is in contrast to the overall preservation, Apart from the adaptive filtering and re-ordering of the axes,
which starts low and grows slowly with k. Since the selected points we maintained a rather standard visual presentation of the PCP
are positioned at the border between the two species clusters, they plot, to make sure it is as easy and natural as possible for users to
have very close near neighbors (i.e., points which are located in- inspect it. The colors reflect the labels of the data with the same
between species), but as the value of k grows, their neighborhoods colors as in the overview (Subsection 4.2), when available, and
become more mixed and, thus, less well-preserved. the rest of the instances of the data—which are not selected—are
shown with high transparency. Each axis maps the entire range
4.4 Goal 4: Dimensions of each dimension, from bottom to top. A simple example is
Having established trust in the visualization, the users then pro- given in Figure 4(b), where we can see that the dimensions of
ceed to identify and investigate the visible patterns from the the selected points roughly appear at the intersection between two
projected data. One of the most common analytical tasks in any species, versicolor (brown) and virginica (orange).
DR-based workflow is, for example, to identify clusters of similar Dimension Correlation Supporting the interpretation of clus-
points [16], with the goal of detecting patterns in the organization ters is definitely one important step towards interpreting t-SNE,
of the data in the high-dimensional space. Irregularly-shaped but it does not cover the entire picture. As it has been noted by
clusters are also of interest [14], which suggests that the points’ Wattenberg et al. [14], t-SNE commonly generates visual patterns
organization along a non-linear multidimensional axis might be with different shapes, which may or may not faithfully represent
relevant. The problem of explaining the reasons why those clusters the actual shapes of the original high-dimensional patterns. It
are formed is tackled by a number of t-viSNE views that are is natural to expect that the user, upon seeing an oddly-shaped
described next. pattern, will come up with different hypotheses about why that
Adaptive Parallel Coordinates Plot Our first proposal to shape exists, or at least will be curious to try to understand what
support the task of interpreting patterns in a t-SNE projection exactly caused such a shape to appear.
is an Adaptive PCP [59], as shown in Figure 1(k). It highlights We propose the Dimension Correlation tool, a novel interactive
the dimensions of the points selected with the lasso tool, using tool to explore and interpret such shapes in a t-SNE projection. It
a maximum of 8 axes at any time, to avoid clutter. The shown is triggered by a user interaction that consists of drawing a polyline
axes (and their order) are, however, not fixed, as is the usual case. with the mouse (i.e., a sequence of connected line segments),
Instead, they are adapted to the selection in the following way. following the shape of the pattern detected by the user. After the
First, a Principal Component Analysis (PCA) [1] is performed polyline is finished, all points within a user-defined range ρ of
using only the selected points, but with all dimensions. That yields the polyline are selected and “projected” onto the polyline, in the
two results: (1) a set of eigenvectors that represent a new base following way (cf. Figure 5): (1) we find the minimum distance dip
that best explains the variance of the selected points, and (2) a between each point i in the scatterplot to the polyline p, defined
set of eigenvalues that represent how much variance is explained as the minimum distance from i to any segment of p; (2) every
by each eigenvector. Simulating a reduction of the dimensions point i such that dip > ρ is discarded; and (3) for the remaining
of the selected points to 1-Dimensional space using PCA, we points, we find the point pi that is the projection of i into p, i.e.,
pick the eigenvector with the largest eigenvalue, i.e., the most the projection of i into the segment of p that is closest to i.
representative one. This N-D vector can be seen as sequence w If we ignore the actual distances between the points pi ob-
of N weights, one per original dimension, where the value of tained along the polyline, a user-defined ordering can be induced
w j indicates the importance of dimension j in explaining the (or extracted) for the points i that were not discarded during the
variance of the user-selected subset of the data. Finally, we sort process (cf. Figure 5(b)). This is one possible way of modeling—
w in descending order, then pick the dimensions that correspond in a simple and unambiguous way—the shape of the visual
to the first (up to) 8 values of the sorted w. These are the (up to) 8 pattern perceived by the user. Based on this ordering, we can then
dimensions shown in the PCP axes, in the same descending order investigate which dimensions are more correlated to the pattern,
(from left to right). i.e., are more relevant to explain its significance. For that, we first
generate a set of dimension-specific orderings for the same points the UCI machine learning repository [58]. The data set contains
i that were projected onto the polyline, using the values of these measurements for 699 breast cancer cases, labeled into benign or
points along each dimension for the ordering (cf. Figure 5(b)). For malignant cancer. The nine dimensions included in this data set
example, in a data set X, for the dimension-specific ordering of are cytological characteristics rated from 1 to 10 (higher means
dimension j, the values Xi, j will be used (for the selected points closer to malignant) when the instances were collected. However,
i). The relevance of each dimension is then defined as the absolute she read on the Internet that t-SNE is a complex algorithm, and
value of the correlation between its dimension-specific ordering most of its decisions are hidden from the user perspective. After
and the user-defined ordering of the points i, which is equivalent finding that t-viSNE allows her to interpret and assess t-SNE’s
to the Spearman’s rank correlation coefficient [60]. We use the results, she decides to use it.
absolute value here, because the fact that the correlation is positive
Overall Accuracy Anna loads the data into t-viSNE and starts
or negative is not critical. A strong negative correlation simply
the hyper-parameter exploration with a grid search. After the
means that the pattern goes in the opposite direction of the one
execution, she sees several projections that accurately separate
used when drawing the polyline.
the two classes. As she does not have any special preference,
The results (i.e., relevances of each dimension) are finally
she selects the top-left projection, because the projections are
shown in an interactive horizontal bar chart (Figure 1(j)), where
sorted from best to worst based on the average of all the provided
the dimensions are sorted from top to bottom according to rele-
quality metrics. After the resulting scatterplot is loaded in the
vance (with the most relevant on the top). While the relevance is
main view, she starts to investigate the overall quality by looking
computed using the absolute value of the correlation, we decided
at the Shepard Heatmap, see Figure 6(b). Most values are situ-
to show the original value in the bars (including negative correla-
ated along the diagonal of the heatmap, which—as she learned
tions to the left of the central axis) to avoid possibly misleading
from the documentation of the tool—suggests that it is a rather
the analyst. This is illustrated in Figure 5(c). The final component
accurate projection. Also, by examining the distribution of points
of the Dimension Correlation tool is the ability to explore the
by color in the overview (Figure 6(a)), she gets the impression
different dimensions by clicking on the bars, which will change
that the points are mostly correctly arranged into two classes
the colormap of the main view to reflect the values of the points
(malignant cancer cases on the left and benign cancer cases on the
for that specific dimension.
right). Since labels are not used by t-SNE (it is an unsupervised
It is important to notice that the goal of the Dimension Corre-
technique), this further supports her initial assumption that the
lation tool is not to dictate exactly which are the dimensions that
produced results are accurate.
cause the formation of a shape in a t-SNE projection. We propose
When she looks at the main view again, one thing catches her
a way to suggest the most interesting dimensions according to a
eye: there is quite a difference in density between the two large
detected visual pattern, in order to help analysts to prioritize the
clusters of points (as shown by the points’ colors in Figure 6(c)).
dimensions they will investigate further. Mapping the values of
The cluster to the left (mostly malignant cases) has low density
specific dimensions on top of the points of the scatterplot (usually
in general, as opposed to the cluster to the right (mostly benign
with colors) is a common way to try to find relationships between
cases), which seems to be quite sparse. “That is strange,” she
dimensions and patterns during the exploration of DR projections.
thinks. It seems to be the opposite of what the t-SNE projection
Without any support, it is also usually a cumbersome activity for
is showing, since the cluster to the left looks more compact
high-dimensional data sets, requiring analysts to cycle through a
in the projection. It also indicates that benign cases are more
large number of dimensions. Our intention with the Dimension
homogeneous in the high-dimensional space, being closer to each
Correlation tool is to work towards closing this gap.
other than the malignant cases, and it could also mean that
malignant cases have a less clear profile.
5 U SE C ASES
Interpretation of Clusters Anna is satisfied with her initial
In this section, we demonstrate how our tool can support users to
look at the data set through the projection, but one question comes
better understand the general behavior of t-SNE and to validate
up in her mind: how did the algorithm manage to separate the cases
the quality of t-SNE results by presenting a typical usage scenario
between benign and malignant? If she would understand how that
and a more detailed use case, both based on data sets from the
worked, she might not only be able to validate if the results make
medical domain. This section follows the methodology from Ming
sense, but also use that knowledge to better understand the dif-
et al. [61] in order to showcase our tool’s abilities to open the black
ferences between the cases in terms of cytological characteristics.
box of an ML approach in a similar way. However, the usage tasks
Anna uses the Dimension Correlation in order to determine the
discussed in the following are very different due to our use of the
role of the data set’s dimensions in the outcome of the projection.
unsupervised t-SNE algorithm, in contrast to their investigations
She interactively draws a polyline with her mouse following the
of supervised ML techniques.
pattern from the benign cases to the malignant ones, as shown
in Figure 6(c). By looking at the Dimension Correlation view
5.1 Usage Scenario: Understanding a Cancer Classifier (see Figure 6(d)), she observes that “mitoses” is the least important
Anna is a medical student who is enthusiastic about becoming dimension due to its weak correlation (approximately 18%). She
a specialist in identifying and treating breast cancer. She heard validates her hypothesis by clicking on the “mitoses” dimension
about a DR algorithm called t-SNE, and she is eager to know and observing that the actual dimension values look almost ran-
if it can help her to identify cancer cells accurately. Personally, domly distributed throughout the projected points. Afterward, she
Anna does not completely trust the decisions made from automatic resets the current selection and draws two new polylines, which
algorithms (such as classifiers), so she would prefer to use an are perpendicular to the previous one, through the points of (1)
interactive visualization. She decides, then, to use t-SNE to explore the malignant class (see Figure 6(e)) and (2) the benign class (not
the Breast Cancer Wisconsin data set which she downloaded from shown due to space limits). For this new investigation, she is only
(a) (c) (e) (g) Number of Selected Points: 10/699
Number of Selected Points: 5/699

(b)
(b) (d) (f) (h)
Bland Chromatin Normal Nucleoli
Single Epithelial Cell Size Size Uniformity
Normal Nucleoli Bare Nuclei
Mitoses Shape Uniform.
Analytical Flow
Overall Accuracy Interpretation of Clusters Investigation of Outliers
Fig. 6: Usage scenario based on the Breast Cancer Wisconsin data set. The Overview (a) and the Shepard Heatmap (b) indicate that
the overall accuracy is good. The high density of benign cases (c) seems to indicate that their high-dimensional profile is clearer and
less diverse than malignant cases, which are more sparse. Different combinations of dimensions are correlated with patterns between
clusters (c, d) and inside clusters (e, f), which affects the interpretation of clusters. The investigation of outliers leads to identifying
points that are hard to classify due to class mixing (g) and groups with identical dimension values (h).
interested in the highest correlations, so she sets a threshold for a large density, she learns that t-SNE formed this and more mini-
minimum of 20% for a correlation to be visible. For the first case clusters in different areas of the projection as a result of their high
(1), it appears that t-SNE separates the malignant class according (usually identical) similarity.
to “normal nucleoli,” “size uniformity,” and “shape uniformity”
in one area—as explained in Figure 5—and the other area due 5.2 Use Case: Improving Diabetes Classification
to “bare nuclei” (Figure 6(f)). The order and direction of the In our use case, we chose the Pima Indian Diabetes data set [62]
produced bar charts (in accordance with the orientation of the to illustrate how t-viSNE can lead to a better overview, quality
initially-drawn shape) allowed her to reach this conclusion. In the of the results, dimension understanding, and even performance
second case (2) (not included due to space constraints), she spotted improvements. The data set includes 768 female patients of Pima
that there is a pattern of a rapid increase in the “clump thickness” Indian heritage, aged between 21 to 81. The main task in this
(more than 80% correlation) when going from the middle-left example is to classify the patients into positive (which have
side to the bottom side of the cluster with the benign classified diabetes; 268 data points) or negative to diabetes (i.e., healthy;
points. “This is new,” she thinks. These connections between the 500 data points). Every data instance contains eight dimensions:
dimensions and the formation of the clusters are something she the number of times each patient/person was pregnant and their
was previously not aware of. age, plasma glucose concentration level, diastolic blood pressure,
Investigation of Outliers Next by looking back at the t-SNE skin thickness, insulin level, body mass index (BMI), and diabetes
overview, she identifies a red-colored instance positioned far away pedigree function (DPF), which is a function measuring the
from the rest of the malignant points, which grabs her attention hereditary or genetic risk of having diabetes.
(Figure 6(a), bottom). She thinks it might be an error in the Overall Accuracy We start by executing a grid search and,
projection, and decides to examine it closer by selecting a few after a few seconds, we are presented with 25 representative
points around the potential outlier with the lasso selection (only projections. As we notice that the projections lack high values
one point in the selection is malignant, while all others are benign). in continuity, we choose to sort the projections based on this
The PCP view adapts to the selection (Figure 6(g)), and she is able quality metric for further investigation. Next, as the projections
to acknowledge that, indeed, these points have very similar values are quite different and none of them appears to have a clear
for most dimensions, so the seemingly erroneous positioning of advantage over the others, we pick one with good values for all the
the point was not t-SNE’s fault. “These points are very similar, rest of the quality metrics (i.e., greater than 40%). The overview
which means it must be hard to decide exactly where they belong,” in Figure 7(a) shows the selected projection with three clear
Anna thinks. She is glad she could investigate them further and clusters of varying sizes (marked with C1, C2, and C3). However,
check their dimensions with interactive visualization; an automatic the labels seem to be mixed in all of them. That means either the
procedure might have simply misclassified that instance, with no projections are not very good, or the labels are simply very hard
clear explanation of why that happened. Finally, when zooming to separate. By analyzing the Shepard Heatmap (Figure 7(b)), it
into the main scatterplot view, she discovers a larger number seems that there is a distortion in how the projection represents the
of mini-clusters, such as the one shown in Figure 6(c), where original N-D distances: the darker cells of the heatmap are above
compact points form a tight subcluster at lower zoom levels. the diagonal and concentrated near the origin, which means that
By looking at the PCP again (Figure 6(h)), she realizes that the lowest N-D distances (up to 30% of the maximum) have been
these points are all exactly the same (i.e., they have the same represented in the projection with a wide range of 2-D distances
dimension values). After investigating similar subclusters with a (up to 60% of the maximum). While it may be argued that the
Low-density
2 Low-density
Low-density Insulin
cluster cluster
”tip”
C2 3 High cost
C1
C3
Zoomed in view (d)
0.4
(a) High remaining cost 1
0
1 0.6
n
2
igi
0
Or
0.6
3
(e) 0 20 30 40
Di
4
ag
Insulin
on
User-drawn polyline
al
Glucose
BMI
(b) (c) 4 (f) (g)
Fig. 7: Use case based on the Pima Indian Diabetes data set. Although there are three separate clusters C1–C3, the class labels are
mostly mixed (a), and the Shepard Heatmap (b) indicates that smaller N-D distances are spread out in 2-D. Some insights about the
clusters (c): C1 has a small area with high remaining cost (d); C2 has a clearly-distorted shape that is highly correlated with the Insulin
dimension (f, g); and C3 is tight in the projection, but sparse (low density) in N-D. All (red-colored) selected areas show, in general,
good Neighborhood Preservation (e) starting from k = 20, except for the 1 subcluster in C1 that decreases as k increases.
data is too spread in the projection, we must always consider draw a polyline that simulates a “skeleton” of C2’s shape (user-
that t-SNE’s goal is not to preserve all pairwise distances, but drawn polyline in Figure 7(c)). The results show high correlation
only close neighborhoods. The projection has used most of its for the “insulin” dimension along our drawn path, with a value of
available 2-D space to represent (as best as possible) the smallest just below 70% (Figure 7(f)), and low correlation with all other
N-D distances, which can be considered a good trade-off for this dimensions. Finally, we click on the bar to indicate that we want
specific objective. In the following paragraphs, we concentrate on this specific dimension’s values to be presented, which results in a
some of the goals described in Subsection 4.3 and Subsection 4.4 clear color gradient from the bottom to the top of C2 (Figure 7(g)).
for each of the three clusters. This color gradient corresponds to increasing levels of insulin, as
can be seen in the color legend. We can then interpret that the
C1: Remaining Cost Looking at the main view (Figure 7(c), insulin dimension has a high correlation with the formation of this
1 ), we detect an area on the top of cluster C1 with slightly specific shape.
increased size for a few points (in comparison to the other points in
the same cluster), which means there are high values of remaining C3: Densities The next step in our analysis is to confirm
cost in this small area. This is usually a sign of a badly-optimized if the layout of the points accurately represents the original
area that should not be trusted. To confirm that, we look at N-D densities of the clusters. By inspecting the distribution of
the KLD distribution (Figure 7(d)): the vast majority of points colors over the points in the main view (Figure 7(c)), we can
are located between 0.1 to 0.6 on the x-axis. This means that see that each cluster has a different density profile: C1 presents
those were very well optimized (notice that the y-axis is in log the most dense neighborhoods, C2 has average-to-high density
scale). Only a handful of points show higher costs, and those throughout most of its points (with a small tip with very low
few larger points in C1 belong to this group. Additionally, when density), and C3 has low density overall. This quick look is
we inspect the Neighborhood Preservation plot (Figure 7(e)), we enough to catch two interesting insights: we confirm that the
see that the badly-optimized area has lower values compared to neighborhoods with highest densities (i.e., containing the smallest
the projection’s average, but the values decrease even more after pairwise distances) are indeed spread out by the projection, as
k > 26 in contrast to those around k = 20. That means these points we had initially hypothesized from the Shepard Heatmap; and
are not well-positioned compared to both very close neighbors we detect a quite counter-intuitive phenomenon where the areas
and the entire projection. These two aspects of our investigation with the lowest density in N-D (or the most sparse areas) are
confirm our reservations against this area. represented in 2-D in the most compact neighborhoods (marked
C2: Interpretation of Patterns One salient pattern that stands in Figure 7(c) as “low-density ‘tip’ ”
2 and “low-density cluster”
out in the projection (Figure 7(c)) is the long curved shape of 3 ). By inspecting the Neighborhood Preservation of the latter
cluster C2. As opposed to C1 and C3, which look like ordinary low-density area in C3 (Figure 7(e)), we have more evidence
(formless) clusters, the points in C2 have been laid out in the 2-D that this insight is indeed correct, since the small “low-density
projection in an elongated shape going from top to bottom, with cluster” 3 starts with relatively high preservation from k = 20
slight curves to the right and then to the left. It would be natural and becomes even larger with a peak around k = 30. We can
to hypothesize that there is some specific underlying factor in the conclude that, even though this area is sparse in N-D, it presents
data that caused this shape to happen, and to be curious as to high cohesiveness in its neighborhood, which causes t-SNE to
what exactly that factor is. Our proposed Dimension Correlation embed the corresponding points as a compact group.
tool was designed to answer such questions. For that, we first
Closing the Visual Analysis Loop A more detailed investiga- 50

Age Distribution Gender Distribution Completed Education
10 10
tion of C3 (Figure 7(c), 45
# of Participants
# of Participants
3 ) shows that some of the internal varia- 40 8 8
Years
35 6 6
30
tion of this cluster has not been well represented by t-SNE, since 25
20
4
2
4
2
0 0
the points are mostly overlapping. It appears that the variation of t-viSNE GEP Male Female Other None BSc MSc PhD
the other clusters was prioritized by the algorithm, leaving C3 with Experience with InfoVis Experience with DR Experience with t-SNE
15 15 10
# of Participants
# of Participants
# of Participants
the appearance of almost a single point. Such insights, found only 10 10
8
6
through the visual analysis, also contribute to the investigation of 5 5 4
2
0 0 0
the quality of the projection, and in t-viSNE they can be used Yes No A Lot A Little Never A Lot A Little Never
to trigger a search for an improved projection before the visual

analysis proceeds. Thus, we use a lasso selection to choose C3, Fig. 8: Statistics on the participants of our comparative user study,
then use the “optimize selection” button (see Figure 1(e), top right) split into the two groups using t-viSNE and Google’s Embedding
to identify the best projections for the selection. After sorting Projector (GEP).
the six results based on QMA, the chosen one can be seen in
Figure 1. The main difference between this new projection and
the previously-analyzed example is that perplexity is set to 10 Participants Our target group was data analysts who were
instead of 50, making the clusters much sparser. The values of interested in analyzing high-dimensional data, felt they needed
all the quality metrics are still high for this new projection, and better tools for the job, and preferably were familiar with either t-
cluster C3 can now be entirely explored without the necessity to SNE or DR in general. We reached for volunteers through relevant
zoom in (cf. Figure 1(f), highlighted). In Figure 1(g), values of mailing lists and contacting visualization research groups of three
k = 1 to k = 13 are high, demonstrating the good Neighborhood universities from Sweden, and the 28 respondents (19 researchers,
Preservation in C3. Also, Dimension Correlation investigation 6 students, and 3 practitioners) were assigned to two groups of 14
indicates that “BMI” and “glucose” are highly-correlated to C3 individuals: GEP and t-viSNE. The assignment was performed by
(see Figure 1(j)), and Figure 1(k) highlights the differences in the preserving—as much as possible—the balance between completed
dimensions and the instances of C3 in connection to the better education, previous experiences, and other characteristics, see Fig-
separated (compared to before) true labels, cf. Figure 1(b). ure 8. All of the participants except one had no color perception
issues. The one who reported a minor distinction problem between
almost identical shades of red and green confirmed having no
6 U SER E VALUATION problem to correctly perceive the specific color gradients when
In addition to the described use cases, we performed a comparative using the tool (t-viSNE). Therefore, we decided to keep those
user experiment in order to gather evidence on the effectiveness of results in the study. For more details of our participants, we refer
our visualization tool against another state-of-the-art alternative, to Figure 8.
Google’s Embedding Projector (GEP) [63], as described next. The
results of a pre-study, with a single group that tested only t-viSNE, Study Design Each participant took part individually (i.e., the
can be found in the supplemental material of this paper. study was performed asynchronously for each subject, in a silent
room), using the same hardware, and the study was organized into
6.1 Comparative User Experiment four main steps, which were identical for both groups except that
each interacted with the corresponding group’s tool (GEP or t-
The main goal of the study was to test if t-viSNE improved the viSNE). First, they were shown a video tutorial which discussed
usability and effectiveness of the exploration of high-dimensional t-SNE itself and the main features of the tool (cf. supplemental
data with t-SNE when compared to another state-of-the-art tool. material of this work). An illustrated transcription of this tutorial
Table 1 in Section 2 was used as the basis for an Analysis of was available at all times in the form of a printout. In the
Competing Hypotheses (ACH) [64], a methodology for the fair second step, after watching the video, the participants had a fixed
comparison of a collection of opposing hypotheses; in our case, the time slot to play with the tool without any specific goal, and
multiple different views by our tool in terms of the capabilities and to ask questions. After this time slot ended, no more questions
various possibilities they convey to the user. After the analysis, we were answered. The third step was to perform a set of specific
decided on GEP mainly because it has a good overlap of function- tasks described in a handout, using a t-SNE projection of the
alities with t-viSNE, is well-known, available online, and works Breast Cancer Wisconsin data set provided by the tool, and to
correctly with user-provided data. VisCoDeR [22], for example, answer the questions related to these tasks (see Tasks below for
also provides an overlap of features, but the focus of the tool and details). Participants were also asked to notify when each task was
the tasks it supports—the comparison of DR methods—is very completed, so we could track the task-specific completion times.
different from the focus of our experiment. Clustervision [51], on Finally, in the fourth step, they filled out a feedback form
the other hand, did not work when we tried to load our own data based on the ICE-T methodology [65]. The ICE-T evaluation form
sets). focuses on the value in and the interactivity of visualizations. It has
Research Questions The goals of the experiment are defined four main high-level components, i.e., the pillars of the approach
by two research questions, RQ1: “Will the users spend the same proposed by Wall et al. [65]: Insight, Confidence, Essence, and
time performing the tasks in both tools?”, and RQ2: “Will both Time (ICE-T). Each of these pillars consists of two to three
tools provide, from the users’ perspective, the same level of sub-questions representing the mid-level guidelines. Subsequently,
support for the given tasks?” Thus, we were interested in checking each of these mid-level guidelines has one to three low-level
the completion time for the tasks in each tool (related to RQ1). For heuristics, adding up to a total number of 21 heuristics in the end.
answering RQ2, we studied the users’ feedback both for specific In more detail: Insight is the ability to impel and identify insights
tasks (i.e., the tool supportiveness) and in general (with the help or insightful questions about the data. Confidence is the ability
of the ICE-T methodology, cf. Section 6.3). to produce confidence and trustworthiness in the data. Essence
Completion Time Tool Supportiveness

15 5
Time (in mins)
Likert Scale
4
10
3 t-viSNE
5 2 GEP
0 1
Task 1 Task 2 Task 3 Task 4 Task 5 Task 6 Task 1 Task 2 Task 3 Task 5 Task 6
Q.1.1 (Task 1) Q.1.2 (Task 1) Q.2.1 (Task 2) Q.3.1 (Task 3) Q.4.1 (Task 4) Q.5.1 (Task 5) Q.6.1 (Task 6)
10 8 15 15 15 15 15
# of Participants
8 6
6 10 10 10 10 10
4
4 5 5 5 5 5
2 2
0 0 0 0 0 0 0
1 2 3 4 5 2 6 10 14 18 22 26 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
Answers # of Trials Answers Answers Answers Answers Answers
Fig. 9: Results of the comparative study: the top charts show completion time and tool supportiveness (as judged by participants) for all
the tasks of the study, and the bottom row includes the histograms of the participants’ responses in all questions/tasks. The completion
times between the two groups were very similar, but t-viSNE got consistently higher scores for tool supportiveness in all tasks. For a
detailed analysis of each of the individual tasks’ results, please refer to Subsection 6.2.
is the capability to communicate concisely an overall essence of option was chosen three times, all in the GEP group). While this is
the data. Lastly, Time is the capability to reduce the necessary of course based on subjective user feedback, we consider that it is
total time to respond to a large variety of queries about data. The nonetheless an important aspect of the results; since both t-viSNE
operationalization of this conceptual approach led to the ICE-T and GEP mainly aim to support the exploratory visual analysis of
evaluation form, where raters can give an answer for each heuristic high-dimensional data—through many different coordinated views
in a 7-point Likert rating scale, or the “Not Applicable” answer. and interactive tools—it may become hard to set a single, concrete
Tasks Six tasks were provided to the participants, without ground-truth for evaluating their perfomance as a whole. Thus, the
any specific mentions to the tool’s features. In consequence, the users’ perception of how much the tool supported their intended
participants themselves were responsible for performing them to goals can be one (but not the only) good indication of how useful
the best of their abilities. The six tasks were designed to match the tool actually is.
the six main pitfalls of the exploration of high-dimensional data Task-Specific Qualitative Analysis We proceed by comparing
with t-SNE, as defined by Wattenberg et al. [14]. Their numbering the results of the two groups in each task individually, using the
follows the same order as described in Section 1 (so Task 1 is task-specific histograms from the bottom row of Figure 9. Our
related to pitfall (i), Task 2 is related to pitfall (ii), and so on). Each goal here is to perform an informal and qualitative analysis of
task consisted of one, two, or three questions that the participants the results, using the data from the experiment as input, to obtain
were asked to answer, all with multiple choices (except for Q.1.2) more insights on the differences of the user experiences with the
including “I do not know”. Before moving on to the next task, two different tools.
they were also asked to rate how supportive the tool was for that
task. A quick summary of the tasks is presented together with the In Task 1, Choosing More Effective Parameters, participants
results in the next section. Please refer to the handout provided in were asked to find problems in their chosen t-SNE layout (Q.1.1)
the supplemental materials for the complete description, as seen and to note how many times they tuned the t-SNE parameters
by the participants. before starting the experiment (Q.1.2). In both cases, smaller is
better (i.e., fewer problems and easier to tune the parameters).
6.2 Results For Q.1.1, the distributions of the answers of the two groups are
symmetrically opposite: most t-viSNE users found fewer problems
Figure 9 provides a summary of the data gathered during the
with the initial layout (answers 2 and 3), while most GEP users
experiment, more specifically: the task completion times, the
found many problems, to the point of considering them too hard
reported supportiveness of the tools on each task, and the distri-
to count (i.e., answers 3 and 4). This indicates that t-viSNE users
butions of answers to each task. The analysis of the ICE-T results
were, in general, more satisfied with the t-SNE layout after setting
can be found further below in Subsection 6.3.
One initial observation is that the overall Completion Time for the parameters. The answers to Q.1.2 also show that t-viSNE users
both groups was remarkably similar. With the exception of Tasks 1 needed fewer iterations to find a good parameter setting.
and 5, where t-viSNE users performed faster than GEP, in general In Task 2, Deciding About (Ir-)Relevant Sizes of Clusters, the
the results have not shown any statistically significant difference. goal was to determine the relative density (or, conversely, the spar-
To answer RQ1, we detected no statistically significant differ- sity) of the clusters. The expected answer—see the visualization
ence in the time the users needed to perform the given tasks in Figure 6(c), for example—is that the benign cluster is denser
for both tools, in general. (even though it may appear less dense, when no extra information
On the other hand, t-viSNE obtained consistently higher scores is provided in the projection), which corresponds to answer 1. We
for Tool Supportiveness, with a higher average in all the proposed can see from the histogram of Q.2.1 that most of the t-viSNE
tasks. The bulk of the distributions of the supportiveness scores group agreed with this result, while the GEP group mostly chose
from the two groups overlap little, mostly near outliers (the “N/A” answer 2: “The benign cluster is sparser than the malignant”.
For Task 3, Evaluating Original Space Distances, participants TABLE 2: Results from the ICE-T feedback. t-viSNE obtained
had to judge the quality of the distance preservation in the projec- significantly larger scores than Google’s Embedding Projector
tion. Most participants from the t-viSNE group chose answer 3— (GEP) in all components.
t-viSNE
good (but not perfect) distance preservation—which seems to align
Components Insight Time Essence Confidence Average
well with the Shepard Heatmap from Figure 6(b), for example. Participant 7 6.63 6.60 7.00 7.00 6.81
The answers from the GEP group were mostly scattered, with a Participant 9 6.88 6.80 7.00 6.50 6.79
Participant 6 6.88 7.00 6.75 6.50 6.78
tendency towards answers 4 (distances are only slightly preserved) Participant 5 6.63 6.00 6.00 6.67 6.32
or 5 (“I do not know”). Participant 12 6.25 6.20 6.50 6.25 6.30 7
Participant 8 6.63 5.80 6.75 6.00 6.29 6
Task 4, Extracting Patterns from the Projection, consisted
Participant 13 6.25 6.40 6.25 6.25 6.29 5
Legend:
simply of determining the number of clusters in the projection. Participant 11 6.63 6.00 6.50 6.00 6.28 4
The results from both groups were quite similarly distributed, Participant 14 6.88 6.60 5.75 5.75 6.24 3
Participant 4 5.71 6.00 6.50 5.75 5.99 2
with most participants choosing 2 clusters (as expected, see e.g. Participant 10 6.13 5.20 6.50 5.50 5.83 1
Figure 6(a)). One difference is that 4 participants from the GEP Participant 3 6.00 5.80 6.00 5.50 5.83
group chose 1 cluster, which could indicate that GEP failed to Participant 1 6.13 6.20 4.75 5.50 5.64
Participant 2 5.63 5.40 5.50 5.25 5.44
clearly separate the two clusters in some cases. 95% C.I. 6.37 ± 0.24 6.14 ± 0.29 6.27 ± 0.36 6.03 ± 0.30 6.20 ± 0.24
For Task 5, Observing and Exploring Shapes, participants were GEP
asked to determine the least important dimension that affected the Components Insight Time Essence Confidence Average
shape of the clusters. All participants from the t-viSNE group Participant 24 6.00 5.80 5.75 6.33 5.97
Participant 17 6.00 6.00 5.67 5.33 5.75
chose answer 4, mitoses, in agreement with our own observations Participant 21 5.83 6.40 6.25 3.75 5.56
for this data set (e.g., Figure 6(d)) and previous work (e.g., [66]). Participant 15 5.00 5.40 6.00 5.33 5.43
Participant 26 6.13 5.60 5.50 4.25 5.37 7
While we cannot claim that this is the correct answer, the results
Participant 25 5.50 5.40 5.75 4.75 5.35 6
are encouraging from the perspective of the consistency of the Participant 23 6.13 5.40 4.50 4.75 5.19 5
Legend:
participants’ experience with the tool. The GEP answers, on the Participant 22 5.50 5.40 3.25 4.75 4.73 4
Participant 18 4.75 5.80 4.25 3.75 4.64 3
other hand, were mostly scattered, but with a tendency towards Participant 19 4.75 5.20 4.75 3.67 4.59 2
answer 5 (“I do not know”). Participant 20 4.88 4.80 4.00 4.25 4.48 1
Participant 16 4.50 4.60 3.75 3.67 4.13
Finally, the goal of Task 6, Interpreting and Assessing Local
Participant 27 5.00 4.20 3.75 2.00 3.74
Topology, was to find and interpret “unusual” patterns in the Participant 28 3.88 5.00 3.25 2.25 3.59
projection, more specifically formations that are known to happen 95% C.I. 5.27 ± 0.40 5.36 ± 0.33 4.74 ± 0.61 4.20 ± 0.67 4.89 ± 0.43
in this data set because of identical points, i.e., data points which U-value 15 (< 47) 29.5 (< 47) 18.5 (< 47) 12 (< 47) 7 (< 47)
have the same values for all dimensions. This corresponded to
answer 1, which was correctly identified by most participants
from the t-viSNE group. The answers for the GEP group were
of them overlap—and the results of all component-wise Mann-
again mostly scattered, with 6 of them choosing “I do not know”
Whitney U tests, with all U’s well below the critical value of
(against 2 only from the t-viSNE group).
47, showing that t-viSNE had significantly larger scores in all
four ICE-T components. These results, together with the tools’
6.3 ICE-T Results supportiveness outcomes (discussed above in Subsection 6.2),
suggest that our tool provides better level of support for the
As described in Subsection 6.1, we complemented the data from
given tasks than GEP, which answers RQ2.
the tasks themselves by using the ICE-T methodology and ques-
tionnaire to gather and compare structured user feedback from
both groups. The scores obtained from all participants, for all 7 D ISCUSSION
ICE-T components, can be seen in Table 2. Larger is better, with In this section, we discuss different aspects of the design choices
green indicating good results (as opposed to red). The raw data of our implementation, elaborate on our experiences with devel-
is accompanied by two statistical analyses: the two-tailed 95% oping t-viSNE, and lastly, we present limitations and future work.
confidence intervals (CIs) per component (t ∗ = 2.16, N = 14); and
the results of one-tailed Mann-Whitney U tests [60], also one per 7.1 Design Choices
component, with a significance level of 0.01 (U ∗ = 47). We chose a
non-parametric test due to the small sample size and its robustness Shepard Heatmap vs. Shepard Diagram We propose the
to non-normality in the data distribution. Shepard Heatmap, instead of simply adopting a Shepard Diagram
A quick visual inspection of the two tables already hints at as usual in previous work, in order to make sure this view reaches
t-viSNE having superior scores than GEP in all components, with its intended goal: to be a quick and simple overview of the quality.
all cells being green-colored (as opposed to GEP’s table, which A full scatterplot with (n2 − n)/2 points and variable transparency
contains many red-colored cells). Indeed, the smallest score for would have done a similar job when it comes to avoiding clutter,
t-viSNE was 4.75, while GEP got many scores under 4 (or even but that would mean (a) a lot of unnecessary details, such as
under 3). Following the trend of the previously-presented results, outliers, would be visible and might attract the user’s attention,
the Time component is the one with the most similar scores be- and (b) t-viSNE would show several scatterplots at the same time,
tween the two tools. On the other hand, the Confidence component which could be confusing for the user. During our design process,
had the largest difference, which suggests that participants were we realized that a different abstraction, with less detail, was the
significantly more confident in their results when using t-viSNE superior choice based on the hypothesis that grid-based binning
than with GEP. The observed conclusions are confirmed when we can reduce cluttering and overlapping [67], while hiding some of
compare the component-wise CIs for both groups—since none the less-prominent details. In Figure 10, we show two examples
of the results of this trade-off: for the smaller Iris data set, both on the standard plot by highlighting the differences between the
diagrams seem to convey the same patterns, but for the somewhat selection and the global average, shown as positive and negative
larger Breast Cancer Wisconsin data set (described in Section 5), values around the 0 value of the y-axis. It provides a clearer
the patterns are more confusing while using a Shepard Diagram. overall picture of the difference in preservation among all the
We decided to implement both approaches in our tool, as shown shown scales, but compromises the precision and simplicity of
in Figure 1(c), so that the user may choose to fall back to the more interpretation of the y-axis (where the exact percentage of Neigh-
common scatterplot-based view if desired. Additionally, as for the borhood Preservation was previously shown). The difference bar
bin sizes of the heatmap, we decided to keep them constant (with chart (b) is a combination of the designs (a) and (d). Similar to (d),
10 bins by default) in order to make sure that every projection the interpretation of the y-values might be misleading. Lacking a
can be interpreted in a predictable way, without extra training clear winner in this case, we opted to let the users decide.
or parameter settings required from the users. The color scale of
Preservation, %
the heatmap adapts automatically to the range of distances of the 0.6
0.4
0.2
loaded data set, divided into 10 discrete sub-ranges. Implementing 0
both bin sizes (grid and color) as user-defined parameters would (a) Bar chart (default option)
be a trivial addition to the tool.
+/- Preservation
+0.5
0
(b) Difference bar chart
Preservation, %
0.6
Iris Data Set
0.4
(c) Line plot
+/- Preservation
+0.1
0
-0.1
(d) Difference line plot
Fig. 11: Four options for the visualization of Neighborhood

Preservation (using the Iris data set).
Wisconsin Data Set
Breast Cancer
Adaptive PCP vs. PCP Although it is not uncommon to find

tools that use PCP views together with DR-based scatterplots (e.g.,
iPCA [69]) with various schemes for re-ordering and prioritizing
the axes (e.g., [70], [71]), the arrangement and presentation of
these PCP’s are usually static in order to reflect attributes of the
data (or the projection) as a whole. In our proposed Adaptive PCP,
Shepard Diagram Shepard Heatmap the arrangement of the axes is dynamically updated every time the
user makes a new selection (using a local PCA); this way, the
Fig. 10: Comparison: Shepard Heatmap vs. Shepard Diagram.
PCP only shows, at any given time, the most relevant dimensions
for the user’s current focus, which may differ significantly from
Different Colormaps There are quite a few different col- the global aspects of the projection as a whole. Coupled with the
ormaps being used simultaneously in t-viSNE: as a bare minimum, Dimension Correlation view, this provides a highly-customized
there is a categorical one for the labels in the overview (and the toolset for inspecting and interpreting the meanings of specific
PCP), a single-hue sequential one for the Shepard Heatmap, and a neighborhoods of data points.
multi-hue sequential one for the main view. We carefully chose To briefly present the benefits of using our technique, we
these colormaps, considering Gestalt laws and recent research employ the Single Proton Emission Computed Tomography
results [68], in order to make sure they are efficient, do not (SPECTF) data set [58] with 44 dimensions. In Figure 12, we
interfere with each other, and that it is as clear as possible that can observe that the standard PCP is cluttered, especially for the
they represent different things. case without any selection. Thus, it is hard to see why the normal
class is actually separated from the abnormal one. Furthermore,
Visual Abstraction for Neighborhood Preservation The
the numerous axis labels introduce even further cluttering and
Neighborhood Preservation plot (Figure 1(g)) can be visualized as
confusion for the users of the standard PCP. Instead, our Adap-
a bar chart (by default), a difference bar chart, a standard line plot,
tive PCP utilizes PCA as a degree-of-interest function and only
or a difference line plot, as shown in Figure 11. Although they
displays the 8 most informative dimensions. It enables the analyst
show basically the same information, each one has advantages
to discover that abnormal classified patients have less fluctuating
and disadvantages. On the one hand, we found the bar chart (a)
measurements than the others, which becomes even more salient
to be better when comparing the projection’s average with the
in the selection case where the measurements for the normal class
selection’s average when we search for discrete k-values, and
(in brown color) are rather stable when patients are in both rest
during the initial state (no selection of points), where the user can
and stress conditions.
easily distinguish the bars having the same size. It can optionally
be replaced by the line plot (c), with similar effects; however, it Labels In order to better explain the contribution of t-viSNE,
can become confusing when there is very little difference between the data sets used in our use cases contain predefined labels,
the selection and the projection average, due to the overlap of the which is not the case in general when using unsupervised learning
two lines. The difference line plot (d), on the other hand, builds techniques, such as t-SNE. There is no restriction, however, to
example of an inherent characteristic of t-SNE, since it comes

directly from its algorithm. A limitation that arises from building
PCP
a tool that is tuned to tackle problems concerning a particular
# of Instances: 267 & # of Dimensions: 44

algorithm is the possibility of the algorithm becoming obsolete or
being replaced by a newer, better alternative. We argue, though,
that more than a decade after its proposal, it has now become
quite clear that t-SNE is not going away anytime soon. Papers
Adaptive PCP
are still regularly coming out proving its stability [76], [77], [78],
and high-impact applications and publications in many different
domains geared towards non-visualization and non-ML experts
Global Selection are based on it [79], [80]. Even in the improbable scenario that
Classes: Abnormal Normal
Dimensions: ROI#R = Region of Interest (#Number) in Rest & ROI#S = Region of Interest (#Number) in Stress t-SNE becomes obsolete soon, the fact that most of our proposed
views can be re-used or adapted to different DR methods means
Fig. 12: Adaptive PCP vs. PCP on the SPECTF data set. We that our work is still relevant and largely future-proof.
demonstrate two cases: without selection of points (left) and with
selection of ten points all belonging to the normal class (right). User Study The goals of the comparative study presented in
this paper were to provide initial evidence of the acceptance of t-
viSNE by analysts, the consistency of their results when exploring
having labels when using t-viSNE; one might use the results of a t-SNE projection using our tool, and the improvement over
a clustering algorithm, for example, as a replacement for pre- another state-of-the-art tool. The tasks of the study were designed
defined labels, or simply no labels at all. Apart from not having to test how each tool helps the analyst in overcoming the six
any specific color mapping in the overview and the PCP, none of pitfalls defined by Wattenberg et al. [14]), which was also one of
the other techniques are affected by it. the design goals of t-viSNE itself. Since that might not have been
the case for GEP, this could be seen as a bias towards t-viSNE.
Nevertheless, while it may not reflect reality in the same way as,
7.2 Limitations and Future Work
e.g., a large-scale field study performed with real-world experts in
We implemented t-viSNE in JavaScript and WebGL, using a their actual working environment [81], the positive results from
combination of D3.js [72], Three.js [73], and Plotly.js [74] for the the study showed that our approach is promising and deserves to
frontend. In the backend, it uses Laurens van der Maaten’s Barnes- be developed and tested further, which will be done in future work.
Hut t-SNE implementation written in Python and C++ [52],
Progressive Quality Analysis The remaining costs are one
and Projlib [75] for the quality measures. The use cases and
aspect of estimating the projection quality. This means that pro-
experiments were performed on a MacBook Pro 2018 with a 2.8
jected points with high remaining costs can be moved by an
GHz Intel Core i7 CPU, a Radeon Pro 555 2048 MB GPU, 16 GB
additional optimization step. Akin to this idea, t-viSNE might
of RAM, and running macOS Mojave.
show a preview of the data points in the next optimization
Performance There are two reasons why we decided to use the step. In consequence, users could determine whether the t-SNE
Barnes-Hut implementation of the original t-SNE algorithm [52], optimization is completed or not, simply by observing the points’
instead of a newer and faster implementation [53], [54]. First, each trajectories in low-dimensional space. This remains as possible
fast and approximated implementation of t-SNE introduces its own future work.
variations to the algorithm, and we did not want these variations to
influence the design of our tool or introduce unnecessary bias in 8 C ONCLUSIONS
the results of our study. Second, in this phase of the research, In this paper, we introduced t-viSNE, an interactive tool for the
we were mainly concerned with designing and validating the visual investigation of t-SNE projections. By partly opening the
system with the right set of views and the right analysis workflow, black box of the t-SNE algorithm, we managed to give power
so we decided to prioritize the ease of implementation over the to users allowing them to test the quality of the projections and
raw performance. Replacing the actual implementation of t-SNE understand the rationale behind the choices of the algorithm when
should be straightforward, if deemed necessary. forming clusters. Additionally, we brought into light the usually
Other DR Methods Although our main design goal was to lost information from the inner parts of the algorithm such as
support the investigation of t-SNE projections, most of our views densities of points and highlighted areas which are not well-
and interaction techniques are not strictly confined to the t-SNE optimized according to t-SNE. To confirm the effectiveness of t-
algorithm. For example, the Dimension Correlation view could, viSNE, we presented a hypothetical usage scenario and a use case
in theory, be applied to any projection generated by any other with real-world data sets. We also evaluated our approach with a
algorithm. Its motivation, however, came from the fact that t- user study by comparing it with Google’s Embedding Projector
SNE is especially known to generate hard-to-interpret shapes in its (GEP): the results show that, in general, the participants could
output [14], so the necessity of exploring and investigating such manage to reach the intended analysis tasks even with limited
shapes became more apparent than with other DR methods. The training, and their feedback indicates that t-viSNE reached a better
same goes for other views, such as Neighborhood Preservation level of support for the given tasks than GEP. However, both tools
or Adaptive PCP: the inspiration and the design constraints came were similar with respect to completion time.
from known shortcomings and characteristics of t-SNE, such as
its focus on optimizing neighborhoods of points in detriment ACKNOWLEDGMENTS
of global distances, but the implementation could be re-used The authors are thankful to Margit Pohl, Vienna University of
in different scenarios. The analysis of density, however, is one Technology, for her suggestions to improve the evaluation section.
R EFERENCES [24] J. Venna and S. Kaski, “Visualizing Gene Interaction Graphs with Local
Multidimensional Scaling,” in Proceedings of the European Symposium
[1] I. T. Jolliffe and J. Cadima, “Principal Component Analysis: A Review on Artificial Neural Networks (ESANN ’06), 2006, pp. 557–562.
and Recent Developments,” Philosophical Transactions of the Royal [25] M. Sips, B. Neubert, J. Lewis, and P. Hanrahan, “Selecting Good
Society A: Mathematical, Physical and Engineering Sciences, vol. 374, Views of High-Dimensional Data Using Class Consistency,” Computer
no. 2065, pp. 1–16, 2016. Graphics Forum, vol. 28, no. 3, pp. 831–838, 2009.
[2] I. Borg and P. J. Groenen, Modern Multidimensional Scaling: Theory and [26] M. M. Abbas, M. Aupetit, M. Sedlmair, and H. Bensmail, “ClustMe: A
Applications. Springer Series in Statistics, 2005. Visual Quality Measure for Ranking Monochrome Scatterplots based on
[3] J. A. Lee and M. Verleysen, Nonlinear Dimensionality Reduction. In- Cluster Patterns,” Computer Graphics Forum, vol. 38, no. 3, pp. 225–236,
formation Science and Statistics, 2007. 2019.
[4] J. W. Sammon, “A Nonlinear Mapping for Data Structure Analysis,” [27] R. M. Martins, D. Coimbra, R. Minghim, and A. C. Telea, “Visual Analy-
IEEE Transactions on Computers, vol. C-18, no. 5, pp. 401–409, 1969. sis of Dimensionality Reduction Quality for Parameterized Projections,”
[5] J. B. Tenenbaum, V. De Silva, and J. C. Langford, “A Global Geometric Computers & Graphics, vol. 41, pp. 26–42, 2014.
Framework for Nonlinear Dimensionality Reduction,” Science, vol. 290, [28] B. Mokbel, W. Lueks, A. Gisbrecht, and B. Hammer, “Visualizing
no. 5500, pp. 2319–2323, 2000. the Quality of Dimensionality Reduction,” Neurocomputing, vol. 112,
[6] S. T. Roweis and L. K. Saul, “Nonlinear Dimensionality Reduction by pp. 109–123, 2013, Advances in Artificial Neural Networks, Machine
Locally Linear Embedding,” Science, vol. 290, no. 5500, pp. 2323–2326, Learning, and Computational Intelligence.
2000. [29] C. Seifert, V. Sabol, and W. Kienreich, “Stress Maps: Analysing Local
[7] P. Joia, D. Coimbra, J. A. Cuminato, F. V. Paulovich, and L. G. Phenomena in Dimensionality Reduction Based Visualisations,” in Pro-
Nonato, “Local Affine Multidimensional Projection,” IEEE Transactions ceedings of the International Symposium on Visual Analytics Science and
on Visualization and Computer Graphics, vol. 17, no. 12, pp. 2563–2571, Technology (EuroVAST ’10). The Eurographics Association, 2010.
2011. [30] J. A. Lee and M. Verleysen, “Quality Assessment of Dimensionality
[8] L. van der Maaten, E. Postma, and J. van den Herik, “Dimensionality Reduction: Rank-Based Criteria,” Neurocomputing, vol. 72, no. 7, pp.
Reduction: A Comparative Review,” Journal of Machine Learning Re- 1431–1443, 2009, Advances in Machine Learning and Computational
search, vol. 10, pp. 66–71, 2009. Intelligence.
[9] M. Espadoto, R. M. Martins, A. Kerren, N. S. T. Hirata, and A. C. Telea, [31] S. Lespinats and M. Aupetit, “CheckViz: Sanity Check and Topological
“Towards a Quantitative Survey of Dimension Reduction Techniques,” Clues for Linear and Non-Linear Mappings,” Computer Graphics Forum,
IEEE Transactions on Visualization and Computer Graphics, 2019. vol. 30, no. 1, pp. 113–125, 2011.
[10] L. van der Maaten and G. Hinton, “Visualizing Data Using t-SNE,” [32] M. Aupetit, “Visualizing Distortions and Recovering Topology in Con-
Journal of Machine Learning Research, vol. 9, pp. 2579–2605, 2008. tinuous Projection Techniques,” Neurocomputing, vol. 70, no. 7–9, pp.
[11] T. Höllt, N. Pezzotti, V. van Unen, F. Koning, E. Eisemann, B. Lelieveldt, 1304–1330, 2007.
and A. Vilanova, “Cytosplore: Interactive Immune Cell Phenotyping for [33] R. M. Martins, R. Minghim, and A. C. Telea, “Explaining Neighborhood
Large Single-Cell Datasets,” Computer Graphics Forum, vol. 35, no. 3, Preservation for Multidimensional Projections,” in Proceedings of the
pp. 171–180, 2016. Computer Graphics & Visual Computing (CGVC ’15). Eurographics,
[12] M. Johnson, M. Schuster, Q. V. Le, M. Krikun, Y. Wu, Z. Chen, 2015, pp. 121–128.
N. Thorat, F. Viégas, M. Wattenberg, G. Corrado, M. Hughes, and [34] N. Heulot, M. Aupetit, and J.-D. Fekete, “ProxiLens: Interactive Ex-
J. Dean, “Google’s Multilingual Neural Machine Translation System: ploration of High-Dimensional Data Using Projections,” in Proceedings
Enabling Zero-Shot Translation,” Transactions of the Association for of the EuroVis Workshop on Visual Analytics using Multidimensional
Computational Linguistics, vol. 5, pp. 339–351, 2017. Projections. The Eurographics Association, 2013.
[13] E. D. Amir, K. L. Davis, M. D. Tadmor, E. F. Simonds, J. H. Levine, S. C.
[35] S. Liu, B. Wang, P.-T. Bremer, and V. Pascucci, “Distortion-Guided
Bendall, D. K. Shenfeld, S. Krishnaswamy, G. P. Nolan, and D. Pe’er,
Structure-Driven Interactive Exploration of High-Dimensional Data,”
“viSNE Enables Visualization of High Dimensional Single-Cell Data and
Computer Graphics Forum, vol. 33, no. 3, pp. 101–110, 2014.
Reveals Phenotypic Heterogeneity of Leukemia,” Nature Biotechnology,
vol. 31, no. 6, pp. 545–552, 2013. [36] J. Stahnke, M. Dörk, B. Müller, and A. Thom, “Probing Projections:
Interaction Techniques for Interpreting Arrangements and Errors of
[14] M. Wattenberg, F. Viégas, and I. Johnson, “How to Use t-SNE
Dimensionality Reductions,” IEEE Transactions on Visualization and
Effectively,” Distill, 2016. [Online]. Available: http://distill.pub/2016/
Computer Graphics, vol. 22, no. 1, pp. 629–638, 2016.
misread-tsne
[15] D. Sacha, L. Zhang, M. Sedlmair, J. A. Lee, J. Peltonen, D. Weiskopf, [37] S. J. Fernstad, J. Shaw, and J. Johansson, “Quality-Based Guidance
S. C. North, and D. A. Keim, “Visual Interaction with Dimensionality for Exploratory Dimensionality Reduction,” Information Visualization,
Reduction: A Structured Literature Analysis,” IEEE Transactions on vol. 12, no. 1, pp. 44–64, 2013.
Visualization and Computer Graphics, vol. 23, no. 1, pp. 241–250, 2017. [38] R. da Silva, P. Rauber, R. M. Martins, R. Minghim, and A. C. Telea,
[16] L. G. Nonato and M. Aupetit, “Multidimensional Projection for Visual “Attribute-Based Visual Explanation of Multidimensional Projections,”
Analytics: Linking Techniques with Distortions, Tasks, and Layout En- in Proceedings of the EuroVis Workshop on Visual Analytics (EuroVA
richment,” IEEE Transactions on Visualization and Computer Graphics, ’15), 2015, pp. 31–35.
vol. 25, no. 8, pp. 2650–2673, 2019. [39] E. Kandogan, “Just-in-Time Annotation of Clusters, Outliers, and Trends
[17] A. Chatzimparmpas, R. M. Martins, and A. Kerren, “t-viSNE: A Visual in Point-Based Data Visualizations,” in Proceedings of the IEEE Confer-
Inspector for the Exploration of t-SNE,” in Poster Abstracts, IEEE ence on Visual Analytics Science and Technology (VAST ’12). IEEE,
Information Visualization (VIS ’18), 2018. 2012, pp. 73–82.
[18] T. Schreck, T. von Landesberger, and S. Bremm, “Techniques for [40] Y. Chen, S. Barlowe, and J. Yang, “Click2Annotate: Automated Insight
Precision-Based Visual Analysis of Projected Data,” Information Visu- Externalization with Rich Semantics,” in Proceedings of the IEEE Sym-
alization, vol. 9, no. 3, pp. 181–193, 2010. posium on Visual Analytics Science and Technology (VAST ’10). IEEE,
[19] E. Sherkat, S. Nourashrafeddin, E. E. Milios, and R. Minghim, “Inter- 2010, pp. 155–162.
active Document Clustering Revisited: A Visual Analytics Approach,” [41] L. Tan, Y. Song, S. Liu, and L. Xie, “ImageHive: Interactive Content-
in Proceedings of the 23rd International Conference on Intelligent User Aware Image Summarization,” IEEE Computer Graphics and Applica-
Interfaces, ser. IUI ’18. ACM, 2018, pp. 281–292. tions, vol. 32, no. 1, pp. 46–55, 2012.
[20] A. Endert, P. Fiaux, and C. North, “Semantic Interaction for Visual Text [42] D. B. Coimbra, R. M. Martins, T. T. Neves, A. C. Telea, and F. V.
Analytics,” in Proceedings of the SIGCHI Conference on Human Factors Paulovich, “Explaining Three-Dimensional Dimensionality Reduction
in Computing Systems, ser. CHI ’12. ACM, 2012, pp. 473–482. Plots,” Information Visualization, vol. 15, no. 2, pp. 154–172, 2016.
[21] “t-viSNE Code,” 2020, accessed April 04, 2020. [Online]. Available: [43] I. Borg and P. Groenen, “Modern Multidimensional Scaling: Theory and
http://bit.ly/t-visne-code Applications,” Journal of Educational Measurement, vol. 40, no. 3, pp.
[22] R. Cutura, S. Holzer, M. Aupetit, and M. Sedlmair, “VisCoDeR: A 277–280, 2003.
Tool for Visually Comparing Dimensionality Reduction Algorithms,” in [44] T. Fujiwara, O. Kwon, and K. Ma, “Supporting Analysis of Dimensional-
Proceedings of the European Symposium on Artificial Neural Networks ity Reduction Results with Contrastive Learning,” IEEE Transactions on
(ESANN ’18). i6doc.com publication, 2018, pp. 105–110. Visualization and Computer Graphics, vol. 26, no. 1, pp. 45–55, 2020.
[23] M. Cavallo and C. D. “Clustrophile 2: Guided Visual Clustering [45] R. Faust, D. Glickenstein, and C. Scheidegger, “DimReader: Axis Lines
Analysis,” IEEE Transactions on Visualization and Computer Graphics, that Explain Non-Linear Projections,” IEEE Transactions on Visualiza-
vol. 25, no. 1, pp. 267–276, 2019. tion and Computer Graphics, vol. 25, no. 1, pp. 481–490, 2019.
[46] M. Cavallo and Ç. Demiralp, “A Visual Interaction Framework for Di- [69] D. H. Jeong, C. Ziemkiewicz, B. Fisher, W. Ribarsky, and R. Chang,
mensionality Reduction Based Data Exploration,” in Extended Abstracts “iPCA: An Interactive System for PCA-Based Visual Analytics,” Com-
of the 2018 CHI Conference on Human Factors in Computing Systems, puter Graphics Forum, vol. 28, no. 3, pp. 767–774, 2009.
ser. CHI EA ’18. ACM, 2018, pp. D112:1–D112:4. [70] M. Ankerst, S. Berchtold, and D. A. Keim, “Similarity Clustering of
[47] B. C. Kwon, H. Kim, E. Wall, J. Choo, H. Park, and A. Endert, Dimensions for an Enhanced Visualization of Multidimensional Data,”
“AxiSketcher: Interactive Nonlinear Axis Mapping of Visualizations in Proceedings of the IEEE Symposium on Information Visualization,
through User Drawings,” IEEE Transactions on Visualization and Com- 1998, pp. 52–60.
puter Graphics, vol. 23, no. 1, pp. 221–230, 2017. [71] L. F. Lu, M. L. Huang, and J. Zhang, “Two Axes Re-Ordering Methods in
[48] H. Kim, J. Choo, H. Park, and A. Endert, “InterAxis: Steering Scatterplot Parallel Coordinates Plots,” Journal of Visual Languages & Computing,
Axes via Observation-Level Interaction,” IEEE Transactions on Visual- vol. 33, pp. 3–12, 2016.
ization and Computer Graphics, vol. 22, no. 1, pp. 131–140, 2016. [72] “D3 — Data-Driven Documents,” 2011, accessed April 04, 2020.
[49] M. Dowling, J. Wenskovitch, J. T. Fry, S. Leman, L. House, and C. North, [Online]. Available: https://d3js.org/
“SIRIUS: Dual, Symmetric, Interactive Dimension Reductions,” IEEE [73] “Three.js — JavaScript 3D Library,” 2010, accessed April 04, 2020.
Transactions on Visualization and Computer Graphics, vol. 25, no. 1, [Online]. Available: https://threejs.org
pp. 172–182, 2019. [74] “Plotly — JavaScript Open Source Graphing Library,” 2010, accessed
[50] C. Lai, Y. Zhao, and X. Yuan, “Exploring High-Dimensional Data April 04, 2020. [Online]. Available: https://plot.ly
Through Locally Enhanced Projections,” Journal of Visual Languages [75] “Projlib – A python library to support research on multidimensional
& Computing, vol. 48, pp. 144–156, 2018. projections,” 2020. [Online]. Available: https://github.com/rafaelmessias/
[51] B. C. Kwon, B. Eysenbach, J. Verma, K. Ng, C. De Filippi, W. F. Stewart, projlib
and A. Perer, “Clustervision: Visual Supervision of Unsupervised Clus- [76] C. de Bodt, D. Mulders, M. Verleysen, and J. A. Lee, “Perplexity-
tering,” IEEE Transactions on Visualization and Computer Graphics, Free t-SNE and Twice Student tt-SNE,” in Proceedings of the European
vol. 24, no. 1, pp. 142–151, 2018. Symposium on Artificial Neural Networks (ESANN ’18), 2018.
[77] C. De Bodt, D. Mulders, M. Verleysen, and J. A. Lee, “Extensive
[52] L. van der Maaten, “Accelerating t-SNE Using Tree-Based Algorithms,”
Assessment of Barnes-Hut t-SNE.” in Proceedings of the European
Journal of Machine Learning Research, vol. 15, no. 1, pp. 3221–3245,
Symposium on Artificial Neural Networks (ESANN ’18), 2018.
2014.
[78] G. C. Linderman and S. Steinerberger, “Clustering with t-SNE, Prov-
[53] N. Pezzotti, B. P. F. Lelieveldt, L. v. d. Maaten, T. Höllt, E. Eisemann, and ably,” SIAM Journal on Mathematics of Data Science, vol. 1, no. 2, pp.
A. Vilanova, “Approximated and User Steerable tSNE for Progressive 313–332, 2019.
Visual Analytics,” IEEE Transactions on Visualization and Computer [79] V. van Unen, T. Höllt, N. Pezzotti, N. Li, M. J. Reinders, E. Eisemann,
Graphics, vol. 23, no. 7, pp. 1739–1752, 2017. F. Koning, A. Vilanova, and B. P. Lelieveldt, “Visual Analysis of
[54] D. M. Chan, R. Rao, F. Huang, and J. F. Canny, “t-SNE-CUDA: GPU- Mass Cytometry Data by Hierarchical Stochastic Neighbour Embedding
Accelerated t-SNE and its Applications to Modern Data,” in Proceedings Reveals Rare Cell Types,” Nature Communications, vol. 8, no. 1, p. 1740,
of the 30th International Symposium on Computer Architecture and High 2017.
Performance Computing (SBAC-PAD). IEEE, 2018, pp. 330–338. [80] G. C. Linderman, M. Rachh, J. G. Hoskins, S. Steinerberger, and
[55] L. Kaufman and P. Rousseeuw, “Clustering by Means of Medoids,” Y. Kluger, “Fast Interpolation-Based t-SNE for Improved Visualization
Faculty of Mathematics and Informatics, Delft University of Technology, of Single-Cell RNA-Seq Data,” Nature Methods, vol. 16, no. 3, p. 243,
the Netherlands, Tech. Rep., 1987. 2019.
[56] N. Duta, Procrustes Shape Distance. Springer US, 2015, pp. 1278– [81] S. Carpendale, “Evaluating Information Visualizations,” in Information
1279. Visualization: Human-Centered Issues and Perspectives. Springer Berlin
[57] J. D. Leeuw and P. Mair, “Shepard Diagram,” in Wiley StatsRef: Statistics Heidelberg, 2008, pp. 19–45.
Reference Online. American Cancer Society, 2015, pp. 1–3.
[58] D. Dua and C. Graff, “UCI Machine Learning Repository,” 2017. Angelos Chatzimparmpas is PhD student
[Online]. Available: http://archive.ics.uci.edu/ml within the ISOVIS research group and the Lin-
[59] A. Inselberg and B. Dimsdale, “Parallel Coordinates: A Tool for Visualiz- naeus University Centre for Data Intensive Sci-
ing Multi-Dimensional Geometry,” in Proceedings of the 1st Conference ences and Applications at the Department of
on Visualization (Vis ’90). IEEE, 1990, pp. 361–378. Computer Science and Media Technology, Lin-
[60] G. W. Corder and D. I. Foreman, Nonparametric Statistics: A Step-by- naeus University, Sweden. His main research
Step Approach. John Wiley & Sons, 2014. interests include visual exploration of the inner
[61] Y. Ming, H. Qu, and E. Bertini, “RuleMatrix: Visualizing and Under- parts and the quality of machine learning mod-
standing Classifiers with Rules,” IEEE Transactions on Visualization and els with a specific focus on engineering smarter
Computer Graphics, vol. 25, no. 1, pp. 342–352, 2019. cyber-physical systems, as well as visual analyt-
[62] J. Smith, J. Everhart, W. Dickson, W. Knowler, and R. Johannes, “Using ics approaches involving such models.
the ADAP Learning Algorithm to Forecast the Onset of Diabetes Melli-
tus,” in Proceedings of the Annual Symposium Computer Application in Rafael M. Martins is Senior Lecturer at the De-
Medical Care. American Medical Informatics Association, 1988, pp. partment of Computer Science and Media Tech-
261–265. nology at Linnaeus University, Sweden. His PhD
[63] D. Smilkov, N. Thorat, C. Nicholson, E. Reif, F. B. Viégas, and M. Wat- research involved mainly the visual exploration
tenberg, “Embedding Projector: Interactive Visualization and Interpre- of the quality of dimensionality reduction (DR)
tation of Embeddings,” in Proceedings of the NIPS 2016 Workshop on techniques, a topic he continues to investigate,
Interpretable Machine Learning for Complex Systems, 2016. in addition to other related research areas such
[64] R. J. Heuer, Analysis of Competing Hypotheses. Psychology of Intelli- as the interpretation of DR layouts and the ap-
gence Analysis, 1999. plication of DR techniques in different domains
[65] E. Wall, M. Agnihotri, L. Matzen, K. Divis, M. Haass, A. Endert, and including software engineering and digital hu-
J. Stasko, “A Heuristic Approach to Value-Driven Evaluation of Visual- manities.
izations,” IEEE Transactions on Visualization and Computer Graphics,
vol. 25, no. 1, pp. 491–500, 2019. Andreas Kerren is Professor of Computer Sci-
[66] L. R. Borges, “Analysis of the Wisconsin Breast Cancer Dataset and ence at the Department of Computer Science
Machine Learning for Breast Cancer Detection,” in Proceedings of the and Media Technology at Linnaeus University,
XI Workshop on Computational Vision (WVC), 2015. Sweden, and head of the ISOVIS research
[67] S. M. Longshaw, M. J. Turner, and W. T. Hewitt, “Interactive Grid group. He is also a key researcher at the Lin-
Based Binning for Information Visualization,” in Theory and Practice naeus University Centre for Data Intensive Sci-
of Computer Graphics, I. S. Lim and W. Tang, Eds. The Eurographics ences and Applications contributing with his ex-
Association, 2008. pertise in information visualization and visual
[68] Y. Liu and J. Heer, “Somewhere over the Rainbow: An Empirical analytics. His research mainly focuses on the
Assessment of Quantitative Colormaps,” in Proceedings of the 2018 explorative analysis and visualization of typically
CHI Conference on Human Factors in Computing Systems, ser. CHI ’18. large and complex information spaces, for exam-
ACM, 2018, pp. 598:1–598:12. ple in the humanities or the life sciences.

T-Visne: Interactive Assessment and Interpretation of T-Sne Projections

Uploaded by

Copyright:

Available Formats

T-Visne: Interactive Assessment and Interpretation of T-Sne Projections

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

T-Visne: Interactive Assessment and Interpretation of T-Sne Projections

Uploaded by

Copyright:

Available Formats

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. XX, NO.

t-viSNE: Interactive Assessment and

D IMENSIONALITY Reduction (DR) techniques are an impor-

is composed of a set of coordinated views that work together 2 R ELATED W ORK

4.2 Goal 2: Overview

4.3 Goal 3: Quality

(a) (c) (e) (g) Number of Selected Points: 10/699

Number of Selected Points: 5/699

Bland Chromatin Normal Nucleoli

Single Epithelial Cell Size Size Uniformity

Normal Nucleoli Bare Nuclei

Mitoses Shape Uniform.

Zoomed in view (d)

Closing the Visual Analysis Loop A more detailed investiga- 50

to trigger a search for an improved projection before the visual

Completion Time Tool Supportiveness

(b) Difference bar chart

(c) Line plot

(d) Difference line plot

Fig. 11: Four options for the visualization of Neighborhood

Adaptive PCP vs. PCP Although it is not uncommon to find

example of an inherent characteristic of t-SNE, since it comes

a tool that is tuned to tackle problems concerning a particular

# of Instances: 267 & # of Dimensions: 44

You might also like