Syllabus - Introduction To Data Visualization - Summer 2021
Syllabus - Introduction To Data Visualization - Summer 2021
Richard Traunmüller
Video lecture by
Richard Traunmüller
Data visualization is one of the most powerful tools to explore, understand and communicate patterns in
quantitative information. At the same time, good data visualization is a surprisingly difficult task and
demands three quite different skills: substantive knowledge, statistical skill, and artistic sense. The course
is intended to introduce participants to key principles of analytic design and useful visualization techniques
for the exploration and presentation of univariate and multivariate data. This course is highly applied in
nature and emphasizes the practical aspects of data visualization in the social sciences. Students will
learn how to evaluate data visualizations based on principles of analytic design, how to construct
compelling visualizations using the free statistics software R, and how to explore and present their data
with visual methods.
Course Objectives
Prerequisites
Course prerequisites are a basic understanding of statistics and bivariate linear regression. Some
experience in the use of a statistical software package would help, preferable basic data management
tasks in R (loading and merging data, generating and recoding variables, etc.) I will provide detailed code
examples to get students up to pace.
This is an online course using a flipped classroom design. It covers the same material and content as an
on-site course but runs differently. In this course, you are responsible for watching video-recorded lectures
1
and reading the required literature for each unit prior to participating in mandatory weekly one-hour online
meetings where students have the chance to discuss the materials from a unit with the instructor.
Although this is an online course where students have more freedom in when they engage with the course
materials, students are expected to spend the same amount of time overall on all activities in the course
– including preparatory activities (readings, studying), in-class-activities (watching prerecorded videos,
attending the live online meetings), and follow-up activities (working on assignments and exams) – as in
an on-site course. As a rule of thumb you can expect to spend approximately 3h/week on in-class-activities
and 9 hours per week on out-of-class activities (preparing for class, readings, assignments, projects,
studying for quizzes and exams). Therefore, the workload in all courses will be approximately 12h/week.
Please note that the actual workload will depend on your personal knowledge.
In preparation for the weekly online meetings, students are expected to watch the lecture videos and read
the assigned literature before the start of the meeting. In addition, students are encouraged to post
questions about the materials covered in the videos and readings of the week in the forum before the
meetings (deadline for posting questions is Monday, 12:00 EDT/18:00 CEST).
Students have the opportunity to use the Conferences feature in Canvas to connect with peers outside
the scheduled weekly online meetings (e.g., for study groups). Students are not required to use Canvas
Conferences and can of course use other online meeting platforms such as Google Hangouts, Skype or
Microsoft Teams.
Grading
A+ 100 - 97
A 96 - 93
A- 92 - 90
B+ 89 - 87
B 86 - 83
B- 82 - 80
Etc.
The grading scale is a base scale recommended by the MDM. Variations for grading on a scale are at the
discretion of the instructor.
The final grade will be communicated under the assignment "Final Grade" in the Canvas course. Please
note that the letter grade written in parentheses in Canvas is the correct final grade. The point-grade
2
displayed alongside the letter grade is irrelevant and can be ignored. Dates of when assignment will be
due are indicated in the syllabus. Extensions will be granted sparingly and are at the instructor's discretion.
The learning experience in this course will mainly rely on the online interaction between the students and
the instructors during the weekly online meetings. Therefore, we encourage all students in this course to
use a web camera and a headset. Decent quality headsets and web cams are available for less than $20
each. We ask students to refrain from using built-in web cams and speakers on their desktops or laptops.
We know from our experience in previous online courses that this will reduce the quality of video and
audio transmission and therefore will decrease the overall learning experience for all students in the
course. In addition, we suggest that students use a wire connection (LAN), if available, when connecting
to the online meetings. Wireless connections (WLAN) are usually less stable and might be dropped.
Mannheim Business School would also like to officially inform you that, in order to facilitate your
participation in this course, your personal data will be processed by and on systems run by MBS and our
subcontractors. You can find detailed information in our privacy policy and information for data
subjects here.
This course is intended to provide students with a thorough introduction to the best practice of modern
data visualization from a social science perspective.
The course is highly applied in nature and emphasizes the practical aspects of data visualization in the
social sciences. To illustrate the concepts and methods, examples and data from the social sciences will
be used throughout the course. Next to conceptual discussions, the course will spend some time on how
to produce data visualizations using the free statistical programming language R. Course participants will
get hands-on advice on producing modern visualizations for their practical problems.
After an introduction to data visualization showing new and classic examples, the course will discuss data
visualization as a methodology for social science data analysis and exploration rather than simply turning
data into visual objects. Data visualization is about solving problems with data, where visualization is the
means to an overarching goal. The course distinguishes between high-level goals (exploration vs.
presentation) and low-level goals (making specific comparisons and revealing specific patterns).
Understanding data visualization as a methodology also implies that, instead of focusing on single
graphics and formats, one should think about how they are used in the larger context of a data analysis.
This immediately leads to considerations of how to use and combine multiple graphs either of different
subsets of the data or different formats of the same subset.
Next, the course will introduce the basic fundamentals of graphical perception – how humans see and
process visual stimuli. We will take a closer look at how to best achieve the low-level goals of making
specific comparisons and finding specific patterns in the data. Certain graphical formats are generally
superior to others. Understanding the workings of graphical perception suggests specific design principles
that improve the detectability of patterns in the data and decrease the cognitive load in processing them.
3
We will apply this knowledge in a discussion of the relative merits of familiar formats such as bar, dot, and
line charts.
Comparisons are the heart of any analysis of quantitative data. The course will give an overview of the
graphical formats and visual techniques that optimally support some of the most fundamental data analytic
tasks: comparing before and after, comparing subgroups, comparing to a standard, comparing to a larger
context, etc. Students will get to know the slope graph, spark lines and the bullet graph as less well known
but highly effective formats for making visual comparisons. Importantly, this session will present one of
the most powerful methods of data visualization: the small multiple design. We will stress the importance
of arrangement, sorting, and visual reference elements in enabling effective comparisons.
Social science data analysis is fundamentally about relations between variables. This includes how a
variable changes over time as a special case. The course discusses scatter plot variants for the effective
display of bivariate relationships. Considerable time will be spent on the problem of over-plotting, how to
deal with it (e.g., through the use of jittering or alpha blending) and how to enhance scatterplots with
additional plot elements. One important enhancement that will receive a detailed treatment is the addition
of both parametric and non-parametric scatter-plot smoothers that reveal general trends in data. We will
close with a discussion of how the aspect-ratio of a graphic affects the perceived strength of a correlation
or time trend.
Readings
Mandatory Readings
Few, Stephen (2012). Show Me the Numbers. Designing Tables and Graphs to Enlighten. (Second
Edition). Analytics Press.
Complementary Readings
Will be provided on the course web page.
Academic Conduct
Clear definitions of the forms of academic misconduct, including cheating and plagiarism, as well as
information about disciplinary sanctions for academic misconduct may be found at
https://www.president.umd.edu/sites/president.umd.edu/files/documents/policies/III-100A.pdf (University
of Maryland) and
Knowledge of these rules is the responsibility of the student and ignorance of them does not excuse
misconduct. The student is expected to be familiar with these guidelines before submitting any written
work or taking any exams in this course. Lack of familiarity with these rules in no way constitutes an
excuse for acts of misconduct. Charges of plagiarism and other forms of academic misconduct will be
dealt with very seriously and may result in oral or written reprimands, a lower or failing grade on the
4
assignment, a lower or failing grade for the course, suspension, and/or, in some cases, expulsion from
the university.
In order to receive services, students at the University of Maryland must contact the Accessibility &
Disability Service (ADS) office to register in person for services. Please call the office to set up an
appointment to register with an ADS counselor. Contact the ADS office at 301.314.7682;
https://www.counseling.umd.edu/ads/.
Students at the Mannheim Business School should contact the Commissioner and Counsellor for
Disabled Students and Students with Chronic Illnesses at http://www.uni-
mannheim.de/studienbueros/english/counselling/disabled_persons_and_persons_with_chronic_illnesse
s/
Course Evaluation
In an effort to improve the learning experience for students in our online courses, students will be invited
to participate in an online course evaluation at the end of the course. Participation is entirely voluntary
and highly appreciated.
5
Sessions
Lab assignment: due Wednesday, June 23, 2021, 12:00 EDT/18:00 CEST
Required Readings:
• Few, S. (2012). Introduction. (Ch. 1: p. 1-13)
• Few, S. (2012). Differing Roles of Tables and Graphs. (Ch. 3: p. 39-51)
Recommended Readings:
• Tufte, E. (2006). Principles of Analytic Design. In: Tufte, E.: Beautiful Evidence. Graphics Press.
(Ch. 5).
• Gelman, A & Unwin, A. (2013). Infovis and Statistical Graphs: Different Goals, Different Looks
(with Discussion). Journal of Computational and Graphical Statistics 22: 2-28
Lab assignment: due Wednesday, June 30, 2021, 12:00 EDT/18:00 CEST
Required Readings:
• Few, S. (2012). Visual Perception and Graphical Communication. (Ch 5: 61-86)
• Few, S. (2012). Fundamental Variations of Graphs. (Ch 6: 87-135)
• Few, S. (2012). General Design for Communication. (Ch 7: 141-154)
Recommended Readings:
• Few, S. (2012). Silly Graphs that are Best Forsaken. (Ch 12: 271-285)
• Heer, J. & Bostock, M. (2010). Crowdsourcing Human Perception: Using Mechanical Turk to
Assess Visualization Design. ACM Human Factors in Computing Systems (CHI) 2010.
6
Week 3: Making Visual Comparisons
• Exploring and making visual comparisons using the football data set and R
Required Readings:
• Few, S. (2012). General Graph Design. (Ch. 9: 191-203)
• Few, S. (2012). Component Level Graph Design. (Ch. 10: 205-255)
• Few, S. (2012). Displaying Many Variables at Once. (Ch. 11: 257-270)
Recommended Readings:
• Tufte, E. (2001). Data Density and Small Multiples. In: Tufte, E.: The Visual Display of Quantitative
Information. Graphics Press. Ch. 8.
Lab assignment: due Wednesday, July 14, 2021, 12:00 EDT/18:00 CEST
Required Readings:
• Few, S. (2009). Correlation Analysis. In: Few, S.: Now You See It. Analytics Press. (Ch. 11: 245-
279).
Recommended Readings:
• Jacoby, W. G. (2000). Loess: a nonparametric tool for depicting relationships between variables.
Electoral Studies 19: 577-613.
Project/Homework/Final exam
Final visualization project due: Tuesday, July 20, 2021, 12:00 EDT/18:00 CEST.