0% found this document useful (0 votes)
5 views12 pages

Visualizing Data and Models - University of Washington

CSSS 569 is a course at the University of Washington focused on effective data visualization in social sciences, emphasizing the design of graphics and tables using R statistical environment. It includes assignments such as homework, breakout group projects, and a final presentation, with strict guidelines against using generative AI tools for coursework. The course also outlines required readings, software recommendations, and policies on academic integrity and accommodations.

Uploaded by

justforthis811
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views12 pages

Visualizing Data and Models - University of Washington

CSSS 569 is a course at the University of Washington focused on effective data visualization in social sciences, emphasizing the design of graphics and tables using R statistical environment. It includes assignments such as homework, breakout group projects, and a final presentation, with strict guidelines against using generative AI tools for coursework. The course also outlines required readings, software recommendations, and policies on academic integrity and accommodations.

Uploaded by

justforthis811
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

CSSS 569 · Visualizing Data and Models

Winter Quarter 2025


University of Washington

Christopher Adolph · Professor · Political Science and CSSS

Class Meets Office


Mondays & Wednesdays 4:30–5:50 PM All office hours held
Smith Hall 205 remotely via Zoom
cadolph@uw.edu

Section Meets Teaching Assistant


Fridays 3:30–4:45 PM Ramses Llobet
All sections taught via Zoom rllobet@uw.edu

Overview. Visual displays are an integral part of most social science presentations and
can make or break a paper. Good visuals help researchers uncover patterns and relation-
ships they would otherwise miss. Ever more sophisticated statistical models cry out for
clear, easy-to-understand visual representations of model findings. Yet social scientists
seldom put as much care into designing visual displays as they devote to crafting effec-
tive prose. This course takes the design of graphics and tables seriously and explores
a variety of visual techniques for investigating patterns in data, summarizing statis-
tical results, and efficiently representing the robustness of such results to alternative
modeling assumptions. Emphasis is placed on the principles of effective visualization,
examples from the social sciences, novel visual displays, and the implementation of
recommended techniques using the R statistical environment and the R packages tile,
simcf, and ggplot2.

1
Prerequisites. No specific courses are required, but some graduate level quantitative methods
coursework is prerequisite, as many of the applications we consider will assume famil-
iarity with the basics of research design and quantitative inference (linear regression &
elementary maximum likelihood).

Office Hours. Chris Adolph: By appointment via Zoom. Ramses Llobet: by appointment
via Zoom.

Course Website. Consult http://faculty.washington.edu/cadolph/vis for problem


sets, notes, and announcements.

Use of Generative Text and Images Prohibited. Students are prohibited from us-
ing generative text or generative images – so called-artificial intelligence tools such as
ChatGPT or MidJourney – to assist in completing any course assigments. Students
should not use chatbots based on large language models to complete class assignments
because of the fundamental challenges these tools have in generating accurate state-
ments (“hallucination”) and the intrinsic inability of these tools to properly attribute
sources of information. Moreover, a strong ethical case can be made against the use of
either generative text or images in academic work due to the unauthorized use of copy-
righted materials to train the models underlying these tools. Regardless of the merits
of these tools, reliance on them in an instructional environment deprives students of
the opportunity to hone the research, writing, and coding skills required to evaluate or
refine their outputs. Even if there is a case for using chatbots or AI art in some contexts,
doing so in this course contradicts its core pedagogical aims.
Students may not use chatbots or AI art to produce, in whole or in part, either rough
or final drafts of computer code, figures, assignment write-ups, presentations, or pa-
pers: use of chatbots to assist in any of these tasks will be considered cheating and/or
academic fraud. With prior instructor approval, limited exceptions may be made only
for the use of AI to process text or image data into usable machine-readable formats;
in such cases, students should be mindful of ethical considerations in using such tools,
practical considerations regarding the reliability of data processed using generative AI,
and appropriate techniques for mitigating bias and hallucination. If you are uncertain
whether use of a specific resource violates these guidelines, ask your instructor before using it.

Penalty for Cheating or Academic Fraud. Any student caught cheating or plagia-
rizing by the instructor on any assignment will receive a grade of X for the course and
will be reported to the Dean’s office in the College of Arts and Sciences.

2
Notice Required by State Law. Washington state law requires that UW develop a policy
for accommodation of student absences or significant hardship due to reasons of faith or conscience,
or for organized religious activities. The UW’s policy, including more information about how to
request an accommodation, is available at Religious Accommodations Policy (https://registrar.
washington.edu/staffandfaculty/religious-accommodations-policy). Accommodations
must be requested within the first two weeks of this course using the Religious Accommodations
Request form (https://registrar.washington.edu/students/religious-accommodations-
request).

Other relevant university policies. See this website:


https://registrar.washington.edu/staffandfaculty/syllabi-guidelines

Course Requirements
Homework (30%) I will assign three homeworks covering topics to include exploring
datasets, visualizing the results of statistical inference, and designing and programming
new visualizations. For some assignments, it will be possible to use a variety of graphics
packages to complete the assignment, but for most problems, R will be required or
strongly recommended. Help will be available for R and any other package specifically
recommended for the assignment, but not for other packages.

Breakout Groups (30%) Starting next week, students will self-select into a small Zoom
discussion group investigating the application of visual displays to a specific scientific
problem or area. This problem might consist of a difficult kind of model or dataset to
visualize. Alternatively, it might be a problematic or promising visual display method
used frequently in the student’s field which the student hopes to replace, improve, or
perfect. In past years, students investigated interactive graphics, animations, and visu-
alizations for text data, network data, hierarchical and multilevel data, spatial data, and
time series, respectively, among other topics. Students may choose among these topics
or propose their own. I reserve the right to decide which groups are large enough to
be viable and to combine groups if needed.
Before our joint Zoom meeting, each member of the breakout group will write and
circulate by emailed PDF to the group and to me a 2–5 page memo, complete with
(original or borrowed) graphics, illustrating a relevant data visualization problem they
wish to tackle and briefly sketching possible strategies for solving it. This memo need
not solve the data visualization problem and may not necessarily even present an ac-

3
tual data analysis – the goal is to start a conversation about how we might approach a
student-selected visualization challenge. Each group will meet at least once for discus-
sion of their problem area and individual memos led by me. This meeting will occur
no earlier than the start of Week 4 (Monday, 27 January) and no later than the end of
Week 7 (Friday, 21 February).
By 9 AM Monday, 3 March, each group will post to Canvas a single, group-authored
report of at least 5 to 8 pages sharing lessons learned, recommendations for best prac-
tices, and outstanding problems in the area studied by the group. (As a guide, imagine
the most useful brief introductory essay you could have read before further exploring
your breakout group’s topic: this is the ideal final report.)
During the week of 3 March, I will facilitate a (written) online discussion based on
these reports. Members in the class may ask any other group questions about their topic
and conclusions. Each member of the class should ask (at least) one original question
of another group, and each member should help answer at least one question directed
at their own group.
Credit for this portion of the course will be based on the individual memo, partic-
ipation in breakout discussions, the final report, and participation in the online class
discussion.

Final presentation (40%) Over the final two meetings of the course, each student will
present a poster1 applying the tools learned in class to their own research. Alterna-
tively, students can take a published article in their field and show how better visuals
would either more clearly convey the findings or cast doubt on them, or present an
innovation in statistical graphics, preferably one which comes with software to help
implement the innovation. The final presentation may address problems related to
the topics pursued in the breakout group, but should represent primarily the work of
the presenting student, not the group: this is a separate assignment, and it is usually
more fruitful to tackle a second problem for the final presentation. Likewise, it’s use-
ful for the final poster to be substantially different from the homeworks, though it
may represent an evolution of a project explored in the homework assignments. Final
presentations must be emailed to your instructor in PDF format for credit to be given.

1 Posters are used as an alternative to slide presentations in many fields. Guidance on poster
construction will be provided later in the quarter for students who have never made a scientific
poster. Students presenting interactive graphics as part of their final presentation should bring
a laptop displaying the interactive graphic, perhaps with a supporting poster explaining the
project if needed.

4
Group projects are permitted, but each member must have primary responsibility
for at least one figure, and this should be indicated in the email sending the poster to
me (but not in the poster itself ).

NB: We will use Google Sheets to coordinate formation of breakout topics and groups, scheduling of
breakout meetings, and scheduling of final posters. Google Sheets requiring your attention will be
announced on the course mailing list. Prompt attention to Google Sheets requests is essential
to keeping the course on schedule.

Course texts
Visual display books are expensive; students should order based on their interests. De-
scriptions at right may help select the most useful texts for permanent purchase. The
starred texts are the most essential for purchase.

Kieran Healy. 2018. Data Visual- Guide to data visualization implement-
ization. Princeton University Press. ing many of this courses’ recommenda-
(Amazon: $45.95) tions in R’s ggplot2.


Edward R. Tufte. 2001. The Vi- The most famous and possibly the best
sual Display of Quantitative Informa- book on data visualization ever written.
tion. Graphics Press. 2nd ed. (Ama- Fun to read and essential.
zon: $28.00)

William S. Cleveland. 1993. Visual- Classic on the design of data visuals from
izing Data. Hobart Press. (Amazon: a statistical perspective, especially for ex-
$59.85) ploratory data analysis with many condi-
tioning variables.

Paul Murrell. 2018. R Graphics. The authority on R’s various graphics


Chapman & Hall. 3rd ed. (Amazon: engines; excellent technical reference for
$82.95) both beginners and programmers.

5
Colin Ware. 2020. Information Visu- Collects a wealth of cognitive science re-
alization. Morgan Kaufman. 4th ed. search on how people see and process data
(Amazon: $53.37). visuals. Helpful background; less empha-
sis on application.

Claus O. Wilke. 2019. Fun- Nuts-and-bolts examples of effective vi-


damentals of Data Visualiza- sualization contrasted with common mis-
tion. O’Reilly. (Amazon: takes; short chapters and quick intu-
$53.73, but available free here: itions.
https://clauswilke.com/dataviz).


Nathan Yau. 2024. Visualize This: Gentle introduction to use of R and
The FlowingData Guide to Design, Vi- other packages to perform exploratory
sualization, and Statistics. Indianapo- data analysis and make beautiful visual
lis: Wiley. 2nd ed. (Amazon: displays.
$24.10)

Recommended for further reading

Chris Beeley. 2013. Web Application Development with R Using Shiny. Packt Publishing.
Jacques Bertin. 1967. [2010.] Semiologie graphique. [Semiology of Graphics.]
trans. William J. Berg. ESRI Press.
R. Dennis Cook. 1998. Regression Graphics. Wiley Interscience.
Dianne Cook & Deborah F. Swayne. 2007. Interactive and Dynamic Graphics for Data
Analysis. Springer-Verlag.
Michael Friendly. 2000. Visualizing Categorical Data. SAS Publishing.
Ben Fry. 2007. Visualizing Data. O’Reilly.
Kosuke Imai. 2017. Quantitative Social Science: An Introduction. Princeton Univ. Press.
Julie Steele and Noah Iliinsky, eds. 2010. Beautiful Visualization. O’Reilly Media, Inc.
David McCandless. 2009. The Visual Miscellaneum. Harper Design.
Isabel Meirelles. 2013. Design for Information. Rockport Publishers.
Oscar Perpinan Lamigueiro. 2014. Displaying Time Series, Spatial, and Space-Time Data
with R. Chapman & Hall/CRC.
Deepayan Sarkar. 2008. Lattice: Multvariate Data Visualization with R. Springer-Verlag.

6
Edward Tufte. 1990. Envisioning Information. Graphics Press.
Edward Tufte. 1997. Visual Explanations. Graphics Press.
Edward Tufte. 2006. Beautiful Evidence. Graphics Press.
Howard Wainer. 2005. Graphic Discovery. Princeton University Press.
Hadley Wickham. 2009. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag.
Leland Wilkinson. 1999. The Grammar of Graphics. Springer-Verlag.
Graham Wills. 2012. Visualizing Time: Designing Graphical Representations for Statistical
Data. Springer.
Yihui Xie. 2013. Dynamic Documents with R and knitr. Chapman & Hall/CRC.

Tools
It’s easier than ever to create beautiful and effective scientific graphics, but not all graph-
ical software is created equal. Many commonly used packages – particularly Microsoft
Excel and its clones – combine inflexibility with poor default settings.

For the most part, students are not required to use a specific package, but are encour-
aged to use software that allows: (1) flexible generation of virtually any diagram, (2)
command line or code interface, perhaps in addition to a graphical interface, and (3)
widely usable output, such as postscript or PDF.

Recommended Software for Visual Display

R & RStudio. In-class code examples will use the R statistical language, which has all
these virtues in addition to being free, open source, and widely used. You can obtain
R at http://www.r-project.org. Throughout the course, I will provide example code
in R and can only promise detailed homework help for the R package. At least one
homework will require students to use R, so it’s worth downloading now. In particular,
the course will provide readings and examples drawing on the popular ggplot2 graphics
package and the instructor’s own tile graphics package, both available for R.

Illustrator. Adobe Illustrator is the industry standard for retouching postscript and PDF
graphics. Unfortunately, it is also (a) very expensive, even with an academic license
and (b) now only available as part of a subscription to a package of Adobe software (see
the Tech Center page at the University Bookstore’s website for details). Illustrator is
not required for the course but is worth considering as students develop their visualization
skills, especially for touching up final illustrations.

7
Other free tools. Yau’s Visualize This discusses other tools for getting data off the web
(like the Python programming language), constructing interactive graphics (like the
processing language), and for working with maps (using SVG). Although we will not
cover these tools in class, they may be of use for student projects. A wealth of tools have
emerged to work in conjunction with R to create interactive graphics, animations, and
slides for the web (especially Shiny, but also rCharts, Slidify, gridSVG, and others).

Course outline
The readings for this course are complementary to the lectures and often cover topics
or directions we don’t have time to get to in lecture. It is thus more important than
usual for a statistics class that students should come to class having read the material
assigned for that day. The reading load for this class is considerably longer (in pages,
if not minutes) than the typical statistics class but is fun, quick, and essential: the best
way to learn effective visualization is to see how other scholars do it. Some of the
readings, particularly from the Journal of Computational and Graphical Statistics ( JCGS),
have technical portions, but in most cases these details can be skimmed unless/until you
want to code up these methods for yourself. Readings marked Optional are intended
to be read now if you are interested in or working on the graphical problem described
therein.
Note that if you are not familiar with R, you should begin reading the “optional”
selections from Zuur immediately.
On some days, we will open class with a “Gallery” in which I will present for dis-
cussion several innovative or problematic visualizations (see the course site for a list).
This will give everyone a chance to see the principles of the course in action, and learn
from both the successes and mistakes of other scientists, including your instructor. In
some cases, these “Gallery” lectures may be provided be pre-recorded video to save
class time.

Part I: Theory of Visualization


Monday, 6 January · Introduction
Optional: Tufte, Visual And Statistical Thinking, pp. 5–15

8
Monday, 8 January – Wednesday, 15 January · Principles of Information Visualization
Required: Tufte, VDQI, all
Wilke, Ch. 1–3, 15, 17, 29
Suggested: Yau, Ch. 1
Optional: Richard A. Feinberg and Howard Wainer. 2011. “Extracting
sunbeams from cucumbers.” JCGS 20:4.
NO CLASS MONDAY, 20 JANUArY: MArTIN LUTHEr KING, Jr. DAY

Wednesday, 22 January – Monday, 27 January · Cognitive Issues in Visualization


Required: Healy, Ch. 1
Wilke, Ch. 4–7, 19, 20
Suggested: Yau, Ch. 2–4
Optional: Jeffrey Heer and Michael Bostock. 2010. “Crowdsourcing graphical
perception: Using Mechanical Turk to assess visualization design.”
ACM Human Factors in Computing Systems (CHI). 203–212.
Ware, Ch. 1, 4, 5
Ware, Ch. 6
Rick Wicklin. 2011. “Visualizing airline delays and cancelations.”
JCGS 20.2 (heatmap example)
PrOBLEM SET 1 DUE MONDAY, 27 JANUArY BY CANVAS SUBMISSION

Wednesday, 29 January – Monday, 3 February · Programming Visual Displays


Required: Healy, Ch. 2–5, 8
Wilke, Ch. 9–12, 18, 22–25, 27
Suggested: Murrell, Ch. 1–3, 6–7, 9–10
Yau, Ch. 5–6
Optional: Murrell, Ch. 4–5, 8, 11–17 (on lattice, ggplot2, advanced grid,
categorical data, maps, networks, 3D, dynamic
and interactive graphics)
Hadley Wickham. 2010. “A Layered Grammar of Graphics.”
JCGS 19:1. (on ggplot2)

9
Part II: Visualization for Statistical Applications
Wednesday, 5 February – Monday, 10 February · Exploratory Data Analysis
Required: Cleveland, Visualizing Data, selections.
W. N. Venables and B. D. Ripley. 2010. Modern applied statistics with S.
4th ed. Springer. Ch. 5 & 11.
Wilke, Ch. 13–14
Healy, Ch. 7 (on maps)
Suggested: Yau, Ch. 7 (on maps)
Optional: Ben Fry. 2007. Visualizing Data. O’Reilly. Ch. 1, 2, 4.
William G. Jacoby. 1998. “Statistical Graphics for Visualizing
Multivariate Data.” Sage Papers on Quantitative Applications in the
Social Sciences, selections.
Catherine B. Hurley. 2004. “Clustering visualizations of
multidimensional data.” JCGS 13:4.
Rida E. Moustafa, Ali S. Hadi, and Jürgen Syzmanik. 2011.
“Multi-class data exploration using space transformed visualization
plots.” JCGS 20:2. (read for essential points and graphics)
Danny Holten. 2006. “Hierarchical edge bundles: Visualization of
adjacency relations in hierarchical data.” IEEE Transactions on
Visualization and Computer Graphics. 12:5 (on network data).
Christopher G. Healey. 2001. “Combining perception and
impressionistic techniques for nonphotorealistic visualization of
multidimensional data.” SIGGRAPH Paper.
Christopher Adolph. 2003. “Visual interpretation and presentation of
Monte Carlo results.” The Political Methodologist

10
Wednesday, 12 February – Monday, 24 February · Visualizing Model Inference
Required: Gary King, Michael Tomz, and Jason Wittenberg. 2000. “Making the
most of statistical analyses: Interpretation and presentation.”
American Journal of Political Science 44:2
Healy, Ch. 6.
Wilke, Ch. 16, 21
Suggested: Yau, Ch. 6, 8–9.
Optional: Andrew Gelman, Cristian Pasarica, and Rahul Dodhia. 2002. “Let’s
practice what we preach: Turning tables into graphs.” The American
Statistician 56:2.
Andrew Gelman. 2011. “Why tables are really much better than
graphs (with responses and rejoinder).” JCGS 20:1.
Rob J. Hyndman and Han Lin Shang. 2010. “Rainbow plots, bagplots,
and boxplots for functional data.” JCGS 19:1.
Ying Sun and Marc G. Genton. 2011. “Functional boxplots.”
JCGS 20:2 (note final figure for 3D confidence intervals).
PrOBLEM SET 2 DUE WEDNESDAY, 19 FEBrUArY BY CANVAS SUBMISSION

NO CLASS MONDAY, 17 FEBrUArY: PrESIDENTS’ DAY

Wednesday, 26 February · Visualizing Model Robustness and Interactions


Required: Andrew Gelman. 2004. “Exploratory data analysis for complex models
(with response and rejoinder).” JCGS 13:4
Optional: Achim Zeileis, David Meyer, and Kurt Hornik. 2007. “Residual-based
shadings for visualizing (conditional) independence.” JCGS 16:3.
Christopher Adolph. 2013. Bankers, Bureaucrats, and Central Bank Politics:
The Myth of Neutrality. Cambridge University Press. Selected
chapters on the display of interactive specifications.

Monday, 3 March · Interactive Visual Displays


Required: “Tutorial: Building ‘Shiny’ Applications with R.”
shiny.rstudio.com/tutorial

11
Potential Bonus Lecture · Advanced LaTeX for Scientific Typesetting (if time permits)
Recommended: Tobias Oetiker, Hubert Partl, Irene Hyna, and Elisabeth Schlegl. 2021.
The Not-So-Short Introduction to LaTeX. Version 6.4.
Ch. 1–3 and possibly 6.
Optional: Will Robertson. “The fontspec package.” 2020.
Version 2.7i. (full modern type support for advanced LaTeX users).

Part III: Student Presentations


Wednesday, 5 March – Wednesday, 12 March · Final Poster Presentations
Students will have a chance to express preferred presentation dates, which we will ac-
commodate as far as is feasible given the constraint of keeping the number of presenta-
tions roughly equal across dates.

PrOBLEM SET 3 DUE WEDNESDAY, 12 MArCH BY CANVAS SUBMISSION

12

You might also like