0% found this document useful (0 votes)
2 views55 pages

Evaluation

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 55

Evaluation

Overview (1 of 2)
• Evaluation is the fourth main process of UX design that we
identified in Chapter 3.
• By evaluation we mean reviewing, trying out or testing a design
idea, a piece of software, a product or a service to discover
whether it meets some criteria.
• These criteria will often be summed up by the guidelines for good
design introduced in Chapter 5, namely that the system is
learnable, effective and accommodating.
• At other times, the designer will want to focus on UX and measure
users’ enjoyment, engagement and aesthetic appreciation.
• At other times, the designer might be more interested in some
other characteristic of the design such as whether a particular
web page has been accessed or whether a particular service
moment is causing users to walk away from the interaction.
Overview (2 of 2)
• UX designers are not concerned just with surface
features such as the design of icons or choice of
colours.
• They are also interested with whether the system
is fit for its purpose, enjoyable, engaging and
whether people can quickly understand and use
the service.
• Evaluation is central to human-centred design
and is undertaken throughout the design process
whenever a designer needs to check an idea,
review a design concept or get reaction to a
physical design.
Objectives
• Understand data analytics.
• Appreciate the uses of a range of generally
applicable evaluation techniques designed for
use with and without users.
• Understand expert-based evaluation methods.
• Understand participant-based evaluation
methods.
• Understand and use data analytics.
• Apply the techniques in appropriate contexts.
Contents
• Introduction
• Data analytics
• Expert evaluation
• Participant-based evaluation
• Evaluation in practice
• Evaluation: further issues
Introduction
• The techniques in this chapter will allow you to evaluate
many types of product, system or service.
• Evaluation of different types of system, or evaluation in
different contexts, may offer particular challenges.
• Evaluation is closely tied to the other key activities of UX
design; understanding, design and envisionment.
• Evaluation is also critically dependent on the form of
envisionment used to represent the system. You will only
be able to evaluate features that are represented in a form
appropriate for the type of evaluation.
• There are also issues concerning who is involved in the
evaluation.
Design
• In our human-centred approach to design, we evaluate
designs right from the earliest idea.
• For example, very early ideas for a service can be
discussed with other designers in a team meeting.
• Mock-ups can be quickly reviewed, and later in the design
process, more realistic prototyping and testing of a
partially finished system can be evaluated with users.
• Statistical evaluations of the near-complete product or
service in its intended setting can be undertaken.
• Once the completed system is fully implemented,
designers can evaluate alternative interface designs by
gathering data about system performance.
Three main types of evaluation
• One involves a usability expert, or a UX designer, reviewing some
form of envisioned version of a design. These are expert-based
methods.
• Another involves recruiting people to use an envisioned version of a
system. These are participant-based methods, also called ‘user
testing’.
• A third method is to gather data on system performance once the
system or service is deployed. These methods are known as data
analytics.
• Expert-based methods will often pick up significant usability or UX
issues quickly, but experts will sometimes miss detailed issues that
real users find difficult.
• Participant methods must be used at some point in the development
process to get real feedback from users.
• Data analytics can be gathered and analysed once a system or
service is implemented.
Data analytics (1 of 2)
• It has often been said that this is the era of ‘big
data’.
• Huge amounts of data are being generated across
many different fields.
• The Internet of Things (IoT) refers to the
interconnectedness of sensors and devices with
one another and across the internet.
• Mobile devices are collecting increasing amounts
of personal data such as how many steps someone
has taken in a day.
• Other sensors measure heart rate, blood pressure
or levels of excitement in a person.
The quantified self (QS)
• The availability of various bio-sensors in
mobile and wearable devices has led to a
movement known as the quantified self or
personal analytics.
• Frequently associated with trying to get
people to behave in a more healthy way, QS
poses interesting questions about data
gathering and use.
• For example, a watch will vibrate if a wearer
has not stood up or moved around for an
hour.
• It monitors and displays heart rate data.
• Other personal data such as the number of
steps someone has taken in a day or the
number of stairs they have climbed are
presented on personal ‘dashboard’
visualizations. How people react to these
various representations of themselves is an
interesting issue (e.g. see Choe et al., 2014).
Data analytics (2 of 2)
• In terms of evaluating UX and other aspects of interactive
systems design, data analytics provides designers with data on
system performance and the behaviours of individuals in
interacting with systems and services.
• Data analytics also provides designers with interesting
visualizations of the data and tools to help manipulate and
analyse the data.
• The best known of the data analytics providers is Google
Analytics.
• This is a free service that provides data about where users to
websites and apps have come from (including their country,
and potentially more detailed information about location and the
device they were using) and what they did when they interacted
with the system (such as how long they used the system, which
pages of a site they visited, the order that they viewed pages
and so on).
Facebook analytics
• Facebook analytics for apps is a free service that can be
installed and provides information about who used an app on
Facebook.
• Since users on Facebook have often provided a lot of personal
information, more details of the users can be found.
• Google Analytics can provide demographic information based
on what users have told them, sing a similar formula as that
used to target Google Ads (advertisements).
• The data from Google or Facebook analytics is displayed
using a ‘dashboard’.
• Using these data analytics services, designers can examine
the activities of individuals and different groups such as
Android phone users, people who accessed from a desktop
machine using a particular browser and people who access
the site from a particular location.
Other data analytics
• Other data analytic tools will provide
a ‘heat map’ of a website showing
which parts of a page are clicked on
most frequently.
• Heat maps can also be produced
from eye-tracking software which
measures where people are looking
on a display.
• Other tools will allow the analyst to
follow people’s browsing behaviour in
real time, watching what they click
on, how long they spend on
particular sections and whether there
are particular service moments
where people drop out of the
customer journey.
Understanding through data analytics
• The ability to understand user behaviour through data analytics, combined with
the ability to rapidly deploy new versions of software, is changing the nature of
interactive software development.
• For example, a games company that has deployed a game on Facebook can
watch what players are doing in real time.
• If they notice some particular phenomenon – such as many people dropping
out of the game before they move on to the next level – they can easily
change the game – perhaps by introducing a surprise prize of extra money just
before the end of the level – and thus encourage players to keep playing.
• In other circumstances, a company may issue its software with two alternative
interfaces or with slightly different interfaces. The two interfaces are randomly
assigned to users as they log onto a site.
• By looking at the analytics of the two interfaces, analysts can see which is
performing better.
• This is known as A/B testing and is increasingly used to refine the UX of
commercial websites.
Expert evaluation
• A simple, relatively quick and effective method of
evaluation is to get an UX or usability expert to look at a
service or system and try using it.
• As we said in the introduction, this is no substitute for
getting real people to use a design but expert evaluation is
effective, particularly early in the design process.
• Experts will pick up common problems based on their
experience and will identify factors that might otherwise
interfere with an evaluation by non-experts.
• Although the methods have been around for over 20 years,
expert based methods are still widely used by industry
(Rohrer, Wendt, Sauro, Boyle, Cole, 2016).
• However, to help the experts structure their evaluation, it is
useful to adopt a particular approach.
Heuristic evaluation (1 of 3)
• Heuristic evaluation refers to a number of methods in
which a person trained in HCI, UX or interaction design
examines a proposed design to see how it measures up
against a list of principles, guidelines or ‘heuristics’ for
good design.
• This review may be a quick discussion over the shoulder
of a colleague or may be a formal, carefully documented
process.
• Ideally, several people with expertise in interactive
systems design should review the interface.
• Each expert notes the problems and the relevant heuristic
and suggests a solution where possible.
Heuristic evaluation (2 of 3)
• It is also helpful if a severity rating, say on a scale of 1 to 3, is
added, according to the likely impact of the problem, as
recommended by Dumas and Fox (2012) in their comprehensive
review of usability testing.
• However, they also note the disappointing level of correlation
amongst experts in rating severity of problems.
• Evaluators work independently and then combine results.
• They may need to work through any training materials and be
briefed by the design team about the functionality.
• The scenarios used in the design process are valuable here.
• The list of design principles above can be summarized by the
three overarching usability principles of learnability (principles 1–
4), effectiveness (principles 5–9) and accommodation (principles
10–12).
Heuristic evaluation (3 of 3)
• Heuristic evaluation therefore is valuable as
formative evaluation to help the designer
improve the interaction at an early stage.
• If that is what we need to do, then we must
carry out properly designed and controlled
experiments with a much greater number of
participants.
• However, the more controlled the testing
situation becomes, the less it is likely to
resemble the real world, which leads us to the
question of ‘ecological validity’.
Ecological validity (1 of 2)
• In real life, people multitask, use several applications in
parallel or in quick succession, are interrupted, improvise,
ask other people for help, use applications intermittently
and adapt technologies for purposes the designers never
imagined.
• We have unpredictable, complex but generally effective
coping strategies for everyday life and the technologies
supporting it.
• People switch channels and interleave activities.
• The small tasks which are the focus of most evaluations are
usually part of lengthy sequences directed towards aims
which change according to circumstances.
• All of this is extremely difficult to reproduce in testing and is
often deliberately excluded from expert evaluations.
Ecological validity (2 of 2)
• So, the results of most evaluation can only
ever be indicative of issues in real-life
usage.
• Ecological validity is concerned with
making an evaluation as life-like as
possible.
• Designers can create circumstances that
are as close to the real life environment as
possible when undertaking an evaluation.
• Designs that appear robust in controlled,
‘laboratory’ settings can perform much less
well in real-life, stressed situations.
Cognitive walkthrough (1 of 3)
• Cognitive walkthrough is a rigorous paper-based technique for
checking through the detailed design and logic of steps in an
interaction.
• It is derived from the human information processor view of
cognition and closely related to task analysis.
• In essence, the cognitive walkthrough entails a usability or UX
analyst stepping through the cognitive tasks that must be carried
out in interacting with technology.
• Originally developed by Lewis et al. (1990) for applications
where people browse and explore information, it has been
extended to interactive systems in general (Wharton et al.,
1994).
• Aside from its systematic approach, the great strength of the
cognitive walkthrough is that it is based on well-established
theory rather than the trial and error or a heuristically based
approach.
Cognitive walkthrough (2 of 3)
• Inputs to the process are:
‒ An understanding of the people who are expected to use the system.
‒ A set of concrete scenarios representing both (a) very common and (b)
uncommon but critical sequences of activities.
‒ A complete description of the interface to the system – this should comprise
both a representation of how the interface is presented, for example screen
designs, and the correct sequence of actions for achieving the scenario
tasks, usually as a hierarchical task analysis (HTA).
• Having gathered these materials together, the analyst asks the following
four questions for each individual step in the interaction:
‒ Will the people using the system try to achieve the right effect?
‒ Will they notice that the correct action is available?
‒ Will they associate the correct action with the effect that they are trying to
achieve?
‒ If the correct action is performed, will people see that progress is being
made towards the goal of their activity?
Cognitive walkthrough (3 of 3)
• If any of the questions is answered in the negative,
then a usability problem has been identified and is
recorded, but redesign suggestions are not made at
this point.
• If the walkthrough is being used as originally devised,
this process is carried out as a group exercise by
analysts and designers together.
• The analysts step through usage scenarios and the
design team are required to explain how the user
would identify, carry out and monitor the correct
sequence of actions.
• Software designers in organizations with structured
quality procedures in place will find some similarities
to program code walkthroughs.
Usability evaluation
• Most of the expert-based evaluation methods
focus on the usability of systems.
• For example, our own set of heuristics focus on
usability as does Nielsen’s.
• Other writers develop heuristics specifically for
websites or particular types of websites such as
e-commerce sites.
• However, there is no problem with designers
devising their own heuristics that focus on
particular aspects of the UX that they are
interested in.
Semantic differential
• Recall from Chapter 7 the discussion on semantic
differential and semantic understanding as ways of
helping designers to understand users’ views of a domain.
• Also in Chapter 8, we discussed how descriptive
adjectives (which essentially are semantic descriptors)
could be used as a method of envisioning the
characteristics that some UX should aim to achieve.
• These descriptors can the be used as an evaluation tool,
with UX experts working through an envisionment of a
design and rating the experience against the specific
characteristics that the design was intended to achieve.
• For example, we undertook an expert-based walkthrough
of the Visit Scotland app (Section 8.3) to see if it achieved
its objectives of being engaging, authoritative and modern.
Participant-based evaluation
• Whereas expert, heuristic evaluations can be
carried out by designers on their own, there can
be no substitute for involving some real people in
the evaluation.
• Participant evaluation aims to do exactly that.
• There are many ways to involve people that
require various degrees of cooperation.
• The methods range from designers sitting with
participants as they work through a system to
leaving people alone with the technology and
observing what they do through a two-way
mirror.
Cooperative evaluation
• Andrew Monk and colleagues (Monk et al.,
1993) at the University of York (UK) developed
cooperative evaluation as a means of
maximizing the data gathered from a simple
testing session.
• The technique is ‘cooperative’ because
participants are not passive subjects but work
as co-evaluators (Figure 10.6).
• It has proved a reliable but economical
technique in diverse applications. Table 10.1
and the sample questions are edited from
Appendix 1 in Monk et al. (1993).
Guidelines for cooperative evaluation
Participatory heuristic evaluation
• The developers of participatory heuristic evaluation
(Muller et al., 1998) claim that it extends the power of
heuristic evaluation without adding greatly to the effort
required.
• An expanded list of heuristics is provided, based on
those of Nielsen and Mack (1994) – but of course you
could use any heuristics such as those introduced
earlier (Chapter 4).
• The procedure for the use of participatory heuristic
evaluation is just as for the expert version, but the
participants are involved as ‘work-domain experts’
alongside usability experts and must be briefed about
what is required.
Co-discovery (1 of 2)
• Co-discovery is a naturalistic, informal technique that is
particularly good for capturing first impressions. It is best
used in the later stages of design.
• The standard approach of watching individual people
interacting with the technology, and possibly ‘thinking
aloud’ as they do so, can be varied by having participants
explore new technology in pairs.
• For example, a series of pairs of people could be given a
prototype of a new digital camera and asked to
experiment with its features by taking pictures of each
other and objects in the room.
• It is a good idea to use people who know each other
quite well.
• As with most other techniques, it also helps set users
some realistic tasks to try out.
Co-discovery (2 of 2)
• Depending on the data to be collected, the
evaluator can take an active part in the
session by asking questions or suggesting
activities, or simply monitor the interaction
either live or using a video recording.
• Inevitably, asking specific questions skews the
output towards the evaluator’s interests, but
does help ensure that all important angles are
covered.
• The term ‘co-discovery’ originates from Kemp
and van Gelderen (1996) who provide a
detailed description of its use.
Controlled experiments (1 of 3)
• Another way of undertaking participant evaluation is to set up a
controlled experiment.
• Controlled experiments are appropriate where the designer is
interested in particular features of a design, perhaps comparing
one design to another to see which is better.
• The first thing to do when considering a controlled experiment
approach to evaluation is to establish what it is that you are
looking at.
• This is the independent variable.
• For example, you might want to compare two different designs of
a website, or two different ways of selecting a function on a
mobile phone application.
• Later, we describe an experiment that examined two different
ways of presenting an audio interface to select locations of
objects (Chapter 18). The independent variable was the type of
audio interface.
Controlled experiments (2 of 3)
• Once you have established what it is you are looking at, you
need to decide how you are going to measure the difference.
• These are the dependent variables.
• You might want to judge which web design is better based on the
number of clicks needed to achieve some task; speed of access
could be the dependent variable for selecting a function.
• In the case of the audio interface, accuracy of location was the
dependent variable.
• Once the independent and dependent variables have been
agreed, the experiment needs to be designed to avoid anything
getting in the way of the relationship between independent and
dependent variables.
• You want to ensure a balanced and clear relationship between
independent and dependent variables so that you can be sure
you are looking at the relationship between them and nothing
else.
Controlled experiments (3 of 3)
• The next stage is to decide whether each participant will
participate in all conditions (the so-called within-subject design)
or whether each participant will perform in only one condition (the
so-called between-subject design).
• Having got some participants to agree to participate in a
controlled experiment, it is tempting to try to find out as much as
possible.
• A controlled experiment will often result in some quantitative
data: the measures of the dependent values.
• This data can then be analysed using statistics, for example
comparing the average time to do something across two
conditions or the average number of clicks.
• So, to undertake controlled experiments, you will need some
basic understanding of probability theory, of experimental theory
and, of course, of statistics.
• Statistical Software such as SPSS will help design and analyse
your data.
Evaluation in practice
• The main steps in undertaking a simple but effective
evaluation project are:
‒ Establish the aims of the evaluation, the intended
participants in the evaluation, the context of use and the state
of the technology; obtain or construct scenarios illustrating
how the application will be used.
‒ Select evaluation methods. These should be a combination
of expert-based review methods and participant methods.
‒ Carry out expert review.
‒ Plan participant testing; use the results of the expert review
to help focus this.
‒ Recruit people and organize testing venue and equipment.
‒ Carry out the evaluation.
‒ Analyse results, document and report back to designers.
Source: Adapted from Vredenburg, K., Mao, J.-Y., Smith, P.W. and Carey, T. (2002) A survey of user-centred design practice, Proceedings of SIGCHI conference on human factors in computing
systems, MN, 20–25 April, pp. 471–478, Table 3. © 2002 ACM, Inc. Reprinted by permission
Aims of the evaluation (1 of 2)
• Deciding the aim(s) for evaluation helps determine the type
of data required.
• It is useful to write down the main questions you need to
answer.
• For example, in the evaluation of the early concept for a
virtual training environment, the aims were to investigate the
following:
‒ Do the trainers understand and welcome the basic idea of the virtual
training environment?
‒ Would they use it to extend or replace existing training courses?
‒ How close to reality should the virtual environment be?
‒ What features are required to support record keeping and
administration?
• The data we were interested in at this stage was largely
qualitative (non-numerical), so appropriate data gathering
methods were interviews and discussions with the trainers.
Aims of the evaluation (2 of 2)
• If the aim of the evaluation is the comparison of two different
evaluation designs, then much more focused questions will be
required and the data gathered will be more quantitative. In
the virtual training environment, for example, some questions
we asked were:
‒ Is it quicker to reach a particular room in the virtual
environment using mouse, cursor keys or joystick?
‒ Is it easier to open a virtual door by clicking on the handle
or selecting the ‘open’ icon from a tools palette?
• Underlying issues were the focus on speed and ease of
operation.
• This illustrates the link between analysis and evaluation – in
this case, it had been identified that these qualities were
crucial for the acceptability of the virtual learning environment.
• With questions such as these, we are likely to need
quantitative (numerical) data to support design choices.
Metrics and measures
• In most of these, there is a task – something the participant
needs to get done – and it is reasonably straightforward to
decide whether the task has been achieved successfully or
not.
• There is one major difficulty: deciding the acceptable figure
for, say, the percentage of tasks successfully completed. Is
this 95 per cent, 80 per cent or 50 per cent?
• There are three things to keep in mind when deciding metrics:
‒ Just because something can be measured, it doesn’t
mean it should be.
‒ Always refer back to the overall purpose and context of
use of the technology.
‒ Consider the usefulness of the data you are likely to
obtain against the resources it will take to test against the
metrics.
• The last point is particularly important in practice.
Metrics and measures
People (1 of 2)
• The most important people in evaluation are the people who will
use the system.
• Analysis work should have identified the characteristics of these
people and represented these in the form of personas.
• Relevant data can include knowledge of the activities the
technology is intended to support, skills relating to input and
output devices, experience, education, training and physical and
cognitive capabilities.
• Nielsen’s recommended sample of 3–5 participants has been
accepted wisdom in usability practice for over a decade.
• However, some practitioners and researchers advise that this is
too few.
• We consider that in many real-world situations, obtaining even 3–
5 people is difficult, so we continue to recommend small test
numbers as part of a pragmatic evaluation strategy.
People (2 of 2)
• However, testing such a small number makes sense only if you have a
relatively homogeneous group to design for – for example, experienced
managers who use a customer database system or computer games
players aged between 16 and 25.
• If you have a heterogeneous set of customers that your design is aimed
at, then you will need to run 3–5 people from each group through your
tests.
• Finding representative participants should be straightforward if you are
developing an in-house application.
• If you cannot recruit any genuine participants – people who are really
representative of the target customers – and you are the designer of the
software, at least have someone else try to use it.
• Consider your own role and that of others in the evaluation team if you
have one.
• Our recommended method for basic testing requires an evaluator to sit
with each user and engage with them as they carry out the test tasks.
Physical and physiological
measures (1 of 2)
• Eye-movement tracking (or ‘eye tracking’) can show participants’
changing focus on different areas of the screen.
• This can indicate which features of a user interface have attracted
attention, and in which order, or capture larger-scale gaze patterns
indicating how people move around the screen.
• Eye tracking is very popular with website designers as it can be
used to highlight which parts of the page are most looked at, the
so-called ‘hot spots’, and which are missed altogether.
• Eye-tracking software is readily available to provide maps of the
screen.
• Some of it can also measure pupil dilation, which is taken as an
indication of arousal. Your pupil dilates if you like what you see.
• Physiological techniques in evaluation rely on the fact that all our
emotions – anxiety, pleasure, apprehension, delight, surprise and
so on – generate physiological changes.
Physical and physiological
measures (2 of 2)
• The most common measures are of changes in heart rate, the rate of
respiration, skin temperature, blood volume, pulse and galvanic skin
response (an indicator of the amount of perspiration).
• Sensors can be attached to the participant’s body (commonly the
fingertips) and linked to software which converts the results to numerical
and graphical formats for analysis.
• Another key aspect of the evaluation is the data that you will need to
gather about the people.
• Eye-tracking can be used to see where people are looking.
• Face recognition can determine if people are looking happy or sad,
confused or angry.
• The Facial Action Coding System (FACS) is a robust way of measuring
emotion through facial expression.
• These various measures can be combined into a powerful way of
evaluating UX.
The test plan and task specification
• A plan should be drawn up to guide the evaluation. The
plan specifies:
‒ Aims of the test session
‒ Practical details, including where and when it will be conducted,
how long each session will last, the specification of equipment and
materials for testing and data collection, and any technical support
that may be necessary
‒ Numbers and types of participant
‒ Tasks to be performed, with a definition of successful completion.
This section also specifies what data should be collected and how
it will be analysed.
• You should now conduct a pilot session and fix any
unforeseen difficulties. For example, task completion time
is often much longer than expected and instructions may
need clarification.
Reporting usability evaluation results
• However competent and complete the evaluation, it is only worthwhile if
the results are acted upon.
• Even if you are both designer and evaluator, you need an organized list
of findings so that you can prioritize redesign work.
• If you are reporting back to a design/development team, it is crucial that
they can see immediately what the problem is, how significant its
consequences are and ideally what needs to be done to fix it.
• The report should be ordered either by areas of the system concerned
or by severity of problem.
• For the latter, you could adopt a three- or five-point scale, perhaps
ranging from ‘would prevent participant from proceeding further’ to
‘minor irritation’.
• Adding a note of the general usability principle concerned may help
designers to understand why there is a difficulty but often more specific
explanation will be needed.
• Alternatively, sometimes the problem is so obvious that explanation is
superfluous.
Evaluating usability
• There are several
standard ways of
measuring usability
but probably the best
known and most
robust is the system
usability scale (SUS).
• Jeff Sauro presents
the scale as
illustrated in Fig.
10.10. He suggests
that any score over
68 is above average
and indicates a
reasonable level of
usability.
Evaluating UX (1 of 2)
• There are a number of tools and methods
specifically aimed at evaluating user experience.
• They differentiate between the pragmatic qualities
of the UX and the hedonic qualities (Hassenzahl,
2010).
• The user experience questionnaire describes
these qualities as illustrated.
• A 26 item questionnaire is used to gather data
about a UX (Figure 10.12).
• On-line spreadsheets are available to help with
the statistical analysis of the data gathered.
User experience questionnaire(UEQ)
UEQ semantic differential
Evaluating UX (2 of 2)
• An alternative is to use the
Attrakdiff on-line questionnaire
(Figure 10.13).
• This has a similar approach
but uses different terms.
• Both of these questionnaires
can be used as they are and
this has the advantage that
comparisons can be made
across products and services.
• For specific evaluation,
however, UX designers may
need to change the terms
used on the semantic
differential scales.
Evaluating presence
• Designers of virtual reality (VR) and augmented reality (AR)
applications are often concerned with the sense of presence, of
being ‘there’ in the virtual environment rather than ‘here’ in the
room where the technology is being used.
• A strong sense of presence is thought to be crucial for such
applications as games, those designed to treat phobias, to allow
people to ‘visit’ real places they may never see otherwise or indeed
for some workplace applications such as training to operate
effectively under stress.
• This is a very current research topic and there are no techniques
that deal with all the issues satisfactorily.
• The Sense of Presence Inventory (SOPI) can be used to measure
media presence.
• The Witmer and Singer Immersive Tendencies Questionnaire
(Witmer and Singer, 1998) is the best known of such instruments.
• Other approaches to measuring presence attempt to avoid such
layers of indirection by observing behaviour in the virtual
environment or by direct physiological measures.
Evaluation at home
• People at home are much less of a ‘captive audience’ for the
evaluator than those at work.
• They are also likely to be more concerned about protecting their
privacy and generally unwilling to spend their valuable leisure time
in helping you with your usability evaluation.
• So, it is important that data gathering techniques are interesting
and stimulating for users and make as little demand on time and
effort as possible.
• Petersen et al. (2002), for example, were interested in the evolution
over time of relationships with technology in the home.
• Diaries were also distributed as a data collection tool, but in this
instance, the non-completion rate was high possibly.
• Where the family is the focus of interest, techniques should be
engaging for children as well as adults – not only does this help to
ensure that all viewpoints are covered but also working with
children is a good way of drawing parents into evaluation activities.
Summary

• This chapter has presented an overview of the key issues in evaluation.


• Designing the evaluation of an interactive system, product or service
requires as much attention and effort as designing any other aspect of
that system.
• Designers need to be aware of the possibilities and limitations of
different approaches and, in addition to studying the theory, they need
plenty of practical experience.
• Designers need to focus hard on what features of a system or product
they want to evaluate.
• They need to think hard about the state that the system or product is in
and hence whether they can evaluate those features.
Question & Answers

You might also like