Computers & Education: Kai-Hsiang Yang, Bou-Chuan Lu

Computers & Education 160 (2021) 104033
Contents lists available at ScienceDirect
Computers & Education

journal homepage: http://www.elsevier.com/locate/compedu
Towards the successful game-based learning: Detection and

feedback to misconceptions is the key☆
Kai-Hsiang Yang *, Bou-Chuan Lu
Department of Mathematics and Information Education, National Taipei University of Education, Taipei, Taiwan, ROC
A R T I C L E I N F O A B S T R A C T
Keywords: In recent years, digital game-based learning models with suitable teaching strategies have proved
Elementary education to be of great educational benefit. Two-tier tests are an effective diagnostic tool and can be used to
Games diagnose misconceptions of students. This study therefore designed an educational game with
Teaching/learning strategies
two-tier testing mechanism to detect the misconceptions of students in the game and provide
different kinds of feedback for the misconceptions. In addition, we used lag sequence analysis to
analyze the behavioral patterns of students exhibited during gameplay to understand the influ
ence of different types of feedback content and reading time on learning effectiveness. The results
indicate that digital game-based learning with two-tier testing and general digital game-based
learning exert no significant effects on learning effectiveness. However, we did find out from
the results of the lag sequence analysis that students who are willing to spend time reading
feedback for the misconceptions are likely to achieve better learning effectiveness. Furthermore,
incorporating two-tier testing into digital games can effectively reduce mathematics anxiety and
help learners learn well.
1. Introduction
In recent years, digital game-based learning has attracted considerable attention in academia and has become a popular research
topic (Chen, Zou, Cheng, & Xie, 2020). Many related studies have suggested that digital game-based learning exerts a positive impact
on the learning motivation and attitude of learners (Tapingkae, Panjaburee, Hwang, & Srisawasdi, 2020; Taub et al., 2020; Yang,
Chang, Hwang, & Zou, 2020). Besides, a number of studies have pointed out that suitable teaching strategies must be integrated into
digital game-based learning in order to effectively enhance the learning achievements and problem-solving capabilities of students
(Hsu & Wang, 2018; Sun, Chen, & Chu, 2018). Many studies have mentioned that it is important to add different learning strategies and
feedback to digital games. However, these learning contents and problem feedback are determined when designing games. These
games cannot dynamically diagnose the learners’ misconceptions then provides appropriate feedback. Therefore, how to develop a
digital game-based learning model with a mechanism that can diagnose misconceptions is an important research issue.
A two-tier test is a type of assessment model that enable teachers to detect the reasons why learners solve problems the way they do
and determine whether the learners have any misconceptions or alternative conceptions. The development procedure of two-tier
testing was proposed by Treagust (1988) and comprises a development framework with three phases and ten steps. Many studies
☆
This study is supported in part by the Ministry of Science and Technology of the Republic of China under contract numbers MOST 107-2511-
H-152-007-MY2 and 109-2511-H-152-009.
* Corresponding author.
E-mail address: khyang@tea.ntue.edu.tw (K.-H. Yang).
https://doi.org/10.1016/j.compedu.2020.104033
Received 19 March 2020; Received in revised form 21 September 2020; Accepted 25 September 2020
Available online 29 September 2020
0360-1315/© 2020 Published by Elsevier Ltd.
K.-H. Yang and B.-C. Lu Computers & Education 160 (2021) 104033
have used two-tier tests to determine how much groups of students understood or misunderstood various concepts. Although the
two-tier testing has proved to be an effective diagnostic tool, there is no research to explore the effect of the two-tier testing in the
digital game-based learning environment.
To study the effect of the two-tier testing in the digital game-based learning environment, this study developed a digital game
system with two-tier testing and applied it to the learning of factors and multiples. The participants of our experiment were fifth-grade
elementary school students. We compared the influence of the developed system and a general digital game system on the learning
effectiveness and mathematics anxiety of the students. We therefore investigated the influence of feedback content and reading time on
learning effectiveness, and used the lag sequence analysis to analyze the sequences of student behavior. Lag sequence analysis can help
researchers check the sequence relationship between each learning behavior based on statistical theory. It can also easily identify
important sequential behavior patterns and illustrate the relationship between behaviors through visual charts. Many studies have
used this technique to analyze the learning behavior patterns of learners in games to understand how games affect learner learning
(Sun, Kuo, Hou, & Lin, 2017; Yang, Chu, & Chiang, 2018). The research questions of this study were as follows:
(1). Do students using the game-based learning model with two-tier testing present better learning effectiveness than those using a
general game-based learning model?
(2). Do students using the game-based learning model with two-tier testing show greater improvement in terms of mathematics
anxiety than those using a general game-based learning model?
(3). What learning behavior patterns do students using the game-based learning model with two-tier testing display?
2. Literature review
2.1. Game-based learning
In recent years, digital game-based learning has attracted considerable attention in academia and has become a popular research
topic (Chen et al., 2020). Many related studies have suggested that digital game-based learning exerts a positive impact on the learning
motivation and attitude of learners (Taub et al., 2020; Yang et al., 2020). Digital games have interesting storylines, clear goals, and
tasks to be solved, which makes teaching more diverse and effectively increases the learning interest and learning effectiveness of
students.
A number of studies have pointed out that suitable teaching strategies must be integrated into digital game-based learning in order
to effectively enhance the learning achievements and problem-solving capabilities of students (Hsu & Wang, 2018; Sun et al., 2018;
Tapingkae et al., 2020; Yang, 2017). For example, Yang (2017) proposed a mastery theory-based digital game with different feedback
models, and compared the differences in the learning behavior of students using the two feedback models. Their results show that
students in both feedback methods can achieve the same learning performance as that in the conventional learning method with a
teacher involved. Hsu and Wang (2018) used an online puzzle game to investigate the influence of game mechanics and
student-generated questions on arithmetic thinking. Their results indicated that integrating game mechanics and student-generated
questions can effectively improve the arithmetic thinking of learners. Sun et al. (2018) examined the influence of the scaffolding
strategy and reward mechanisms on students. They found that the right scaffolds not only improve gaming experiences and creativity
but also help learners formulate even more complex learning strategies. They explained that incorporating the scaffolding theory into
digital game-based learning can help reduce frustration, enhance problem-solving skills, and, through reward mechanisms, increase
learning motivation. Tapingkae et al. (2020) proposed a formative assessment-based contextual gaming approach to guide students to
make decisions and to monitor their learning during the gaming process. Their experimental results showed that the proposed
approach not only enhanced the students’ digital citizenship behaviors, but also promoted their motivations and perceptions.
As shown above, many related studies have mentioned that it is important to add different learning strategies and feedback to
digital games, but these learning contents and problem feedback are determined when designing games. These games cannot
dynamically diagnose the learners’ misconceptions then provides appropriate feedback. Therefore, how to develop a digital game-
based learning model with a mechanism that can diagnose misconceptions is an important research problem.
2.2. Two-tier test
A two-tier test is a type of assessment model that enable teachers to detect the reasons why learners solve problems the way they do
and determine whether the learners have any misconceptions or alternative conceptions. The development procedure of two-tier
testing was proposed by Treagust (1988) and comprises a development framework with three phases and ten steps. The first phase
is to define the content of the diagnostic tool using four steps: (1) identify propositional knowledge statements, (2) develop the concept
map, (3) relate propositional knowledge to the concept map, and (4) validate the content. The second phase involves collecting in
formation on misconceptions using three steps: (1) review relevant literature, (2) conduct unstructured interviews, and (3) develop
multiple-choice and open-ended questions. The third phase is to develop the diagnostic tool using three steps: (1) develop the two-tier
test, (2) design a two-way specification table, and (3) continue to refine the tool.
Treagust (1988) utilized two-tier testing to divide questions into two tiers. The first tier describes the context of a learning scenario
or a conceptual problem. Based on his or her knowledge, the learner determines whether the context or concept is correct. In other
words, the first tier assesses the learner’s propositional knowledge regarding the context or concept. In the second tier, the learner must
explain his or her understanding of the context or concept in the first tier. In other words, the second tier investigates the learner’s
2
explanatory knowledge and mental model. The choices and explanations of the two tiers should be designed based on the alternative
conceptions, misconceptions, or common errors of learners. Teachers will then be able to diagnose what concepts learners have based
on their responses.
In recent years, the two-tier testing has been widely used in various learning fields. For instance, Yang, Chen, and Hwang (2015)
performed path analysis and observed that students adopted the learning-by-reviewing strategy in their learning behavior patterns in
two-tier testing. After this, researchers began to investigate the influence of incorporating feedback mechanisms into two-tier testing
on learning. For instance, Yang, Fu, Hwang, and Yang (2017) developed an interactive mathematics learning system based on two-tier
testing to help university students learn calculus, and they demonstrated that instant diagnosis and feedback can enhance the
mathematics performance and confidence of students. Maier, Wolf, and Randler (2016) also used a computer-assisted formative
assessment to examine different types of feedback and diagnostic problems with multiple tiers. They established that models with
feedback mechanisms benefit learning. In short, adding instant and proper feedback and guidance mechanisms to two-tier tests can
help learners correct their misconceptions. In this case, two-tier testing is not just a diagnostic tool but an effective teaching instrument
(Tsai & Chou, 2002).
Although the two-tier testing has proved to be an effective diagnostic tool, there is no research to explore the effect of the two-tier
testing in the digital game-based learning environment. Therefore, this is still an important research topic.
3. Development of digital game-based learning game with two-tier testing
In order to explore the learning effect of the digital game-based learning model with two-tier testing, we chose the factor and
multiple unit in the mathematics as the learning content of the game. The main reason is that mathematics is an important subject
closely linked to daily life, and when students are learning mathematics, it is easy to produce many misconceptions and reduce learning
effectiveness (Lin, Yih, & Tsai, 2014). In mathematics, the concepts of factors and multiples are very important basic concepts.
However, because these concepts are very abstract and easy to cause semantic misunderstanding, it is easy for learners to generate
misconceptions through intuitive thinking, such as a larger number has more factors, and a smaller number has more multiples (Lin
et al., 2014). As a result, the learner is unable to learn smoothly, and the learning interest is reduced and the anxiety of mathematics is
increased, which eventually makes the learner give up mathematics (Verkijika & De Wet, 2015). In recent years, many studies have
helped students learn mathematics through digital games that incorporate learning strategies, and the integration of effective learning
strategies or tools into digital games is a key. Related research has pointed out that the two-tier testing is a good teaching tool that can
diagnose the learner’s misconceptions. Through the help of two-tier testing, students’ knowledge can be evaluated and their mis
conceptions can be diagnosed, so that digital games can provide students with appropriate feedbacks and learning materials to help
learning (Treagust, 1988).
In this study, we first conducted a literature review and expert interviews to develop a two-tier test for factors and multiples, and
then used RPG Maker to create a digital game-based learning system with two-tier testing. In this role-playing game, the learner can
solve game tasks while learning mathematical concepts. The game will diagnose the learner’s possible misconceptions through the
two-tier testing, and then provide appropriate feedback to help the learner correct the wrong concept. Fig. 1 displays the framework of
the developed system. The system contains a two-tier test data base, a learning material database on factors and multiples, a game
Fig. 1. Structure of developed game.
3
database, and a learning process database. The two-tier test database includes misconceptions regarding factors and multiples, a two-
way specification table, two-tier questions, and an instant-feedback mechanism. The learning materials database contains mathe
matical concepts, questions, learning tasks and learning skills. The content of the learning materials was designed by an instructional
designer based on key points in elementary textbooks. The game database included game scenes and maps, game scenarios, and reward
mechanisms. The scenarios were designed as immersive experiences for older elementary school students. The learning process
database recorded the learning behaviors of the students during the game. Key learning behaviors were recorded using specific codes
designed by the teacher.
At the beginning of the game, the game will first explain the background of the story, as well as the rules and the final goals that
must be achieved. In the game, students play the role of a civilian in a virtual kingdom. They explore and go on adventures with the
goal of becoming a king’s soldier and perform tasks (that is, learning concepts) designed by the game developer. In the game scenario,
the students faced a series of challenges. They must observe their game environment, talk to non-player characters (NPCs) to extract
important concepts, and choose tasks to complete. They can get more weapons or coins by completing tasks to upgrade their levels.
Students will accumulate experience during the game so that they can learn more advanced concepts and complete more difficult
challenges. In addition, the game also contains some game elements. For example, they can obtain some weapons or coins during
exploration in the game environment. These game elements are very important to encourage students to play games and complete
tasks.
In addition, the system provides students with learning materials, allowing them to practice their mathematical knowledge before
accepting learning tasks. When students choose a task, the process will be divided into the concept learning step and the challenge task
step. In the former, students learn mathematics by investigating in the environment and talking with NPCs, and at the same time they
can obtain concept-review lectures (as shown in Fig. 2). Once they enter the challenge task step, the game will present a two-tier test
question for students to answer, and will diagnose the student’s concept based on the student’s answer and the explanation of the
answer. Then the game will dynamically provide appropriate feedback and learning materials for students. If the task is not completed,
the students can still choose to challenge this task again later. Fig. 3 presents the process of the two-tier test assessment.
The design of the two-tier test in this study is based on the recommended development process proposed by Treagust (1988). It is
designed by the order of analyzing proposition knowledge and concept maps, collecting misconceptions, and developing diagnostic
tools. Finally, the two-tier test of the factors and multiples is completed. There are 12 questions in total, and each two-tier question is
designed by important sub-concepts in factors and multiples. A two-tier test question will divide a question into two levels. The first
level is a single choice question with 4 options to confirm the correctness of the concept. These options include 1 correct option and 2
wrong options with special misconceptions and 1 wrong option not caused by the misconceptions. The second level is a single choice
question with 4 options to explain the concept. These options include 1 correct concept and 3 misconceptions. An example of a two-tier
test is shown in Fig. 4.
The main feature of the two-tier testing is the ability to diagnose concepts. The system will determine whether the student has a
misconception based on his/her answers at the first level questions and the second level of explanation. The system divides the answer
status into three types: correct concept, misconception error, and non-misconception error. The correct concept means that the student
answered correctly in both levels of questions, which means that the student has the correct concept; the misconception error indicates
that the student had chosen the same kind of misconception options in both levels of questions, indicating that the student has a
particular misconception; the non-misconception error indicates that the student’s answers in two levels of questions are illogical,
indicating that the student answered the questions by trying errors. The mechanism of the two-tier testing is shown in Fig. 5.
Fig. 2. Screenshot of concept-review lecture.
4
Fig. 3. Feedback mechanism of two-tier test.
Fig. 4. Example of two-tier test.
4. Experiment design
To determine the influence of the proposed digital game-based learning method with two-tier testing on learning effectiveness and
mathematics anxiety and analyze the learning behavior patterns of students in the system, we performed a teaching experiment with
fifth-grade elementary school students.
4.1. Participants
The participants of this study comprised 53 fifth-graders, 27 of which were in the experimental group and 26 of which were in the
5
Fig. 5. Framework of two-tier test mechanism.
control group. The experimental group used the digital game-based learning system with two-tier testing and the control group used a
general digital game-based learning system. The participants all received the same computer skill training courses, and all were
instructed by the same teacher who had more than 10 years of teaching experience.
4.2. Measuring tools
The measuring tools of this study included a learning-achievement test on factors and multiples concepts and a mathematical-
anxiety questionnaire. The learning-achievement test on factors and multiples concepts included divisibility, factors, common fac
tors, multiples, the multiples of 2, 5, 10, and 3, and common multiples. The test items were designed by the researchers and then
examined by a mathematics professor and an experienced elementary school teacher to ensure expert validity. For the mathematical-
anxiety questionnaire, we adopted the mathematical-anxiety scale used by Verkijika and De Wet (2015), which came from the
Fennema-Sherman Math Attitude Scales (FSMAS). The mathematical-anxiety scale contained four positively-worded items and four
negatively-worded items. In data analysis, the credibility of the scale was 0.96. The negatively-worded items are reverse scored, so a
higher score represents a lower level of mathematics anxiety.
4.3. Coding schema of game-based learning behavior
To examine the learning behavior of students during the game, we developed a coding schema divided into two categories of
behavior: learning and non-learning, as shown in Table 1.
4.4. Experimental process
To investigate the influence of the digital game-based learning model with two-tier testing on learning, we conducted a teaching
experiment using the concepts of factors and multiples in elementary mathematics. Fig. 6 exhibits the procedure of the experiment.
Before the experiment, the learners first were given a lesson on factors and multiples for one week, and then took a pretest to measure
their prior knowledge of the mathematical concepts and their anxiety. During the experiment, the students in the experimental group
learned using a digital game-based learning model with two-tier testing, whereas the control group learned using a general digital
game-based learning model. The digital game-based learning system recorded the behavioral patterns presented by the students while
they played the game. The digital game-based learning system of the experimental group additionally recorded the processes of the
students taking the two-tier tests. After the experiment, a posttest was administered to compare the influences of the two types of
digital game-based learning models on learning effectiveness and mathematics anxiety in the two groups.
Table 1
Coding schema of game-based learning behavior.
Code Code Name Behavior Definition Category of Behavior
E Explore Exploratory behavior in the game, including walking, investigating signboards, and talking to NPCs Non-learning
L Learn Doing learning activities in the game system Learning
A Answer Answering questions (including two-tier test questions)
R Review Reading factors-and-multiples review lectures
C Correct Selecting the correct answer to a question
I Incorrect Selecting an incorrect answer to a question
6
Fig. 6. Diagram of experiment design.
5. Experimental results
5.1. Analysis of learning achievement
5.1.1. Experimental group vs. control group

To compare the learning achievements of the students in the experimental group and the control group, we employed a one-way
ANCOVA analysis to evaluate their test scores. The assumption of homogeneity of regression was not violated (F = 0.393, p > .05),
showing a common regression coefficient for one-way ANCOVA. Table 2 presents the results of posttest scores for the two groups. The
mean and standard deviation of the scores of the experimental group were 63.11 and 17.98, respectively, and the mean and standard
deviation of the scores of the control group were 59.96 and 20.16. According to the ANCOVA result, it was found that no significant
differences between the two groups (F = 0.987, p > .05, η2 = 0.019). Furthermore, η2 was 0.019, representing a small effect size. This
result implies that using the digital game-based learning model with two-tier testing to teach students is roughly as effective as using a
general digital game-based learning model.
To further study the effect, we investigate the influence of reading time and different types of feedback on student learning, we
defined “reading time” as the duration of time between the system providing feedback when the student give an incorrect answer
(Behavior R) and the time the student gives another answer (Behavior A). Say for example that Behavior R occurs at 25 min 10 s and
Behavior A takes place at 25 min 37 s. Then, the reading time is 27 s.
The digital game-based learning systems recorded the time it took the students to read feedback. We used the mean reading time of
15 s as the demarcation point. The students who took more than 15 s to read feedback were assigned to the “long-time reading group”,
and those who took 15 s or less were assigned to the “short-time reading group”. In the experimental group, 15 students were in the
long-time reading group, and 12 students were in the short-time reading group. In the control group, 15 students were in the long-time
reading group, and 11 students were in the short-time reading group. The feedback content of the experimental group was based on the
misconceptions diagnosed using two-tier testing, which means the content was tailored for individual students. The feedback content
of the control group was general feedback based on circumstances that are common in assessments, which may not have been equally
helpful for all students.
5.1.2. Experimental vs. control groups for long-time readers

We used one-way ANCOVA to examine the differences between the learning achievements of the students in the long-time reading
groups in the experimental and control groups. According to the results in Table 3, the mean and standard deviation of the posttest
scores obtained by the students in the long-time reading group in the experimental group were 71.2 and 15.05, whereas the mean and
standard deviation of the posttest scores obtained by the students in the long-time reading group in the control group were 63.5 and
15.33. This result indicates that the learning effectiveness of the students who used digital game-based learning with two-tier testing
and spent more time reading the feedback was significantly better than that of the students who used general digital game-based
Table 2
ANCOVA results of post-test scores.
Group N Mean S.D. Adjusted mean Std. error. F η2
Experimental group 27 63.11 17.98 63.55 2.85 0.987 0.019
Control group 26 59.96 20.16 59.51 2.90
7
Table 3
ANCOVA results of post-test scores of long-time reading groups.
Experimental group 15 71.2 15.05 71.54 2.50 5.585* 0.171
Control group 15 63.5 15.33 63.16 2.50
*p < .05.
learning and spent more time reading the feedback (F = 5.585, p < .05, η2 = 0.171). In addition, η2 was 0.171, representing a large
effect size. This also means that spending more time on reading feedback for the misconceptions promotes learning effectiveness and
enables students to grasp concepts significantly better than does spending more time on reading general feedback.
5.1.3. Long-time vs. short-time reading groups in experimental group

We used one-way ANCOVA to examine the differences between the learning achievements of the students in the long-time reading
group and the short-time reading group in the experimental group. According to the results in Table 4, the mean and standard de
viation of the posttest scores obtained by the students in the long-time reading group in the experimental group were 71.2 and 15.05,
whereas the mean and standard deviation of the posttest scores obtained by the students in the short-time reading group in the
experimental group were 53 and 16.6. According to the ANCOVA result, it was found that the learning effectiveness of the students
who used digital game-based learning with two-tier testing and spent more time reading the feedback was significantly better than that
of the students who used digital game-based learning with two-tier testing but spent less time reading the feedback (F = 12.159, p <
.01, η2 = 0.336). In addition, η2 was 0.336, representing a large effect size. This also means that spending more time on reading
feedback for the misconceptions promotes learning effectiveness and enables students to grasp concepts significantly better than
spending less time on reading feedback for the misconceptions.
5.1.4. Long-time vs. short-time reading groups in control group

Finally, one-way ANCOVA was further adopted to examine the differences between the learning achievements of the students in the
long-time reading group and the short-time reading group in the control group. Table 5 shows the results of post-test scores for the two
reading groups in control group. The mean and standard deviation of the posttest scores in the long-time reading group were 63.5 and
15.33, whereas the mean and standard deviation of the posttest scores in the short-time reading group were 55.14 and 25.34. Ac
cording to the result, it was found that there were no significant differences between the two groups (F = 0.887, p > .05, η2 = 0.037). In
addition, η2 was 0.037, representing a small effect size. This also means that the amount of time spent reading general feedback does
not impact learning effectiveness.
5.2. Analysis of mathematics anxiety
This study also examined the validity of the different digital game-based learning models from the perspective of mathematics
anxiety. One-way ANCOVA was further adopted to compare the mathematics anxiety of the students in the experimental group and the
control group. According to the results in Table 6, the mean and standard deviation of the posttest questionnaire scores in the
experimental group were 4.14 and 0.59, whereas the mean and standard deviation of the posttest questionnaire scores in the control
group were 3.58 and 0.93. According to the ANCOVA result, it was found that the students who used digital game-based learning with
two-tier testing felt a significantly lower level of mathematics anxiety than those who used general digital game-based learning (F =
4.04, p < .05, η2 = 0.075). In addition, η2 was 0.075, representing a medium effect size.
5.3. Analysis of learning behavior patterns
Lag sequence analysis was used to examine the influence of learning behavior in the digital game on the learning performance. The
learning process database of the game recorded the learning behaviors of the students, and lag sequence analysis was applied, the
results of which were converted into a behavioral transition diagram. Then, we examined and analyzed potential student behaviors. A
total of 14313 learning behavior codes were collected from the 53 learners engaged in gameplay. The 27 students in the experimental
group generated 8049 behavior codes, and the 26 students in the control group produced 6264 behavior codes.
Lag sequence analysis was applied to the collected learning behavior codes. The system first calculated the frequencies of each
behavior sequence and then used a matrix to derive the adjusted residuals table. Tables 7 and 8 present the adjusted residuals tables of
the two groups. The columns of the residuals tables are the initial behaviors of the students, corresponding to rows that are the next
Table 4
ANCOVA results of post-test scores of reading groups in experimental group.
Long-time group 15 71.2 15.05 69.86 2.89 12.159** 0.336
Short-time group 12 53.0 16.60 54.68 3.23
**p < .01.
8
Table 5
ANCOVA results of post-test scores of reading groups in control group.
Long-time group 15 63.5 15.33 62.54 4.19 0.887 0.037
Short-time group 11 55.14 25.34 56.45 4.90
Table 6
ANCOVA results of mathematical-anxiety questionnaire scores.
Experimental group 27 4.14 0.59 4.04 0.13 4.04* 0.075
Control group 26 3.58 0.93 3.68 0.13
*p < .05.
Table 7
Results of residuals transition in experimental group.
E L A R C I
E 35.79* 3.05* − 13.16 − 10.36 − 12.92 − 15.8

L 9.36* 24.89* − 13.33 − 4.39 − 14.15 − 17.31
A − 21.31 − 23.27 − 16.49 − 18.43 54.42* 66.58*
R − 18.92 6.28* 38.97* − 5.66 − 11.17 − 13.66
C 1.35 − 5.62 23.89* − 8 − 6.06 − 7.42
I − 15.79 − 17.24 − 12.22 59.49* − 7.41 − 9.07
p < .05.
Table 8
Results of residuals transition in control group.
E L A R C I
E 29.39* 9.49* − 12.1 − 17.95 − 14.65 − 12.31

L − 1.87 14.95* − 11.05 16.27* − 15.95 − 13.41
A − 19.96 − 21.62 − 13.55 − 12.84 58.21* 48.94*
R − 5.24 2.77* 16.56* − 0.77 − 9.43 − 7.93
C − 1.37 − 4.23 26.28* − 9.44 − 7.32 − 6.16
I − 12.34 − 13.37 9.63* 32.92* − 6.16 − 5.18
p < .05.
behavior that occurs. The values indicate the Z scores of the behavior sequences, which represent the significance of the behavior
sequences. For instance, the Z score of the behavior sequence of learning (L) and then exploring (E) in the experimental group was 9.36.
A Z score greater than 1.96 means that the frequency of said behavior sequence pattern is statistically significant (p < .05). As shown in
Table 7, the following ten behavior sequences were found to be significant in the experimental group: E→E, E→L, L→E, L→L, A→C,
A→I, R→L, R→A, C→A, and I→R. As shown in Table 8, the following eleven behavior sequences were found to be significant in the
control group: E→E, E→L, L→L, L→R, A→C, A→I, R→L, R→A, C→A, I→A, and I→R.
We then derived the behavioral transition diagrams (as shown in Figs. 8 and 9) to determine significant differences between the
experimental and control groups. The arrows indicate the order of the behaviors, and the values next to the arrows are the Z scores of
the behavior sequences. The width of the arrows represents the relative significance of the behavior sequences. We compared the
behavioral transition diagrams of the two groups to find similarities and differences, and then discussed the causes of these in the
discussion section (Fig. 10).
6. Discussion
Mathematics is a subject that is closely linked to everyday life. However, misconceptions often arise in the learning of mathematics
(Lin et al., 2014), which reduces learning effectiveness. We therefore developed a digital game-based learning system with a two-tier
testing mechanism to identify the misconceptions of learners. The system provides appropriate feedback to help students learn correct
concepts and reduce their mathematics anxiety. We further used lag sequence analysis to investigate the behavioral patterns of stu
dents and analyze their underlying causes.
According to the experimental results, the research questions of this study can be answered. Discussions about the findings can be
extrapolated into three aspects: (1) the impact of the game-based learning model with two-tier testing on learning achievements, (2)
the impact of the game-based learning model with two-tier testing on students’ mathematics anxiety, and (3) the impact of the game-
9
based learning model with two-tier testing on students’ behavior differences.

For the impact of the game-based learning model with two-tier testing on learning achievements, our results presented no sig
nificant differences in learning effectiveness between the learners in the experimental group and those in the control group. This
indicates that using the digital game-based learning model with two-tier testing to teach students is roughly as effective as using a
general digital game-based learning model. The results of the lag sequence analysis of learning behaviors revealed that the two groups
shared nine learning behaviors. There were more similarities than differences, which may explain the lack of significant differences in
learning effectiveness between the two groups. Moreover, we grouped the participants by length of time spent reading feedback.
Students who made good use of the feedback mechanism were classified into the long-time reading group; students who did not spend
time reading the feedback were classified into the short-time reading group. Based on the analysis results, we plotted the various
reading groups in Fig. 7, where the vertical axis is the posttest score and the horizontal axis is reading time. The solid line represents the
experimental group, and the dashed line represents the control group.
Based on this figure, we have the following findings: (1) the learning effectiveness of the students in the long-time reading group in
the experimental group was significantly better than that of the students in the long-time reading group in the control group. This
indicates that students who made use of the feedback mechanisms in the two-tier tests in the experimental group could construct
correct concept schemas, effectively retained the concepts, reviewed less frequently after learning, applied the trial-and-error method
to answer questions less frequently, and had more energy and better prior knowledge to learn the next concept. This result proves that
two-tier testing can effectively detect misconceptions and that learners can derive better learning effectiveness through feedback for
the misconceptions learning materials and gain the correct knowledge via instant feedback and review (Yang et al., 2017). (2) The
learning effectiveness of the students in the long-time reading group in the experimental group was significantly better than that of the
students in the short-time reading group in the experimental group. This indicates that even with feedback for the misconceptions
content, its impact on learning effectiveness is dependent on whether learners take the time to read the feedback. (3) The learning
effectiveness of the students in the long-time reading group in the control group and of the students in the short-time reading group in
the control group showed no significant differences, which means that regardless of how much time learners spend on reading general
feedback content, it does not influence learning effectiveness. The behavioral patterns unique to the control group revealed that some
students in the control group did not review concepts after they answered a question incorrectly. This implies that learning materials
that are not tailored to the needs of the learners have no educational value at all. On the whole, we discovered that the best learning
method in our experiment was to allow adequate time to learn concepts with the feedback mechanism of two-tier testing. This enables
learners to identify their misconceptions or alternative conceptions, reinforce their mathematical knowledge, and ultimately achieve
better learning effectiveness. This also verifies that digital game-based learning must be coordinated with a suitable teaching in
strument or strategy in order to effectively enhance learning effectiveness (Hsu & Wang, 2018; Sun et al., 2018; Vanbecelaere et al.,
2020).
With regard to mathematics anxiety, the level of mathematics anxiety was much lower in the experimental group than in the
control group. The lower level of mathematics anxiety in the experimental group can be attributed to the fact that two-tier testing can
increase the learning frequency of students and provide effective and suitable learning content, thereby gradually reducing the
pressure brought by problem-solving and reducing mathematics anxiety. In addition, the results of the lag sequence analysis of
behavioral patterns indicated that learners in the experimental group immersed themselves in the game scenario as they learned
concepts, which means that the learners could gain knowledge happily and easily. This explains why students in the experimental
group displayed lower levels of mathematics anxiety. In contrast, the assessment mechanisms in the learning system used by the
control group had only one-tier questions, and it was easier to guess the correct answers. Nevertheless, generalized feedback decreases
learning effectiveness and gradually increases mathematics anxiety. The behavioral pattern analysis results revealed that the students
in the control group often reviewed previous concepts as they learned. This indicates that the students had to repeatedly confirm the
connections between new and old knowledge and that they had doubts over whether they could learn mathematics effectively and
correctly, which means they experienced more mathematics anxiety. The students in the control group also tended to give another
response directly after answering a question incorrectly without reviewing the relevant concepts. It is therefore likely that the students
were answering questions by trial and error, which implies that the students believed that reading the feedback content was pointless
and just wanted the game to end quickly. This is a display of greater mathematics anxiety. This result also demonstrates that the
misconception diagnosis function of two-tier testing can help reduce mathematics anxiety.
Finally, we conducted an in-depth investigation on the learning behaviors of the students using the lag sequence analysis. The
analysis results showed that the students in both groups presented continuous learning behavior, continuous exploring of the game,
immediate learning after exploring the game, and the reviewing of old concepts before learning new concepts. The above indicate that
the learning content and the game content were designed well and are highly connected to each other, thereby enabling learners to
enjoy the learning mathematical concepts in the game. During the assessment stage, there were students in both groups who reviewed
concepts after giving an incorrect answer and made another attempt to answer the question after relearning relevant concepts. This
shows that these students all wanted to do well in mathematics, and after answering a question incorrectly, they sought the correct
answers on their own, learned by reviewing, and seized the opportunity to answer the question again after learning and confirm
whether they had grasped the concept completely. The analysis results also revealed that only students in the experimental group
immediately wanted to continue with the game after learning. This indicates that the concept diagnosis of two-tier testing and
appropriate feedback content can help students gradually establish correct knowledge and permanently retain it. Thus, students in the
experimental group had the correct prior knowledge as they learned new concepts, which made it easier for them to gain new
knowledge and immerse themselves in the game. This also explains why the students who spent more time reading the feedback in the
experimental group displayed better learning effectiveness and why the students in the experimental group presented lower levels of
10
Fig. 7. Interaction between different reading times and learning achievement.
Fig. 8. Behavioral transition diagram of experimental group.
Fig. 9. Behavioral transition diagram of control group.
11
Fig. 10. Behavioral comparison diagram of two groups.
mathematics anxiety. Only students in the control group made another attempt at answering the question after giving an incorrect one
without reviewing the concepts. This implies that they were not accepting towards the generalized feedback content; some students in
the control group may have believed that reading and reviewing the learning materials was pointless and thus gave up reviewing.
Consequently, they could not correct any misconceptions they may have had or learn the correct concepts by the time they took the
assessment, and this ultimately affected their learning of new concepts. From these results, we can infer that feedback content that is
not tailored to the needs of the learners has no educational value at all. These analysis results also explain why the learning effec
tiveness of the students in the long-time reading group in the experimental group was significantly better than that of the students in
the control group and why mathematics anxiety was greater in the control group than in the experimental group.
7. Conclusions and implications
In summary, this study preliminarily studied the issue of enhancing detection and feedback to misconceptions in the digital game-
based learning. An educational game with two-tier testing was designed to detect the misconceptions of students in the game and
provide different kinds of feedback for the misconceptions. In addition, we used lag sequence analysis to analyze the behavioral
patterns of students during gameplay to understand the influence of different types of feedback content and reading time on learning
effectiveness. The results indicate that digital game-based learning with two-tier testing and general digital game-based learning exert
no significant effects on learning effectiveness. However, we did find out from the results of the lag sequence analysis that students who
are willing to spend time reading feedback for the misconceptions are likely to achieve better learning effectiveness. Furthermore,
incorporating two-tier testing into digital games can effectively reduce mathematics anxiety and help learners learn well. These
significant findings have not been explicitly presented in related research, and thus we believe this study brings valuable insights
regarding the issue of detection and feedback in the digital game-based learning.
These results have some implications for teachers and game designers. First, we encourage teachers to develop a set of two-tier
testing problems based on possible misunderstandings in the learning content, which can be used to diagnose possible mis
conceptions of students. In addition, corresponding feedback needs to be designed for different misconceptions to enhance the learning
effect. Second, it is recommended that the instructional designers and game developers can discuss the design of games with teachers in
order to develop games suitable for learning in the classroom. In addition, game design must integrate the two-tier testing questions
and feedback designed by the teachers, so that the misconceptions can be diagnosed during the game and provide appropriate feedback
to the students.
Our suggestions for future research are as follows. We developed two-tier test questions for the factors and multiples unit in the
fifth-grade mathematics curriculum using the two-tier test development process proposed by Treagust (1988) and used RPG Maker to
design a role-playing game. Due to the functional limitations of the software, we could not give the students open-ended questions. We
suggested that future studies incorporate two-tier testing into different types of games and use open-ended questions for the second tier
so that students can explain their thinking models and concepts. This will provide teachers with more information with which to
analyze the misconceptions of learners and identify any misconceptions that were not discovered previously. Moreover, the sample
size in this study was small, and the results cannot be generalized to all students’ perceptions in all circumstances. It would be better to
conduct long-term and large-scale experiments with the two-tier testing in digital games.
Credit author statement
Kai-Hsiang Yang: Conceptualization, Methodology, Validation, Formal analysis, Resources, Writing - Original Draft, Writing -
Review & Editing, Visualization, Supervision. Bou-Chuan Lu: Software, Investigation.
12
References
Chen, X., Zou, D., Cheng, G., & Xie, H. (2020). Detecting latent topics and trends in educational technologies over four decades using structural topic modeling: A
retrospective of all volumes of computers & education. Computers & Education, 151, 103855.
Hsu, C. C., & Wang, T. I. (2018). Applying game mechanics and student-generated questions to an online puzzle-based game learning system to promote algorithmic
thinking skills. Computers & Education, 121, 73–88.
Lin, Y. H., Yih, J. M., & Tsai, T. N. (2014). Clustering approach to investigate intuitive rule usage with factor and multiple on mathematics problems. International
Journal of Intelligent Technologies and Applied Statistics, 7(3), 267–282.
Maier, U., Wolf, N., & Randler, C. (2016). Effects of a computer-assisted formative assessment intervention based on multiple-tier diagnostic items and different
feedback types. Computers & Education, 95, 85–98.
Sun, C. T., Chen, L. X., & Chu, H. M. (2018). Associations among scaffold presentation, reward mechanisms and problem-solving behaviors in game play. Computers &
Education, 119, 95–111.
Sun, J. C. Y., Kuo, C. Y., Hou, H. T., & Lin, Y. Y. (2017). Exploring learners’ sequential behavioral patterns, flow experience, and learning performance in an anti-
phishing educational game. Educational Technology & Society, 20(1), 45–60.
Tapingkae, P., Panjaburee, P., Hwang, G. J., & Srisawasdi, N. (2020). Effects of a formative assessment-based contextual gaming approach on students’ digital
citizenship behaviours, learning motivations, and perceptions. Computers & Education, Article 103998.
Taub, M., Sawyer, R., Smith, A., Rowe, J., Azevedo, R., & Lester, J. (2020). The agency effect: The impact of student agency on learning, emotions, and problem-
solving behaviors in a game-based learning environment. Computers & Education, 147, 103781.
Treagust, D. F. (1988). Development and use of diagnostic tests to evaluate students’ misconceptions in science. International Journal of Science Education, 10(2),
159–169.
Tsai, C. C., & Chou, C. (2002). Diagnosing students’ alternative conceptions in science. Journal of Computer Assisted Learning, 18(2), 157–165.
Vanbecelaere, S., Van den Berghe, K., Cornillie, F., Sasanguie, D., Reynvoet, B., & Depaepe, F. (2020). The effects of two digital educational games on cognitive and
non-cognitive math and reading outcomes. Computers & Education, 143, 103680.
Verkijika, S. F., & De Wet, L. (2015). Using a brain-computer interface (BCI) in reducing math anxiety: Evidence from South Africa. Computers & Education, 81,
113–122.
Yang, K. H. (2017). Learning behavior and achievement analysis of a digital game-based learning approach integrating mastery learning theory and different feedback
models. Interactive Learning Environments, 25(2), 235–248.
Yang, Q. F., Chang, S. C., Hwang, G. J., & Zou, D. (2020). Balancing cognitive complexity and gaming level: Effects of a cognitive complexity-based competition game
on EFL students’ English vocabulary learning performance, anxiety and behaviors. Computers & Education, 148, 103808.
Yang, T. C., Chen, S. Y., & Hwang, G. J. (2015). The influences of a two-tier test strategy on student learning: A lag sequential analysis approach. Computers &
Education, 82, 366–377.
Yang, K. H., Chu, H. C., & Chiang, L. Y. (2018). Effects of a progressive prompting-based educational game on second graders’ mathematics learning performance and
behavioral patterns. Journal of Educational Technology & Society, 21(2), 322–334.
Yang, T. C., Fu, H. T., Hwang, G. J., & Yang, S. J. (2017). Development of an interactive mathematics learning system based on a two-tier test diagnostic and guiding
strategy. Australasian Journal of Educational Technology, 33(1), 62–80.
13

Computers & Education: Kai-Hsiang Yang, Bou-Chuan Lu

Uploaded by

Copyright:

Available Formats

Computers & Education: Kai-Hsiang Yang, Bou-Chuan Lu

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computers & Education: Kai-Hsiang Yang, Bou-Chuan Lu

Uploaded by

Copyright:

Available Formats

Computers & Education 160 (2021) 104033

Contents lists available at ScienceDirect

Computers & Education

Towards the successful game-based learning: Detection and

2.1. Game-based learning

2.2. Two-tier test

3. Development of digital game-based learning game with two-tier testing

Fig. 1. Structure of developed game.

Fig. 2. Screenshot of concept-review lecture.

Fig. 3. Feedback mechanism of two-tier test.

Fig. 4. Example of two-tier test.

Fig. 5. Framework of two-tier test mechanism.

4.2. Measuring tools

4.3. Coding schema of game-based learning behavior

4.4. Experimental process

Fig. 6. Diagram of experiment design.

5.1. Analysis of learning achievement

5.1.1. Experimental group vs. control group

5.1.2. Experimental vs. control groups for long-time readers

5.1.3. Long-time vs. short-time reading groups in experimental group

5.1.4. Long-time vs. short-time reading groups in control group

5.2. Analysis of mathematics anxiety

5.3. Analysis of learning behavior patterns

**p < .01.

E 35.79* 3.05* − 13.16 − 10.36 − 12.92 − 15.8

E 29.39* 9.49* − 12.1 − 17.95 − 14.65 − 12.31

based learning model with two-tier testing on students’ behavior differences.

Fig. 7. Interaction between different reading times and learning achievement.

Fig. 8. Behavioral transition diagram of experimental group.

Fig. 9. Behavioral transition diagram of control group.

Fig. 10. Behavioral comparison diagram of two groups.

7. Conclusions and implications

Credit author statement

You might also like