Deep Learning in Gaming and Animations
Deep Learning in Gaming and Animations
Deep Learning in Gaming and Animations
and Animations
Explainable AI (XAI) for Engineering Applications
Series Editors: Aditya Khamparia and Deepak Gupta
Edited by
Vikas Chaudhary, Moolchand Sharma, Prerna Sharma,
and Deevyankar Agarwal
First edition published 2022
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742
© 2022 selection and editorial matter, Vikas Chaudhary, Moolchand Sharma, Prerna Sharma, and
Deevyankar Agarwal; individual chapters, the contributors
Reasonable efforts have been made to publish reliable data and information, but the author and pub-
lisher cannot assume responsibility for the validity of all materials or the consequences of their use.
The authors and publishers have attempted to trace the copyright holders of all material reproduced in
this publication and apologize to copyright holders if permission to publish in this form has not been
obtained. If any copyright material has not been acknowledged, please write and let us know so we may
rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or here-
after invented, including photocopying, microfilming, and recording, or in any information storage or
retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.com
or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-
750-8400. For works that are not available on CCC please contact mpkbookspermissions@tandf.co.uk
Trademark Notice: Product or corporate names may be trademarks or registered trademarks and are
used only for identification and explanation without intent to infringe.
DOI: 10.1201/9781003231530
Dr. Vikas Chaudhary would like to dedicate this book to his father
Sh. Rajendra Singh and his mother Smt. Santosh for their constant
support and motivation; and his family members, including his
wife Ms. Amita Panwar, his daughter Astha Chaudhary, and
his sons Shivansh Chaudhary and Anmol Chaudhary. I would
also like to give my special thanks to the publisher and my
other co-editors for having faith in my abilities. Before all and
after all, the main thanks should be to the Almighty God.
Mr. Moolchand Sharma would like to dedicate this book to his
father Sh. Naresh Kumar Sharma and his mother Smt. Rambati
Sharma, for their constant support and motivation, and his family
members, including his wife Ms. Pratibha Sharma and his son
Dhairya Sharma. I would also like to give my special thanks to the
publisher and my other co-editors for having faith in my abilities.
Ms. Prerna Sharma would like to dedicate this book to
her father Mr. Vipin Sharma, mother Ms. Suman Sharma,
husband Mr. Parminder Mann and her family for their
support and constant motivation. Specially dedicated to
her beloved sons Pratyaksh, Kairav & Kevin Mann
Mr. Deevyankar Agarwal would like to dedicate this book
to his father Sh. Anil Kumar Agarwal and his mother Smt.
Sunita Agarwal, and his wife Ms. Aparna Agarwal, and his
son Jai Agarwal for their constant support and motivation.
I would also like to give my special thanks to the publisher
and my other co-editors for having faith in my abilities.
Contents
List of Figures and Tables..........................................................................................ix
Preface.................................................................................................................... xiii
Editors....................................................................................................................... xv
Contributors�����������������������������������������������������������������������������������������������������������xvii
Chapter 1 Checkers-AI..........................................................................................1
Priyanshi Gupta, Vividha and Preeti Nagrath
Index....................................................................................................................... 157
vii
List of Figures and Tables
FIGURES
Figure 1.1 An example of payoff matrix for a zero-sum two-player game........... 4
Figure 1.2 Code implementation of the heuristic function...................................5
Figure 1.3 Binary Search tree example.................................................................6
Figure 1.4 A Minimax tree...................................................................................7
Figure 1.5 Algorithmic representation of a Minimax tree example.....................7
Figure 1.6 Alpha-Beta pruning illustration...........................................................8
Figure 1.7 Alpha-Beta pruning illustration...........................................................9
Figure 1.8 Alpha-Beta pruning illustration......................................................... 10
Figure 1.9 Flowchart depicting the implementation algorithm.......................... 11
Figure 1.10 Before single capture of a piece......................................................... 12
Figure 1.11 After single capture of a piece........................................................... 13
Figure 1.12 Before multiple capture of pieces...................................................... 13
Figure 1.13 After multiple capture of pieces........................................................ 13
Figure 1.14 U
pgradation of a piece into king on reaching the end side of
the opponent������������������������������������������������������������������������������������� 14
Figure 1.15 Displaying the result of the game...................................................... 14
Figure 1.16 Use case flowchart............................................................................. 15
Figure 1.17 System environment diagram............................................................ 16
Figure 2.1 Setting up keyframes for timing the bouncing balls.........................20
Figure 2.2 Classification of autonomous agents.................................................. 22
Figure 2.3 A topology of agent........................................................................... 23
Figure 2.4 ive emerging technologies that will change the world in next
F
five years.������������������������������������������������������������������������������������������26
Figure 2.5 Attractive opportunities in the 3D animation market........................34
Figure 3.1 The MDA framework........................................................................ 49
Figure 3.2 Proposed functionality diagram........................................................ 52
ix
x List of Figures and Tables
TABLES
Table 2.1
Agent Properties Based on the Coordination, Planning, and
Cooperative Ability��������������������������������������������������������������������������24
Table 2.2
Summary of AR Frameworks............................................................28
Table 2.3
A Summary of Game Engines........................................................... 29
Table 3.1
Levels of Fear (See Ntokos, 2018)..................................................... 47
Table 4.1
Overview of ML Algorithm used in IoT Networks........................... 83
Table 4.2
Overview of IoT Applications, Protocols and Algorithm.................. 86
Table 8.1
Main Contributions of the Above Explored Algorithms................. 147
Table 8.2 Popular Datasets in Video Game Content Generation.................... 150
Preface
We are delighted to launch our book, “Deep Learning in Gaming and Animations:
Principles and Applications.” Artificial intelligence has been a growing resource
for video games for years now. Most video games—whether they are racing games,
shooting games, or strategy games—have various elements controlled by AI, such
as the enemy bots or neutral characters. Even the ambiguous characters that do not
seem to be doing much are programmed to add more depth to the game and give you
clues about your next steps. Today’s modern world is currently under a significant
influence on innovative technologies such as artificial intelligence, deep learning,
machine learning, and IoT. This book aims to present the various approaches, tech-
niques, and applications that are available in the field of gaming and animations.
It is a valuable source of knowledge for researchers, engineers, practitioners, and
graduate and doctoral students working in the same field. It will also be helpful for
faculty members of graduate schools and universities. Around 25 full-length chap-
ters were received. Amongst these manuscripts, eight chapters have been included
in this volume. All the chapters submitted were peer-reviewed by at least two inde-
pendent reviewers, provided with a detailed review proforma. The comments from
the reviewers were communicated to the authors, who incorporated the suggestions
in their revised manuscripts. The recommendations from two reviewers were taken
into consideration while selecting chapters for inclusion in this volume. The exhaus-
tiveness of the review process is evident, given the large number of articles received
addressing a wide range of research areas. The stringent review process ensured that
each published chapter met the rigorous academic and scientific standards.
We would also like to thank the authors of the published chapters for adhering to
the schedule and incorporating the review comments. We wish to extend my heart-
felt acknowledgment to the authors, peer-reviewers, committee members, and pro-
duction staff whose diligent work shaped this volume. We especially want to thank
our dedicated team of peer-reviewers who volunteered for the arduous and tedious
step of quality checking and critique on the submitted chapters.
Vikas Chaudhary,
Moolchand Sharma,
Prerna Sharma,
Deevyankar Agarwal,
Editors,
October 5, 2021
xiii
Editors
Vikas Chaudhary is a professor in the Computer Science & Engineering depart-
ment at JIMS Engineering Management Technical Campus, Greater Noida. He
has 18 years of teaching and research experience. He obtained a Doctorate from
the National Institute of Technology, Kurukshetra, India, in Machine Learning/
Unsupervised Learning. He has published various research papers in the International
Journals of Springer, Elsevier, Taylor & Francis. Also, he has published various
papers in IEEE International Conferences and national conferences. He is a reviewer
of Springer Journal as well as of many IEEE conferences. He has written a book on
Cryptography & Network Security. His research area is machine learning, artificial
neural networks.
xv
xvi Editors
chapters with International level publishers (Wiley and Elsevier). She has extensively
worked on Computational Intelligence. Her area of interest includes artificial intel-
ligence, machine learning, nature-inspired computing, soft computing, and cloud
computing. She is associated with various professional bodies like IAENG, ICSES,
UACEE, Internet Society, etc. She has a rich academic background and teaching
experience of 8 years. She is a doctoral researcher at Delhi Technological University
(DTU), Delhi. She completed her Post Graduate in 2011 from USIT, GGSIPU, and
Graduate in 2009 from GPMCE, GGSIPU.
xvii
xviii Contributors
CONTENTS
1.1 Introduction....................................................................................................... 1
1.2 Related Work..................................................................................................... 2
1.3 Methodology......................................................................................................3
1.3.1 Game Theory.........................................................................................3
1.3.2 Zero-Sum Game.................................................................................... 3
1.3.3 Heuristic Function.................................................................................4
1.3.4 Search Tree............................................................................................6
1.3.5 Minimax Approach...............................................................................6
1.3.6 Alpha-Beta Pruning...............................................................................8
1.3.7 Minimax vs Alpha-Beta Pruning.......................................................... 8
1.4 Implementation..................................................................................................9
1.4.1 Game Algorithm....................................................................................9
1.4.2 Graphical User Interface..................................................................... 12
1.5 Utility and Application.................................................................................... 14
1.5.1 System Environment............................................................................ 15
1.6 Conclusion and Future Scope.......................................................................... 16
1.6.1 Conclusion........................................................................................... 16
1.6.2 Future Scope........................................................................................ 16
References................................................................................................................. 17
1.1 INTRODUCTION
For years, the application of artificial intelligence (AI) in video games has been
growing steadily. The bulk of video games have multiple controllable traits, such as
racing games, fighting games, etc. [1]. AI is one of the computer sciences fields that
can be used to construct a variety of intelligent games, whether board games, video
DOI: 10.1201/9781003231530-1 1
2 Deep Learning in Gaming and Animations
games, or educational games, that would respond to a human being and that a com-
puter machine would not be able to react to them [2].
The suggested game checkers are one of the most played board games that
involve thought and tactics to win the game, or if it was not played properly and used
the methods that would allow the player to win, it would be easy to lose. Checkers
is a series of two-player strategy board games that include diagonal movements
of game pieces [3]. American checkers are the most common form of checker. In
the popular “checkerboard” pattern, with 12 parts per hand, it is played on an 8 ×
8 board of light and dark squares. It is played by two players, as with all types of
checkers, taking turns on opposite sides of the board. Black, red, or white are the
typical pieces. By hopping over them, enemy bits are captured. The interactions
between the pieces are very less in checkers as compared to chess due to different
types of pieces. In AI, the heuristic solution definition is based on awareness or
study of the reasoning of people. In order to solve a problem, a heuristic approach
is faster than traditional methods. The goal of the heuristic is to find a solution
that is adequate within a suitable amount of time to solve the problem. A way of
evaluating the quest for the target is the heuristic approach. This chapter is divided
into several sections. The first section gives an introduction of checkers games and
its probable computations along with rules. The second section is the literature
survey which tells about the existing computer programs to solve checkers and its
history associated with it. The third section is the methodology section which gives
details about the methods and fields used in the project. The fourth section is the
implementation section in which the applicative view of the algorithm is briefly
explained along with implementation of graphical user interface (GUI) in the game
and its snapshots. The fifth section is the utility and application section which tells
the basic uses, features, and the system environment of the game developed. The
last section is the conclusion section in which results and findings along with future
scope of the project is mentioned.
software named Chinook. It was the first computer machine to win the title of
world champion in the game for checkers against a human. The Chinook software
program contains an opening book, an archive for starting moves from games
played by grandmasters with a deep search algorithm which is an efficient motion
evaluation feature; and an eight or less piece end-game database for all positions.
Because many previous inventions in the gaming industry have been done using
the search tree and heuristic function technique, we chose to use Alpha-Beta prun-
ing algorithm as a major emphasis for this article. Different search algorithms
were developed before the Alpha-Beta pruning aimed to minimize the tree search.
In other games, such as chess, checkers, tic-tac-toe, Isola, and many more, this
algorithm has been used.
1.3 METHODOLOGY
The different techniques studied and discussed to understand and implement the
checkers game are as follows.
net gains and losses of the involved parties can be less than or greater than zero. A zero-
sum game is sometimes called a purely competitive game, whereas non-zero-sum
games can be competitive or non-competitive. Zero-sum games are most commonly
solved using the Minimax theorem, which is directly connected to the duality of linear
programming [7], or Nash equilibrium. Many persons have a cognitive predisposition,
known as zero-sum tendency, to see circumstances as zero-sum. The easy representa-
tion for a match is a payout matrix. Then the choices are disclosed and the total points
of each player are influenced according to the payout for those choices. Consider the
two-player zero-sum game illustrated in Figure 1.1. The order of play is as follows: one
of two secret acts, 1 or 2, is chosen by the first player (red); one of three secret deeds,
A, B, or C, is selected by the second player (blue), unaware of the first player’s deci-
sion. Red will choose Action 2 and Blue will choose Action 1. Red wins 20 points by
allocating prizes, while Blue loses 20 points. Rather than agreeing on a single action
to be taken, the two players allocate probabilities to their respective actions, and use a
random process that selects an action for them according to these probabilities.
Each player tests the odds in order to decrease the estimated overall point loss,
regardless of the approach of the opponent. This leads to a linear problem of pro-
gramming with each player’s best solutions. For all zero-sum games with two teams,
this Minimax approach will possibly determine optimal strategies. The final chance
of Red preferring Action 1 is 4/7, and it turned out to be 3/7 for Action 2. For Blue,
the odds are 0, 4/7, and 3/7 respectively for acts A, B, and C. Red is the winner with
an average of 20/7 points per session.
1.3.3 Heuristic Function
Heuristic, AI, and mathematical optimization in computer science is a concept intended
to solve a problem faster when the classical search is too long. For optimality, complete-
ness, accuracy, or speed accuracy, this is done by trading. This could be named, in a
way, a shortcut. A heuristic function is a function that, at each branching step, ranks
alternatives in search algorithms based on the available knowledge to decide which
branch to pursue, through which, the exact solution [8] may be approximated. There
is a different sense of heuristic functions used in single-agent search; they return an
approximation of the distance of a given state from a target. The heuristic function in
this project is used to make the search in the algorithm easier. The heuristic, like the
full-space search algorithm, initially tries all possibilities at each point. However, if the
present probability is potentially worse than the better alternative discovered, the search
Checkers-AI 5
can be stopped at any moment. For the heuristic function proposed in this chapter, we
theorize that the perfect weight for the king piece can be found, but assigning the king
a weight much greater than a standard chip will make the player give the king too much
priority, leading to situations where the player potentially damages their odds of win-
ning by losing a king’s regular chips. Thus, we have created a balance between the value
of a pawn and the value of the king by creating a function called PieceSumDifference.
Pawn’s Value = 1
King’s Value = 3
As we can see in Figure 1.2, the function is described and the number of kings and
pawns each player has in the current scenario is calculated in order to assign a value
and evaluate the position at which the current player is standing.
The player that maximizes is the red player and the player that minimizes is the
yellow player. Therefore, the heuristic value returned is the difference between the
values of the player maximizing and the player minimizing.
1.3.5 Minimax Approach
In AI and game theory, Minimax is a decision rule used to minimize the future
loss while optimizing the expected advantage. Before applying this strategy, the two
assumptions made for the game are the fact that the human player plays optimally
and tries to win. The other is that the game should be a game that is purely strategic.
The Minimax algorithm is the modified version of the backward-induction algo-
rithm. It can also be considered to maximize the minimum gain called maximin.
In two cases, players make simultaneous moves and opposite moves, each player
minimizing the other player’s overall potential gain. A player therefore maximizes
his own minimum advantage, becoming a zero-sum game. Minimax and Maximin,
however, are not identical. In non-zero-sum game conditions, Maximin can be used.
1.3.6 Alpha-Beta Pruning
Minimax approach amounts to exhaustive search of solution space. In a realistic game
such as checkers, the search space is large. It is therefore more efficient to heuristically
reduce the search space in case of Minimax solutions. Alpha-Beta pruning algorithm
provides a mechanism to decrease the number of counts of the nodes thus reducing
the search space in the Minimax tree [12]. Apart from checkers, it is also used in other
two-player games. Further, the search is stopped once it encounters a branch where
at least a single value has been found that confirms that the branch would be worse
in comparison to earlier payoffs. Therefore, these moves can be avoided in the game
to reduce unnecessary computation. Alpha-Beta pruning does the same. It discards
the branches of the search tree without affecting any of the decisions to return the
same result in spite of pruning [13]. Apart from reducing the search time, it is pos-
sible to have a deeper search allowing a greater depth of subtree to be scanned. Two
values are used in the algorithm, Alpha and Beta. The highest (lowest) score that the
optimizing player has is stored in the Alpha variable. Beta value keeps the full score
that the player who minimizes is guaranteed to win. The original Alpha value shall
be equal to −infinity and Beta value is equal to +infinity. This gap gets smaller as the
search continues. When the Beta value gets smaller than the Alpha value, it means
that it is possible to limit the search beyond the current node.
In Figure 1.6, Alpha-Beta pruning cuts the gray subtrees while moving from left
to right and it is not appropriate to investigate them at every step since the category
of subtrees yields the value as a whole of an equal subtree or worse leaving the end
result unaffected. The implementation of this algorithm is demonstrated as shown in
Figures 1.7 and 1.8.
and allowing a greater search to be done at the same time [15]. It is a member of the
algorithm branch and binding class, unlike its predecessor.
Usually, the subtrees are temporarily dominated by either a first player advantage
during Alpha-Beta pruning in the case when many first player movements are good,
but all second player moves are calculated in order to find a comeback, or vice versa.
If the ordering of the move is incorrect, this gain will repeatedly switch sides during
the search, it will lead to inefficiency each time. Any step closer to the current posi-
tion decreases exponentially with the number of searched positions, significant effort
is worth focusing on sorting early moves.
1.4 IMPLEMENTATION
The algorithmic implementation and the GUI are described in the following sections.
its turn; a search algorithm is called which allows the program to look ahead at the
possible future positions before deciding what move it wants to make in that current
position. It’s white turn to move now. In every move, there are only two possible
moves to choose. We can visualize these moves as two separate branches at the end
of which are two new positions, of course, it is black’s turn to move now. We continue
expanding these moves till either we reach the end of the game or we decide to stop
because going deeper would take too much time. Either way, at the end of the tree we
now have to perform the static evaluation on these final positions. The static evalua-
tion means try to estimate how good the position is on one side without making any
more moves. Large values would favor white and small values would favor black. For
this reason, white constantly tries to maximize the evaluation hence known as the
Maximizing player. In addition, black is always trying to minimize the evaluation,
hence known as the Minimizing player. We start by evaluating the positions on the
bottom left. In the previous position, it was white’s turn to move, and since white will
always choose the value that leads to the maximum evaluation, we assign the value to
Checkers-AI 11
the node accordingly and complete the evaluation of that node using the right branch
as well. Now, black will try to minimize the evaluation function, so we assign the
position the lower value by comparing. In addition, we go up the tree hence return-
ing the maximum-minimum gain and getting the higher probable position for white
to move. This is where pruning comes into the picture. It would take a lot of time
to go down all the branches to get the best value. Without exploring all the nodes
through the branch, it tries to get the least or the max value from the already evalu-
ated position. This would result in the player knowing that he already has a better
option available and that he won’t have to go down the other branch. These checks
are made through Alpha-Beta parameters. This observation concludes that we don’t
have to waste any computation in evaluating the final position. Hence, we’ve pruned
that position from the tree.
FIGURE 1.14 Upgradation of a piece into king on reaching the end side of the opponent.
The game is a one-player checkers game where the user can play with the AI and
test their skills in strategic and logical gaming. Some benefits of playing checkers
game are stated below:
the content (a character can change its place). Figure 1.17 depicts the system environ-
ment diagram.
• Built a data management system to store the details of the user such as
name, number of wins and losses.
• Add new features such as two-player games, different types of checkers.
• Using the same algorithms to make more two-player games such as tic-tac-
toe and chess.
REFERENCES
1. Idzham, K., Khalishah, M., Steven, Y., Aminuddin, M.S., Syawani, H., Zain, A.M.,
and Yusoff, Y., Study of Artificial Intelligence into Checkers Game using HTML and
JavaScript. IOP Conf. Ser.: Mater. Sci. Eng., 864, 012091, 2020. doi:10.1088/1757-899X/
864/1/012091
2. Alkharusi, S., Checkers Research Paper Based on AI (2), 1, 7, 2020. https://www.
researchgate.net/publication/339337169_checkers_research_paper_based_on_AI_2
3. Masters, J., Draughts, Checkers – Online Guide. www.tradgames.org.uk
4. Sutton R., Samuel’s Checkers Player. In: Sammut C. and Webb G.I. (eds.), Encyclopedia
of Machine Learning. Boston, MA: Springer, 2011. https://doi.org/10.1007/978-0-387-
30164-8_740
5. Arthur, S., Some Studies in Machine Learning Using the Game of Checkers. IBM J.
Res. Dev., 3 (3), 210–229, 1959. CiteSeerX 10.1.1.368.2254. doi:10.1147/rd.33.0210
6. Myerson, R., Game Theory: Analysis of Conflict. Cambridge, MA; London, England:
Harvard University Press, p. 1, 1991. doi: 10.2307/j.ctvjsf522.15
7. Pearl, J., Heuristics: Intelligent Search Strategies for Computer Problem Solving.
Reading, MA: Addison-Wesley Pub. Co., Inc., p. 3, 1984.OSTI 5127296.
8. Black, P. and Vreda, P., “Search Tree”. Dictionary of Algorithms and Data Structures
Figure 2. 2005. https://levelup.gitconnected.com/an-into-to-binary-search-trees-
432f94d180da
9. Binmore, K., Playing for Real: A Text on Game Theory. New York, 2012. Oxford
Scholarship Online: http://dx.doi.org/10.1093/acprof:osobl/9780199924530.001.0001
10. Kuo, Jonathan C.T., Artificial Intelligence at Play — Connect Four (Mini-max
Algorithm Explained), 2020. https://medium.com/analytics-vidhya/artificial-intelli-
gence-at-play-connect-four-minimax-algorithm-explained-3b5fc32e4a4f
11. Russell, S.J. and Peter, N., Artificial Intelligence: A Modern Approach (2nd ed.). Upper
Saddle River, NJ: Prentice Hall, 2003, pp. 163–171. ISBN 0-13-790395-2 Figure 3: By
Nuno Nogueira (Nmnoguera) http://en.wikipedia.org/wiki/Image:Minimax.svg, cre-
ated in Inkscape by author, CC BY-SA 2.5, https://commons.wikimedia.org/w/index.
php?curid=2276653d=2276653
12. Russell, S.J. and Norvig, P., Artificial Intelligence: A Modern Approach (3rd ed.).
Upper Saddle River, NJ: Prentice Hall, 2010.
13. McCarthy, J., Human Level AI Is Harder Than It Seemed in 1955. 2005. Retrieved
2006-12-20.
14. Edwards, D.J. and Hart, T.P., The Alpha–Beta Heuristic (AIM-030). RLE and MIT
Computation Center: Massachusetts Institute of Technology, 1961. hdl:1721.1/6098.
15. Knuth, D.E. and Moore, R.W., An Analysis of Alpha-Beta Pruning. Art. Intel., 6 (4),
293–326, 1975. doi:10.1016/0004-3702(75)90019-3 S2CID 7894372
2 The Future of
Automatically Generated
Animation with AI
Preety Khatri
CONTENTS
2.1 Introduction..................................................................................................... 19
2.2 Ai’s Role in Animation.................................................................................... 21
2.2.1 How AI Replaces Animation............................................................... 21
2.2.2 Various Agents in AI........................................................................... 22
2.3 AI Latest Techniques in Animation................................................................ 23
2.3.1 Latest AI Technology..........................................................................25
2.3.2 AR Technology....................................................................................26
2.3.3 VR Technology....................................................................................28
2.4 The Traditional and Modern Animation......................................................... 30
2.4.1 Traditional Animation......................................................................... 31
2.4.2 Stop-Motion......................................................................................... 32
2.4.3 Modern Animation.............................................................................. 33
2.5 Future Aspects of Animation with AI............................................................. 33
2.6 Conclusion.......................................................................................................34
References................................................................................................................. 35
2.1 INTRODUCTION
Computer animation refers to the addition of something new and improved to the
conventional method of animation. A technique that uses a sequence of images in
frames to create the illusion of movement. It is a technique that creates the illusion
of movement by viewing images on a screen and capturing a sequence of individual
states of an active scene using a recording device. An animation can be described as
a movie made up of a series of rendered images. Various features such as file size,
file format, frames per second (fps), compression, output size, and so on have been
used to monitor the quality of the images in the picture or frames [1, 2]. The most
popular form of animation is keyframing, in which the animation is generated at
several points throughout the animation. At the same time, the computer creates all
of the transition frames between the two keys. Changing the position, rotation, and
scale of objects are all examples of animation techniques.
Setting the length of the animation in frames and fps is very important when
animating. We can place the keys inside the frames with the aid of keyframes [3, 4].
DOI: 10.1201/9781003231530-2 19
20 Deep Learning in Gaming and Animations
To alter, transfer, rotate, or resize an object, the key is placed at the beginning and
end of the desired path. Consider this scenario: if an object moves from point X to
point Y in 4 seconds and you have 50 fps, put four keys 100 frames apart. Real-time
animation allows you to give different objects physical properties. To control them,
it employs a variety of controls and features. In the x, y, and z planes, you can cre-
ate different objects, alter masses, create actors, control friction, and control forces.
Architectural walk-throughs can be constructed using a variety of real-time anima-
tion and three-dimensional (3D) games.
There are two approaches to taking computer animation and evolution into
account. The first approach involves combining traditional animation techniques
with the use of a computer. The second approach focuses on simulation models
based on physics and dynamics laws [5]. Consider the following scenario: Traditional
methods allow us to build 3D characters with enhanced gestures, while simulation
methods are used to model human actions accurately. Take bouncing balls, for exam-
ple, where the motion of the balls can be enhanced by adding squash and stretch [3].
When an object is squashed, it expands and flattens out, indicating that it is made
of a pliable and soft material. Various traditional animators use this tool. This does
not provide a practical simulation, but it gives the audience an idea. A bouncing ball
motion can also be fully simulated by computer using mechanics such as quantum
conservation and Newton’s laws, as shown in Figure 2.1 [2].
In the multimedia industry, autonomous virtual actors and real-time animation
are critical because immersive use of functionality is a direct advantage. Each tele-
vision and film producer will be eager to develop new programs and features to
participate interactively [6]. Real-time animation capabilities are needed by editors,
publishers, and writers of interactive TV programs, CD-ROMs, and CD-Is that are
becoming increasingly interactive [3].
Computer animation has been regarded as a cutting-edge medium for visual
effects (VFX) and advertising in films for many years. The rapid development of
powerful super workstations in recent years has given rise to new areas such as video
gaming, virtual reality (VR), and multimedia. The use of real-time animation and
digital technology has become a significant concern. The audience can only choose
which programs to watch on traditional television [7]. With the latest advances in
multimedia products and digital and interactive television, viewers will communi-
cate with programming. This creation would pave the way for personalized pro-
gramming for each viewer [6].
In animation, we can look at the following developments: computer animation
began with very basic methods derived from conventional animation and keyframes
[3]. There have also been developed several time-consuming rendering methods.
Importing inverse dynamics and kinematics from robotics paves the way for more
advanced simulation methods. Computer animation has traditionally focused on
dynamic simulation and physics methods, particularly in collision detection and
deformations [5].
With the advent of super workstations and VR applications, brute force approaches
such as radioscopy are likely to resurface. In the future, real-time complex animation
systems will be built using simulation and VR devices [8]. Autonomous actors and
actual actors with motion captured by sensors are calculated by the machine using
real-time behavioral simulation with complex physics-based interactions with the
environment. Long-distance partners could be assigned to these dynamic scenes.
most workers, which are both boring and menial. To engage in such activities, most
people need some motivation. Inventors use technology to make their lives simpler
in any way. There are already computers and appliances that perform our every-
day tasks, such as washing machines and microwaves. There are several automation
tools available that can perform most of the tasks that animators can perform with
AI. As a result, we can conclude that AI is now the most dominant industry and that
technology is performing better than humans [12].
2.2.2 Various Agents in AI
One of the most critical areas in AI is agent, which is primarily concerned with
intelligent systems [13]. In terms of AI, an agent is a machine that exists in a given
environment and makes its own decisions. It uses sensors to perceive the environ-
ment and actuators to function in the environment. Intelligent agents, robotics, and
other technologies are examples.
As seen in Figure 2.2, which shows how autonomous agents are classified. The
term “real-life agents” refers to living animals such as mammals, reptiles, fish, and
birds. Robotic agents of the mechanical kind, on the other hand, are also agents from
the AI perspective—for example, the robot rovers used for NASA’s Mars Rover mis-
sions. Computer agents are agents that only exist in a virtual or web-based world [10].
Agents that process human language, agents that gather various information,
agents that are informative, agents that are intelligent and learn, and agents that are
programmed for a specific purpose such as entertainment, e.g., in games, special
VFX, 3D animation, and so on are all examples of software agents. Software agents
and artificial life agents are examples of computational agents. Computer agents
are further classified as quest agents, learning agents, preparation agents, conversa-
tional, intelligent agents, and so on [13].
Figure 2.3 shows a topology of intelligent agents, collaborative agents, collab-
orative learning agents, and interface agents based on their learning, coordination,
autonomy, and other characteristics. Agents function at a higher level than symbols
and require high-level communications.
Agents use a variety of bots, including chatterbots, which are used for web chat-
ting. Annoybots, which are used to disrupt chat rooms and newsrooms. Spambots,
which are used to generate junk mails after collecting web email addresses. Mail
bots, which are used to manage and filter email, and spider bots, which are used to
scrape content from the internet.
As a result of the above, we may conclude that agents are those who act or exert
force. Anything that produces or is capable of producing an effect is referred to as an
agent. An agent is allowed to act for or in the role of another person, such as a gov-
ernment delegate, emissary, or official who engages in undercover activities; it may
also be a business representative, as shown in Table 2.1 [14]. Based on the coordina-
tion, planning, and cooperative ability, the agent’s properties can be classified as:
TABLE 2.1
Agent Properties Based on the Coordination, Planning, and Cooperative
Ability
AI algorithms will easily do data-driven work without taking up the time that
people might. This new technology can now accomplish what teams of anima-
tors will take weeks to accomplish. AI’s true success is making seamless edits,
nuanced characterization, and significant VFX, in addition to providing a solid
track record. Even though these algorithms are very costly and are used by large
companies such as Disney. The animators are uncertain about their future careers
as a result of this.
To make human lives more straightforward, the tricky part is that not every artist
contributes to this mentality. If not alarming, the intrusion of technology into their
artistic work is frustrating. Filmmakers have not yet captured the entire AI move-
ment where these training devices are present to generate worlds and animations.
This is primarily due to a lack of communication between the filmmaker and the
invention.
2.3.1 Latest AI Technology
As we all know, digital technology is rapidly evolving. We use various new technol-
ogy regularly, such as voice assistant Alexa, Siri, Google Maps, and so on. AI in
animation is currently being developed and used to some degree by animators to
speed up time-consuming tasks. One of the key benefits of using AI to automate
those stages in animation [18] is that you have more time to focus on other activities
when you speed things up.
Companies like Adobe have already developed features for their standard ani-
mate suite that connect up characters’ mouth movements to sound. Moreover, others
are going even further by leveraging AI to create solutions that completely automate
characters’ movements and facial expressions as they speak, based on data collected
through ML, effectively automating this delicate, time-consuming task.
When it comes to cutting-edge technology, ML is one of the AI systems in which
computers are not directly programmed to perform those tasks. Instead, they auto-
matically learn and develop as a result of their experiences. Deep learning is a
form of ML that uses artificial neural networks to make predictions. Unsupervised
learning, supervised learning, and reinforcement learning are examples of ML
algorithms.
The algorithm in unsupervised learning does not use personal data to work on it
without any guidance. It deduces a function from the training data, which consists
of collecting an input object and the desired output in supervised learning. Machines
use reinforcement learning to take appropriate actions to maximize the incentive in
order to find the best option that should be considered.
26 Deep Learning in Gaming and Animations
FIGURE 2.4 Five emerging technologies that will change the world in next five years.
(Source: Forrester Research, Inc., unauthorized production, citation, or distribution prohibited.)
As seen in Figure 2.4, five new technologies will change the world in the next five
years. Based on parameters including systems of engagement technologies, systems
of insight technologies, and supporting technologies [19], various emerging technol-
ogies such as Internet of Things (IoT), virtual and augmented reality (VR and AR),
hybrid wireless, AI/cognitive, intelligent agents, real-time interaction management,
spatial analytics, cloud-native application frameworks, insight platforms, and so on
have been elaborated in this figure.
2.3.2 AR Technology
In AR systems, the consumer wears a head-mounted display (HMD) and uses a
device, which involves many boundaries. The H/w was bulky, and the HMD
The Future of Automatically Generated Animation with AI 27
interfered with the user’s standard view [20]. It is also susceptible to causing discom-
forts such as dizziness or nausea. As a result, the performance of the AR systems on
that particular hardware necessitated a significant amount of effort and time from
the developers. These drawbacks have been alleviated by the appearance and rapid
evolution of smartphones. There are variously powerful smartphones on the market
based on HMD’s technology. As a result, the consumer does not have to keep the
hardware and their versatility is enhanced. As a result, there is now various smart-
phone AR applications available on the market that digital layer knowledge over the
physical world [21].
We can see the digital data visible with the aid of a smartphone camera and position
it in a specific location. Knowledge may be positioned in a variety of ways, including
by the use of a mobile device. The new smartphones have Global Navigation Satellite
System (GNSS) orientation sensors, localization systems, and other features. The
orientation sensor processes the unprocessed sensor data from the accelerometer to
obtain information. The orientation sensor provides pitch (degrees of rotation about
the x-axis) and roll (degrees of rotation about the y-axis). As a result, the device’s
direction is determined by the pitch and roll values. When a smartphone is posi-
tioned at a specific location with known coordinates and orientation, the digital
information is visualized using the camera. The accuracy of the GNSS positioning
is a disadvantage of this system.
The accuracy of the signal decreases in areas with tall buildings or a large number
of trees, and the signal bounces off. Furthermore, this device is ineffective inside
houses. The use of markers is a different way of displaying digital data. A marker
is a square frame with a light-colored, usually white, center and a dark-colored,
usually black, outer frame [22]. Each marker is made using a unique pattern, mak-
ing them one-of-a-kind. In the marker recognition process, the smartphone camera
gathers and processes images in real-time intending to identify a pattern. The color
detection, known shapes, repeated and geometries patterns image recognition tech-
niques are focused on finding patterns of color detection, known shapes, repeated
and geometries patterns.
When a creator is discovered, the virtual knowledge on the camera overlaps.
Natural feature tracking (NFT) is a technique that uses artifacts or pictures instead
of markers to monitor natural features. As a result, NFT enables users to recognize
and monitor natural features on objects and images. Digital data such as two-dimen-
sional (2D) images, 3D models, video, audio, text, animation, etc. This opens up the
possibility of developing a wide range of applications, such as:
TABLE 2.2
Summary of AR Frameworks
Location- Marker-
Based Operating System NFT Based
Droid AR x Android ---------- X
AR Toolkit ----------- iOS, Android, Linux, Windows, Mac OS X, Unity 3D x X
Beyond AR x Android --------- ------------
Vuforia ----------- Android, iOS, Unity 3D x X
Mixer ----------- Android & iPhone x X
progress [20] due to many growing software structures. A framework usually pro-
vides some basic features that can be used to build more complex applications. The
most significant advantage of using outlines is that they include general application
structure, they aid developer relationships, and many available resources and librar-
ies can be used for frameworks. There are many platforms available these days for
quickly creating AR applications [21].
Table 2.2 summarizes the existing state of free AR framework functionality.
Many structures are no longer maintained, and as a result, the list is constantly
changing. AR Toolkit and Vuforia are the most user-friendly and comprehensive
tools, but if you want to build a position-based app, you will need to use one of the
other frameworks [2].
2.3.3 VR Technology
Even though AR and VR are closely linked technologies, they all depict peculiar real-
ities. When we talk about AR, we are talking about adding elements to reality, while
VR creates a new reality that is not actual. VR is an artificial world developed with
specialized software. VR displays a 3D image that can be explored interactively using
controls such as a game console, a computer mouse, or sensor-equipped gloves [23].
VR systems include using a HMD, such as glasses, to learn about the virtual
world. The first VR device had poor graphics quality and needed a complicated
HMD. The HMD has advanced dramatically in recent years, and there are now regu-
lating solutions on the market, such as Oculus Rift. Oculus VR [7] is working on a
VR headset similar to the Oculus Rift.
Stereoscopic vision is used in the treatment. Stereoscopic vision is a method of
gathering 3D visual information that gives a picture the illusion of depth. Due to
their separation, the eyes in natural stereo vision generated two images with minor
variations between them. To construct depth perception, the brain processes these
variations. The HMD projects a stereoscopic view. Two images are shown on the
projector, one for each eye. A small controller is used to monitor the interpupillary
distance based on the display.
Individuals and elements vary in their eye separation, resulting in a realistic ste-
reoscopic vision-logic. The HMD function and sensors that monitor the user’s head
movements and change the picture are known as a virtual surround sound system.
The Future of Automatically Generated Animation with AI 29
For the time being, this gadget will connect your machine to your smartphone. Some
users report experiencing headaches or motion sickness as a result of the immersive
3D vision. To avoid this, it is necessary to adjust the lens to each individual’s vision.
However, prolonged use can cause anxiety.
Imagining VR with an HMD is a little annoying at times. However, creating
a semi-immersive app in which a 3D model is shown on the smartphone monitor
seems promising. Throughout the touch screen, the user will collaborate with it.
Users can load 3D models and collaborate with them using those frameworks. These
frameworks can be used to build simple semi-immersive applications. However, if
you want to work with 3D models for animations, rendering, and other purposes, a
game engine is the best choice.
A software framework is a game engine developed for the production and creation
of video games that can be played on a variety of platforms. Loading, rendering,
object collision detection, animation, physics, inputs, graphical user interface (GUI),
and AI are the engine’s key components. The game engine also has other tools for
creating the actual game, such as terrains, characters, real-world object behaviors,
and so on. The game engine includes all of the resources needed to create a VR app.
Many game engines also support stereoscopic vision [24].
Mind3d, a lightweight 3D framework for Android based on OpenGL ES v1.0/1.1,
is one of the several 3D model frameworks available. This system includes tools for
loading and modifying.m2d, 3ds, and. Obj files. It cannot be sustained at this time.
The disadvantage is that these previously mentioned frameworks are only compat-
ible with Android, making it difficult for beginner programmers to create apps. On
the other hand, the game engine is the best way to build a virtual world and expand
basic functionality.
Unreal Engine 4, CryEngine, and Unity are the most common game engines at
the moment. These game engines are prevalent, and although each has its own set of
advantages and disadvantages, they are all compatible with VR glasses [25]. Unity is
the simplest to use and is compatible with all mobile devices, but the graphics quality
is lower, and real-time simulation is impossible.
Unreal Engine 4 has incredible graphics capabilities, allowing for the development
of hyper-realistic scenes, but it is only compatible with iOS and Android. However,
it is easy to use. Finally, because of the engine’s steep learning curve, CryEngine is
better suited to experienced developers, though the graphics quality is excellent. The
main characteristics of these game engines are summarized in Table 2.3 [25].
TABLE 2.3
A Summary of Game Engines
It is the equivalent of ignoring the roots of animation and film, which are tradi-
tional art and photography. Both are intertwined and would not exist without the
advancement of technology and artistic movements. Computer animation is why so
many people are upset to be protective because it is a different way for thousands of
artists, designers, and engineers to have more freedom to work that is cared for your
likeness, not just that it is a more open and structural workflow.
When it comes down to it, it is all about personal preference. It might also come down
to what is best for the project. People should consider that creating digital animations
opens up a world of possibilities and access that conventional animation cannot match.
For instance, having various controls in any environment where you want to work
physically and artistically is a good example. Only animation studios had the physi-
cal resources to create and screen animated films. Anyone now has unrestricted
access to make an animated film and distribute it as they see fit. Because of all the
possibilities for animating [9], the field is far more available in terms of creativity.
There is no way to animate in a single way or a single form. It is shocking when
someone criticizes it because it is so extensive and accessible.
Whether 2D or 3D is better nowadays has almost nothing to do with technological
properties, but instead with which one can work better. Recently, there have been some
fantastic 2D films, such as Iron Giant and Lilo & Stitch, as well as some terrifying 3D
films, such as Final Fantasy. Spirit, Treasure Planet, and Sinbad failed simply because
they were terrible movies in plot and pacing, not because of the medium in which they
were produced. And some of the most popular 3D films include Shark Tale (2003),
Planes (2013), Norm of the North (2016), and Alpha and Omega (2010). Essentially,
3D is yet another method to aid in the filmmaking process. We use it and want to use
it to be most appropriate for the project, but the medium does not determine the result.
Ideas, pencils, and paper are the best starting points for a film. The way you play your
story has a significant impact on its success. What best suits the project is decided by
finding out the suitable media and resources that will work with the story.
At this point in the history of animation and its technology, having the best of all
worlds, physical and digital, is critical. Technology has simply moved into a more
modern reality. However, it has rarely replaced key elements that capture the heart of
animation, which many people fear when they believe technology is about to make
a significant shift in any field.
Many people are afraid of being replaced one day by a computer or someone who
has the tools to do three jobs for one low price.
To counter this misconception, there will always be new employment that will
most likely replace old ones, yes. However, it will also lead to more creative content
in other fields that defiantly require the physical labor of a human mind. Suppose the
future doesn’t manage to invent a computer with such an advanced artificial mind
to produce fantastic animation without being villains someday. In that case, we will
have something to be concerned about [17].
technology was prevalent in the cinema before the advent of computers. A sequence of
sketches on transparent pages is used to create this animation technique. Traditional
animation starts with the creation of a plot. Following the selection of the plot, the
artist creates a storyboard that resembles a comic book [4]. The storyboard depicts
the filming sequences of camera angles and frames. Before the director approves a
scene in the storyboard stage, the animator may need to repeat it several times.
Traditional animators work in batches, drawing one picture or frame at a time. In
conventional animation, the pencil is the primary weapon. Artist creates a drawing
on a sheet of transparent paper that can be inserted into your desk’s connector strip.
The peg bar is a traditional animation method used to hold drawings in place. The
artist creates the character solely with a pencil, which is then shot or scanned and
synced with the necessary soundtracks. Before the animation is sent to the supervi-
sors, the artist will check and develop his work [4].
Each pencil-drawn frame of the animation is transferred to the animation.
Background images are painted and superimposed on cels, cels, or celluloid pages.
Background artists typically paint the sets on which the animated sequence’s action
takes place. Acrylic paint was commonly used to create the backdrop: animators
study models, dolls, puppets, real-life figures for inspiration. The main animator in
an animation studio creates a character’s key or mainframes [4]. Twining is a tech-
nique in which the primary animator draws the main points of the action while the
junior animator completes the intermediate or incomplete frames.
The animator uses keyframes to monitor the movement of the character’s arms,
eyes, and mouth. A supervisor animator, a small group of leading animators, and
many assistant animators work together in a large budget animation production group.
Key animators decide the main action of the character. After that, the accepted pencil
animation graphics are photographed with a black and white animation camera.
Traditional animation includes 2D cell animation and stop-motion animation,
even though both can use digital recording techniques in the end. The process used
to create the animation is what matters most. Cell animation typically includes hand
drawing, hand inking, and hand painting each structure on actual paper and cells. At
the same time, stop-motion involves dealing with physical designs and objects shot
with the camera one frame at a time. You get your simple cartoon movie, like Mulan,
with traditional animation. It is a method of drawing pictures, photographing them,
and then animating them so that the original drawings are slightly different.
2.4.2 Stop-Motion
Stop-motion animation has been a staple of film special effects for almost as long as
films have existed. Stop-motion animation is an enjoyable and straightforward ani-
mation technique. Even if you do not know it, you have probably seen stop-motion
animation in advertisements, music videos, TV shows, and movies. Although it is
popular to think of stop-motion as a single form [12], it is not. Stop-motion tech-
niques can create a wide variety of film forms, not just sound animation. You are on
your way if you combine parallel parts of digital camera, machine, and imagination.
Though computer-generated animation is more flashy, stop-motion animation has its
own rich culture.
The Future of Automatically Generated Animation with AI 33
2.4.3 Modern Animation
The effect is created electronically using either 2D or 3D models in computer ani-
mation. Virtualization of the traditional 2D animation workspace is ordinary in 2D
computer animation, taking pen and paper into the digital realm to redesign cartoon
animation types and workflows. 3D computer animation workflows usually mix con-
ventional timelines and workflows adapted to work in 3D virtual space [2].
In any case, you are dealing with computer animation if you are animating on
video. Avatar and Antz, for example, contain 3D animation, while Cartoon Network
and Nickelodeon will use 2D for the majority of their cartoon features. Pixar’s ani-
mation style is 3D computer animation. It entails animating with computers in a 3D
world.
Computer graphics and animation are now used in almost every audiovisual
media, from movies to commercials. This is so that objects, environments, and char-
acters can be created that would be impossible or difficult to construct in actual life
animation. It has now become an integral part of our everyday lives. It is used for
movies, education, and a variety of other commercial purposes.
FIGURE 2.5 Attractive opportunities in the 3D animation market. (Source: Markets and
markets analysis.)
2.6 CONCLUSION
The future of automatically generated animation with AI was addressed in this chap-
ter. The effect of AI on animation and how modern animation differs from tradi-
tional animation are discussed in this chapter [8]. This chapter examines whether AI
can replace animators in animation, which is a significant concern these days.
The latest technologies for 3D models were discussed in this chapter, and these
technologies provide users with authentic and enriching experiences when visual-
izing 3D objects. Some users may be bothered and inconvenienced by these apps,
especially those that require the use of VR glasses. As a result, some users can prefer
AR apps or using the mobile screen to view 3D models [27].
As AI technology improves and spreads across the industry, animators, filmmak-
ers, and designers are jumping in and making high-quality animations with fewer
people in less time. AI is also having a positive effect on animation and motion
graphics, pushing the boundaries of what can be done with animation. Disney is
known for using AI to create storyboard animations solely from scripts that include
terms like “turn right” and make the character turn in that direction in the animation.
The Future of Automatically Generated Animation with AI 35
REFERENCES
1. Agrawal, A., Joshua G., and Avi G. eds., Introduction to: “Economics of Artificial
Intelligence.” In Economics of Artificial Intelligence. Toronto: nber.org, 2018. http://
www.nber.org/chapters/c14005.pdf.
2. Funge J., Making Them Behave: Cognitive Models for Computer Animation, Ph.D.
thesis, Department of Computer Science, University of Toronto, 1998.
3. Burtnyk N, and Wein M., Computer-generated Keyframe Animation, J. SMPTE, 80,
149–153, 1971.
4. Lasseter J., Principles of Traditional Animation Applied to 3D Computer Animation,
Proc. SIGGRAPH ‘87, Computer Graphics, 21 (4), 35–44, 1987.
5. Amiguet-Vercher J., Szarowicz A., and Forte P., Synchronized Multi-agent Simulations
for Automated Crowd Scene Simulation, AGENT-1 Workshop Proc., IJCAI, 1, 4–10,
2001.
6. Mayer, R.E., and Moreno, R., Animation as an Aid to Multimedia Learning. Educational
Psych Rev., 14 (1), 87–99, 2002.
7. CSIRO Mathematical and Information Sciences (n.d.), Virtual Reality for Teaching Anatomy
and Surgery, 2004. Retrieved from http://www.siaa.asn.au/docs/CSIROscopestudy.pdf.
8. Boden, M.A., AI: Its Nature and Future. Oxford, New York: Oxford University Press,
2016.
9. Park O.C., and Gittelman S.S., Selective use of Animation and Feedback in Computer-
based Instruction, Educ. Technol. Res. Dev., 40, 27–38, 1992.
10. Fikes R., and Nilsson, N., STRIPS: A New Approach to the Application of Theorem
Proving to Problem-Solving, Artif. Intell., 2, 189–208, 1971.
11. Milheim, W.D., How to Use Animation in Computer-assisted Learning, Br. J. Educ.
Technol., 24 (3), 171–178, 1993.
12. Palmer, S., and Elkerton, J., Animated Demonstrations for Learning Procedural
Computer-based Tasks, Hum. Comput. Interact., 8, 193–216, 1993.
13. Funge J., Tu X., and Terzopoulos D., Cognitive Modeling: Knowledge, Reasoning, and
Planning for Intelligent Characters. Computer Graphics Proceedings: SIGGRAPH 99,
1999.
14. Bass, A.S., Non-Tech Businesses Are Beginning to Use Artificial Intelligence. Financial
Times, 2018.
15. Rieber, L.P., Animation as Feedback in Computer-Based Simulation: Representation
Matters, Educ. Technol. Res. Dev., 44, 5–22, 1996.
16. Long D., The AIPS-98 Planning Competition, AI Mag., 21 (2), pp 13–33, 2000.
17. Reynolds C.W., Computer Animation with Scripts and Actors, Proc. SIGGRAPH’82,
289–296, 1982.
18. Russell S., and Norvig P., Artificial Intelligence, A Modern Approach. London:
Prentice-Hall, 1999.
19. Tversky, B., Morrison, J. B., and Bétrancourt M., Animation: Can it facilitate?, Int. J.
Hum. Comput. Stud., 57, 247–262, 2002.
20. ARTOOLKIT, Open source augmented reality SDK. Artoolkit.org, 2016. [online]
Available from: http://artoolkit.org.
36 Deep Learning in Gaming and Animations
21. Carmigniani, J., Furht, B., An Isetti, M., Ceravolo, P., Damiani, E., and Ivkovic, M.,
Augmented Reality Technologies, Systems, and Applications, Multimedia Tools Appl.,
51 (1), 341–377, 2011.
22. Schnotz, W., Böckheler, J., and Grzondziel, H., Individual and Cooperative Learning
with Interactive Animated Pictures, Eur. J. Psychol. Ed., 14, 245–265, 1999.
23. Beier K.P., Virtual Reality: A Short Introduction, 2004. Retrieved from http://www-vrl.
umich.edu/intro/.
24. Zeltzer D., Towards an Integrated View of 3D Computer Animation, The Visual
Computer, 1 (4), 249–259, 1985.
25. CRYENGINE. Cryengine. 2016. [online] Available from: https://www.cryengine.com/
features.
26. Fikes R., Hart, P., and Nilsson, N., Learning and Executing Generalized Robot Plans,
Art. Intell., 3, 251–288, 1972.
27. Draganov, I.R. and Boumbarov, O.L., Investigating Oculus Rift Virtual Reality Display
Applicability to Medical Assistive System for Motor Disabled Patients. The 8th IEEE
International Conference, 2015.
3 Artificial Intelligence
as Futuristic Approach
for Narrative Gaming
Toka Haroun, Vikas Rao Naidu,
and Aparna Agarwal
CONTENTS
3.1 Introduction..................................................................................................... 37
3.2 Related Works.................................................................................................. 38
3.2.1 AI for Computer Games...................................................................... 38
3.2.2 AI for Adaptive Computer Games....................................................... 41
3.2.3 AI in Video Games: Toward a Unified Framework............................. 41
3.2.4 Narrative in Video Games................................................................... 42
3.2.5 Narrative Game Mechanics................................................................. 43
3.2.6 Interactive Narrative............................................................................44
3.2.7 Adventure Games and Puzzle Design.................................................44
3.2.8 The Horror Genre and Video Games.................................................. 45
3.3 Player Experience............................................................................................ 48
3.4 Methodologies used in Game Development.................................................... 51
3.5 AI Elements in the Proposed Narrative Gaming Model................................. 58
3.6 Q-Algorithm for AI in Gaming.......................................................................60
3.7 Conclusion....................................................................................................... 62
References................................................................................................................. 62
3.1 INTRODUCTION
Artificial intelligence (AI) in video games comprises the systems designed and
developed to create the choices and actions of non-player characters (NPC) within
a game. Video games that are developed currently, deliver an awfully fascinating
ground for testing and researching new ideas for AI. These games integrate diverse
and detailed environments with systems that are developed to allow dynamic, com-
plex, and smart real-time choices. In various video game genres such as strategy,
action, and role-playing games, the NPC are developed to have a rule-based system
with variations according to different scenarios. However, machine learning (ML)
methods are sometimes applied to allow NPC to adapt and learn from their interac-
tion with the player character according to their success or failure. Although ML
can be used to improve the NPC’s overall performance, it is not usually applied in
video games.
DOI: 10.1201/9781003231530-3 37
38 Deep Learning in Gaming and Animations
Social isolation measures due to COVID-19 has affected many people espe-
cially young adults causing loss of a sense of community, negative impacts on
learning and development, and increase in impersonality (Sikali, 2020). Computer
games can offer entertainment and stress relief during the pandemic (Ferguson,
2020).
Narrative and storytelling games can help players in social isolation by offering
an immersive storytelling experience where the player can explore and interact with
the game environment to reveal the plot of the game.
This research is aimed at helping people during social isolation especially young
adults with emotional wellbeing (Anderton Kevin, 2018). Providing the players with
an immersive experience that is unlikely to be experienced in the real world, where
the players need to solve puzzles and explore a thrilling and mysterious environment
from the safety of their homes (Butler, 2016).
This chapter will also contribute to the independent game development commu-
nity by using the recent studies and tools in game development to create an immer-
sive storytelling experience for player that is accessible online.
Acquiescent to gaming objectives of AI, generally seven goals or on target by the
game developers for delivering an enjoyable and thrilling gaming experience to the
players:
forward. For the deterministic shortest path problem, natural assumptions of the
default results are proven. In this context, the authors adjust the conventional cost
function into a trajectory-tracking function, which is also an efficient cost-to-target
network tracking function. This significantly contributes to the conceptual frame-
work of the problem domain. The Lyapunov method presents a new principle of
balance and consistency to the shortest path decision-making process.
Coleman Ron, in his paper titled “Fractal study of stealthy pathfinding aesthetics”
uses a fractal framework to examine aesthetic qualities for a new category of stealth-
based pathfinding that seeks to prevent detection in video games (Coleman, 2009).
This study is interesting because the research on AI has provided comparatively
limited attention to aesthetic findings in pathfinding. As per the fractal framework,
the data published indicates that stealthy paths are distinct in their aesthetic value in
comparison to control path. The author also demonstrates that paths created by vari-
ous stealth rules are also unique according to statistical results.
Frank Dignum et al. discusses the research on multiagent systems and their
potential promises in developing cognitively intelligent NPC. However, due to com-
patibility differences and issues, the technology is not easily implemented in game
engines (Dignum et al., 2009). Game engines have dynamic and instantaneous fea-
tures which contribute to an increase in centralized control and efficiency, however
multiagent platforms focus on the independence of agents. The use of multiagent sys-
tems to create a more independence and intelligence will help advance and improve
gameplay.
In a paper titled A multiagent potential fields-based bot for real-time strategy
games, Johan Hagelback et al. discusses the use of AI in real-time strategy games
where bots are given field-based systems that allows them to plan attacks, find
enemies, avoid finding each other, and explore their environment (Hagelbäck and
Johansson, 2009).
A paper titled “Combining artificial intelligence methods for learning bots in a
real time strategy game,” written by Robin Baumgarten et al., discusses how AI is
used in strategy games to simulate human centric gameplay that can plan attacks and
movement according to decision tree base learning, case-based reasoning (CBR),
and annealing (Baumgarten, Colton and Morris, 2009).
Fabio Aiolli et al. in their paper titled “Enhancing artificial intelligence on a real
mobile game” discusses the technical issues that encountered during game develop-
ment such as creating a complex and engaging AI to play against. However, such
complexity is difficult for mobile game development (Palazzi and Aiolli, 2009). The
author suggests the use of a ML algorithm that solves this issue by adapting and
predicting human strategies within the game.
In the paper “Breeding terrains with genetic terrain programming—the evolu-
tion of terrain generators,” by Miguel Frade et al. discusses the role of AI in gen-
erating terrains for level design. This allows level designers to create more diverse
and advanced terrain types with better features and aesthetics (Frade, Fernandez
De Vega and Cotta, 2009). The paper conducted a study on the various terrains
created by AI, their characteristics, and resolution. The results have shown that the
use of AI in terrain generation can reserve detailed features without compromising
on resolution.
Artificial Intelligence as Futuristic Approach for Narrative Gaming 41
The fourth and final spatial design is Emergent Narrative. This provides the
player with an abundant source of world building to allow the player to design their
own narrative. The narrative is not pre-constructed by the developers, it is designed
to be chaotic similar to the real world. The author has used The Sims games as
example where the player can make decisions to interact with other characters, have
desires, and where their decisions in the game have consequences making the player
immersed and engaged while maintaining freedom in their own narrative.
explaining how those actions led to the specific ending the player got by the end of
the game.
Bycer wrote that there are endless possibilities for how narrative mechanics can
evolve and change gameplay to tell better stories that are more immersive and inter-
esting. He hoped that AAA game developing companies would start implementing
them in their new games and he also recommended the use of narrative mechan-
ics that have not been developed for games yet such as changing the behavior and
appearance of NPC’s according to player’s behavior as well as altering the player
character’s senses within the games based on the rate of success or failure at solving
puzzles in the game.
3.2.6 Interactive Narrative
Interactive narrative is a type of interactive entertainment in video games where the
player can affect the story through their actions during gameplay (Riedl, 2012). Narrative
in video games provide meaning and context to game events and actions, provide moti-
vation for player’s actions in game, and acts as a link to transition the player through the
various events and tasks within the game. However, the player has limited control over
the events of the storyline within the game and can only have a small influence through
choices which allows the game to have multiple branching storylines. Branching story-
lines are not often developed due to technical difficulties but AI can help create interac-
tive narrative by creating multiple storyline branches easily and efficiently.
There are two different approaches in using AI in interactive narrative; the first
is emergent narrative and the second is drama management. Emergent narrative is
where the AI simulates realistic and independent characters within the game, while
drama management is where the AI creates and drives the storyline of the game
according to the player’s actions and preferences. Interactive narrative does not just
give the player control over the story but also the illusion of a realistic world full of
choices and possibilities for the player to explore.
Puzzles in video games have a strong impact where they can affect the player’s
actions and progress the story. They give meaning to in-game actions such as inspect-
ing items to find clues and search for a solution. They can also have consequences by
giving punishments or rewards that affects the overall story and progression. Puzzles
must be designed to allow the player to progress in the story and game goals where
they are will not be perceived as meaningless obstacles but meaningful and impact-
ful choices that makes the game progress.
The thesis discusses ten principles for designing puzzles in video games. The first is
to make them easy to understand so that the players do not feel confused. The second
is to make them easy to begin solving them so that the players do not feel intimidated
and try to solve them. The third is showing progress where the player can feel that they
are accomplishing something by interacting and manipulating the objects available.
The fourth is to show that the puzzle can be solved by giving the player cues of prog-
ress. The fifth is making the puzzle interesting by adding more difficult challenges.
The sixth is giving the player multiple options to progress the game. Scattering dif-
ferent puzzles that the player can choose from to avoid frustration resulted from lack
of progression. The seventh is to show the player that all elements of the puzzles are
connected which helps solve more difficult challenges. The eighth is to offer clues that
can help the player in solving the puzzle. The ninth is showing the answer to the puzzle
indirectly so that when the player finds the solution, they gain a sense of accomplish-
ment. The final principle is to be careful of using optical illusions in puzzle design
which might make the player feel frustrated.
Horror helps young adults confront and master their fears in a safe space (Jamie
Madigan, 2015). This helps releasing feelings of anxiety caused by real world fears
and worries and replacing it with feelings of relief and exhilaration as well as a sense
of reward from conquering their own fear (Elio Martino, 2019).
Experiencing fear in a fictional environment causes a rush of emotions that other
video game genres are unable to provide. This is a result of the excitation trans-
fer theory where fear transfers feelings of pleasure similar to that of riding a roller
coaster knowing that it is safe (Nicolas Brown, 2020).
This article discusses the various issues modern Horror games face and how they
are different to older games of the same genre. The author argues that horror games
have fallen out of fashion and are no longer popular in the game industry. AAA game
development companies have moved away from the genre despite the enormous suc-
cesses of games such as Dead Space 3 and Resident Evil 6. This is due to the indus-
try’s focus on combat which has resulted in the rise of Action-Horror games rather
than focusing on puzzle-solving, narrative, and adventure.
Meanwhile, the independent game development companies such as Frictional
Games, started to stay away from action-horror and implement new game mechan-
ics (such as stealth and sanity) to help the player focus on story while also increasing
fear elements in the game. The success of the game has resulted in a popularity with
indie horror games with many games being released in the market using similar
mechanics of stealth as well as removing combat (Bycer, 2019).
However, many of these games lack an understanding of psychology behind what
evokes tension and fear in the player as well as what immerses the player into the
world and the narrative of the game. This includes providing the player with details
about the game world and narrative as well as game mechanic and loops that keeps
the player immersed in the story. The difficulties of designing these elements are
one of the reasons that independent game developers have switched to a different
approach; designing games for an audience instead of players. With a rise of a trend
called “Let’s Play” has started gaining popularity on YouTube in 2010. With many
YouTubers creating videos of themselves playing and reacting to these games.
This article is significant to the proposed project because it explains the current
issues in Modern Horror game design and shows how they are different from the
classic Horror games.
In this chapter, a tool has been designed to help game developers in creating a
horror game. The author points out the difficulty in developing horror games due
to the different interpretation the nature of fear. This chapter offers a solution by
categorizing fear on a scale tool that measures the level of fear in game events. This
tool helps the developer in understanding how certain game mechanics, lighting, and
enemy design should be like in order to achieve the desired emotional response from
the player as well as plan the intensity of certain game events along with the pacing.
The tool also acts as a bridge between understanding the psychology behind fear and
enjoyment in horror video games (Ntokos, 2018).
It is crucial to understand the emotions that are desired to be provoked in the
player and understand how these emotions contribute to the player’s enjoyment and
immersion. The level of fear tool contains ten levels of fear which explain the inten-
sity of the emotion.
Artificial Intelligence as Futuristic Approach for Narrative Gaming 47
The author advices game developer to use the scale shown in Table 3.1 to plan the
game events and develop accordingly. The scale tool can be applied to be used on
atmosphere, audio, and Enemy AI.
Atmosphere is crucial in horror games where it controls the player’s feel in the
environment. Atmosphere is the element that sets the feel, style, and tone in the
environment. It contains elements such as sound, lighting, and gameplay experi-
ence. Darkness is suitable for creating a scary atmosphere for the player where the
darkness pushes the player to imagine their surroundings and increase tension. Any
element that blocks player’s visibility will create a sense of unease.
Audio has a strong impact on player experience and game atmosphere. In
Horror games, Auditory Hallucinations affect the player’s experience by increas-
ing tension and fear through hearing sounds that did not happen (Demarque and
TABLE 3.1
Levels of Fear (see Ntokos, 2018)
Lima, 2013). They have various types and forms such as strange noises of ghosts
or crying.
This chapter conducted an experiment where people got to play two different
versions of the same game. One has Auditory Hallucination and one without. The
results have shown that people who played the game with Auditory Hallucination felt
more scared playing the game.
The Survival Horror genre has many elements that makes it perfect for story-
telling and narration and trying to understand the genre from the narrative per-
spective helps shed light to many aspects in Horror games that makes it unique
from other genres as well as better at implementing effective narrative structures
in the game. The player character is always put in situations where (s)he takes the
role of a detective who is exploring and investigating their surrounding environ-
ment. This makes the player engaged in the game and trying to reconstruct the
narrative of the game while also being unsure about the strange and paranormal
elements happening in the game world and threatening the character. The player’s
goal is not just to discover the story, but to find a solution and escape at the same
time. This structure is one of the reasons that characters in Horror games usu-
ally suffer from memory loss. This enables the player to share the experience
of uncovering the past with the character creating a connection and increasing
immersion.
Environmental storytelling is one of the powerful mechanics used in horror
games where narrative is being unfolded to the player through the game world and
architecture. The buildings, objects such as toys, furniture, or notes are used to tell
stories from the past. The level design becomes a crucial part of the game narrative
and almost treated as one of the characters in the game (Kirkland, 2009).
Survival Horror games show that there are ways to represent narrative in an
interactive game experience through spatial environmental design even though
the player’s pathway is designed in the game world through quests and tasks.
Storytelling in Survival Horror is also designed to feed the player the sense of not
being in control in an interactive narrative experience. However, in order to achieve
this, storytelling usually takes a linear design in the game giving the player the
illusion of being in control over the story events while providing the player with
interactivity. This causes a sense of fear and tension in the game through showing
that there are more powerful dark forces that control the narrative and manipulate
the player in the game world.
methods help the game designer in analyzing both the end results to improve project
implementation, and the implementation process to improve the final result. This
method of approaching the project from both the viewpoints, allow various possibili-
ties and dependencies when creating systems and subsystems in a game that need to
interact with each other dynamic and complex behaviors inside the game.
The MDA framework is designed to help designers and game developers to
closely examine interdependencies in a game in order to create the desired end result
while scholars and researchers need to study and understand them before making
any conclusions on the game experience resulted by them.
Players view the game as a set of rules which need to be followed, these rules
build the system that is being played and playing the system results in the player
having fun. This view can be formalized using the MDA model to be used by the
developers as follows; the game contains mechanics which are the basic structure
of the game and the code that forms the rules, Dynamics are the result of the inter-
dependencies of the mechanics, how they affect the player’s input and the game’s
output to the player, and finally the Aesthetics are the player’s emotional responses
and experiences in the game.
This framework is built on a fundamental idea that video games should be
described as artifacts and not as media. The author suggests that video games identi-
fied as artifacts mean that the behavior of a game is formed by its content and their
interactions together forming a system and not the media elements that are presented
to the player.
Each element of the MDA framework can be used to view separate parts of the
game that can be linked together to provide the full experience desired by the devel-
opers as shown in Figure 3.1. The designer views the mechanics as the elements
that creates the dynamic behaviors in the system, these dynamics create the desired
aesthetic experiences to the player. However, the player views the aesthetics as the
main element that sets the mood of the game which is a result of the dynamics and
lastly, the underlying mechanics.
The author urges developers and researchers to think about both the views of the
designer and the player in order to understand how game elements interdepend and
interact with one another as well as create a game that focuses on player experience
as opposed to a game that focuses on the given features and mechanics.
FIGURE 3.1 The MDA framework (see Hunicke, Leblanc and Zubek, 2004).
50 Deep Learning in Gaming and Animations
certain goal while resistance is the challenges that stand in front of the player block-
ing the player from achieving their objectives. This kind of pattern leads the player
to find a solution to remove the resistance and continue with the objective.
The second pattern is based on AI movement where the player moves in the envi-
ronment responding to the movements and actions of AI characters in game which
can be NPCs or enemies.
The third pattern is based on a target path where the player needs to reach a cer-
tain visible target which directs and moves the character to that location.
The fourth pattern is based on collection paths where the player moves in the
environment motivated by collecting rewards or resources such as battery pickups
or coins.
The fifth and final pattern is player vulnerability path which forces the player to
adapt to the environment due to vulnerability in a given scenario such as needing to
hide to avoid enemies.
games have been observed to gather references and understand their requirements
and functionalities until initial features for the proposed project have been identified.
Various sources and tutorials have been studied to learn how to implement the initial
identified features inside Unreal Engine as shown in Figure 3.2.
Game development projects share many similarities with software engineering
projects (Kortmann and Harteveld, 2009). Agile methodologies are suitable for com-
plex projects where requirements needed are not clear at the start of the project.
Game development projects are complex with uncertain requirements that might
need to change during development due to its multidisciplinary nature where vari-
ous skills and tasks are needed from art design, sound, gameplay, and programming
(Petrillo and Pimenta, 2010). This causes difficulties for clients to have a clear pic-
ture of the needed outcomes at the start of project development. Agile methodology
allows developers to gain better understanding to the needs of the clients as well as
provide suitable solutions to improve the quality of the project as well as add effi-
ciency in development by spending more time adding improvements to the project.
Therefore, many game developments companies have applied the use of Agile meth-
odologies (Stacey and Nandhakumar, 2008).
The functional prototype is implemented in the third person design template of
Unreal Engine. This template includes a third person character and camera where the
camera follows the player character in the three-dimensional (3D) space as shown in
Figure 3.3. The first-person mode has been designed by moving the camera to the
head of the player and linking it to the head mesh so that it moves with the movement
of the player character. A spotlight has been added in front of the head mesh to cre-
ate the light for the flashlight. The arrow attached to the character mesh points at the
direction which the player character faces when spawned in the level.
Artificial Intelligence as Futuristic Approach for Narrative Gaming 53
The player can walk and look around in all directions inside the 3D space by
using the mouse and keyboard to navigate in the x, y, and z axes. When the player
presses the w, s, a, and d keyboard buttons, the coordinate in the background gets
updated accordingly.
Screenshot of the player’s screen in the functional prototype. The player has a
user interface where it shows the player’s hunger and stamina levels. These bars
decrease by time where hunger can get regenerated through picking up consumable
items (minor food pickup and full food pickup), the stamina gets regenerated by time
however the player character will not be able to sprint as long as the stamina is get-
ting regenerating.
In the bottom, there is the flashlight battery that shows the player how much bat-
tery is left as shown in Figures 3.4, 3.5 and 3.6. The player will not be able to turn on
the flashlight if the battery is empty and must pick up batteries to refill.
There is an objectives icon that is dynamic meaning it changes and get updated
according to the player’s actions. If the player collides with a new objective, it will be
added and shown on the UI. If the player completes an objective, objective complete
message will be shown.
The screenshot shows the use of the interaction and inspection system in the pro-
totype. This is implemented by creating a Line Trace that reads if there are any
inspectable items for the player to pick. A message is shows to the player if the player
is looking at an inspectable item.
Items are created as child to a parent item having similar main characteristics includ-
ing being picked up and inspected by the player. In the screenshot, the player is looking
at the parent item where a message shows the player that (s)he can inspect the item.
The player picks up the inspectable item where a message is shown explaining
that using left click allows item inspection while right click drops the item on the
floor being impacted by physics.
When the player left clicks to inspect the item, (s)he can rotate the item by left
clicking and moving the mouse. There is also an option to zoom by scrolling the
mouse well. Item description can be visible on the side. The description is dynamic
and changes according to the item being held. This is scripted inside the script graph
of the inspection UI. The player can exit inspection mode by right clicking twice;
first to hold the item and second time to drop the item.
The design of the user interface has been designed inside Unreal Engine wid-
gets as shown in Figure 3.8.
Figure 3.6 shows the blueprints for an AI going toward the player. This is done
by creating a trigger box and placing it in the scene at the desired location. A trigger
box is a box collision that is invisible to the player in game. It is used to trigger events
when something overlaps with it. In Figure 3.7, the box is triggered by the player
character by casting or calling the player’s class and from this class, the location of
the player is stored in the blueprint and used to for playing a sound effect. This is
done by getting a play sound at location node and connecting the player’s location
to the location input of the node. Then the AI actor is stored in the blueprint by get-
ting all the actors from the AI class. This enables the developer to access any actor
in the specified class as shown in Figures 3.9 and 3.10. The actors are then stored
as a copy and the AI actor set to visible. An AI move to node is added. This node
enables the AI to move toward the desired location. The pawn is connected to the
object reference, specifying which actor or which AI should be moved. The location
input is a vector variable representing the desired destination the AI should move to.
The target actor input is also an object or actor reference however, this input is for
specifying an actor whom the AI is targeted to move toward. Acceptance radius is
a float variable that specifies the value on which the success of the event is based on
upon reaching the desired destination. Finally in the input is stop on overlap which
is a Boolean variable that specifies whether or not the AI will stop moving when
overlapping or colliding with something. In the output there are three different exits;
if the first is connected, then any nodes or events that follow will happen no matter if
the AI move node succeeded in going to the desired location or not. The second exit
is for the success of the node, and the third is for the failure. Finally in the output is
the movement result which stores the result of the path followed for later use.
After the success of the AI move node, the AI actor is destroyed as well as the
trigger box to ensure that the player does not trigger the event again and avoid any
bugs. Overall blueprint of the game is shown in Figures 3.11 and 3.12.
58 Deep Learning in Gaming and Animations
strategies in order to gain the highest scores. The author has conducted study where a
Deep Q-learning algorithm has scored 50 points after just five minutes of training on
playing the game Snake. The Snake game was developed using Python and Pygame,
which is a library that allows developers to create simple games using Python. At
first the AI agent did not know how to play the game as it was not trained yet, how-
ever, after 100 iterations which is equivalent to five minutes of playing, the agent
started to score points and after 200 iterations, the agent scored 83 points.
The method of reinforcement learning is based on the Markov Decision Process.
This process provides a mathematical framework that creates random decision-
making outcomes. The author has used Deep Q-learning instead of ML because in
ML, the agent is trained with a targeted input and output. The output is the correct
answer where they are required to be predicted according to the inputs provided.
This method is not effective because it does not show which action is the best to be
taken in the game to score the highest score.
When using reinforcement learning, two components are present, those are, the
game which represents the environment and the snake which represents the AI agent
that has the deep neural network creating the decisions of the snake in game. When
the agents performs and action, a positive or negative reward is then provided by
the environment which depends on how that specific action taken by the agent was
beneficial to that game state. The agent then has an objective to learn the best strat-
egy of actions that provide the best rewards in each game state. The game state is
the observations received by the agent throughout different iterations in the game
environment. It can vary between variables such as position and speed. The agent’s
decision-making process to determine the best strategy of actions in the game is
called a policy according to the reinforcement learning method.
The AI agent makes decisions according to the Q-table which is a matrix that
shows the association between all the possible actions that can be taken by the agent
according to its state in the environment. Values are given in the table according
60 Deep Learning in Gaming and Animations
to the probability of each action’s success and their rewards. However, the author
argues that there is a problem with the policy in reinforcement learning because it is
represented in a table. This constrains the space for the states in the environment and
makes it difficult to have a large number of different states. Therefore, using Deep
Q-learning is preferred because it represents the policy in a deep neural network
where the values are changed by applying the Bellman equation.
The algorithm for the snake agent is explained as follows: Firstly, the game
starts and a Q-value is given randomly. Secondly, the state is recognized by the
system. Thirdly, an action is taken according to the recognized state. The action
taken can be random at first to allow exploration and maximize learning so that
later the agent can depend on its neural network. Finally, the agent gets rewarded
from the environment according to the action taken and how it affects the current
state. Then the Q-value is updated by using the Bellman equation which results in
generating a new state accordingly. The data of the original state, action, reward,
and the updated states are all stored to be used in training the neural network in
a method called Replay Memory. The operations repeat until the game ends or
another condition is met.
A state is explained as representing the agent’s situation as well as the neural net-
work input. In the study done by the author, an array of eleven Boolean variables was
given for the states in the snake game. Those states contained Booleans for whether
or not the agent is close to danger, the direction of the agent’s movement, and the
position of food.
The agent should try to get the maximum number of positive rewards and get
the minimum number of negative rewards. In the snake game, positive rewards are
given by adding ten points to the score every time the snake (AI agent) eats a fruit,
while negative rewards are given by removing 10 points from the score every time
the snake (AI agent) hits itself or a wall. Additional points can be added to the score
when for each move the snake takes while avoiding death, this affects the decision-
making process for the AI were moving in a in a certain way can get additional
rewards. This shows that by applying reinforced learning, an AI can outsmart the
human players by using flaws in opponents’ strategies to come up with action that
are unanticipated.
The study done by the author shows that a simple AI agent can learn how to under-
stand the mechanics of an environment, create decisions to gain positive rewards and
avoid negative rewards without being informed of the rules, and respond in unantici-
pated way by using flaws in opponents’ strategies.
lead to negative rewards or no rewards at all in the future. This process is the process
behind the AI agent decision making during the second stage which is exploitation
(sacredgames, 2017).
The author has given the equation below for Q-learning:
An example has been done in Unreal Engine to show the use of Q-learning on an
AI agent where the agent or none player character starts the training stage by mov-
ing at random between four locations in the environment. The four locations include
three bowls containing food and a switch. Each location has their own values associ-
ated to them and the AI agent learns their values as well as associated behavior (how
to get food from the bowls and how the light switch can affect the availability of
food) during the training stage.
The desired outcome is for the agent to start making decisions and behave with
intention according to the values and behaviors found and observed during the train-
ing stage. The agent should be able to identify that food can be obtained from the
three bowls only if the light switch is turned on therefore the AI agent is expected to
be able to go to the light switch first then go to the food bowls.
Q learning algorithm provides an AI agent with realistic lifelike and dynamic
behavior by imitating the drive behind the carrot and stick example. The agent’s
decisions are based on the rewards available in the environment which is stored in a
table containing all the possible rewards both positive and negative. It can be used for
creating various intentional behaviors to enhance gameplay by having more lifelike
non player characters.
62 Deep Learning in Gaming and Animations
3.7 CONCLUSION
AI has become an integral part in game development. This chapter focuses the use
of AI in 3D narrative games for enhancing the player’s experience and increasing
immersion. This is proposed by using AI or ML to create immersive, complex,
and dynamic game narratives, which changes according to the player’s play-pattern
and allows the player to have more freedom and make interesting choices. Various
approaches have been studied in this chapter including approaches for narrative
game design, enhancing the player’s experience through the use of the MDA model,
and the use of AI in creating dynamic video games. It has been found that there
are various types of AI that is used in game development which are usually cre-
ated on a rule-based system which conditions the AI to make certain behaviors
as a response to different scenarios but sometimes ML is used to allow the AI to
learn and adapt to the player’s patterns. However, supervised ML is not appropriate
for narrative games. Therefore, this study recommends the use of unsupervised or
reinforced algorithms in developing narrative games. This allows the player to be
in a controlled environment but also allows the AI to be able to learn and record
the player’s patterns and responds accordingly creating an enjoyable and immersive
playing experience.
REFERENCES
Aarseth, E. (2012) ‘A narrative theory of games’, in In Proceedings of the international confer-
ence on the foundations of digital games, pp. 129–133. doi: 10.1145/2282338.2282365.
Afram, Rabi. “Puzzle Design in Adventure Games.” (2013),Degree Project in Game Design,
15 ECTS Credits Game Design and Programming, Spring 2013.
Anderton Kevin (2018) The Impact Of Gaming: A Benefit To Society [Infographic], Forbes.
Baumgarten, R., Colton, S. and Morris, M. (2009) ‘Combining AI methods for learning bots
in a real-time strategy game’, International Journal of Computer Games Technology,
2009, pp. 1–10. doi: 10.1155/2009/129075.
Boonen, C. S. and Mieritz, D. (2018) ‘Paralysing fear: player agency parameters in Horror
games’, in Proceedings of Nordic DiGRA 2018.
Brown, A. D., Stacey, P., & Nandhakumar, J. (2008). Making sense of sensemaking narra-
tives. Human Relations, 61(8), 1035–1062. doi:10.1177/0018726708094858
Butler, M. (2016) Why People Play Horror Games, iNews.
Bycer, J. (2012) Extreme Storytelling: The Use of Narrative Mechanics, Gamasutra.
Bycer, J. (2019) The Problem of Modern Horror Game Design - Game Wisdom, Game
Wisdom.
Carlquist, J. (2002) ‘Playing the Story Computer Games as a Narrative Genre’, Human IT:
Journal for Information Technology Studies as a Human Science, 6(3), pp. 7–53.
Clempner, J. B. (2009) ‘A Shortest-path Lyapunov Approach for Forward Decision Processes’,
International Journal of Computer Games Technology, Volume 2009, pp. 1–12. doi:
10.1155/2009/162450.
Coleman, R. (2009) ‘Fractal Analysis of Stealthy Pathfinding Aesthetics’, International
Journal of Computer Games Technology, Volume 2009, pp. 1–7. doi: 10.1155/
2009/670459.
Comi, M. (2018) ‘How to Teach AI to Play Games: Deep Reinforcement Learning’, Towards
Data Science. Available at: https://towardsdatascience.com/how-to-teach-an-ai-to-
play-games-deep-reinforcement-learning-28f9b920440a (Accessed: 7 March 2021).
Artificial Intelligence as Futuristic Approach for Narrative Gaming 63
El Rhalibi, A., Wong, K. W. and Price, M. (2009) ‘Artificial intelligence for computer games’,
International Journal of Computer Games Technology, Volume 2009, pp. 1–3. doi: 10.1155/
2009/251652.
Riedl, M. O. (2012) ‘Interactive Narrative: A Novel Application of Artificial Intelligence
for Computer Games Artificial Intelligence in Computer Games’, in Proceedings of
the Twenty-Sixth AAAI Conference on Artificial Intelligence. Atlanta, Georgia, USA:
Georgia Institute of Technology, pp. 2160, 2165.
Rouse, R. (2009) ‘Match Made in Hell: The Inevitable Success of the Horror Genre in Video
Games’, in Perron, B. (ed.) Horror Video Games Essays on the Fusion of Fear and
Play. Jefferson, North Carolina: McFarland & Company, Inc., pp. 15–25.
sacredgames, (2017) “Reinforcement Learning: Q-Algorithm in a Match to Sample Task –
Machine Learning In Unreal Engine,” sacredgames, December 19. https://unrealai.
wordpress.com/2017/12/19/q-learning/ (accessed Mar. 16, 2021)
Safadi, F., Fonteneau, R. and Ernst, D. (2015) ‘Artificial Intelligence in Video Games: Towards
a Unified Framework’, International Journal of Computer Games Technology, 2015,
pp. 1–30. doi: 10.1155/2015/271296.
Sikali, K. (2020) ‘The Dangers of Social Distancing: How COVID-19 can Reshape our Social
Experience’, Journal of Community Psychology, 48(8), pp. 2435–2438. doi: 10.1002/
jcop.22430.
Tanskanen, S. (2018) ‘Player Immersion in Video Games Designing an Immersive Game
Project’, p. 75.
4 Review on Using
Artificial Intelligence
Related Deep Learning
Techniques in Gaming
and Recent Networks
Mujahid Tabassum, Sundresan Perumal,
Hadi Nabipour Afrouzi, Saad Bin Abdul
Kashem, and Waqar Hassan
CONTENTS
4.1 Introduction.....................................................................................................66
4.2 Internet of Things (IoT)................................................................................... 67
4.3 IoT Infrastructure............................................................................................. 68
4.3.1 Components......................................................................................... 68
4.3.1.1 Identification Block............................................................... 68
4.3.1.2 Sensing Block........................................................................ 68
4.3.1.3 Communication Block.......................................................... 68
4.3.1.4 Computation Block............................................................... 69
4.3.1.5 Service Block and Semantics Block..................................... 69
4.3.2 Protocols.............................................................................................. 69
4.3.3 Applications......................................................................................... 73
4.3.3.1 Home Automation................................................................. 74
4.3.3.2 Smart Agriculture................................................................. 74
4.3.3.3 eHealth.................................................................................. 74
4.3.3.4 Logistics................................................................................ 75
4.4 Deep Learning in Gaming and Animation..................................................... 75
4.4.1 MotionScan Technology...................................................................... 77
4.4.2 Framework (Architectural Model)....................................................... 77
4.4.3 Appearance Model............................................................................... 78
4.5 IoT and 5G Technology................................................................................... 78
4.6 Artificial Intelligence....................................................................................... 79
4.6.1 Supervised Learning............................................................................80
4.6.1.1 Regression............................................................................. 81
4.6.1.2 Classification......................................................................... 81
DOI: 10.1201/9781003231530-4 65
66 Deep Learning in Gaming and Animations
4.1 INTRODUCTION
Thanks to Internet technology which has brought various new concepts and opportu-
nities to life. Today the Internet of Things (IoT) concept has become a fast-growing
industry in developed and developing countries. Due to Internet availability, we can
access instant real-world devices and digital information anywhere at any time. With
the Internet’s help, most of our daily usage things are connected us while we are
away from home; for example, watches, home TVs, kids monitoring, house moni-
toring, air conditioners, car and many more. Therefore, IoT industry’s rapid growth
needs reliable IoT applications, protocols, and platforms to support future industries
such as smart cities, smart homes, smart appliances, gaming, e-health, logistics, and
intelligent forming requirements.
IoT can be recognized as a network of neighboring things that connect via Internet
to share collected information without requiring human-to-human or human-to-
computer interaction. It is a network combination of many tiny sensor devices which
sense and collect various information depending on their characteristics and trans-
mit to the respective user via a wireless medium. The environmental and physical
parameters are sensed and monitored to take pre-defined actions. Users can access
live data, instantly notified to take appropriate measures.
IoT networks offer several benefits to various industries. According to an IDC
forecast, there will be 41.6 billion IoT devices connected over the Internet by 2025,
which will generate approximately 79.4 zettabytes (ZB) amount of data [1].
IoT systems include numerous business applications to offer many benefits to
humankind. The new Industrial IoT and the fourth digital revolution (Industry 4.0)
provide designers and implementers’ versatility and increase their decision-making
abilities using IoT and machine learning (ML) developments. Besides ML, cloud
computing services provide more benefit to companies and individual in term of
usability and productivity.
Artificial intelligence (AI) and ML techniques facilitate the communication
between systems, allowing them to make their own autonomous choices [2]. A sim-
ple IoT Network is used to offer self-optimized networks service. The individually
designed network is expected to be configured for massive data transfer and recep-
tion between a various autonomous device with time and channel free freedom. The
connected devices will determine and calculate the shortest flow path for success-
ful transmission. AI make these devices more efficient and communicate with each
other effectively.
In the field of gaming, opponents of human players often have advanced AI
involving genetic algorithms (GA). Strategies used in the past are being programmed
using AI to ensure that AI can learn and develop from past experiences. Learning
techniques allow AI to repeat past mistakes, enhance the game. This allows for a
more realistic experience for human players as they need to change their strategy
Review on Using AI Related Deep Learning Techniques 67
over time. It also helps to avoid situations where the human player finds a sequence
of moves that ultimately leads to success, meaning that the game no longer becomes
a challenge. GA an instance is needed to illustrate the challenges in the solution
situation and the optimal process of determining the quality of the example. The fit-
ness function first recognizes the variability of a unit and determines its properties.
In addition, these special functions are tailored to the problem domain. The fitness
function can only be a system timing function in most cases, especially code optimi-
zation. As genetic representation and fitness functions are determined, GA create an
initial candidate sample that will then apply multiple repeat operators with options,
crossovers, and options to increase the value of the candidate’s fitness.
This chapter has discussed the deep learning (DL) involvement in IoT networks,
components, application, protocols, gaming and expectation from 5G network. We
have discussed the various ML; DA algorithms and concepts enhance the IoT net-
work productivity and efficiency. In the last part, we have discussed the future trend
and challenges.
4.3 IoT INFRASTRUCTURE
4.3.1 Components
Usually, IoT associated networks are a combination of the large number of devices
connected worldwide. IoT technology connects smart devices, gateways, data net-
works via cloud computing and applications. These intelligent devices are mostly
at various distance from each other under multiple scenarios and controlled by the
centralized management system to process and save the data or information. Entire
IoT infrastructure is made of various elements, blocks, components, and protocols.
IoT components consist of a sensing unit, a communication unit, computation, and
Internet (connectivity) unit and appropriate protocols and services. In addition to
featured features and application capabilities, IoT communication protocols, compu-
tational processing speed, and cloud services define IoT platforms, strengths, and lim-
itations [5]. The growing demand and involvement of IoT in recent industries signify
the interoperability improvement needs between various applications and services
as per user requirements. IoT infrastructure model consists of six blocks such as [5]:
with each other especially with the upper system that handles the collected data. A
gateway or bridge is used in a condition when connected devices cannot adequately
communicate with other systems via a specific protocol. Therefore, a gateway is
used to communicate among various network via communication protocols. The IoT
devices used several communication technologies such as WiFi, ZigBee, Near Field
Communication (NFC), Bluetooth Low-Energy (BLE), Long Term Evolution (LTE),
LoRa, SigFox, NarrowBand (NB)-IoT, etc., to connect with the Internet. Hence, in
IoT networks identification methods are considered better than authentication in
terms of performance gains. For different recognition technologies, each entity is
unique through its recognition [6]. There are several communication protocols used
in IoT networks such as Constrained Application Protocol (CoAP) and Message
Queue Telemetry Transport (MQTT) to connect IoT objects with the data manage-
ment system.
4.3.2 Protocols
IoT protocol can be divided into two basic types such as IoT Network Protocol
related to managing the network traffic and IoT Data Protocols related to managing
data availability. In the IoT network, the connected devices are not entirely machine-
to-machine M2M systems because M2M communication devices directly commu-
nicate. In M2M communication, devices such as sensor, actuators, and embedded
systems capture or sense the data and share with other connected devices. For exam-
ple, they controlled electrical applications like bulbs, air conditioner, fridge and fan
RF or Bluetooth technology using smartphones or remotes. Therefore, electrical
appliance and smartphones considered as two different machines which are interact-
ing with each other. However, in IoT networks, physical devices are embedded with
sensors, software, and electronics to communicate using a cloud base or Internet-
related network. IoT networks are about sensor automation and Internet platform [7].
Figure 4.1 shows the concept difference of M2M and IoT.
In IoT network, most of the system parts need to be configured, maintained, and
monitored to offer expected services and data management [7].
70 Deep Learning in Gaming and Animations
CoAP is a new communication protocol mainly designed for IoT networks, low
power devices. It allows connected devices to communicate among themselves
and with the Internet using similar protocols. Its lightweight generator traffic and
intended to use for resource-constrained Internet devices such as sensor nodes. It is a
service layer protocol that offers one to one communication like Hypertext Transfer
Protocol (HTTP) and does not support TCP/IP. It uses User Datagram Protocol
(UDP) over IP to offer efficiency as compared to HTTP. It uses fewer resources than
HTTP and provides more observation, execution, discovery, reading, and writing.
CoAP is the best choice for web services based [8].
MQTT is another communication protocol that is used in IoT networks. It was
developed in 1999 by Arlen Nipper (Arcom) and Andy Stanford-Clark (IBM) to
collect data from various electrical devices. It is mostly used for monitoring from
a remote area in IoT networks [8]. It is implemented over TCP/IP as a lightweight
communication protocol and work on a hub-and-spoke architecture concept. The
correspondence between devices uses a message brokers’ server that does not make
an M2M communication platform. It utilized three elements such as subscriber, pub-
lisher, and a broker. MQTT support Secure Sockets Layer (SSL) and Transport Layer
Security (TLS) for security services. In WAN IoT base networks, MQTT is consid-
ered better to use due to its broker concept. The broker is a center point of contact
between sensor devices. MQTT protocol is a favorite choice for all IoT based devices
because it provides ample routing information functions to cheap, memory-intensive,
and small devices on low and poor bandwidth-based networks. It is useful for poor
bandwidth networks, especially in numerous remote locations. Therefore, Amazon
and Microsoft Azure are using the MQTT protocol for their services.
Bluetooth is one of the most used short-distance wireless technologies used in
IoT networks. User can get Bluetooth applications that offer wearable technology
to connect smart devices quickly. One of the newly developed Bluetooth protocol
for IoT networks is known as Bluetooth Low-Energy (BLE) protocol as shown in
Figure 4.2. It offers the same services as Bluetooth with lower power consumption.
However, BLE is not designed to transfer large files and is mainly preferable for
small file size [9].
Review on Using AI Related Deep Learning Techniques 71
WiFi is another famous and favored IoT network-related protocol and accepted
by many industries because it offers fast data transfer speeds and many data. The
large WiFi 802.11 standard allows user to transmit hundreds of megabits in a
second only. However, WiFi technology consumes a considerable amount of bat-
tery that is one of its drawbacks. It operates on 2.4 GHz and 5 GHz frequencies
band [9, 10].
ZigBee is another protocol which is mainly designed for IoT related industries as
compared to the consumers. It operates mostly on the 2.4 GHz frequency band with
a smaller data transmission rate [9, 10].
The Data Distribution Service (DDS) is another standard protocol choice for
the high-efficiency, expandable and efficient machine-to-machine communication.
It is developed by Objective Management Group. Users can transfer data on both
low-scale devices and cloud services using DDS. It used two important layers such
as Data-Centric Publish-Subscribe (DCPS) and Data Local Reconstruction Layer
(DLRL). The DCPS used to deliver information to the subscribers, and DLRL offer
an interface to DCPS [9].
NFC is another IoT related technology that is mostly used in the latest mobile
phones. It allows users to connect with electronic devices to use digital contents to
perform contactless payment transactions. It mainly emphasis on contactless com-
munication between electronic devices. It has a minimal distance range of 4 cm
between both electronic devices [9].
Many IoT applications are available to call for service over a longer distance. The
use of cellular networking technologies such as 4G/5G is possible in such IoT appli-
cations. Cellular is an IoT Communication Protocol that can send or transmit a large
quantity of data over the longer distance. However, users must understand the cost
of the cellular network, which might be quite expensive and consume more power.
Therefore, SparqEE has introduced a cellular kit with a name CELLv1.0 which
can be used in Arduino and Raspberry Pi platforms to get the cellular technology
72 Deep Learning in Gaming and Animations
4.3.3 Applications
IoT field is still emerging technology, but it helps many industries play important roles
[13]. While IoT is an evolving area, it has helped to enhance numerous applications that
have changed our lives in many ways. The IoT-I project 2010 survey described 65 IoT
scenarios covering 12 areas: travel, smart home, smart city, lifestyle, gaming, agricul-
ture, supply chains, emergency services, public health care, user engagement, culture
and tourism, climate, and environment [14]. The IoT analytical web portal results show
the growing number of IoT projects in various worldwide regions [15]. In Figure 4.4,
we can observe the ever-increasing number of IoT involved in multiple industries.
Smart City and Industries are on top with the heavy usage of IoT-related networks.
4.3.3.3 eHealth
Health industry also getting optimal benefits from the usability of the IoT network.
Due to the ubiquitous IoT network capacities, all devices connected can be tracked
and monitored to collect respective and live data [19]. Using WSNs, patients’ live
medical data can be collected from anywhere and anytime, saving lives and under-
standing the variable parameters. Several AI-based smart applications such as live
Review on Using AI Related Deep Learning Techniques 75
heath monitoring or fall detection, diet monitoring, body mass index (BMI) monitor-
ing, blood pressure and heartbeat monitoring helps older people and disabled people
to live independently. With the help of these smart devices and applications, doctors
can continuously and effectively monitor patients’ health over the longer distance.
The growing rate of ageing has created many healthcare problems in today era. For
instance, some countries have old-houses and rehabilitation centers that manage old
age people or sick people and offer health services. IoT networks help in these areas
to continuously control and monitor patients’ health parameters [13].
Google Health application was released in 2008 as a personal health monitor
application. The application allows users to share their personal health information
with health service providers voluntarily. The Google Health application gets input
from users or monitors the user’s health and generates a complete report on its health
record. Later, the application was upgraded, and the user’s records are synchronized
on the cloud for further analysis. In the current era, various patient monitoring and
hospital-orientated systems are designed, developed, and significant changes have
been made to incorporate multidisciplinary information fields [20].
4.3.3.4 Logistics
Like other fields, transport and logistics industries are also taking advantages of
IoT systems. IoT helps the transportation industry improve transport efficiency and
precision across the entire supply chain, real-time monitoring and object movements
tracking from source to destination using attached RFID tags. In addition, IoT pro-
vides promising solutions for transforming transport and can predict vehicles moving
in parking or on the road. For example, BMW, Honda, and other vehicle manufac-
ture companies used several sensors and tags to monitor the environment to provide
drivers with driving direction with an intelligent computer system. Moreover, other
transportation and logistics operations such as routes control, warning emission,
track monitoring, etc., can also be enhanced by IoT [13]. Overall, IoT applications
can be grouped into 54 primary application containing 12 main domains due to their
usability and involvement in many industries [15].
believable. However, understanding the single way the player communicates with the
world is almost impossible.
Furthermore, the switch between animations can look clunky and canned.
Changes between moves are usually done by reusable algorithms that work in the
same way every time. Consider how a character could sit in a chair or place a box.
This is made more difficult as the objects are in varying sizes. Picking up things
in various shapes and sizes or resting your arms on seats of various sizes may be
uncomfortable. A variety of moves should be considered and animated when moving
an object, including beginning to walk, slowing down, turning appropriately when
positioning the foot, and engaging with the object. At the same time, our.ng produces
a wide range of high-quality moves and actions from a single network. NSM has
been taught how to transition from one movement to the next in a normal manner.
Based on the previous poses and scenario geometry, the network advances the next
character.
Mo-cap is the process of recording people’s real-life movements with a cam-
era aimed at capturing those precise movements in a scene created by a computer
Review on Using AI Related Deep Learning Techniques 77
4.4.1 MotionScan Technology
That was back in 2011 when L.A. Noir really brought a great life-like face animation
that looks ahead of every other game. Now, almost a decade later, we still have not seen
many other games close to the level when it comes to displaying facial expressions [22].
L.A. in 2011 to create life-like face animations. MotionScan technology used by
Rockstar Studios in Noir. This is because the face scanning technology used in the
development of this game, called MotionScan, is very expensive and the file size of
the captured animation is very large, which is why most publishers use this tech-
nology for their games. However, this may soon change due to recent advances in
mo-cap driven by in-depth training.
In the following research work, the authors introduce a DL Framework for creat-
ing animations from the source image of a face, following the movements of another
face in a moving video like MotionScan technology. They propose a self-monitor-
ing training method that can use labelled video data without a specific category to
study the significant dynamics that define movement. Then show how those motion
dynamics can be combined with static imagery to create motion videos [23].
Motion module has an encoder that must study the incredible presentation which
contains important key points that are very important in relation to the motion of
the object, which is the face in this view. The motion of these key points in different
frames of the driving video creates an area of motion, driven by the function that we
want to study our model. The author uses the Taylor expansion to predict this process
from the first order that builds this plane of motion. According to the authors, this is
the first time a first-order approach has been used to model movement. In addition,
a solid motion field is created by combining the studied Eiffen transformations of
these keys. A solid motion plane predicts the speed of each pixel of distance, not
just focusing on the main points of the sparse motion plane. In addition, the Motion
Module also generates an acceleration map, highlighting the frame pixels that need
to be painted [24].
4.4.3 Appearance Model
Visual Module uses an encoder to encode the source image, which is combined with
a motion field and acceleration map to animate the source image. Generator models
are used for this purpose. During the self-examined learning process, a style frame
from the driving video is used as the source image and the studied motion plane is
used to scent the source image. Since the actual video frame serves as the truth for
the movement of the results, this is self-monitoring training. During the testing esti-
mate phase, this source image can be replaced with another image of the same object
category, and it does not need to come from the driving video [24].
There are several DL applications in the current market such as Self Driving
Cars, Entertainment, Visual Recognition, Virtual Assistants, and Natural Language
Processing.
the market demand of IoT networks in the coming time, to stimulate new social devel-
opment and economics needs. The emerging conditions of potential IoT networks and
the advancement of 5G wireless technologies are two significant developments that
propel the 5G IoT [25]. At present, the IoT networks use 3G and 4G networks exten-
sively but not entirely optimized by IoT applications. The 4G generation refers to LTE
with significant capabilities to offer various Internet base services for IoT networks.
4G network are more reliable, fast, and provide consistent services to users than
other technologies such as BLE, WiMaxB, ZigBee, SigFox, LoRa, and others [20].
However, with the proliferation of IoT networks and applications, 5G networks are
expected to offer fast and sustainable Internet services. Figure 4.7 shows the mobile
network evolution from 3G to coming 5G enables IoT generation [25].
The new 5G networks are expected to operate exclusively on 4G LTE core net-
work to offer, data and Internet services with the speed of 10Gbps along connect-
ing thousands of devices. A lot of academic and industrial research are going on
to understand various aspects of IoT and 5G networks. The main objective is to
enhance the productivity of the IoT network using 5G technology. In a collabora-
tion project involving CISCO, Intel, and Verizon, 5G has unveiled a novel series
of “neuroscience-based algorithms” that adaptive video quality to the human eye’s
demand, indicating that the wireless networks are integrated into human intelligence.
The 5G generation will contribute to IoT success by connecting thousands of intel-
ligent devices and interacting without human participation. The 5G-IoT networks
are expected to offer real-time, on-demand, re-configurable services that required
the 5G-IoT architecture to provide smart operations on each phase. The 5G-IoT net-
works are expected to offer [25]:
between inputs and forecast objective outputs. It is used in many applications to solve
various problems such as in WSN. It helps to solve routing, localization, fault detec-
tion, data aggregation, congestion and energy harvesting related issues. It can be dif-
ferent categories into two sub-categories, such as Classification and Regression [26].
4.6.1.1 Regression
This method predicts some value (Y) based on a given set of parameters (X) along
with continuous or quantitative variables. It is a simple ML method used to predict
accurate results with fewer errors.
Linear Regression: The purpose of linear regression is to learn a function f that is
mapping with Y value. It can be represented with the following mathematical model.
In the equation, Y refers to the dependent variable (output), x represents independent
variable (input), f refers to a function that makes the relation between x and Y, and ε
deal with an expected random error.
Y = f ( x ) + ε
Support Vector Regression (SVR): The Support Vector Machine (SVM) method is
used to solve regression problems along a process known as SVR.
4.6.1.2 Classification
Classification can be further divided into various fields.
TABLE 4.1
Overview of ML Algorithm used in IoT Networks
Table 4.1 shows the overview of ML algorithm used in IoT Network [27, 28].
ML must balance each other and effectively fulfill them. The popularity and demand
for data analytics also have increased in recent years, along with the ML algorithm.
ML integration in the IoT field has opened new productivity, efficiency, economical,
and accuracy IoT solutions. ML usability in IoT applications improves communica-
tion and computational performance, better controllability, and enhances decision-
making skills. IoT networks consist of thousands to billion ubiquitous sensing devices
to offer reliable and consistent communication to improve and secure society lifestyle
and industry workflow. Recently, IoT application has significantly improved with the
convergence of ML and AI techniques. Latest ML and AI techniques allow user
easily to observe the collected data and make critical operational decisions rapidly.
Therefore, in many industries, ML and AI techniques are being used to make the
rapid and successful decision based on collected data. Today, Data Analytics has got
significant attention in IoT network due to the following reasons [29]:
• Massive Volume of Data Generated from IoT devices: Large number
of devices are connected within a one IoT network. They are producing
high volume data that make hard for the user to maintain and understand.
Therefore, it is essential to use intelligent data analytic techniques to extract
relevant data rapidly.
• Variability in Huge Volume Data Collected from Heterogeneous
Devices: In IoT network, many devices are connected and processing the
information among themselves. Due to IoT network heterogeneity, data qual-
ity, processing, and storage have been considered challenging tasks to handle.
• Data Uncertainty: In IoT networks, devices can change their characteris-
tics due to certain limitations such as battery drain, network failure, path
loss, interference, and others. Uncertainty is considered standard in practi-
cal data analysis. Data loss is always present in IoT, which needs efficient
analytics techniques to pre-process the data. It is essential to use proper
analytical techniques and assessment models to enhance decision-making
accuracy based on collected data.
• Balance Scalability with the Efficiency: Normally, IoT networks store
the collected data on the cloud and perform analytics to make decisions.
Moving data from smart devices to the cloud could cause a delay in trans-
ferring between different networks and speed. This could be a challenging
task for time-critical application, especially in many IoT connected devices.
Therefore, it is essential to manage and balance data analytics techniques
accuracy and speed in many IoT networks.
Data Analytics as shown below in Figure 4.11 can be classified as following [29]:
• Descriptive Analytics: Normally, IoT networks collect a large amount of
continuous data from a large, deployed area via smart devices. This informa-
tion is stored on the cloud for user interaction. The user can generate detailed
observation and opinions based on historical data collected from smart
devices using appropriate ML algorithms. Descriptive analysis is defined
as the method used to studies and interprets the raw data into meaningful
information. Data aggregation, data mining, mathematical, logical opera-
tions, data summarization, and others are examples of descriptive analytics.
Review on Using AI Related Deep Learning Techniques 85
TABLE 4.2
Overview of IoT Applications, Protocols and Algorithm
are used to predict future values and to classify analyzed data. The classification
and regression tree used to classify intelligent citizens’ behavior is a further rapid
training algorithm. Neural networks are appropriate learning models for function
approximation problems to predict the categories of data [30].
4.9 CONCLUSION
IoT consists of a wide variety of linked appliances and shares enormous data vol-
umes. The IoT platform has been a part of our daily lives. However, IoT equipment
is constrained in computational and connectivity capabilities that are bottlenecks
in creating flexible, smart solutions using ML techniques. Although advancements
in technology and software upgrades pave the way toward a future consisting of
accelerated IoT expansion, device delivery and precise IoT data high Volume review,
we also concluded that it has proved challenging to combine intelligent technologies
from various fields. ML techniques play an important role in IoT networks success
and assist in product enhancement. Due to IoT networks continuous growth, many
improvements are required in security, data handling, processing, speed, protocols,
and platform connectivity. The optimized power consumption and reduction of sen-
sor size are expected to minimize the deployment cost. Future work is going on to
develop poor infrastructure and operating cost with better performance.
A few years ago, we would never have imagined an in-depth learning app that
includes driverless cars and Alexa, Siri and Google Assistant virtual assistants. But
today, these creations are part of our daily lives. DL is captivating us with endless
possibilities like fraud detection and pixel recovery.
REFERENCES
1. N.a, Help Net Security, 41.6 Billion IoT Devices will be Generating 79.4 zettabytes of
Data in 2025, 2019. Access date: 10 July 2020, Access Link: https://www.helpnetsecu-
rity.com/2019/06/21/connected-iot-devices-forecast/
2. Tabassum, M., & Mathew, K., A Genetic Algorithm Analysis towards Optimization
Solutions. Int. J. Digit. Inf. Wireless Commun. (IJDIWC), 4(1), 124–142, 2014.
3. Elijah, O., Rahman, T.A., Orikumhi, I., Leow, C.Y., & Hindia, M.N., An Overview of
Internet of Things (IoT) and Data Analytics in Agriculture: Benefits and Challenges.
IEEE Internet Things J., 5(5), 3758–3773, 2018.
Review on Using AI Related Deep Learning Techniques 89
4. Seth, D., Eswaran, S., Mukherjee, T., & Sachdeva, M. (2020). A Deep Learning
Framework for Ensuring Responsible Play in Skill-based Cash Gaming. In 19th IEEE
International Conference on Machine Learning and Applications (ICMLA), IEEE,
454–459.
5. Hejazi, H., Rajab, H., Cinkler, T., & Lengyel, L., Survey of Platforms for Massive
IoT. In 2018 IEEE International Conference on Future IoT Technologies (Future IoT)
(pp. 1–8). IEEE, January, 2018.
6. Tabassum, M., Perumal, S., Mohanan, S., Suresh, P., Cheriyan, S., & Hassan, W., IoT, IR
4.0, and AI Technology Usability and Future Trend Demands: Multi-Criteria Decision-
Making for Technology Evaluation. In P. Suresh (ed.), Design Methodologies and Tools
for 5G Network Development and Application (pp. 109–144). IGI Global, Hershey,
Pennsylvania, USA, 2021.
7. B. Mishra and A. Kertesz, “The Use of MQTT in M2M and IoT Systems: A Survey,” in
IEEE Access, vol. 8, pp. 201071–201086, 2020, DOI: 10.1109/ACCESS.2020.3035849.
8. AVSYSTEM, Internet of Things, “IoT vs M2M - What Is the Difference?”, 2019.
Access date: 10 July 2020, Access Link: https://www.avsystem.com/blog/iot-and-m2m-
what-is-the-difference/.
9. Fries, J. IoT Agenda, “Why are IoT Developers Confused by MQTT and CoAP?”, 2017.
Access Date: 19 July 2020, Access Link: https://internetofthingsagenda.techtarget.com/blog/
IoT-Agenda/Why-are-IoT-developers-confused-by-MQTT-and-CoAP#:~:text=CoAP%20
was%20started%20by%20a,it%20is%20in%20the%20lead.
10. Ubuntupit, M. Top 15 Standard IoT Protocols That You Must Know about, Access
Data: 25 June 2020. Access Link: https://www.ubuntupit.com/top-15-standard-iot-
protocols-that-you-must-know-about/
11. SpareqEE, “SparqEE CELLv1.0”. Access Date: 10 July 2020, Access Link: http://www.
sparqee.com/portfolio/sparqee-cell/.
12. Tabassum, M., & Zen, K., Performance Evaluation of ZigBee in Indoor and Outdoor
Environment. In 2015 9th International Conference on IT in Asia (CITA) (pp. 1–7).
IEEE, August, 2015.
13. Safaei, B., Monazzah, A.M.H., Bafroei, M.B., & Ejlali, A., Reliability Side-effects
in Internet of Things Application Layer Protocols. In 2017 2nd International
Conference on System Reliability and Safety (ICSRS) (pp. 207–212). IEEE,
December, 2017.
14. Sharma, T. & Tabassum, M., Enhanced Algorithm to Optimize QoS and Security
Parameters in Ad hoc Networks. In P. Suresh (ed.), Design Methodologies and Tools
for 5G Network Development and Application (pp. 1–27). IGI Global, Hershey,
Pennsylvania, USA, 2021.
15. IoT-I, Internet of Things Initiative, FP7 EU project, FP7-ICT-2009-5-257565.
16. Scully, P. IOT ANALYTICS, “The Top 10 IoT Segments in 2018 - based on 1,600
real IoT projects”, 2018. Access Date: 28 June 2020, Access Link: https://iot-analyt-
ics.com/top-10-iot-segments-2018-real-iot-projects/?_scpsug=crawled_5484401_
4ca49840-17a8-11e8-854f-f01fafd7b417.
17. Tabassum, M. & Zen, K., Signal Interference Evaluation of Eko Wireless Sensor
Network. In 19th International Conference on Transformative Research in Science and
Engineering, Business and Social Innovation (SDPS 2014), 2014.
18. Tabassum, M. & Zen, K., Evaluation and Improvement of Data Availability in WSNs
Cluster Base Routing Protocol. J. Telecommun. Electron. Comput. Eng. (JTEC), 9(2–9),
111–116, 2017.
19. Tabassum, M., Perumal, S., & Ab Halim, A.H., Review and Evaluation of Data
Availability and Network Consistency in Wireless Sensor Networks. Malaysian J. Sci.
Health Technol., 4(Special), 56–64, 2019.
90 Deep Learning in Gaming and Animations
20. Brendard, S., Tabassum, M., & Chua, H., Wireless Body Area Networks Channel
Decongestion Algorithm. In 2015 9th International Conference on IT in Asia (CITA)
(pp. 1–6). IEEE, August, 2015.
21. Chen, S., Gao, X., Wang, J., Xiao, Y., Zhang, Y., & Xu, G., Brand-new Speech
Animation Technology based on First Order Motion Model and MelGAN-VC. J. Phys.
Conf. Ser., 1828(1), 012029, 2021.
22. Saxena, S. MotionScan: Towards Brain Concussion Detection with a Mobile Tablet
Device. 2016.
23. Siarohin, A., Lathuilière, S., Tulyakov, S., Ricci, E., & Sebe, N. First Order Motion
Model for Image Animation, 2020. arXiv preprint arXiv:2003.00196.
24. Xue, W., Madonski, R., Lakomy, K., Gao, Z., & Huang, Y. Add-on Module of Active
Disturbance Rejection for Set-Point Tracking of Motion Control Systems. IEEE Trans.
Ind. Appl., 53, 4028–4040, 2017.
25. Yuehong, Y.I.N., Zeng, Y., Chen, X., & Fan, Y. The Internet of Things in Healthcare:
An Overview. J. Indus. Inform. Integ., 1, 3–13, 2016.
26. Li, S., Da Xu, L., & Zhao, S. 5G Internet of Things: A survey. J. Indus. Inform. Integ.,
10, 1–9, 2018.
27. Kumar, D.P., Amgoth, T., & Annavarapu, C.S.R. Machine Learning Algorithms for
Wireless Sensor Networks: A Survey. Inf. Fusion, 49, 1–25, 2019.
28. Mahdavinejad, M.S., Rezvan, M., Barekatain, M., Adibi, P., Barnaghi, P., & Sheth,
A.P. Machine learning for Internet of Things data analysis: A survey. Digit. Commun.
Networks, 4(3), 161–175, 2018.
29. Tabassum, M. & Mathew, K. A Genetic Algorithm Analysis Towards Optimization
Solutions. Int. J. Digit. Inform. Wireless Commun. (IJDIWC), 4(1), 124–142, 2014.
30. Chang, J.H., Tabassum, M., Qidwai, U., Kashem, S.B.A., Suresh, P., & Saravanakumar,
U., Design and Evaluate Low-Cost Wireless Sensor Network Infrastructure to Monitor
the Jetty Docking Area in Rural Areas. In Advances in Smart System Technologies (pp.
689–700). Springer, Singapore, 2020.
31. Liang, C.B., Tabassum, M., Kashem, S.B.A., Zama, Z., Suresh, P., & Saravanakumar,
U., Smart Home Security System Based on Zigbee. In Advances in Smart System
Technologies (pp. 827–836). Springer, Singapore, 2020.
32. Ali, A.B., Tabassum, M., & Mathew, K. A Comparative Study of IGP and EGP Routing
Protocols, Performance Evaluation Along Load Balancing and Redundancy Across
Different AS. In Proceedings of the International MultiConference of Engineers and
Computer Scientists (Vol. 2, pp. 487–967), 2016, March.
5 A Review on Deep
Learning Algorithms
for Image Processing in
Gaming and Animations
Sugandha Chakraverti, Ashish Kumar Chakraverti,
Piyush Bhushan Singh, and Rakesh Ranjan
CONTENTS
5.1 Introduction..................................................................................................... 91
5.2 Machine Learning...........................................................................................92
5.3 Deep Learning................................................................................................. 93
5.3.1 Important Architectures in Deep Learning.........................................94
5.4 Content-Based Gaming Image Retrieval (CBGIR).........................................94
5.5 Image Classification......................................................................................... 95
5.5.1 Adding Deep Learning to Neural Networks....................................... 95
5.5.2 Benefits of Image Classification.......................................................... 95
5.5.3 The Image Classification Theory........................................................96
5.6 Image Processing.............................................................................................97
5.6.1 Point Operations..................................................................................97
5.7 Image Processing using Deep Learning.......................................................... 98
5.8 Conclusion.......................................................................................................99
References............................................................................................................... 100
5.1 INTRODUCTION
Limited information content or any information that is unilateral yet associated with
the most of the images due to the differentiation in focal points when objects appear
to be blurred in any of the one image. Most of the cases, images lack valid informa-
tion or say, they are unilateral with the flow of the content. Multi-modal images are
categorized into their several types determined with austere as they reflect diverg-
ing information as per the types of images. In case of doctors, for an instance, at
times, they find the information too much disintegrated that it results in their mis-
understandings. Not only doctors, even several researchers associated in the field of
image processing, utilization, and gaming diagnosis found the same problem as well.
Throughout the recent years, several researches are continuously assembling with
the image processing faculties related to gaming and animation as it still continues
DOI: 10.1201/9781003231530-5 91
92 Deep Learning in Gaming and Animations
to prove its effectiveness. The image fusion solution automatically detects different
types of images along with proper integration of clear images. In short, the image
fusion is precisely structured algorithm that combines more than two images to cre-
ate a new one.
Image fusion is also known for producing multi-modal gaming images that
grabbed the attention of a wide range of associated audience and researchers in
quiet recent years later categorizing its domain into three potential classes such as
DL algorithm, transform domain algorithm, and algorithms for spatial domains.
Recently, the focus is majorly on the methods and techniques of DL encouraging
scholars from across the globe to conduct DL research even from their comfort zones
resulting in image processing and alternatives. Some of the recent analysis on such
DL include pixel-level image fusions (steps conclude-training, classification, weight,
and fusion), convolutional neural networks (CNN), convolutional sparse representa-
tion, stacked auto-encoders (SAEs), and deep Boltzmann machine.
Humans use their vision to collect information that mostly and significantly
depends on the image quality. Hence, Spatial resolution such as High-Resolution
(HR) is considered much effective for measuring this important attribute because
of their plenty delegation, prime features of shooting, and their artificial factors. It
is at times quite difficult for the imaging system to receive an information without
any interruption, deformity or changes. Therefore, in practical sense, it still remains
strictly limited.
Improving the quality of the images is not an easy or inexpensive task. It is quite
trending in research and engineering fields related to image sciences. The expensive-
ness in the field lies in the hardware device practically while approaching toward
increasing the superlative image resolutions. In this chapter, we therefore shall also go
through the image super-resolution technology used for improving the image quality.
Image super-resolution technology [1] is an image processing approach toward
acquiring clear and high-quality images that is believed to be replicated just from the
image of a single-frame that is actually structured low resolution (LR). The visual
effects of the image get on improving the image super-resolution that is also ben-
eficial during the extraction, recognizing of proper data, traffic, as well as security
monitoring.
The deficiency of conventional schematic learning and their over-simplification
intensify the image processing information with a rate of higher frequency so that
the comparison between the images shall be regenerated using the sporadic prac-
tices. This results in DL approaches for melding with the sublimed with resolutions
to coherently meliorate the interpretation for reconstructing.
Major studies such as artificial learning, DL, and machine learning has their posi-
tions in the areas of computing, real-time imagery, finance, medical science and
health, etc. Their urgeness to level up their performance can be broadly seen in open-
sourced frameworks during the research methodologies of practical and advanced
technologies, with a hope to touch the $24 trillion world market by 2024. Since
they find themselves available as well as a necessity in every often field these days,
these advanced learning and research are trying their best to reach out to the lowest
factor of errors. Therefore, government is discussing broadly to invest more on such
advanced learnings so as to be quick with the competitive flow on developing more
upon the data-set sectors and research-based dimensions [3].
These days, DL techniques are commonly seen taking immeasurable help from
artificial neural network during image processing and extraction. Just like DL, NN
takes help of CNN for image processing. This CNN is automated in nature and
works similar to recurrent neural network (RNN), only exceptional with the fact is
the computational language is basically used by the RNN. This RNN therefore uses
the concept of feedback loops where the output of one layer is fed as the input of the
next layer.
There is hardly any visible limit to the discussions upon DL. DL is used so much
these days that it is even used in data processing units. Their active potentiality
encourages them to generalize huge data of the smart and metropolitan cities and
evaluate them in the areas of gaming and animation. DL in particular aids the faster
analysis for rendering an accurate diagnosis. Not only this, the complexity of the
health sector is very well managed using DL. This also diagnoses diseases, reduces
labor-force, etc. The following section provides an overview of DL applications for
gaming image processing [5].
to a 13-charactered symbolized code structured from deep SAE. Some of the other
relatable dataset experiments are the retrieval performances of the proposed method,
error parameter, and the classification performance [13].
Gaming image retrieval research is an aid to radiologists for being beneficial to
the usage of gaming and animated images for a long run. This brought advance-
ments to the imaging techniques and automatic diagnosed systems in the medi-
cal sector. The gaming images in the modern hospitals store Digital Imaging and
Communications in Medicine on text format, for retrieving repositories of the
gaming images. Therefore, CBGIR system is actually reliable based on visual-
izations that always gets itself ready to retrieve continuous and effective gaming
images [14].
Image classification also helps in analyzing the gaming images. It is also used
in suggesting and depicting any traces of illness in the human body. Other usages
include proper organization of several types of photo collections.
CNN regards the raw pixel data as an input for extracting the necessary features
from a particular image such as textures and shapes. This way the CNNs extract the
contents from the image successfully. The input image of a CNN has dimensions
W×H×C, where W and H are the width and height of the image in pixels, respec-
tively, and C is the number of images color channels.
Generally, CNN constitutes of stack modules known to perform three operations.
They are the Convolution, Rectified Linear Unit (ReLU), and Pooling. Convolution
helps CNN in creating a filter map for the input image. ReLU brings non-linearity
into the model whereas Pooling acts helpful in the reduction of the dimensions of the
feature map. When the obtained feature map is shaped once again into a long vector,
many connected layers are used in the place for completing the final task with the
help of Softmax activation function. The output derived from this function is valued
between 0 and 1. In this review, any image processing method that takes the Softmax
activation function and modifies it compared to baseline methods is considered as a
“post-processing” technique [16].
It is basically the accurate calculations that measures the performance of deep
networks in the subject of image classification. The calculation is such that the ratio
between the correctly classified images and the aggregate quantity of images defines
how metric they are in solving the tasks of CNN in gaming imaging such as patch-
based classification processes [17].
5.6.1 Point Operations
Each pixel value when replaced by a new one that is wholly and solely dependable
on the original value gives an idea of a simple methodology. Thus, this is definitely
known as a “point” operation keeping it distanced from other operations such as
“neighborhood” and “global.” In particular, if we carefully examine the original
brightness that is covered with a small fraction of the total range, new values are seen
increasing the contrast of the image. This is called “Point Transformation.” The rela-
tionship between original and replacement brightness values is often called a transfer
function, and can alter the appearance of an image as in Figure 5.3.
98 Deep Learning in Gaming and Animations
FIGURE 5.3 Examples of different transfer functions applied to an image: (a) original,
(b) logarithmic, (c) histogram equalization, and (d) negative.
type DL. Its recapitulation presents several possible advantages for using DL meth-
ods into gaming and image mining. It also throws an insight of content-based image
retrieval and identifying a flaw in the gaming image.
Besides optical sensing and image capturing systems [18], process images are
the new standpoints for process monitoring. When these process images are put to
use, the Deep Belief Network (DBN) are directly manipulated into the existing net-
works for extracting the features from those images and detect the scheduled faults.
Meanwhile, the sub-networks are operated to extracting the local features from the
sub-images. The global network exceptionally extracts and improves the training
efficiency without deteriorating the fault detection accuracy. On release, a new sta-
tistic is further developed in the course of the framework customized for DL [19].
As much as the current AI techniques are concerned, DL techniques are the most
stabilized variables in the health, infrastructure, research, gaming, and UI/UX sec-
tors. Their available is widespread rapidly each passing day because of its effective
and accurate results. They also aid in the implicit abilities of engineering workshops,
integrational ability for embedding words, and again an ability for dealing with com-
plex and unstructured dataset. Beyond this, the concept is also related digital texts
in electronic health records (EHRs), clinical texts on social media, text in electronic
gaming reports and gaming images. On the growing popularity of DL and its algo-
rithms, the responsibility toward several domains have risen as well. It is very well
observed in the 2017–2020 reports [20], where the 2019 report stated the publication
up to 4 times, that is more than the year 2018. Growths are seen increasing rapidly
after the learning acted even in the gaming techniques and animating applications
for image processing [20].
5.8 CONCLUSION
The technical translation of DL methods in image processing is still in the juvenile
phase. These techniques have grown vigorously useful in tackling with the tasks
conducting image processing and yet stay aback at times due to certain limitations.
Since there was a deficiency in the exclusiveness of the annotated experimental
data, the researchers adjusted to the utilization of simulated data only, validating
hardly any proposed techniques to run it on a large-scale. This chapter discusses
the challenges for clinical translations used for DL methods in image processing by
concluding and summarizing the major key findings on processing images in dif-
ferent aspects. Wilson et al. [21] examined the challenges and issues faced during
the clinical translation using the techniques of spectroscopic optical imaging. This
prepared a broader space for us to focus more on the upcoming hurdles impending
while translating imaging modalities. Therefore, this thesis particularly concentrates
on the facilitations made by the DL application while integrating the approaches of
image processing.
At last, reviewing this chapter shows that the methods used for DL is prominently
utilized in the field of personal activity intelligence (PAI), CBGIR, etc., for sim-
plifying extensive number of translations that would go on a long term. Therefore,
with the help of several studies, research, and investigation, we analyzed that even
these minimalistic studies on image processing are some of the little concerns for
100 Deep Learning in Gaming and Animations
contributing to the bigger picture challenges to what we call it as the future open
challenges associated to the applications on DL such as detecting problems in optical
inversion, image post-processing, and annotation on semantic imaging.
REFERENCES
1. Zhang J., Shao, M., Yu, L., and Li, Y., Image Super-Resolution Reconstruction Based
on Sparse Representation and Deep Learning, Signal Process. Image Commun., 87,
115925, September 2020.
2. Arunakranthi, G., Rajkumar, B., Chandra, V., Rao, S., and Harshvardhan, A., Advanced
Patterns of Predictions and Cavernous Data Analytics Using Quantum Machine
Learning, Mater. Today Proc., 2021.
3. Raju, B., and Bonagiri, R., A Cavernous Analytics Using Advanced Machine Learning
for Real World Datasets in Research Implementations, Mater. Today Proc., 2021.
4. Li, Y., Zhao, J., and Zhihan, Lv., Li, J., Medical Image Fusion Method by Deep
Learning, Int. J. Cogn. Comput. Eng., 2, June 2021, 21-29.
5. Bhattacharyaa, S., Reddy Maddikuntaa, P.K., VietPhamb, Q., Gadekallu, T.R.,
Krishnan, S.R., Chowdhary, C.L., and MamounAlazab, Md. J.P., Deep Learning and
Medical Image Processing for Coronavirus (COVID-19) Pandemic: A Survey, Sustain.
Cities Soc., 65, 102589, February 2021.
6. Benjio, Y., Courville, A., and Vincent, P., Representation Learning: A Review and New
Perspectives, IEEE Trans. Pattern Anal. Mach. Intell., 35, 1798–1828, 2013.
7. Goodfellow, I., Bengio, Y., Courville, A., and Bengio, Y., Deep Learning, vol. 1, MIT
Press, Cambridge, 2016.
8. Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., et al.,
A Survey on Deep Learning in Medical Image Analysis, Med. Image Anal., 42, 60–88,
2017.
9. Vincent, P., Larochelle, H., Bengio, Y., and Manzagol, P.-A., Extracting and Composing
Robust Features with Denoising Autoencoders, Proc. 25th Int. Conf. Machine Learning,
ACM, 1096–1103, 2008.
10. Holden, D., Saito, J., Komura, T., and Joyce, T., Learning Motion Manifolds with
Convolutional Autoencoders, SIGGRAPH Asia 2015 Technical Briefs, ACM, 1–4.
November 2015 Article No.: 18. https://doi.org/10.1145/2820903.2820918.
11. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., and Manzagol, P.-A., Stacked
Denoising Autoencoders: Learning Useful Representations in a Deep Network with a
Local Denoising Criterion, J. Mach. Learn. Res., 11, 3371–3408, 2010.
12. Huang, F.J., Boureau, Y.-L., LeCun, Y., Huang, F.J., Boureau, Y.-L., LeCun, Y., et al.,
Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object
Recognition, IEEE Conf. Comput. Vision Pattern Recog., 2007. CVPR’07, IEEE, 1–8,
2007.
13. Ozturk, S., Stacked Auto-Encoder Based Tagging with Deep Features for Content-
Based Medical Image Retrieval, Expert Systems Appl., 161, 113693, 2020.
14. Shamna, P., Govindan, V.K., and Abdul Nazeer, K.A., Content-Based Medical Image
Retrieval by Spatial Matching of Visual Words, J. King Saud Univ. Comput. Inf. Sci.,
2018. https://doi.org/10.1016/j.jksuci.2018.10.002
15. https://www.thinkautomation.com/eli5/eli5-what-is-image-classification-in-deep-
learning/
16. Salvi, M.,Acharya, U.R., Molinari, F., and Meiburger, K.M., The Impact of Pre- and
Post-Image Processing Techniques on Deep Learning Frameworks: A Comprehensive
Review for Digital Pathology Image Analysis, Comput. Biol. Medicine, 128, 104129,
January 2021.
Review on Deep Learning Algorithms for Image Processing 101
17. Y. LeChun, Y., Bengio, Y., and Hinton, G., Deep Learning, Nature, 521, 436–444,
2015.
18. Karanam, S.R., Shrinivas, Y., and Vamshi Krishna, M., Study on Image Processing
Using Deep Learning Techniques, Mater. Today Proc., 2020.
19. Lyu, Y., Chen, J., and Song, Y., Image-Based Process Monitoring Using Deep Learning
Framework, Chemom. Intell. Lab. Syst., 189, 8–17, 2019.
20. Pandey, B., Pandey, D.K., Mishra, B.P., and Rhmann, W., A Comprehensive Survey
of Deep Learning in the field of Medical Imaging and Medical Natural Language
Processing: Challenges and Research Directions, J. King Saud. Univ. Comput. Inf. Sci.,
29 January, 2021.
21. Wilson, B.C., Jermyn, M., and Leblond, F., Challenges and Opportunities in Clinical
Translation of Biomedical Optical Spectroscopy and Imaging, J. Biomed. Opt., 23(3),1–
13, 2018. doi: 10.1117/1.JBO.23.3.030901. PMID: 29512358; PMCID: PMC5838403.
6 Artificial Intelligence
in Games
Transforming the
Gaming Skills
Abhisht Joshi, Moolchand Sharma, and
Jafar Al Zubi
CONTENTS
6.1 Introduction................................................................................................... 103
6.1.2 How will AI Reinvent the Experience of Gaming?........................... 106
6.1.2.1 Advantages of AI in Gaming.............................................. 106
6.2 Gaming Experience....................................................................................... 107
6.2.1 Power of Voice in Gaming................................................................. 107
6.2.2 Real Gaming Experience to Players.................................................. 108
6.2.2.1 Three-Dimensional Visualization Techniques................... 108
6.2.2.2 Simulation Based on Physics.............................................. 108
6.2.2.3 Virtual Reality.................................................................... 109
6.2.2.4 Augmented Reality............................................................. 110
6.2.2.5 Extended Reality................................................................. 110
6.2.3 Necessity of RL for Adaptability in Games...................................... 112
6.2.4 In-Game Support to Players by AI Powered Chatbots...................... 113
6.2.5 Gives an Overall Ultimate Gaming Experience................................ 114
6.3 Early Game AI vs Complex Game AI........................................................... 115
6.3.1 Early Game AI................................................................................... 115
6.3.2 Complex Game AI............................................................................. 115
6.3.3 Video Game AI.................................................................................. 116
6.3.3.1 Finite State Machines......................................................... 116
6.3.3.2 Path-Finding....................................................................... 117
6.3.3.3 Real-Time Play Complex AI............................................... 118
6.4 Conclusion and Future Scope........................................................................ 119
References............................................................................................................... 120
6.1 INTRODUCTION
It is important to consider the past and present before embarking on future
game production. In October 1958, the game design and production began.
William Higinbotham, an American physicist, created the first video game. It was
DOI: 10.1201/9781003231530-6 103
104 Deep Learning in Gaming and Animations
an old tennis game known as “Pong” that was very common at the time. Since then,
the number of games, video game platforms, and game types has increased dra-
matically. As a result, many people showed a willingness to play, and many people
took it out on them. The gaming industry has evolved rapidly because of signifi-
cant advancements in hardware and design and development strategies. Artificial
Intelligence (AI) has aided the expansion by improving the gaming experience. In
addition, it has piqued players’ attention by exceeding their standards for the game.
AI software improves creation processes such as image production, animation gen-
eration, improving the performance of non-player characters (NPCs), story layers,
characterization, and graphics authenticity.
When you talk about AI, you are talking about computers that mimic the human
mind. The primary goal of AI in gaming is to make games smarter. It gives the games
a natural system to support NPCs’ intelligent actions. AI offers intelligent game con-
trols that allow for clever character communication and movement. To put it another
way, AI brings the game closer to reality. It investigates the dynamic relationships
between agents and game environments in general. Various games provide agents
with fascinating and complex problems to solve, making video games ideal for AI
study and studying how to use AI technologies to achieve human-level success.
In contrast, playing games is known as AI in the gaming experience. It investi-
gates the dynamic relationships between agents and game environments in general.
Various games provide agents with fascinating and complex problems to solve, mak-
ing video games ideal for AI study. These virtual worlds are safe and manageable.
Furthermore, these game environments have an endless supply of data for machine
learning (ML) algorithms and are much faster than in real-time. Because of these
characteristics, games are a particular and common domain for AI study. On the
other hand, AI has aided games in being better in terms of how we play, understand,
and style them [1].
In general, game AI is concerned with perception and decision-making in virtual
environments. There are some significant challenges and solutions associated with
these components. The primary challenge is that, particularly in strategic games,
the state space of the sport is extremely broad. The entire framework has effectively
modeled large-scale state-space with deep neural networks as representation learn-
ing has increased. The second issue is that it is difficult to learn proper policies
for making decisions in a dynamic, unknown environment. Data-driven approaches
such as supervised learning and reinforcement learning (RL) are viable solutions for
this issue. The third issue is that the vast majority of game AI is created in a con-
trolled virtual world. Therefore, the ability of AI to be transferred between games
could be a major challenge. To address this issue, a more generalized learning sys-
tem is also required. Hence, if we talk about AI in gaming detail, there are two types
of game AI techniques: deterministic and non-deterministic [2, 3].
The bread and butter of game AI are deterministic AI techniques. These tech-
niques are easy to implement, understand, test, and debug because they are predict-
able, fast, and simple. Despite their many advantages, deterministic methods predict
all scenarios and code all actions on the developers’ shoulders. Furthermore, deter-
ministic approaches obstruct learning and evolution. Deterministic habits begin to
become predictable after a little practice. This, in a sense, shortens a game’s lifespan.
On the other hand, learning and volatile gameplay are made easier with non-
deterministic methods. Furthermore, developers are not required to code all actions
directly in advance of all potential scenarios. Non-deterministic approaches can also
learn and extrapolate independently and foster emergent behavior, which occurs
without clear instructions.
However, if we considered ML methods, RL has been commonly used for a
long time to solve these problems. In recent years, deep learning (DL) has achieved
impressive success in computer vision and natural language processing (NLP) [4]. As
a result, many video games with deep reinforcement learning (DRL) have achieved
performance beyond human ability. At the same time, there are still some challenges
in this domain that need to be tackled.
The gaming industry worldwide has values of around $109+ billion, making this
industry one of the central points to perform research and thus improve its per-
formance by integrating it with AI, specifically DL. Figure 6.1 shows the gaming
industry size as compared to other industries. Most gamers do not consider gaming
as a sector, but it is becoming such a large and influential entertainment industry that
it attracts an increasing number of professionals. We must first dismantle the gaming
value chain. Game engines, which game engine owners normally license, are used to
create games. These game creators then depend on publishers to bring their creations
to life [4].
With AI changing the landscape of gaming, we are revolutionizing the gaming industry
with captivating gaming experiences that cannot be achieved on any other platforms.
Jefferson Valadares, CEO, Doppio Games
AI has shown positive results in the gaming industry since its inception. In every
aspect of the gaming industry, AI has always been the same. It not only learned to
play chess, but it also defeated some of the best players in the world. Furthermore,
gaming companies are now employing AI in a more dynamic setback process, such
as determining the main AI benefits in games.
more enjoyable gaming experience. When playing these games, players will simply
tell their characters what to do. For example, gamers may use simple verbal com-
mands to tell their game characters to run, swim, jump, hide, open the door, change
weapons, and do something else. Since voice commands save them from remember-
ing various keys and their variations, they can provide a more smooth and realistic
gaming experience. In contrast to buttons or keys, using voice commands is a more
natural process.
The following are some examples of voice-controlled AI games:
a. Bot Colony
b. Broken Seal
c. The 3% Challenge
d. Westworld: The Maze
e. LEGO: Duplo Stories
The accent problem is a major issue in this type of game; various world regions
have different accents, which confuse AI games because the user has to repeat the
command repeatedly, resulting in poor game results. Since AI-created voice sounds
natural, and the voice generated also plays a crucial role in making it more realistic,
voice AI can make the game more interactive. AI provides digitally created voices,
and its ability to initiate interactions through dialogues aids players in completing
several tasks and more. Voice AI has only recently started to reshape the gam-
ing industry. Voice AI has limitless potential, but it also creates a customized and
dynamic gaming experience for gamers. There is still much progress in voice AI,
and definitive research is being conducted to improve the current voice AI. Scientists
and game developers are collaborating to improve its accuracy [5, 6].
making the AI model adapt to physics law, making it to be real so the player can
relate game with real-life making the gaming experience more real as shown in
Figure 6.2.
“X” represents a variable for any current or future spatial computing technologies.
Figure 6.6 depicts a clearer picture of the XR.
Games today are intelligent and visually pleasing in ways that were unimaginable
10–15 years ago. Figure 6.7 shows the evolution of game characters, and Figure 6.8
shows an example of the evolution of graphics in games over the years.
The definition of gaming has changed thanks to AR. The VR game “Pokémon
Go” is a well-known example. Another example is “Until you fall,” a VR sword
combat game depicted. One of the most common problems with video games is
that items look fine from afar, but as you get closer to take a closer look, objects
make poorly and become pixelated, resulting in bad game quality. NVidia and
Microsoft are cooperating to solve this issue by investing in DL to improve render-
ing. Computer vision and DL algorithms can help with the dynamic representation
of finer details, a problem in the past. Algorithms acting as NPCs are another signifi-
cant factor contributing to improving the real-world gaming experience. AI-based
NPCs allow you to play against less predictable opponents. These adversaries may
also change their difficulty level based on the user. As a player learns how to play the
game, the enemies may become smarter and react to the user in specific ways that the
user has never seen before based on the user’s behavior. Companies are developing
NPCs by using data from top players as training data, allowing for better and quicker
reinforcement training.
The way players communicate with friendly NPCs is another big challenge in
creating a realistic virtual environment. To complete your goals in many games,
you must speak with scripted characters. These interactions, on the other hand, are
normally brief and obey on-screen prompts. NLP could allow you to speak to in-
game characters aloud and receive accurate responses, similar to how Siri, Alexa,
and Google Assistant work. Furthermore, games with VR haptics or imaging of the
player may allow computer vision algorithms to detect body language and intentions,
enhancing the experience of interacting with NPCs even more [9, 10].
Deep Mind’s Alpha Zero. Alpha Zero is a program that aims to understand the recur-
ring patterns and properties of games, such as board symmetry.
This is possible, thanks to a new type of RL in which AlphaGo Zero serves as its
instructor. The software begins with an untrained artificial neural network that has
no prior knowledge of Go. Then, combining this neural network with a strong search
algorithm, the program plays the game against itself. Finally, the neural network is
trained and tuned to anticipate m as it plays. Figure 6.10 shows the Alpha Go Zero
beating Ke Jei.
reasoning and logic for efficient gameplay. VR and XR are also driving up demand
for AI-powered chatbots like never before. Chatbots operated by AI contribute to the
improvement of gaming experiences in the following ways:
a few other companies have also worked on AI to convert real-world videos into
virtual worlds, making it ideal for video games [13].
6.3.1 Early Game AI
The first known example of AI in gaming was inspired as a demonstration of com-
puting at the Festival of Britain in 1950. In 1951, Ferranti introduced a custom-built
Nimrod machine, which used the Nim game to demonstrate its mathematical poten-
tial. Nim is a two-player game where players take turns by removing one to three
items from the pile. On the contrary, the player who re-enters the game is the one
who re-enters the game. Although Nimrod was created to play Nim, Ferranti thought
that creating a computer that would play a complex game would also solve complex
problems. As a result, Nimrod had no program. Instead, it had a set of complex hard-
wired collections of logic to follow. Thus, Nimrod did not have traditional AI, but its
ability to play Nim made it competitive, surprising, and intimidating players alike.
6.3.2 Complex Game AI
Arthur Samuel of the tech giant IBM introduced one of the first applications to learn
about conventional ML in sports (who coined the term “machine learning”). Samuel
accepted the challenge of the game checkers in 1956, which required both easy play
and complex strategy. His work on the first commercial computer, the scientific IBM
701, resulted in the development of two key principles in ML. The invention of alpha-
beta pruning was the first.
“Do not take the time to see how bad an idea is if it is undeniably awful,” Patrick
Winston said of the basic definition of alpha-beta. The Minimax algorithm is used
to reduce the number of nodes that must be evaluated inside a tree—in this case,
for two-player games—and alpha-beta pruning is a tree-search optimization tech-
nique. In a nutshell, when the machine chooses a move, it is represented as a game
state space tree, with each level representing the computer or player’s next move.
The Minimax algorithm aims to optimize the smallest benefit possible. Alpha-beta
prunes parts of the tree that don’t improve a player’s reward out of the evaluation;
only the middle move can help player X win (assuming that both players will be play-
ing optimally), shown in Figure 6.11.
Samuel’s second central idea was the combination of self-play and rote learning.
Samuel pitted his software against itself to enhance its efficiency, as well as remem-
bering each move it had experienced (dutifully recorded on magnetic tape), as well
as its terminal “reward” value. His software was a checkers game for amateurs, but
it incorporated ideas that are still used today. For example, the alpha-beta search
116 Deep Learning in Gaming and Animations
method was revived in the game of chess 40 years later. Garry Kasparov, a chess
grandmaster, was defeated by IBM’s Deep Blue in 1996. Deep Blue ran simultane-
ous game-state tree search, pruned with alpha-beta pruning, to speed up the process
of calculating the computer’s next step (by searching 200 million positions per sec-
ond and decide up to 20 moves for the future). In 2015, Google used DL (neural net-
works) for move selection and Monte Carlo search for applying previously learned
moves to the game of Go. In 2015, AlphaGo defeated a professional human player,
and in 2017, it defeated the world’s top player. AlphaGo reapplied Self-play (which
was developed by Samuel for checkers approximately 60 years ago) to improve its
play and store the moves that were previously known.
6.3.3 Video Game AI
Early video games had no idea that instead of relying on state machines to predict
movement, their rivals used AI (such as in Space Invaders). Pac-Man raised the dif-
ficulty of the enemy in 1980 to aid the player in path-finding (or away from the player
in the case of escape). Furthermore, each opponent had a distinct personality, mak-
ing the game more unpredictable.
Let us look at some of the current methods in video games.
FIGURE 6.12 The finite state machine for a real time strategy worker NPC.
Figure 6.12 depicts an NPC to whom the player can delegate a role. Once delegated,
the NPC performs the task based on the various behaviors required to complete it.
The FSM in Figure 6.12 depicts a worker NPC who is tasked with collecting
resources by the player. Following its assignment, the NPC locates the desired
resource, collects it, and transports it to a collection point. This cycle repeats until
all resources have been depleted, at which point the NPC will become idle.
6.3.3.2 Path-Finding
In games, the ability to navigate in their surroundings by the NPCs is a common
feature. Path-finding is a term used to describe an ability that can be learned in
several ways. One of the most well-known is A. The A algorithm is a variant of
Dijkstra’s shortest path algorithm, in which NPC destinations are represented as the
nodes of a graph (see Figure 6.13). There are two types of lists that A* can deal with:
open and closed lists. The empty list contains all the nodes that non-visited nodes,
while the closed list contains all the visited nodes. The algorithm starts with a node
available in the list and looks for nodes accessible from it, adding them to the empty
list (if they are not already added in the closed list). Next, the current node’s score
is determined using the node’s value (some of the nodes might be more expensive
than others to pass through) and a heuristic from the current node (the shortest dis-
tance between a node and the goal). The process is carried out repeatedly until the
desired result is obtained. Then, the algorithm follows the node’s decreasing score
back again to the starting node to determine the shortest path.
There are several path-finding algorithms, including A*. An original path-finding
algorithm can cluster the universe into larger nodes in applications with a large set of
maps and small-sized CPUs. On the other hand, hierarchical path-finding generated
a smaller number of nodes, decreasing the amount of time spent looking.
The agent’s perception space is very large since the network’s inputs are made
up of around 20,000 real values. With about 170,000 total unique actions to choose
from, the number of actions available is also very large (about 1,000 actions that
are typically possible among them as they are reduced due to cooldowns, items not
present, etc.). The OpenAI Five used proximal policy optimization for learning,
which is simple to use and performs well in real-world scenarios. Both how we play
games and how they are made are being transformed by AI. While the methods
used have evolved over time, some ideas have remained unchanged, such as Arthur
Samuel’s principle of self-play in the creation of learning agents. From checkers
and the IBM 701 to complex real-time games trained with the aid of distributed
networks of CPUs and GPUs, ML is an integral part of games and a test-bed for
potential ML methods [14].
extending the game’s lifespan. Since learning and changing games are inherently
unpredictable, AI developers have historically approached learning strategies with
caution. Non-deterministic AI, which has its own set of problems, encompasses the
techniques for understanding and reacting to character behavior. However, building
and evaluating non-deterministic learning AI techniques takes longer. In addition,
debugging becomes more difficult as it becomes more difficult to comprehend what
the AI is doing fully. These obstacles have proven to be major impediments to AI’s
widespread adoption. However, a lot of this is changing. Several famous games,
including Creatures, Black & White, Battlecruiser 3000AD, Dirt Track Racing,
Fields of Battle, and Heavy Gear, used non-deterministic AI. Their achievements
sparked renewed interest in AI techniques such as decision trees, neural networks,
genetic algorithms, and probabilistic methods. Non-deterministic methods are used
in combination with more conventional deterministic methods in these success-
ful games. They are used only when they are required and for problems for which
they are ideally suited. A neural network is not a magic pill that will solve all AI
problems in a game; however, it can be used with excellent results for specific AI
tasks. When using non-deterministic methods, we suggest taking this approach.
This way, you can isolate the parts of your AI system that are unpredictable and
difficult to create, test, and debug while maintaining the majority of your AI system
in its traditional form.
REFERENCES
1. Tamburrini, G. & Altiero, F. (2021). Research Programs Based on Machine
Intelligence Games. In Chiodo, S. & Schiaffonati, V. (Eds.), Italian Philosophy of
Technology. Philosophy of Engineering and Technology, vol 35. Springer, Cham. DOI:
10.1007/978-3-030-54522-2_11
2. Singh, T. & Mishra, J. (2021). Learning With Artificial Intelligence Systems:
Application, Challenges, and Opportunities. In Verma, S. & Tomar, P. (Eds.), Impact
of AI Technologies on Teaching, Learning, and Research in Higher Education
(pp. 236–253). IGI Global, Hershey, Pennsylvania, 2021. DOI: 10.4018/978-1-7998-
4763-2.ch015
3. Westera, W., Prada, R., Mascarenhas, S. et al. (2020). Artificial Intelligence Moving
Serious Gaming: Presenting Reusable Game AI Components. Education and
Information Technologies, 25, 351–380. DOI: 10.1007/s10639-019-09968-2
4. Ho, M. (2021, February 2). Video Games and Scientific Research. DOI: 10.31219/osf.io/
cj593
5. Vamsidhar, E., Kanagachidambaresan, G.R., & Prakash, K.B. (2021) Application of
Machine Learning and Deep Learning. In Prakash, K.B. & Kanagachidambaresan, G.R.
(Eds.), Programming With TensorFlow. EAI/Springer Innovations in Communication
and Computing. Springer, Cham. DOI: 10.1007/978-3-030-57077-4_8
6. Ahmad, F., Ahmed, Z., & Muneeb, S. (2021). Effect of Gaming Mode Upon the
Players’ Cognitive Performance During Brain Games Plays: An Exploratory
Research. International Journal of Game-Based Learning, 11(1), 67–76. DOI: 10.4018/
IJGBL.2021010105
7. Spronck, P., Liu, J., Schaul, T., & Togelius, J. (Eds.) (2020). Artificial and Computational
Intelligence in Games: Revolutions in Computational Game AI: Report from Dagstuhl
Seminar 19511. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl. https://
drops.dagstuhl.de/opus/volltexte/2020/12011/
Artificial Intelligence in Games 121
CONTENTS
7.1 Introduction................................................................................................... 123
7.2 Literature Survey........................................................................................... 125
7.2.1 Unsupervised Representation Learning with Deep
Convolutional Generative Adversarial Networks.............................. 126
7.2.2 Conditional Generative Adversarial Networks.................................. 126
7.2.3 Image Generation and Recognition................................................... 127
7.2.4 Progressive Growing of GANs for Improved Quality, Stability,
and Variation..................................................................................... 129
7.2.5 Diverse Image Generation via Self-Conditioned GANs.................... 129
7.2.6 Deformable GANs for Pose-Based Human Image Generation......... 130
7.2.7 Systematic Analysis of Image Generation Using GANs................... 131
7.3 Methodology.................................................................................................. 131
7.3.1 Working Intuition.............................................................................. 132
7.3.2 Architecture....................................................................................... 132
7.3.3 Training............................................................................................. 133
7.4 Results............................................................................................................ 134
7.5 Conclusion..................................................................................................... 135
References............................................................................................................... 135
7.1 INTRODUCTION
Artificial intelligence has long been associated with gaming. Deep learning (DL)
research has improved the consistency and innovations of games significantly. DL
has proven to be a highly beneficial improvisation in game development. DL is used
DOI: 10.1201/9781003231530-7 123
124 Deep Learning in Gaming and Animations
To train both models, CGANs are a better option. It works by giving both the
networks, the Discriminator and the Generator, preconditioning and training them
in a controlled manner. These make use of a separate Generator and Discriminator.
If the Generator and Discriminator are initially preconditioned on some additional
details, such as Y, the GANs can be upgraded to a conditional model. Y may be
anything, like class codes, data from other modalities, or some other type of auxil-
iary data. If both networks are fed with Y as an extra layer of input, the condition-
ing can be done. The prior input noise and the information Y are combined in a
joint hidden representation in the Generator. The adversarial training frame allows
for considerable flexibility in how this hidden representation is composed. Y and
the Generator’s output are provided as inputs to a discriminative function in the
Discriminator [3].
shows how the proposed model can be used to learn a multimodal model and gives
examples of how it can be applied to image tagging, demonstrating how the method
can generate descriptive tags that are not part of the training labels.
Using GANs to train generative models has proved to be a promising alternative.
They are getting around the problem of approximating a large number of intractable
probabilistic computations. Adversarial networks have the advantage of not requir-
ing Markov chains or probabilistic intervention because gradients are obtained using
backpropagation, and a wide range of incorporations and factors can be easily incor-
porated into the model. Furthermore, the networks will produce realistic samples
and log-likelihood estimates that are up to date. The modes of data generation are
totally out of balance in a generative model that has not been conditioned. However,
using some additional information to condition the model, the data generation pro-
cess can be guided. For inpainting, the conditioning could use data from a particular
modality, class marks, or a portion of the data. This chapter is a demonstration of
how to go about building CGANs. For analytical findings, two sets of experiments
have been demonstrated. The model was trained on the MNIST digit dataset condi-
tioned on class labels in one experiment and on the MIR Flickr 20.000 dataset for
multimodal learning in the other.
The GANs can be upgraded to a conditional model if the two networks, the
Generator and the Discriminator, are preconditioned on some additional details, say
Y, before training. Y may be information from other modalities, class marks, or
some other type of auxiliary data. As an extra layer of data, Y can be fed into both
the Generator and the Discriminator to perform the conditioning. In the Generator,
the obtained input noise and Y are combined in joint hidden representations. The
adversarial training frame aids in imparting considerable versatility in how this hid-
den representation is implemented. Y and the output from Generator are provided as
inputs to a discriminative function in the Discriminator [3].
i. CGANs: CGANs were introduced into the field to condition the GAN model
and give it the ability to guide what it produces. This is accomplished by
using the data as an additional input layer to condition both the Generator
128 Deep Learning in Gaming and Animations
and the Discriminator models. One of the many benefits of CGAN is that
it allows for better one-to-many mapping representation, which means that
conditioning on a single class (e.g., cat) will synthesize a range of cats with
different colors and features [3].
ii. Stack GANs: Stack GAN is a two-stage GAN that uses text descriptions
to create photo-realistic images (proposed in 2016 by Zhang et al.). The
problem is broken down into two simpler “stages,” significantly improving
on previous approaches. The first stage (Stage-I GAN) summarizes what
is mentioned in the text before adding the primary context and object
colors. The second stage (Stage-II GAN) takes both the text description
and the low-resolution image generated in stage 1 as input. It performs
image detailing to make it more photo-realistic and accurate to the text
description [5].
iii. Cyclical GANs: Cyclical GANs function by using a cycle continuity loss,
which allows information to be transferred from one domain to another and
then back to the previous domain without loss. While cycle consistency loss
had previously been used in other areas, the CycleGAN was the first time
it was applied to GANs. The CycleGAN enables translation between two
separate image domains by training on two unordered image sets, one for
each domain. This is possible because the CycleGAN assumes an underly-
ing relationship between the two domains, allowing for training without
paired training data [4].
iv. Self-attention GANs (SAGANs): Long-range dependence modeling
and self-attention are combined in the SAGAN to generate images of
scenarios and artifacts linked in a way that is compatible with realistic
images (proposed in 2018 by Zhang et al.). Self-attention looks for con-
nections between different sections of a series in order to represent and
recognize each one. The SAGAN is the first paradigm to incorporate
self-awareness into GANs. In general, GAN models employ convolu-
tional layers in their architecture since they are effective at modeling
local dependencies, but CNN’s fall short when it comes to long-range
modeling dependencies. By applying self-attention to both networks, the
SAGAN effectively models both local and global dependencies in an
image, ensuring consistency in the highly detailed features in different
portions of the created image [6].
i. Feature matching: Making the Generator’s goal fit the expected value of
the Discriminator’s intermediate layers.
ii. Minibatch discrimination: Deals with mode collapse by making the
Discriminator examine a small number of examples rather than a single
example, allowing it to determine if the Generator is generating the same
outputs. Feature matching was found to be inferior to this form.
iii. Historical averaging: By keeping track of the historical average, it penal-
izes significantly different criteria from the average.
A Framework for Estimation of Generative Models 129
before, the details to be shuttled must be chosen when taking into account object-
shape deformation, as summarized by the discrepancy between P(xa) and P(xb) (xb).
To do so, the global deformation is broken down into a series of local affine trans-
formations, which are characterized by subsets of joints in P(xa) and P(xb) (xb). The
content of F is deformed using these local masks and affine transformations built
using the relevant joints, and then skip connections are used to copy the transformed
tensor, which is then concatenated with the corresponding tensor in the destination
layer [9].
7.3 METHODOLOGY
a. Convolutional layer: For generating a feature map, this layer adds a feature
detector, which is a 3 × 3 matrix of zeros and ones in general, to the image
matrix, which is also binary. The feature detector strides through the image
matrix, traversing it entirely, ANDed its inputs with the image matrix’s
inputs, performing the addition of the resulting matrix’s components, and
storing it in the feature map. After that, the resulting function map is a
much smaller picture with missing data but highlighted essential features
that can be used in a more streamlined manner later [10].
b. Pooling layer: This layer removes the important features even further and
gives the neural network spatial variance so that the essential features can
be skewed, turned, down, near, or lateral, and the model will still be able
to classify them. In this case, a feature detector strides over the feature map
from the convolutional layer without any feedback and generates a pooled
132 Deep Learning in Gaming and Animations
feature map that is even smaller in scale. After that, the production is placed
in the flattened sheet (a linear of features) [10].
c. Kernel size: The size of the kernel defines the field of view that comes
under that convolution. A general choice for 2D is three—that is 3 × 3
pixels.
d. Padding: The padding specifies how a sample’s border is treated. If the
kernel is more significant than 1, a (half) padded convolution will hold the
spatial output dimensions equal to the input. In contrast, unpadded convo-
lutions will crop away some of the boundaries, reducing the output size.
Padding aids in the management of production size.
e. Stride: The stride specifies the kernel’s step size when traversing the image.
The performance decreases as the stride lengthens. In the same way, that
max-pooling uses a stride of 1, we can use a stride of 2 to down-sample a
picture.
f. Transpose convolution: A transpose convolution is used to reconstruct the
spatial resolution of the size of the output image and perform convolution
resulting in a larger size image. It will be used in the construction of the
Generator [10].
g. Dense layer: This is the fully connected layer that takes the inputs, assigns
them with the wait, and is responsible for the activation caused by the
features.
7.3.1 Working Intuition
GANs are trained using two models, a Discriminator and a Generator, that com-
pete and work against each other while learning and updating their weights. The
Generator’s task is to produce images that resemble the images we are trying to
generate to trick the Discriminator into thinking they are real-world images rather
than created non-real images. The Discriminator’s goal is to determine whether the
images are natural or artificial. The Discriminator has been trained on both authentic
and generated images (initially random noise). The parameters in both the Generator
and the Discriminator are modified after each iteration. The Generator is given the
parameters individually to correct its output rather than adding up the values and
calculating the cost function as a whole [11].
Both the Generator and the Discriminator are trained at the same time to prevent
one of the networks (the Discriminator) from being too bright for the other, result-
ing in random outputs because the Generator would not know which parameters to
tweak to deceive the Discriminator.
7.3.2 Architecture
We start by creating a standalone Discriminator model with a 28 × 28 input
image. A leakyReLU layer with an alpha of 0.2 is accompanied by a convolu-
tional layer of 128 units with a scale of 3 × 3, a stride of 2, and padding of the
same. This is accompanied by a leakyReLU layer with an alpha of 0.2 and a
A Framework for Estimation of Generative Models 133
7.3.3 Training
We first need to create two objects of the Discriminator class, one that receives the
actual images. The Discriminator should learn to compute the high values, meaning
the images are accurate for the Discriminator. Another that should learn to compute
the low values, meaning the images are fake for the Discriminator. To accomplish
this, we use the binary cross-entropy function [12].
The Generator tries to achieve the opposite goal to make the Generator assign
high values to the fake images it creates.
Regularization is needed on both the Generator and the Discriminator, for which
two different optimizers are created. It is critical to specify the variables these opti-
mizers should change; otherwise, the Generator’s optimizer might screw up the
Discriminator’s variables and vice versa.
Optimizers modify the weight parameters in order to minimize the loss function.
The loss function serves as a reference to the terrain, indicating if the optimizer is
heading in the right direction to meet the valley’s bottom, the global minimum. The
Adam optimizer is used in this situation.
The Adam optimizer follows this equation:
α.mt
θt + 1 = θt −
vˆ + E
134 Deep Learning in Gaming and Animations
where,
t = mt
m
1 − β1t
vt
vt =
1 − βt2
mt = (1 − β1 ) gt + β1mt −1 and
vt = (1 − β 2 ) gt + β 2 vt −1
where,
α is the step size,
β1 and β2 are decay rates,
mt is the first-moment vector,
vt is the second-moment vector, and
t in the subscript is the timestep.
7.4 RESULTS
The battle in GANs is not about having the Discriminator’s highest accuracy and
lowest loss. This would have the exact opposite result to what the model is aim-
ing for. Instead, we want to train the Discriminator as much as possible to classify
A Framework for Estimation of Generative Models 135
accurate data (from the data distribution) and false data (Generator by the Generator),
and then train the Generator to reduce the loss as much as possible. Both models are
trained in parallel, with the idea that if one of them is too intelligent for the other, we
will end up with two dumb models that cannot learn from each other.
The loss graph Generator in the training process has been shown in Figure 7.1.
The graph in Figure 7.1 has been attained after 100 epochs with a batch size of
32 images per batch. The model shows a decent performance in creating images
since the graphics are not too demanding in the creation of animated images.
7.5 CONCLUSION
In terms of image creation, the model performs admirably. The Generator’s batch
normalization was not applied at first, resulting in a reduction in its efficiency.
The learning significantly improved after batch normalization layers were applied.
The dense layer had to be a balanced job in terms of size because a too large layer
resulted in the model generating too similar images. Simultaneously, a too-small
layer resulted in Generator failing to learn and update the weights effectively, result-
ing in distorted images. It must be ensured that the Generator does not overfit or
underfit, and tuning must be meticulous.
REFERENCES
1. Generative Adversarial Nets by Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza,
Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengi. https://
arxiv.org/pdf/1406.2661.pdf
2. Unsupervised Representation Learning With Deep Convolutional Generative
Adversarial Networks. Alec Radford & Luke Metz, Soumith Chintala. https://arxiv.org/
pdf/1511.06434v2.pdf
3. Conditional Generative Adversarial Nets by Mehdi Mirza, Simon Osindero. https://
arxiv.org/pdf/1411.1784.pdf
136 Deep Learning in Gaming and Animations
CONTENTS
8.1 Introduction................................................................................................... 138
8.2 Background.................................................................................................... 140
8.2.1 Procedural Content Generation (PCG).............................................. 140
8.2.1.1 Game Bits............................................................................ 140
8.2.1.2 Game Space........................................................................ 140
8.2.1.3 Game Systems..................................................................... 140
8.2.1.4 Game Scenarios.................................................................. 141
8.2.1.5 Game Design....................................................................... 141
8.2.1.6 Derived Content.................................................................. 141
8.2.2 Procedural Content Generation Using Machine Learning
(PCGML)........................................................................................... 141
8.2.3 Deep Learning in PCG...................................................................... 142
8.2.4 Generative Adversarial Networks...................................................... 142
8.3 Overview of GAN in Video Games............................................................... 142
8.3.1 Level/Map Generation....................................................................... 143
8.3.2 Height Map Generation..................................................................... 144
8.3.3 Texture Synthesis in Games.............................................................. 145
8.3.4 Characters or Face Generation.......................................................... 145
8.4 Overview of Datasets and Games.................................................................. 146
8.4.1 Popular Games................................................................................... 146
8.4.1.1 Super Mario Bros (1985)..................................................... 146
8.4.1.2 The Legend of Zelda (1986)................................................ 148
8.4.1.3 DOOM (1993)..................................................................... 148
8.4.1.4 Pac-man (1980)................................................................... 148
8.4.1.5 StarCraft (1998).................................................................. 149
8.4.2 Popular Datasets................................................................................ 149
8.4.2.1 Video Game Level Corpus (VGLC)................................... 149
8.4.2.2 Idgames Archive................................................................. 149
8.1 INTRODUCTION
Procedural Content Generation (PCG) is the automation of content production
through algorithmic means. The content can be anything that requires the involve-
ment of humans, from the creation of images to videos, poetry to music, paintings
to architectural designs. Owing to its capability to augment human creativity, with
limited or no human contribution, PCG has been an integral aspect of game devel-
opment and technical games research for years now. The increasing prominence of
PCG in games is due to its promising potential to escalate the replay values, reduce
development costs and effort, optimize storage space, or simply improvise the aes-
thetics. With games as the prime focus, PCG refers to the creation of game contents,
such as textures, maps, levels, quests, characters, stories, ecosystems, sound effects,
weapons, or even game mechanics, player interaction, and rules.
PCG for games is broadly divided into two categories, namely functional and cos-
metic. The functional PCG covers aspects like game space, systems, and scenarios
[1], which includes maps, ecosystems, levels, and stories, etc. This majorly focuses
upon the player interaction and game mechanics to enable state-of-the-art game
experiences and enable adaptivity in games as per the player. The cosmetic PCG,
on the other hand, focuses upon graphics and visualization, vegetation, architecture,
textures, and sounds. It also includes the set of rules and mechanics of the game that
constitute the game design [1].
The most sought-after and elusive goals of PCGs are to achieve the ability to
control the generated content (controllability), express a variety of unrepetitive con-
tent (expressivity), and generate believable content (believability) [2]. The techniques
used in applications of PCG in games to achieve these goals are broadly catego-
rized as constructive or traditional and artificial intelligence-based (AI-based). The
simpler traditional approaches have been to use grammars, noise-based algorithms,
fractals, Pseudo-random number Generators. The AI-based techniques are further
classified into search-based, solver-based, machine learning-based (ML-based),
and subsequently, deep learning-based (DL-based). The search-based methods [3]
employed evolutionary algorithms and stochastic and metaheuristic search/optimi-
zation techniques, while solver-based methods [4] use the design space approach to
function. The need to be fine-tuned or even explicitly be designed for specific gen-
erations is the primary flaw in standard/constructive PCG that gave rise to the devel-
opment and advancement of ML-based PCG (MLPCG). MLPCG aims to design
general algorithms that can create vast amounts of content using reasonable amounts
of data. ML-based methods have a wide category of algorithms to choose from, such
as n-grams, Markov models, Recurrent neural networks, autoencoders to name a
few. Some of the most famous algorithms deployed under DL methods are neural
Generative Adversarial Networks Based PCG for Games 139
8.2 BACKGROUND
Further, this section presents reviews of PCG along with its implementation via ML
and DL methodologies for several video games. Further, the procedural generation
of game contents (six classes) using different approaches is analyzed.
The comprehensive exploration of these algorithms highlighted the recent explo-
ration of GAN and its implementation by several scholars. However, the litera-
ture available is extremely limited and has not been structured properly for future
research in the same domain.
environment, road network, and ecosystems along with complex player interaction
with the virtual world that makes these games more real-life.
have been implemented on Legend of Zelda for level and room creation. Researchers
have also worked on predicting resource locations on StarCraft II maps using neural
networks [25].
8.3.1 Level/Map Generation
Level generation is the process that is responsible for creating a playable and interest-
ing environment or scenarios for gameplay and is the most important aspect of any
video game. Fabricating levels require designers to have both artistic and technical
acumen. Generating the arcade maps requires extensive playtesting, hence, from the
past few years, designers have inclined toward exploiting the domain of ML for mod-
eling-level designs. Video games are either generate 2D levels or convert these to 3D
maps for creating advanced games. There has been the implementation of several
techniques to create levels using PCG. However, recently GANs [5] have emerged to
be the most popular deep generative method for arcade content generation.
In 2018, Giacomello et al. [27] applies the concept of GAN for generating new
and unique levels on DOOM [28] video game series. Their approach included imple-
menting Wasserstein GAN with Gradient Penalty (WGAN-GP) in two ways. The
first one only had images and noise vector as the input to the model and was named
as unconditional WGAN-GP. On the other hand, in conditional WGAN-GP there is
an additional input of extracted features from existing DOOM levels that showed
better results in comparison to the unconditional model. Volz et al. [16] processed
another popular game by Nintendo, Super Mario Bros [29] for stage creations. In
this research, the main intent was to generalize the generation process for different
games, hence, the authors divided the technique into two phases. In the first stage, an
unsupervised method is used to train the GAN model which is followed by identify-
ing ideal input vectors from the latent space. However, to avoid random sample gen-
eration evolutionary control in the form of Covariance Matrix Adaptation Evolution
Strategy (CMA-ES) [30] is applied while exploring the inputs followed by fitness
function evaluation. This procedure is extended by Giacomello et al. [31] where they
used the CMA-ES technique on DOOM to generate novel levels according to the
features required in the new gameplay stage. Prominently three types of levels were
considered for this experiment, arena level, labyrinth level, and complex level.
The education domain is considered challenging for creating video games by
applying PCG due to its requirement of fulfilling the learning objectives through it.
144 Deep Learning in Gaming and Animations
Therefore, there has been limited research in this field. However, Park et al. [32] pro-
posed a multistep DCGAN (deep convolutional GAN) for generating novel and solv-
able levels. The model comprises two Generators, where the first one is responsible
for creating a large dataset of synthetic levels from a small set of example stages. The
generated data is further fed to the second Generator to produce levels with increased
solvability. Furthermore, Torrado et al. [33] proposed conditional embedding self-
attention generative adversarial network (CESAGAN) where the bootstrapping tech-
nique was integrated to efficiently train Generators and Discriminators. The model
is based on combining self-attention GAN (SAGAN) [34] with a conditional vector
to improve the diversity, uniqueness, and playability of the generated levels.
Gutierrez and Schrum [35] integrated both GAN and graph grammar to generate
dungeon rooms for The Legend of Zelda [36] video game. In their initiated work, the
model employs both GAN and graph grammar equally that resulted in the generation
of an interesting and playable dungeon layout with the placement of obstacles and items
in it. Followed by this research, Schrum et al. [37] combined GAN and Compositional
Pattern Producing Network (CPPN) [38] to tackle the issue of arranging structured lev-
els using segments. The CPPN exploited the latent vectors that are directly associated
with game segments that resulted in the generation of complete levels.
Furthermore, Awiszus et al. [17] introduced TOAD-GAN, a unique methodology
to generate new Mario levels by considering the scarcity of data. The proposed tech-
nique is based on SinGAN architecture [39] and employs a single training level to
produce tile-based game stages. Bontrager and Togelius [40] have recently proposed
a notable methodology on a 2D dungeon crawling game by implementing Generative
Playing Networks (GPN). The model comprises of Generator and agent that have
direct communication for generation of gameplay levels that results in the require-
ment of less training data. Moreover, Constrained Adversarial Networks (CANs) by
Di Liello et al. [41] penalizes the GAN network for inappropriate and invalid structure
generation. The application of CAN has produced efficient results without affect-
ing the run-time of the model. Moreover, GameGAN, the recent development by
NVIDIA [20] has integrated deterministic and modeling algorithms on Pac-Man to
generate novel aspects related to the gameplay and providing high visual consistency.
Apart from specific models, there have been research on developing a general
method for game content generation. The first one is introduced by Irfan et al. [42]
where the model consists of DCGAN being train on three different games includ-
ing Colourescape, Zelda, and Freeway. It was observed through this research that
the generated stages were playable and a large amount of data assisted DCGAN to
capture required data in the levels. Another approach was initiated by Kumaran et al.
[43], in this approach branched GAN was trained that can generate levels of four
different arcades using a single random vector as the input to the architecture and
captures the variation present in the data while training.
as modeling of level maps might require a height map, researchers have also explored
the domain of generating height maps from 2D images. As compared to the domain
of level generation, research resources are limited in the area of height map genera-
tion. Broadly there have been four works on producing height maps in video games
using GANS.
Beckham and Pal [44] in their research identified the first technique to generate
height and texture maps using GANs. Initially, they trained their data on DCGAN to
produce height maps that were further processed using pix2pix GAN for generating
their corresponding textures. As shown in that research, they rendered height and
texture maps to generate 3D terrain for video games. Furthermore, Wulff-Jensen
et al. [45] in 2018 worked on Digital Elevation Maps (DEM) for alps [46] to gener-
ate 3D landscapes for enhancing gameplay interactivity. They trained DCGAN on
Elevation Maps and then converted the same to 3D maps with Unity3D. The resultant
maps were realistic and could be deployed while creating different arcade games.
In 2019, Spick and Walker [47] worked on creating a model that could generate
several variants of a specific landscape region. For this purpose, they trained Spatial
GAN (SPA-GAN) [48] with non-spatial feature learning. The height and texture
maps were rendered through a 3D game engine to produce realistic and endless vari-
ants of any specific area.
Recently, there has been a novel approach to generate terrain for gameplays using
rough sketches. Wang and Kurabayashi [49] proposed a unique methodology involv-
ing a generative model of two phases where the conditional generative adversarial
network (cGAN) is responsible for generating different variants of elevation bitmap
of the sketch. The model in this phase generates maps corresponding to the terrain
data that has been used for training. In the second phase, the deterministic algorithm
produces the actual asset for the terrain by interpreting the details on the elevation
bitmap. Hence, this approach will help various designers to create interactive game
maps by simply converting their rough sketches.
8.4.1 Popular Games
Games have always been favored by people from all generations and hence this
area has innately advanced with time. Earlier the games used to be created manu-
ally, after which methods for automatic creation were examined, leading the inter-
est toward PCG algorithms for the same. However, in recent years researchers are
becoming interested in integrating them with advanced techniques like GANs for
richer and enhanced content. However, throughout this period, certain games have
been admired by researchers for experiments. Therefore, in this subsection, we will
discuss some of those favorable games.
TABLE 8.1
Main Contributions of the Above Explored Algorithms
consists of eight “worlds,” each world with its own set of four sub-levels called
“stages.”
Each game starts with players having a certain number of lives and which may be
increased by gaining additional lives by defeating several enemies in a row with a
Koopa shell, picking up certain power-ups, or by collecting a certain number of coins
or bouncing on enemies successively without getting in contact with the ground.
Contact with enemies converts him into big mode and subsequent contact with ene-
mies converts him into the regular state, instead of dying. The lives count decreases
if Mario takes damage while small, falls in a bottomless pit, or exceeds the time
limit. Once the lives count strikes zero, Mario dies and the evaluation ends.
(cyan), Clyde (orange), Pinky (pink), and Blinky (red). Each ghost is given its own
unique and distinct “personality,” and operates differently, albeit with the same goal
to catch the protagonist. The aim is to eat all the dots are in the maze, consistently
avoiding the ghosts. Successful consumption of all dots in a level advances the player
to the next, more challenging level.
The evaluation in this game begins with a certain number of lives allotted to the
player and each contact with ghost results in loss of life. The seemingly routine play
is coupled with energizers, warp tunnels, and bonus items. The evaluation ends when
the lives count is exhausted.
8.4.2 Popular Datasets
Data is the primary and essential element for research in any domain. In-game con-
tent creation, some popular corpora are employed frequently when PCG is integrated
with GANs. Hence, in the following subsections, a brief overview of these datasets
is mentioned.
it still receives weekly entries for Doom games that further assists in the research
domain. The database consists of several different types of training data for Doom
including, levels, textures or skins, music-related, and many more.
TABLE 8.2
Popular Datasets in Video Game Content Generation
Number of
Dataset Games Available Most Prominent Games Dataset Composition
VGLC [56] 12 Super Mario Bros, Doom, Tiles, Graph, Vector
The Legend of Zelda
Idgames Archive [57, 58] 1 Doom Levels, Skins, Music, etc.
GVGAI [59, 60] No specific count Random generation of data Two categories: level and
rule generation
Generative Adversarial Networks Based PCG for Games 151
of progress monitoring, using DG, has been satisfactorily catered to. There has been
the utilization of Parallel Nash Memory (PNM) or the addition of fake uniform data
to act as a guiding component for adversarial training [10]. Another suggestion has
been to optimize games centric GANs by using mixing strategies, such as NE itself
with gradient descent. But so far this is only of theoretical interest as there is no
known DL (i.e., gradient-based) method yet to optimize these mixed strategies that
can stand the large support sizes needed in games.
Yet another unexplored area in game-centric GAN development is disentangle-
ment. Disentanglement is learning to distinguish between distinct, informative fac-
tors of variations of data. Unlike Bayesian generative models with their probabilistic
framework, in GANs, there is the perpetual absence of sample likelihood and poste-
rior inference for latent variables. Therefore, learning a factorized representation or
especially disentangling the interclass variation in GANs poses an obvious obstruc-
tion in its advancement. Improvisation in disentanglement for GANs specifically
utilized for games is essential to attain the goals of believability and expressivity that
PCG demands. There are limited disentanglement methods for GANs and even more
scarce in respect to gaming requirements. They can be classified into supervised
ones and unsupervised ones, however, the theoretical impossibility of unsupervised
disentanglement learning without inductive biases [65] has been a topic of debate and
has led to a piqued interest in contrastive learning [66–68], another barely explored
area, that needs little supervision, in contrast, to complete unsupervised alternatives.
8.6 CONCLUSION
GANs-based PCG for games has an excitingly promising potential to significantly
reduce costs and development time of games, improvise replay values by producing
high-quality fresh content with incredible aesthetics, while also managing the opti-
mization of memory. A fairly modern research topic, it is the descendent from ML
(DL) based PCG and is yet to be fully explored and the therefore little intersection
of academic research and industrial development is found. This survey attempts to
provide a comprehensive flow of how constructive PCG paved the path for AI-based
PCG and consequently MLPCG and eventually to PCG-GAN. It lists the various
challenges faced and potentials of former PCG models—constructive, search-based,
solver-based, simple CNN-based, and other regenerative-based models, among oth-
ers. This chapter also explores how PCG-GAN can prove to be a futuristic attempt
at providing a rich, photorealistic gaming experience and provide playable, authen-
tic gaming content superior to prior generated content and comparable to human
augmentations-based creations. A multitude of PCG models is examined, each with
a detailed description so that this survey can act as a single point stop for a com-
prehensive overview of all the work done in this field. Since the major point under
consideration is to demonstrate the potential of GAN on games, a variety of GAN-
based games are presented along with the datasets they utilize. This section also
sheds light on some of the ready-to-use off-the-shelf tools that have already been
and can further be utilized for advancement. Different models based on which vari-
ous researchers have trained the GANs for games have also been elaborated on to
understand and build upon.
Generative Adversarial Networks Based PCG for Games 153
Beyond the typical problem of hardware and GPU-related issues faced in most
GAN applications, the specific problems associated with the application of GANs
for games, such as the inability for disentanglement, lack of exploration of aspect
creations beyond level generations, overpowering of networks, etc. have also been
briefly discussed. Based on the problems faced an enormous amount of scope for
future work, this chapter concludes with the elucidation of some potential future
works that can be carried out for further advancements in the application of GANs
for games.
REFERENCES
1. Hendrikx, M., Meijer, S., Velden, J.V.D., & Iosup, A. Procedural Content Generation
for Games: A Survey. ACM Transactions on Multimedia Computing, Communications
and Applications (ACM TOMCCAP), 9, 1–22 (2013).
2. Barriga, N.A. A Short Introduction to Procedural Content Generation Algorithms for
Videogames. International Journal on Artificial Intelligence Tools, 28(11), 1–12 (2019),
1930001.
3. Togelius, J., Yannakakis, G.N., Stanley, K., & Browne, C. Search-Based Procedural
Content Generation: A Taxonomy and Survey. IEEE Transactions on Computational
Intelligence and AI in Games, 3, 172–186 (2011).
4. Smith, A.M., & Mateas, M. Answer Set Programming for Procedural Content Generation:
A Design Space Approach. IEEE Transactions on Computational Intelligence and AI in
Games, 3, 187–200 (2011).
5. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,
Courville, A.C., & Bengio, Y. Generative Adversarial Nets, NIPS, 2672–2680 (2014).
6. Kingma, D.P., & Welling, M. Auto-Encoding Variational Bayes. CoRR. abs/1312.6114
(2014).
7. Greff, K., Srivastava, R., Koutník, J., Steunebrink, B., & Schmidhuber, J. LSTM: A
Search Space Odyssey. IEEE Transactions on Neural Networks and Learning Systems,
28, 2222–2232 (2017).
8. Hochreiter, S., & Schmidhuber, J. LSTM can Solve Hard Long Time Lag Problems.
NIPS (1996).
9. Summerville, A., Snodgrass, S., Guzdial, M., Holmgård, C., Hoover, A.K., Isaksen,
A., Nealen, A., & Togelius, J. Procedural Content Generation via Machine Learning
(PCGML). IEEE Transactions on Games, 10, 257–270 (2018).
10. Oliehoek, F.A., Savani, R., Gallego-Posada, J., Pol, E.V., Jong, E.D., & Groß, R.
GANGs: Generative Adversarial Network Games. ArXiv. abs/1712.00679 (2017).
11. Kempka, M., Wydmuch, M., Runc, G., Toczek, J., & Jaśkowski, W. ViZDoom: A Doom-
based AI research platform for visual reinforcement learning. 2016 IEEE Conference
on Computational Intelligence and Games (CIG), 1–8 (2016).
12. Esparcia-Alcázar, A., García, A.M., Guervós, J.J., & García-Sánchez, P. Controlling
bots in a First-Person Shooter game using genetic algorithms. IEEE Congress on
Evolutionary Computation, 1–8 (2010).
13. Bojarski, S., & Congdon, C. REALM: A Rule-Based Evolutionary Computation Agent
that Learns to Play Mario. Proceedings of the 2010 IEEE Conference on Computational
Intelligence and Games, 83–90 (2010).
14. Rhalibi, A., & Merabti, M. A Hybrid Fuzzy ANN System for Agent Adaptation in
a First-Person Shooter. International Journal of Computer Games Technology, 2008,
432365:1–432365:18 (2008).
15. Karpov, I.V., Schrum, J., & Miikkulainen, R. Believable Bot Navigation via Playback
of Human Traces. Believable Bots (2012).
154 Deep Learning in Gaming and Animations
16. Volz, V., Schrum, J., Liu, J., Lucas, S., Smith, A., & Risi, S. Evolving Mario Levels in the
Latent Space of a Deep Convolutional Generative Adversarial Network. Proceedings of
the Genetic and Evolutionary Computation Conference (2018).
17. Awiszus, M., Schubert, F., & Rosenhahn, B. TOAD-GAN: Coherent Style Level
Generation from a Single Example. Proceedings of the AAAI Conference on Artificial
Intelligence and Interactive Digital Entertainment, 16(1), 10–16. ArXiv. abs/2008.01531
(2020).
18. Summerville, A., & Mateas, M. Super Mario as a String: Platformer Level Generation
via LSTMs. ArXiv. abs/1603.00930 (2016).
19. Sarkar, A., Yang, Z., & Cooper, S. Controllable Level Blending between Games using
Variational Autoencoders. ArXiv. abs/2002.11869 (2020).
20. Kim, S.W., Zhou, Y., Philion, J., Torralba, A., & Fidler, S. Learning to Simulate
Dynamic Environments With GameGAN. 2020 IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR), 1228–1237 (2020).
21. Dahlskog, S., Togelius, J., & Nelson, M.J. Linear Levels Through n-Grams. MindTrek,
200–206 (2014).
22. Jain, R., Isaksen, A., Holmgard, C., & Togelius, J. Autoencoders for level generation,
repair, and recognition. In Proceedings of the ICCC Workshop on Computational
Creativity and Games (2016).
23. Snodgrass, S., & Ontañón, S. Experiments in Map Generation Using Markov Chains.
FDG (2014).
24. Summerville, A., & Mateas, M. Sampling Hyrule: Multi-Technique Probabilistic Level
Generation for Action Role Playing Games. AIIDE (2015).
25. Lee, S., Isaksen, A., Holmgård, C., & Togelius, J. Predicting Resource Locations in
Game Maps Using Deep Convolutional Neural Networks. AAAI (2016).
26. Liu, J., Snodgrass, S., Khalifa, A., Risi, S., Yannakakis, G.N., & Togelius, J. Deep
Learning for Procedural Content Generation. Neural Computing and Applications. 33,
19–37 (2021).
27. Giacomello, E., Lanzi, P.L., & Loiacono, D. DOOM Level Generation Using Generative
Adversarial Networks. 2018 IEEE Games, Entertainment, Media Conference (GEM),
316–323 (2018).
28. Doom (franchise). This page was last edited on 13 September 2021, at 01:55 (UTC)
https://en.wikipedia.org/wiki/Doom (franchise).
29. Super Mario Bros. This page was last edited on 13 September 2021, at 17:08 (UTC)
https://en.wikipedia.org/wiki/Super_Mario_Bros.
30. Hansen, N., Müller, S., & Koumoutsakos, P. Reducing the Time Complexity of the
Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES).
Evolutionary Computation, 11, 1–18 (2003).
31. Giacomello, E., Lanzi, P.L., & Loiacono, D. Searching the Latent Space of a Generative
Adversarial Network to Generate DOOM Levels. 2019 IEEE Conference on Games
(CoG), 1–8 (2019).
32. Park, K., Mott, B., Min, W., Boyer, K., Wiebe, E., & Lester, J.C. Generating Educational
Game Levels with Multistep Deep Convolutional Generative Adversarial Networks.
2019 IEEE Conference on Games (CoG), 1–8 (2019).
33. Torrado, R., Khalifa, A., Green, M.C., Justesen, N., Risi, S., & Togelius, J. Bootstrapping
Conditional GANs for Video Game Level Generation. 2020 IEEE Conference on
Games (CoG), 41–48 (2020).
34. Zhang, H., Goodfellow, I.J., Metaxas, D.N., & Odena, A. Self-Attention Generative
Adversarial Networks. ICML (2019).
35. Gutierrez, J., & Schrum, J. Generative Adversarial Network Rooms in Generative Graph
Grammar Dungeons for The Legend of Zelda. 2020 IEEE Congress on Evolutionary
Computation (CEC), 1–8 (2020).
Generative Adversarial Networks Based PCG for Games 155
36. The Legend of Zelda. This page was last edited on 12 September 2021, at 15:53 (UTC)
https://en.wikipedia.org/wiki/The_Legend_of_Zelda.
37. Schrum, J., Volz, V., & Risi, S. CPPN2GAN: Combining Compositional Pattern
Producing Networks and GANs for Large-Scale Pattern Generation. Proceedings of
the 2020 Genetic and Evolutionary Computation Conference (2020).
38. Stanley, K. Compositional Pattern Producing Networks: A Novel Abstraction of
Development. Genetic Programming and Evolvable Machines, 8, 131–162 (2007).
39. Shaham, T.R., Dekel, T., & Michaeli, T. SinGAN: Learning a Generative Model From
a Single Natural Image. 2019 IEEE/CVF International Conference on Computer Vision
(ICCV), 4569–4579 (2019).
40. Bontrager, P., & Togelius, J. Fully Differentiable Procedural Content Generation
through Generative Playing Networks. ArXiv. abs/2002.05259 (2020).
41. Gobbi, J., Liello, L.D., Ardino, P., Morettin, P., Teso, S., & Passerini, A. Efficient
Generation of Structured Objects with Constrained Adversarial Networks. ArXiv.
abs/2007.13197 (2020).
42. Irfan, A., Zafar, A., & Hassan, S. Evolving Levels for General Games Using Deep
Convolutional Generative Adversarial Networks. 2019 11th Computer Science and
Electronic Engineering (CEEC), 96–101 (2019).
43. Kumaran, V., Mott, B., & Lester, J.C. Generating Game Levels for Multiple Distinct
Games With a Common Latent Space. AAAI, 16(1), 109–115 (2020).
44. Beckham, C., & Pal, C. A Step Towards Procedural Terrain Generation With GANs.
ArXiv. abs/1707.03383 (2017).
45. Wulff-Jensen, A., Rant, N.N., Møller, T.N., & Billeskov, J. Deep Convolutional
Generative Adversarial Network for Procedural 3D Landscape Generation Based on
DEM. ArtsIT/DLI, 85–94 (2017).
46. Ferranti, Jonathan de. Viewfinder Panoramas (2012).
47. Spick, R.R., & Walker, J. Realistic and Textured Terrain Generation using GANs.
European Conference on Visual Media Production (2019).
48. Jetchev, N., Bergmann, U., & Vollgraf, R. Texture Synthesis With Spatial Generative
Adversarial Networks. ArXiv. abs/1611.08207 (2016).
49. Wang, T., & Kurabayashi, S. Sketch2Map: A Game Map Design Support System
Allowing Quick Hand Sketch Prototyping. 2020 IEEE Conference on Games (CoG),
596–599 (2020).
50. Fadaeddini, A., Majidi, B., & Eshghi, M. A Case Study of Generative Adversarial
Networks for Procedural Synthesis of Original Textures in Video Games. 2018
2nd National and 1st International Digital Games Research Conference: Trends,
Technologies, and Applications (DGRC), 118–122 (2018).
51. Horsley, L., & Liebana, D.P. Building an Automatic Sprite Generator with Deep
Convolutional Generative Adversarial Networks. 2017 IEEE Conference on
Computational Intelligence and Games (CIG), 134–141 (2017).
52. Hong, S., Kim, S., & Kang, S. Game Sprite Generator Using a Multi Discriminator
GAN. KSII Transactions on Internet and Information Systems, 13, 4255–4269 (2019).
53. Serpa, Y.R., & Rodrigues, M.A. Towards Machine-Learning Assisted Asset Generation
for Games: A Study on Pixel Art Sprite Sheets. 2019 18th Brazilian Symposium on
Computer Games and Digital Entertainment (SBGames), 182–191 (2019).
54. Pac-Man. This page was last edited on 14 September 2021, at 00:21 (UTC) https://
en.wikipedia.org/wiki/Pac-Man.
55. Blizzard Entertainment. StarCraft (1998). https://starcraft2.com/en-us/.
56. Summerville, A., Snodgrass, S., Mateas, M., & Ontañón, S. The VGLC: The Video
Game Level Corpus. ArXiv. abs/1606.07487 (2016).
57. https://www.doomworld.com/idgames/.
58. idgames archive | Doom Wiki | Fandom, https://doom.fandom.com/wiki/Idgames_archive.
156 Deep Learning in Gaming and Animations
59. http://www.gvgai.net/.
60. Khalifa, A., Liebana, D.P., Lucas, S., & Togelius, J. General Video Game Level
Generation. Proceedings of the Genetic and Evolutionary Computation Conference
2016 (2016).
61. https://topex.ucsd.edu/WWW_html/srtm30_plus.html.
62. Mescheder, L.M., Geiger, A., & Nowozin, S. Which Training Methods for GANs do
actually Converge? ICML (2018).
63. Gidel, G., Hemmat, R.A., Pezeshki, M., Huang, G., Priol, R.L., Lacoste-Julien,
S., & Mitliagkas, I. Negative Momentum for Improved Game Dynamics. ArXiv.
abs/1807.04740 (2019).
64. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., & Abbeel, P. InfoGAN:
Interpretable Representation Learning by Information Maximizing Generative
Adversarial Nets. NIPS (2016).
65. Locatello, F., Bauer, S., Lucic, M., Gelly, S., Schölkopf, B., & Bachem, O. Challenging
Common Assumptions in the Unsupervised Learning of Disentangled Representations.
ArXiv. abs/1811.12359 (2019).
66. He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R.B. Momentum Contrast for Unsupervised
Visual Representation Learning. 2020 IEEE/CVF Conference on Computer Vision and
Pattern Recognition (CVPR), 9726–9735 (2020).
67. Wu, Z., Xiong, Y., Yu, S., & Lin, D. Unsupervised Feature Learning via Non-parametric
Instance Discrimination. 2018 IEEE/CVF Conference on Computer Vision and Pattern
Recognition, 3733–3742 (2018).
68. Chen, T., Kornblith, S., Norouzi, M., & Hinton, G.E. A Simple Framework for
Contrastive Learning of Visual Representations. ArXiv. abs/2002.05709 (2020).
Index
A G
Adventure Games and Puzzle Design Game Theory 3, 6
37, 44 Gaming Experience 103, 107
AI for Adaptive Computer Games 37, 41 Generative Adversarial Networks 123, 126
AI for Computer Games 37, 38
AI in Gaming 37, 60 H
AI in Video Games: Toward a Unified
Framework 37, 41 Heuristic Function 3, 4, 5
AI Latest Techniques in Animation 19, 23 Home Automation 65, 74
AR Technology 19, 26 Horror Genre and Video Games 37, 45
VR Technology 19, 28 Hybrid Fuzzy ANN Systems 139
AI Latest Techniques in Animation
19, 23 I
AI Reinvent 103, 106
The Image Classification Theory 91, 96
AI’S Role in Animation 19, 21
Image Generation and Recognition 123, 127
How AI Replaces Animation 19, 21
Image Processing Using Deep Learning 91, 98
Various Agents in AI 19, 22
Important Architectures in Deep Learning
Alpha-Beta Pruning 8
91, 94
Augmented Reality 110
Interactive Narrative 37, 44
Internet of Things (IoT) 65, 67
C Communication Block 65, 68
Protocols 65, 69
CBGIR 91, 94 Sensing Block 65, 68
CGANs 127 IoT and 5G Technology 65, 78
Classification 65, 81 IoT Networks 83
Complex Game AI 103, 115
Cyclical GANs 128
L
D Logistics 65, 75
E N
Early Game 103, 115 Narrative Game Mechanics 37, 43
eHealth 65, 74 Narrative in Video Games 37, 42
Extended Reality 103, 110 Nondeterministic 105
F P
Finite State Machines 103, 116 PCGML 137, 141
Future Aspects of Animation with AI Player Experience 37, 48
19, 33 Procedural Content Generation (PCG)
Futuristic Approach 37 137, 140
157
158 Index
Q T
Q-Algorithm for AI in Gaming 37, Three-Dimensional Visualization Techniques
60 103, 108
TOAD-GANs 139
The Traditional and Modern Animation
R 19, 30
Regression 65, 81
U
SAGANs 128 V
Search Tree 2, 3, 8
Simulation Based on Physics 103, 108 Virtual Reality 103, 109
Smart Agriculture 65, 74 Voice in Gaming 103, 107
Stack GANs 128
Supervised Learning 65, 80 Z
Systematic Analysis of Image Generation Using
GANs 123, 131 Zero-Sum Game 4, 6