Deep Reinforcement Learning in Unity: With Unity ML Toolkit 1st Edition Abhilash Majumder

Download the full version of the textbook now at textbookfull.
com
Deep Reinforcement Learning in Unity: With

Unity ML Toolkit 1st Edition Abhilash
Majumder
https://textbookfull.com/product/deep-
reinforcement-learning-in-unity-with-unity-ml-
toolkit-1st-edition-abhilash-majumder/
Explore and download more textbook at https://textbookfull.com

Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.
Deep Reinforcement Learning in Unity With Unity ML Toolkit

1st Edition Abhilash Majumder
https://textbookfull.com/product/deep-reinforcement-learning-in-unity-
with-unity-ml-toolkit-1st-edition-abhilash-majumder-2/
textbookfull.com
Mastering UI Development With Unity An In Depth Guide to

Developing Engaging User Interfaces with Unity 5 Unity
2017 and Unity 2018 1st Edition Ashley Godbold
https://textbookfull.com/product/mastering-ui-development-with-unity-
an-in-depth-guide-to-developing-engaging-user-interfaces-with-
unity-5-unity-2017-and-unity-2018-1st-edition-ashley-godbold/
textbookfull.com
Learning C# programming with Unity 3D Alex Okita
https://textbookfull.com/product/learning-c-programming-with-
unity-3d-alex-okita/
textbookfull.com
Conspiracy Theories and Other Dangerous Ideas Sunstein

Cass R
https://textbookfull.com/product/conspiracy-theories-and-other-
dangerous-ideas-sunstein-cass-r/
textbookfull.com
America's Wars on Democracy in Rwanda and the DR Congo
Justin Podur
https://textbookfull.com/product/americas-wars-on-democracy-in-rwanda-
and-the-dr-congo-justin-podur/
textbookfull.com
Sex Politics and Society The Regulation of Sexuality Since

1800 Fourth Edition Jeffrey Weeks
https://textbookfull.com/product/sex-politics-and-society-the-
regulation-of-sexuality-since-1800-fourth-edition-jeffrey-weeks/
textbookfull.com
Intelligent Web Data Management Software Architectures and

Emerging Technologies 1st Edition Kun Ma
https://textbookfull.com/product/intelligent-web-data-management-
software-architectures-and-emerging-technologies-1st-edition-kun-ma/
textbookfull.com
Foreign Investment Promotion Governance and Implementation

in Central Eastern European Regions Pawe■ Capik
https://textbookfull.com/product/foreign-investment-promotion-
governance-and-implementation-in-central-eastern-european-regions-
pawel-capik/
textbookfull.com
The Patient as Agent of Health and Health Care: Autonomy

in Patient-Centered Care for Chronic Conditions 1st
Edition Sullivan
https://textbookfull.com/product/the-patient-as-agent-of-health-and-
health-care-autonomy-in-patient-centered-care-for-chronic-
conditions-1st-edition-sullivan/
textbookfull.com
Organizational Behavior A Critical Thinking Approach
Christopher P. Neck
https://textbookfull.com/product/organizational-behavior-a-critical-
thinking-approach-christopher-p-neck/
textbookfull.com
Deep
Reinforcement
Learning in
Unity
With Unity ML Toolkit
—
Abhilash Majumder
Deep Reinforcement
Learning in Unity
With Unity ML Toolkit
Abhilash Majumder
Deep Reinforcement Learning in Unity: With Unity ML Toolkit
Abhilash Majumder
Pune, Maharashtra, India
ISBN-13 (pbk): 978-1-4842-6502-4 ISBN-13 (electronic): 978-1-4842-6503-1

https://doi.org/10.1007/978-1-4842-6503-1
Copyright © 2021 by Abhilash Majumder

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with
every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an
editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the
trademark.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not
identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to
proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of publication,
neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or
omissions that may be made. The publisher makes no warranty, express or implied, with respect to the
material contained herein.
Managing Director, Apress Media LLC: Welmoed Spahr
Acquisitions Editor: Spandana Chatterjee
Development Editor: Laura Berendson
Coordinating Editor: Shrikant Vishwakarma
Cover designed by eStudioCalamar
Cover image designed by Pexels
Distributed to the book trade worldwide by Springer Science+Business Media LLC, 1 New York Plaza, Suite
4600, New York, NY 10004. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-sbm.
com, or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member (owner)
is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a Delaware
corporation.
For information on translations, please e-mail booktranslations@springernature.com; for reprint, paperback,
or audio rights, please e-mail bookpermissions@springernature.com.
Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook versions and
licenses are also available for most titles. For more information, reference our Print and eBook Bulk Sales
web page at http://www.apress.com/bulk-sales.
Any source code or other supplementary material referenced by the author in this book is available to
readers on GitHub via the book’s product page, located at www.apress.com/978-1-4842-6502-4. For more
detailed information, please visit http://www.apress.com/source-code.
Printed on acid-free paper
This book is dedicated to my parents, Abhijit and Sharbari Majumder,
and my late grandfather, Shri Paresh Chandra Majumder.
Table of Contents
About the Author�� xi
About the Technical Reviewer�� xiii

Acknowledgments��xv
Introduction��xvii
Chapter 1: Introduction to Reinforcement Learning�� 1

OpenAI Gym Environment: CartPole�� 3
Installation and Setup of Python for ML Agents and Deep Learning�� 4
Playing with the CartPole Environment for Deep Reinforcement Learning�� 8
Visualization with TensorBoard�� 11
Unity Game Engine�� 12
Markov Models and State-Based Learning�� 14
Concepts of States in Markov Models�� 15
Markov Models in Python�� 17
Downloading and Installing Unity�� 18
Markov Model with Puppo in Unity�� 20
Hidden Markov Models�� 29
Concepts of Hidden Markov Models�� 29
Hidden Markov Model with Tensorflow�� 31
Hidden Markov Model Agent in Unity�� 32
Bellman Equation�� 39
Bellman Agent Implementation in Unity�� 40
Creating a Multi-Armed Bandit Reinforcement Learning Agent in Unity�� 48
Strategies Involved in Multi-Armed Bandits�� 49
Multi-Armed Bandit Simulation in Unity with UCB Algorithm�� 51
Building Multi-Armed Bandit with Epsilon-Greedy and Gradient Bandit Algorithms�� 58
v
Table of Contents
Value and Policy Iterations�� 60

Implementing Q-Learning Policy Using Taxi Gym Environment�� 63
Q-Learning in Unity�� 67
Summary�� 70
Chapter 2: Pathfinding and Navigation�� 73

Pathfinding Algorithms�� 75
Variants of the A* Algorithm�� 94
Other Variants of Pathfinding�� 107
Pathfinding in Unity�� 108
Dijkstra Algorithm in Unity�� 108
A* Algorithm Simulation in Unity�� 118
Navigation Meshes�� 127
Navigation Mesh and Puppo�� 129
Obstacle Meshes and Puppo�� 139
Off Mesh Links and Puppo�� 144
Creating Enemy AI�� 147
Summary�� 152
Chapter 3: Setting Up ML Agents Toolkit�� 155

Overview of the Unity ML Agents Toolkit�� 156
Installing Baselines and Training A Deep Q-Network�� 158
Installing Unity ML Agents Toolkit�� 161
Cloning the Github Unity ML Agents Repository�� 163
Exploring the Unity ML Agents Examples�� 164
Local Installation of Unity ML Agents�� 166
Installing ML Agents from Python Package Index�� 169
Installation in Virtual Environments�� 170
Advanced Local Installation for Modifying the ML Agents Environment�� 172
Configuring Components of ML Agents: Brain and Academy�� 173
Brain-Academy Architecture�� 174
Behavior and Decision Scripts in Unity ML Agents Version 1.0�� 176
vi
Table of Contents
Linking Unity ML Agents with Tensorflow�� 181

Barracuda: Unity Inference Engine�� 181
Training 3D Ball Environment with Unity ML Agents�� 183
Visualization with TensorBoard�� 189
Playing Around with Unity ML Agents Examples�� 192
Summary�� 205
Chapter 4: Understanding brain agents and academy�� 209

Understanding the Architecture of the Brain�� 211
Sensors�� 212
Policies�� 245
Inference�� 257
Demonstrations�� 265
Communicators�� 270
Understanding the Academy and Agent�� 272
Training an ML Agent with a Single Brain�� 287
Attach Behavior Parameters Script�� 288
Writing Agent Script�� 290
Training Our Agent�� 296
Visualize with TensorBoard�� 298
Running in Inference Mode�� 300
Generic Hyperparameters�� 300
Summary�� 302
Chapter 5: Deep Reinforcement Learning�� 305

Fundamentals of Neural Networks�� 306
Perceptron Neural Network�� 306
Dense Neural Network�� 314
Convolution Neural Network�� 319
Deep Learning with Keras and TensorFlow�� 327
Dense Neural Network�� 327
Convolution Neural Networks�� 334
Building an Image Classifier Model with Resnet-50�� 342
vii
Table of Contents
Deep Reinforcement Learning Algorithms�� 344

On Policy Algorithms�� 345
Off-Policy Algorithms�� 375
Model-Free RL: Imitation Learning–Behavioral Cloning�� 390
Building a Proximal Policy Optimization Agent for Puppo�� 393
Interfacing with Python API�� 403
Interfacing with Side Channels�� 407
Training ML Agents with Baselines�� 409
Understanding Deep RL policies in Unity ML Agents and Memory-Based Networks�� 415
Model Overrider Script�� 415
Models Script: Python�� 424
PPO in Unity ML Agents�� 434
Long Short-Term Memory Networks in ML Agents�� 438
Simplified Hyperparameter Tuning For Optimization�� 441
Hyperparameters for PPO�� 443
Analyzing Hyperparameters through TensorBoard Training�� 443
Summary�� 445
Chapter 6: Competitive Networks for AI Agents�� 449

Curriculum Learning�� 449
Curriculum Learning in ML Agents�� 465
Extended Deep Reinforcement Learning Algorithms�� 480
Deep Deterministic Policy Gradient�� 480
Twin Delayed DDPG�� 485
Adversarial Self-Play and Cooperative Learning�� 488
Adversarial Self-Play�� 488
Cooperative Learning�� 489
Soccer Environment for Adversarial and Cooperative Learning�� 490
Training Soccer Environment�� 499
Tensorboard Visualization�� 501
viii
Table of Contents
Building an Autonomous AI Agent for a Mini Kart Game�� 502

Training Tiny Agent with PPO Policy�� 508
Visualization and Future Work�� 509
Summary�� 510
Chapter 7: Case Studies in ML Agents�� 513

Evolutionary Algorithms�� 514
Genetic Algorithm�� 516
Evolution Strategies�� 521
Case Study: Obstacle Tower Challenge�� 528
The Details of Obstacle Tower�� 529
Procedural Level Generation: Graph Grammar�� 531
Generalization�� 532
Challenge Winners and Proposed Algorithms�� 534
Installation and Resources�� 535
Interacting with Gym Wrapper�� 542
Case Study: Unity ML Agents Challenge I�� 545
Google Dopamine and ML Agents�� 548
Summary�� 551
Index�� 553
ix
About the Author
Abhilash Majumder is a natural language processing
research engineer for HSBC (UK/India) and technical mentor
for Udacity (ML). He also has been associated with Unity
Technologies and was a speaker at Unite India-19, and has
educated close to 1,000 students from EMEA and SEPAC
(India) on Unity. He is an ML contributor and curator for
Open Source Google Research, Tensorflow, and Unity ML
Agents and a creator of ML libraries under Python Package
Index (PyPI). He is a speaker for NLP and deep learning
for Pydata-Los Angeles. He is an online educationalist for
Udemy, and a deep learning mentor for Upgrad. He is an
erstwhile graduate from the National Institute of Technology, Durgapur (NIT-D) majoring
in NLP, Machine Learning, and Applied Mathematics. He can be reached via email at
debabhi1396@gmail.com
Abhilash was a former apprentice/student ambassador for Unity Technologies,
where he educated corporate employees and students on using general Unity for game
development. He was a technical mentor (AI programming) for the Unity Ambassadors
Community and Content Production. He has been associated with Unity Technologies
for general education, with an emphasis on graphics and machine learning. He is a
community moderator for machine learning (ML Agents) sessions organized by Unity
Technologies (Unity Learn). He is one of the first content creators for Unity Technologies
India (since 2017) and is responsible for the growth of the community in India under the
guidance of Unity Technologies.
xi
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
About the Technical Reviewer
Ansh Shah is pursuing an MSc Physics and BE Mechanical
from Bits Pilani University, India. By day, he is a student.
By night, he is a robotics and machine learning enthusiast.
He is a core member of BITS ROBOCON, a technical team
at college, and is currently working on quadcopter and
quadruped.
xiii
Acknowledgments
The amount of dedication and support that I have received in the making of this book
has left me amazed. First, I would like to thank my family, Mr. Abhijit Majumder and
Mrs. Sharbari Majumder, who have been instrumental in supporting me all the way.
I would also like to extend my heartfelt thanks to the entire Apress Team, without
whom this would not have been possible. Special thanks to Mrs. Spandana Chatterjee,
the Acquisition Editor, Mr. Shrikant Vishwakarma, the Coordinating Editor, and Laura
Berendson, the Development Editor, for their constant support and thorough reviews.
Ansh Shah, the Technical Reviewer of this book, has also played an important role and
I extend my thanks to him.
I would also like to share this space in thanking my mentor, Carl Domingo from
Unity Technologies, who has been so instrumental in guiding me from the beginning
of my journey with Unity. The Unity Machine Learning team deserves mention, as this
book would not have been possible without their constant efforts to make the ML Agents
platform amazing. I especially thank Dr. Danny Lange, whose sessions on machine
learning have been instrumental in understanding the framework and the concepts.
I am grateful to everyone who helped in the entire process to make this book, which
would help readers understand the beauty of deep reinforcement learning.
xv
Introduction
Machine learning has been instrumental in shaping the scope of technology since
its inception. ML has played an important role in the development of things such as
autonomous vehicles and robotics. Deep reinforcement learning is that field of learning
where agents learn with help of rewards—a thought which has been derived from
nature. Through this book, the author tries to present the diversity of reinforcement
learning algorithms in game development as well as in scientific research. Unity,
the cross-platform engine that is used in a plethora of tasks, from visual effects and
cinematography to machine learning and high performance graphics, is the primary
tool that is used in this book. With the power of the Unity ML Agents Toolkit, the deep
reinforcement learning framework built by Unity, the author tries to show the vast
possibilities of this learning paradigm.
The book starts with an introduction to state-based reinforcement learning, from
Markov processes to Bellman equations and Q-learning, which sets the ground for
the successive sections. A plethora of diverse pathfinding algorithms, from Dijkstra to
sophisticated variants of A* star, have been provided along with simulations in Unity.
The book also covers how navigation meshes work for automated pathfinding in Unity.
An introduction to the ML Agents Toolkit, from standard process for installation to
training an AI agent with deep reinforcement learning algorithm (proximal policy
operation [PPO]) is provided as a starter. Along the course of this book, there is an
extensive usage of the Tensorflow framework along with OpenAI Gym environments for
proper visualizations of complex deep reinforcement learning algorithms in terms of
simulations, robotics, and autonomous agents. Successive sections of the book involve
an in-depth study of the variety of on- and off-policy algorithms, ranging from discrete
SARSA/Q-learning to actor critic variants, deep Q-network variants, PPO, and their
implementations using the Keras Tensorflow framework on Gym. These sections are
instrumental in understanding how different simulations such as the famous Puppo
(Unity Berlin), Tiny agents, and other ML Agents samples from Unity are created and
built. Sections with detailed descriptions about how to build simulations in Unity using
the C# software development kit for ML Agents and training them using soft actor critic
(SAC), PPO, or behavioral cloning algorithms such as GAIL are provided.
xvii
Introduction
The latter part of this book provides an insight into curriculum learning and
adversarial networks with an analysis of how AI agents are trained in games such as FIFA.
In all these sections, a detailed description of the variants of neural networks—MLP,
convolution networks, recurrent networks along with long short-term memory and GRU
and their implementations and performance are provided. This is especially helpful as
they are used extensively during building the deep learning algorithms. The importance
of convolution networks for image sampling in Atari-based 2D games such as Pong has
been provided. The knowledge of computer vision and deep reinforcement learning is
combined to produce autonomous vehicles and driverless cars, which is also provided as
an example template (game) for the readers to build upon.
Finally, this book also contains an in-depth review of the Obstacle Tower
Challenge, which was organized by Unity Technologies to challenge state-of-the-art
deep reinforcement learning algorithms. Sections on certain evolutionary algorithms
along with the Google Dopamine framework has been provided for understanding
the vast field of reinforcement learning. Through this book, the author hopes to infuse
enthusiasm and foster research among the readers in the field of deep reinforcement
learning.
xviii
CHAPTER 1
Introduction to
Reinforcement Learning
Reinforcement learning (RL) is a paradigm of learning algorithms that are based on
rewards and actions. The state-based learning paradigm is different from generic
supervised and unsupervised learning, as it does not typically try to find structural
inferences in collections of unlabeled or labeled data. Generic RL relies on finite state
automation and decision processes that assist in finding an optimized reward-based
learning trajectory. The field of RL relies heavily on goal-seeking, stochastic processes
and decision theoretic algorithms, which is a field of active research. With developments
in higher order deep learning algorithms, there has been huge advancement in this field
to create self-learning agents that can achieve a goal by using gradient convergence
techniques and sophisticated memory-based neural networks. This chapter will
focus on the fundamentals of the Markov Decision Process (MDP), hidden Markov
Models (HMMs) and dynamic programming for state enumeration, Bellman’s iterative
algorithms, and a detailed walkthrough of value and policy algorithms. In all these
sections, there will be associated python notebooks for better understanding of the
concepts as well as simulated games made with Unity (version 2018.x).
The fundamental aspects in an academy of RL are agent(s) and environment(s).
Agent refers to an object that uses learning algorithms to try and explore rewards
in steps. The agent tries to optimize a suitable path toward a goal that results in
maximization of the rewards and, in this process, tries to avoid punishing states.
Environment is everything around an agent; this includes the states, obstacles, and
rewards. The environment can be static as well as dynamic. Path convergence in a static
environment is faster if the agent has sufficient buffer memory to retain the correct
trajectory toward the goal as it explores different states. Dynamic environment pose
a stronger challenge for agents, as there is no definite trajectory. The second use-case
1
© Abhilash Majumder 2021
A. Majumder, Deep Reinforcement Learning in Unity, https://doi.org/10.1007/978-1-4842-6503-1_1
Chapter 1 Introduction to Reinforcement Learning
requires sufficient deep memory network models like bidirectional long short-term
memory (LSTM) to retain certain key observations that remain static in the dynamic
environment. Figuratively generic reinforcement learning can be presented as shown in
Figure 1-1.
Figure 1-1. Interaction between agent and environment in reinforcement learning
The set of variables that control and govern the interaction between the agent and
the environment includes {state(S), reward(R), action(A)}.
• State is a set of possible enumerated states provided in the

environment: {s0, s1, s2, … sn}.
• Reward is the set of possible rewards present in particular states in
the environment: {r0, r1, r2, …, rn}.
• Action is the set of possible actions that the agent can take to
maximize its rewards: {A0, A1, A2, … An}.
2
OpenAI Gym Environment: CartPole

To understand the roles of each of these in an RL environment, let us try to study the
CartPole environment from OpenAI gym. OpenAI gym includes many environments for
research and study of classic RL algorithms, robotics, and deep RL algorithms, and this is
used as a wrapper in Unity machine learning (ML) agents Toolkit.
The CartPole environment can be described as a classical physics simulation system
where a pole is attached to an “un-actuated” joint to a cart. The cart is free to move along
a frictionless track. The constraints on the system involve applying a force of +1and -1
to the cart. The pendulum starts upright, and the goal is to prevent it from falling over.
A reward of +1 is provided for every timestamp the pole remains upright. When the
angle of inclination is greater than 15 degrees from the normal, the episode terminates
(punishment). If the cart moves more than 2.4 units either way from the central line, the
episode terminates. Figure 1-2 depicts the environment.
Figure 1-2. CartPole environment from OpenAI gym
The possible states, rewards, and actions sets in this environment include:
• States: An array of length 4.:[cart position, cart velocity, pole

angle, pole tip velocity] such as [4.8000002e+00,3.4028235e+38
,4.1887903e-01,3.4028235e+38]
3
• Rewards: +1 for every timestamp the pole remains upright
• Actions: integer array of size 2 : [left direction, right direction], which

controls the direction of motion of the cart such as [-1,+1]
• Termination: if the cart shifts more than 2.4 units from the center or
the pendulum inclines more than 15 degrees
• Objective: to keep the pendulum or pole upright for 250 time-steps

and collect rewards more than 100 points
Installation and Setup of Python for ML Agents

and Deep Learning
To visualize this environment, installation of Jupyter notebook is required, which can be
installed from the Anaconda environment. Download Anaconda (recommended latest
version for Python), and Jupyter notebooks will be installed as well.
Downloading Anaconda also installs libraries like numpy, matplotlib, and sklearn,
which are used for generic machine learning. Consoles and editors like IPython
Console, Spyder, Anaconda Prompt are also installed with Anaconda. Anaconda
Prompt should be set as an environment PATH variable. Preview of the terminal is
shown in Figure 1-3.
Note Anaconda Navigator is installed with Anaconda. This is an interactive

dashboard application where options for downloading Jupyter notebook, Spyder,
IPython, and JupyterLab are available. The applications can also be started by
clicking on them.
4
Figure 1-3. Anaconda navigator terminal
Jupyter notebook can be installed by using pip command as:
pip3 install –upgrade pip

pip3 install jupyter notebook
For running the Jupyter notebook, open Anaconda Prompt or Command Prompt
and run the following command:
jupyter notebook
Alternatively, Google Colaboratory (Google Colab) runs Jupyter notebooks on the

cloud and is saved to local Google drive. This can be used as well for notebook sharing
and collaboration. The Google Colaboratory is shown in Figure 1-4.
5
Figure 1-4. Google Colaboratory notebook
To start, create a new Python3 kernel notebook, and name it as CartPole

environment. In order to simulate and run the environment, there are certain libraries
and frameworks required to be installed.
• Install Gym : Gym is the collection of environments created by

OpenAI, which contains different environments for developing RL
algorithms.
Run the command in Anaconda Prompt, Command Prompt :
pip install gym
Or run this command from Jupyter notebook or Google Colab

notebook
!pip install gym
• Install Tensorflow and Keras: Tensorflow is an open-source deep

learning framework developed by Google that will be used for
creating neural network layers in deep RL. Keras is an abstraction
(API) over Tensorflow and contains all the built-in functionalities of
Tensorflow with ease of use. The commands are as follows :
pip install tensorflow>=1.7

pip install keras
6
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
These commands are for installation through Anaconda Prompt

or Command Prompt. The version of Tensorflow used later in
this book for Unity ML agents is 1.7. However, for integration
with Unity ML agents, Tensorflow version 2.0 can be used as
well. If issues arise due to mismatch of versions, then that can be
resolved by going through the documentation of Unity ML agents
versioning and compatibility with Tensorflow, and the latter can
be reinstalled just by using the pip command.
For Jupyter notebook or Colab installation of Tensorflow and

Keras, the following commands are required:
!pip install tensorflow>=1.7

!pip install keras
Note Tensorflow has nightly builds that are released every day with a version
number, and this can be viewed in the Python Package Index (Pypi) page of
Tensorflow. These builds are generally referred to as tf-nightly and may have
an unstable compatibility with Unity ML agents. However, official releases are
recommended for integration with ML agents, while nightly builds can be used for
deep learning as well.
• Install gym pyvirtualdisplay and python opengl: These libraries and

frameworks (built for OpenGL API) will be used for rendering the
Gym environment in Colab notebook. There are issues with xvfb
installation locally in Windows, and hence Colab notebooks can be
used for displaying the Gym environment training. The commands
for installation in Colab notebook are as follows:
!apt-get install –y xvfb python-opengl > /dev/null 2>&1

!pip install gym pyvirtualdisplay > /dev/null 2>&1
Once the installation is complete, we can dive into the CartPole environment and try
to gain more information on the environment, rewards, states, and actions.
7
laying with the CartPole Environment for Deep

P
Reinforcement Learning
Open the “Cartpole-Rendering.ipynb” notebook. It contains the starter code for setting
up the environment. The first section contains import statements to import libraries in
the notebook.
import gym
import numpy as np
import matplotlib.pyplot as plt
from IPython import display as ipythondisplay
The next step involves setting up the dimensions of the display window to visualize
the environment in the Colab notebook. This uses the pyvirtualdisplay library.
from pyvirtualdisplay import Display

display = Display(visible=0, size=(400, 300))
display.start()
Now, let us load the environment from Gym using the gym.make command and look
into the states and the actions. Observation states refer to the environment variables that
contain the key factors like cart velocity and pole velocity and is an array of size 4. The
action space is an array of size 2, which refers to the binary actions (moving left or right).
The observation space also contains high and low values as boundary values for the
problem.
env = gym.make("CartPole-v0")
#Action space->Agent
print(env.action_space)
#Observation Space->State and Rewards
print(env.observation_space)
print(env.observation_space.high)
print(env.observation_space.low)
This is shown in Figure 1-5.
8
Figure 1-5. Observation and action space in CartPole environment
After running, the details appear in the console. The details include the different
action spaces as well as the observation steps.
Let us try to run the environment for 50 iterations and check the values of rewards
accumulated. This will simulate the environment for 50 iterations and provide insight
into how the agent balances itself with the baseline OpenAI model.
env = gym.make("CartPole-v0")
env.reset()
prev_screen = env.render(mode='rgb_array')
plt.imshow(prev_screen)
for i in range(50):
action = env.action_space.sample()
#Get Rewards and Next States
obs, reward, done, info = env.step(action)
screen = env.render(mode='rgb_array')
print(reward)
plt.imshow(screen)
ipythondisplay.clear_output(wait=True)
ipythondisplay.display(plt.gcf())
9
if done:
break
ipythondisplay.clear_output(wait=True)
env.close()
The environment is reset initially with the env.reset() method. For each of the 50
iterations, env.action_space.sample() method tries to sample most favorable states or
rewarding states. The sampling method can use tabular discrete RL algorithms like
Q-learning or continuous deep RL algorithms like deep-Q–network (DQN). There is
a discount factor that is called at the start of every iteration to discount the rewards of
the previous timestamp, and the pole agent tries to find new rewards accordingly. The
env.step(action) chooses from a “memory” or previous actions and tries to maximize
its rewards by staying upright as long as possible. At the end of each action step, the
display changes to render a new state of the pole. The loop finally breaks if the iterations
have been completed. The env.close() method closes the connection to the Gym
environment.
This has helped us to understand how states and rewards affect an agent. We will get
into the details of an in-depth study of modeling a deep Q-learning algorithm to provide
a faster and optimal reward-based solution to the CartPole problem. The environment
has observation states that are discrete and can be solved by using tabular RL algorithms
like Markov-based Q-learning or SARSA.
Deep learning provides more optimization by converting the discrete states into
continuous distributions and then tries to apply high-dimensional neural networks
to converge the loss function to a global minima. This is favored by using algorithms
like DQN, double deep-Q-network (DDQN), dueling DQN, actor critic (AC), proximal
policy operation (PPO), deep deterministic policy gradients (DDPG), trust region policy
optimization (TRPO), soft actor critic (SAC). The latter section of the notebook contains
a deep Q-learning implementation of the CartPole problem, which will be explained
in later chapters. To highlight certain important aspects of this code, there is a deep
learning layer made with Keras and also for each iteration the collection of state, action,
and rewards are stored in a replay memory buffer. Based on the previous states of the
buffer memory and the rewards for the previous steps, the pole agent tries to optimize
the Q-learning function over the Keras deep learning layers.
10
Visualization with TensorBoard

The visualization of loss at each iteration of the training process signifies the extent to
which deep Q-learning tries to optimize the position of the pole in an upright manner
and balances the action array for greater rewards. This visualization has been made in
TensorBoard, which can be installed by typing the line in Anaconda Prompt.
pip install tensorboard
To start the TensorBoard visualization in Colab or Jupyter Notebook, the following

lines of code will help. While it is prompted by the Console to use the latest version of
Tensorflow (tf>=2.2), there is not a hard requirement for this, as it is compatible with all
the Tensorflow versions. Tensorboard setup using Keras can also be implemented using
older versions (Tensorboard) such as 1.12 or as low as 1.2. The code segment is the same
across versions for starting TensorBoard. It is recommended to import these libraries in
Colab, as in that case, we have the flexibility to upgrade/downgrade different versions of
our libraries (Tensorflow, Keras, or others) during runtime. This also helps to resolve the
compatibility issues with different versions when installed locally. We can install Keras
2.1.6 for the Tensorflow 1.7 version locally as well.
from keras.callbacks import TensorBoard

% load_ext tensorboard
% tensorboard –-logdir log
TensorBoard starts on port 6006. To include the episodes of training data inside the
logs, a separate logs file is created at runtime as follows:
tensorboard_callback = TensorBoard(
log_dir='./log', histogram_freq=1,
write_graph=True,
write_grads=True,
batch_size=agent.batch_size,
write_images=True)
To reference the tensorboard_callbacks for storing the data, callbacks=[tensorboard_

callback] is added as an argument in model.fit() method as follows:
self.model.fit(np.array(x_batch),np.array(y_batch),batch_size=len(x_batch),
verbose=1,callbacks=[tensorboard_callback])
11
Random documents with unrelated
content Scribd suggests to you:
He was prudent: "Go ahead!"
"I love you, Roger, but I should like to be sincere. From my
childhood I have lived alone a good deal and enjoyed a great deal of
freedom. My father left in me a spirit of independence, which I
haven't abused, because it seemed quite natural to me, and because
it was wholesome. So I have acquired certain habits of mind that I
should find difficult, now, to do without. I know that I am rather
different from the majority of young girls of my class. Yet I believe
that what I feel they feel too; only I dare to say it, and I have a
clearer conscience. You ask me to unite my life with yours. It is my
wish. For each of us it is our most profound desire to find our
beloved mate. And it seems to me that you could be that mate,
Roger . . . if . . . if you wished . . ."
"If I wished!" he exclaimed. "That's a good joke! I don't do
anything but wish! . . ."
"If you truly wished to be my mate. It is not a joke. Reflect! . . .
To unite our lives means to suppress either one or the other. . . .
What do you offer me? . . . You aren't aware of it, because the world
has long been used to these inequalities. But they are new to me. . .
. You do not come to me with only your affection. You come to me
with your family, your friends, your clients, and your relatives, with
your course mapped out, your career fixed, with your party and its
dogmas, your family and its traditions,—with a whole world that is
yours, a whole world that is you. And I, who have a world too, who
am also a world,—you say to me: 'Abandon your world! Throw it
away, and enter into mine!' I am ready to come, Roger, but I must
come whole. Do you accept me as I am?"
"I want all," said he. "It was you, just now, who said that you
could not give me all."
"You don't understand. I say: 'Do you accept me free? And do you
accept all of me?'"
"Free?" responded Roger circumspectly. "Everybody has been free
in France since '89. . . ." (Annette smiled: "The old platitude! . . .")
"But, after all, we must understand each other. It is certainly evident
that from the moment you marry you will not be completely free. By
that act you will have contracted obligations."
"I don't like that word very much," said Annette, "but I am not
afraid of the thing. I should joyously and freely take my part in the
trials and labors of the man I loved, in the duties of our common
life. But I won't renounce, on that account, the duties of my own
life."
"And what other duties are there? After what you have told me
and what I think I know, your life, my dear Annette, your life that
until now has been so placid and so calm, does not seem to me to
have experienced any very great exigencies? What could it demand?
Is it your work that you mean? Would you like to go on with it? I
confess that kind of activity seems wrong to me, for a woman. At
least, as a vocation. It's bothersome, in the home. . . . But I can't
believe that you are afflicted with this gift from Heaven. You are too
human, and too well balanced."
"No, it isn't a question of a special vocation. That would be
simple, for then one would have to follow it. . . . The demand, the
exigence (as you say) of my life is less easy to formulate: for it is
less precise and much more vast. It is a question of the right laid
upon every living soul: the right to change."
Roger cried: "To change! To change love?"
"Even while always remaining faithful, as I have said, to a single
love, the soul has the right to change. . . . Yes, I know, Roger, that
the word 'change' frightens you. . . . It disturbs me, too. . . . When
the passing hour is beautiful, I should like never to stir. One sighs
that it cannot be held forever! . . . And yet, Roger, one ought not to
do it; and, first of all, one cannot. One does not remain stationary.
One lives, one goes forward, one is pushed,—one must, must
advance! This does no injury to love; one takes that along. But love
should not wish to hold us back, shut up with it in the immobile
sweetness of a single thought. A beautiful love may last for a whole
lifetime, but it cannot entirely fill it. Think, my dear Roger, that while
still loving you I might find myself some day, perhaps (I find myself
already), cramped within your circle of action and thought. I would
never dream of arguing with you the excellence of your choice. But
would it be just for it to be imposed on me? And don't you find it
equitable to grant me the right of opening the window, if I haven't
enough air,—and even the door, a little—(oh! I won't go far)—and
for me to have my own little province of activity, my intellectual
interests, my friendships, not to remain confined to one point of the
globe, to the same horizon, but to try and enlarge it, to seek a
change of air, to emigrate. . . . (I say: if it is necessary. . . . I don't
know yet. But in any case I need to feel that I am free to do it, that
I am free to wish, free to breathe, free . . . free to be free . . . even
if I never make use of my liberty.) . . . Forgive me, Roger, perhaps
you find this need absurd and childish. It is not, I assure you; it is
the most profound need of my being, the breath that gives me life.
If it were taken away from me, I should die. . . . I can do
everything, for love. . . . But constraint kills me. And the idea of
constraint makes me a rebel. No, the union of two beings ought not
to become a mutual enchainment. It should be a twofold blooming. I
should like each, instead of being jealous of the other's free
development, to be happy in assisting it. Would you be, Roger?
Would you know how to love me enough to love me free, free of
you? . . ."
(She was thinking: "I should be yours only the more! . . .")
Roger was listening to her anxiously, nervous, and a little vexed.
Any man would have been. Annette should have been capable of
more adroitness. In her need of frankness and her fear of deception,
she was always led into exaggerating the most startling features of
her thought. But a stronger love than Roger's would not have set
this all at naught. Roger, his self-love touched above all, wavered
between two sentiments: that of not taking this feminine caprice
seriously, and the annoyance that he felt at this moral insurrection.
He had not perceived its passionate appeal to his heart. All that he
understood of it was that it was a sort of obscure menace and attack
upon his proprietary rights. If he had possessed more cunning in his
management of women, he would have hidden his secret vexation,
and promised, promised, promised . . . all that Annette desired.
"Lover's promises, as many as the wind will carry. Why then be
niggardly? . . ." But Roger, who had his faults, also had his virtues:
he was, as they say, "a simple young fellow," too much filled with
himself to be well acquainted with women, with whom he had had
recent dealings. He lacked the skill to hide his vexation. And when
Annette awaited his generous answer, she suffered the
disappointment of seeing that while listening to her he had thought
only of himself.
"Annette," said he, "I confess that I can scarcely understand what
you ask of me. You talk of our marriage as of a prison, and your one
idea seems to be to escape from it. My house has no bars at the
windows, and it is large enough for one to be comfortable in it. But
one cannot live with all the doors wide open, and my house is made
to be lived in. You talk to me about leaving it, about having your
individual life, your personal relationships, your friends, and even, if
I have rightly understood, of your privilege to leave the home at will,
in search of Heaven knows what you fail to find there, until it
happens to please you to come back again some day. . . . This can't
be serious, Annette! You haven't thought about it! No man could
grant his wife a position that would be so humiliating for him and so
equivocal for her."
These reflections were not, perhaps, lacking in good sense. But
there are times when perfectly dry good sense, with no intuition of
the heart, is a kind of nonsense. Annette, somewhat ruffled,
answered with a proud frigidity that masked her emotion:
"Roger, it is necessary to have faith in the woman one loves;
when one marries her, one must not do her the wrong of believing
that she would not have the same care as yourself for your honor.
Do you think that such a woman as myself would lend herself to an
equivocation in order to humiliate you? Any humiliation for you
would be a humiliation for her as well. And the freer she were, the
more bound she would feel to watch over that part of yourself which
you had confided to her. You will have to esteem me more highly.
Aren't you capable of having confidence in me?"
He felt the danger of alienating her by his doubts; and, telling

himself that after all there was no need of attaching an exaggerated
importance to these feminine ideas, and that there would be time
later to correct them—(if she remembered them!)—he returned to
his first idea, which was to take the whole thing as a joke. So he
believed that he was doing very well, when he said gallantly:
"Perfect confidence, Annette! I believe in your fair eyes. Only
swear to me that you will love me always, that you will love me
alone! I ask nothing more of you!"
But the little Cordelia, who could not reconcile herself to this
trifling fashion of avoiding the honest response on which her life
depended, stiffened against this impossible pledge.
"No, Roger, I can't, I can't swear that. I love you very much. But I
cannot promise something that does not depend upon myself. It
would mean deceiving you; and I shall never deceive you. I promise
you simply to hide nothing from you. And if the time comes when I
love you no longer, or love another, you will be the first to know it,—
even before that other. And you do the same! Oh, Roger! let us be
honest!"
That was scarcely possible. Embarrassing truth was something to
which the house of Brissot was not accustomed. When it knocked on
the door, they hastened to send word:
"Everyone is out!"
Roger did not fail to do it. He cried:
"My dear, how pretty you are! . . . There, let us talk of something
else! . . ."
XIII
Annette returned to the house, disappointed. She had cherished

great hopes of a frank talk. Although she had anticipated resistance,
she had counted on Roger's heart illumining his mind. The most
distressing thing was not that Roger had not understood, but that he
had not made the least effort to understand. He seemed to see
nothing pathetic in the question for Annette. He was all on the
surface, and he saw everything in his own image. Nothing could be
more painful to a woman with a strong inner life.
She did not deceive herself. Roger had been embarrassed,
irritated by Annette's words, but he had completely failed to perceive
their seriousness; he considered them inconsequential. He thought
that Annette had bizarre and rather paradoxical ideas, that she was
"original": it was troublesome. Madame and Mademoiselle Brissot
knew how to be superior without being "original." But one could not
demand this perfection in everyone. Annette had other qualities,—
that, perhaps, Roger did not place so high, but to which he clung (it
must be said) much more firmly at the moment. In this preference
the body had a greater share than the mind; but the mind, too, had
its share. Roger took a keen delight in Annette's heedless ardor,
when it was not exercised on subjects embarrassing to him. He was
not disturbed. Annette, in her uprightness, had shown him that she
loved him. He was convinced that she would not be able to
disengage herself from him.
He little suspected the drama of conscience that was being played
out at his side. In truth, Annette loved him so much that she could
not bring herself to think him such a sorry figure. She wished to
believe herself mistaken. She tested other possibilities, she tried to
do her best. If Roger would not grant her an independent life, at
least what part would he give her in his own? But the new
conclusions at which she found herself compelled to arrive were
discouraging. Roger's naïve egotism relegated her, in fine, to the
dining table, the drawing-room, and the bed. He was very ready to
tell her, prettily, about his affairs; but all she had to do, thereafter,
was to approve of them. He was no more disposed to concede to his
wife the rights of a collaborator who might discuss his political
activities with him and modify them, than he was to permit her a
social activity different from his own. It seemed to him perfectly
natural—(it was always done)—that the woman who loved him
should give him her whole life, and that she should receive only a
portion of his. At the bottom of his nature he held that old masculine
belief in his own superiority which made him feel that what he gave
was of a finer essence. But he would not have admitted it, for he
was a good fellow and a gallant Frenchman. If it happened that
Annette presumed to base certain feminine rights on the example of
the husband, Roger would smilingly say:
"It is not the same thing."
"Why?" Annette would ask.
And Roger would avoid a response. A conviction that one does
not discuss suffers less danger of being shaken. Roger's conviction
was firmly rooted. And Annette chose the wrong course to make him
doubt himself. Her advances, her efforts to find a mutual ground of
understanding, after her useless attempt to impose her ideas on
him, were interpreted by Roger as a fresh proof of the power that he
had over her. And he even grew vain. Suddenly Annette would
become irritated, and a quivering note would mark her speech.
Roger would pull himself up short, and return to the method that, in
his opinion, had been so successful: he would laughingly promise all
that was demanded of him. It is the tone, they say, that puts the
song across. That was the case with Roger. Annette was conscious
of the contempt.
Other more serious questions arose. Annette's intimacy with
Sylvie had been dangerously menaced. It was evident that the free-
minded girl would not be readily welcomed into this circle, and that
the little seamstress would be still less so. Never would the vain,
stiff-necked Brissots admit, for themselves or for their daughter-in-
law, any such scandalous evidence of relationship. It would have to
be hidden. And Sylvie would be no more ready to do this than
Annette. Each had her pride, and each was proud of the other.
Annette loved Roger, and she wanted him with a more burning
desire than she confessed to herself; but she would never sacrifice
her Sylvie to him. She had loved her too much; and if this love,
perhaps, had waned, she did not forget that at moments it had
made her touch the ultimate depths of passion:—(she knew it, she
alone; even Sylvie suspected only half the truth). But, in the hours
of her mutual confidence with Roger, Annette had told him much too
much. Then Roger had seemed amused, touched. . . . Yes, but on
the condition that all this belonged to the past. He had no intention
of seeing a prolongation of this compromising sisterhood. Secretly,
he had even decided to put an end to it, gently, without appearing
to take a hand in the affair. He did not wish to share his wife's
intimacy with anyone. His wife . . . "This dog is mine. . . ." Like all
his family, he had a very keen sense of what belonged to him.
As Annette's visit grew longer, this possessive grasp grew tighter,
—from certain affectionate externalities with which they surrounded
her. What the Brissots possessed, they possessed. The domestic
despotism of the two women sharply manifested itself daily in a
thousand minute details. Their "mind," as the saying goes, was
"made up" on everything, whether it was a question of the
household or the world, of everyday existence or of great problems
of the moral life. It was screwed down, fixed, once for all. Everything
was prescribed: what must be praised, what must be rejected,—
especially what must be rejected! Such ostracisms! What men, what
things, what ways of thinking or of acting, were judged, condemned
without appeal, and for eternity! The tone and the smile removed
the desire to argue. They had an air of saying (they often said, in so
many words):
"There are not two ways of thinking, my dear child."
Or, when Annette none the less tried to show that there was a
way also of her own:
"My dear, how amusing you are!"
Which had the effect of making her instantly shut her mouth.
They already treated her as a daughter of the house, not quite
thoroughly trained, whom they were instructing. They instructed her
regarding the order and course of the Brissot days, months, and
seasons, regarding their relatives in the province and their relatives
in Paris, their duties of kinship, their calls, their dinners, and the
endless chain of those social tasks, about which the women
complained, and of which they were very proud, because the
harassment of this perpetual activity gave them the illusion that they
were being of some service. This mechanical life, these false
relationships, this perpetual convention, were all intolerable to
Annette. Everything seemed regulated in advance: work and
pleasure,—for they had their pleasures too,—but regulated in
advance! . . . Hurrah for unforeseen ills that released one from the
program! But there was little hope of release, even on the score of
ills. Annette felt herself bricked in, like a stone in a wall! Sand and
lime. Roman cement. Brissot mortar. . . .
She exaggerated the rigorousness of this life. Chance and the
unexpected played their parts in it, as in all lives. The Mesdames
Brissot were more redoubtable in words than in fact; they pretended
to direct everything; but it was not impossible, if one attacked their
weak spots, anointed them, flattered and worshipped them, to lead
them by the nose; a cunning girl might have said to herself, while
evaluating them at their proper worth:
"Keep on talking! I'll do things my own way!"
One would have thought that a tenacious energy, like Annette's,
could never be stifled. But Annette was passing through that
nervous fever of women who, by dint of staring too fixedly at the
object which preoccupies them, cease to see it as it is. From a few
words heard during the daytime, she forged monsters when she was
alone at night. She was appalled at the battle which she had to
wage continually, and she repeated to herself that she would never
succeed in defending herself against them all. She did not feel
strong enough. She mistrusted her own energy. She was afraid of
her own nature, of those unexpected oscillations by which her
troubled mind continued to be shaken, of those sharp gusts that she
could not explain. And, indeed, they sprang from the complexity of
her rich being whose new harmony could be slowly realized only by
living; but, in the meantime, there was danger of their plunging her
into many surprises of violence and weakness, of the flesh and of
the mind, of the insidious hazards of fate, ambushed beneath the
stones of the road. . . .
The basis of her trouble was that she was no longer sure of her
love. She no longer knew. . . . She no longer loved, and yet she still
loved. Her mind and heart—her mind and senses—were at battle.
The mind saw too clearly; it was disillusioned. But the heart was not;
and the body was irritated when it saw that it was going to lose
what it coveted; passion grumbled:
"I do not want to renounce! . . ."
Annette felt this revolt, and she was humiliated by it; her natural
violence reacted forcibly, appealing to her wounded pride. She said:
"I love him no longer! . . ."
And her now hostile glance espied in Roger the reasons for no
longer loving him.
Roger saw nothing. He surrounded Annette with kindnesses, with
flowers, with gallant attentions. But he thought that the game was
won. Not for an instant did he dream of the proud savage soul that
was observing him, from behind its veil, burning to give itself—but to
him who would utter the mysterious password which shows that one
is recognized. He did not utter it; and for a reason. On the contrary,
he uttered irreflective words that, without her showing it, wounded
Annette to the heart. The instant after, he no longer remembered
what he had said. But Annette, who had not seemed to hear, could
have repeated them to him ten days, ten years later. She kept the
memory of them fresh, and the wound open. It was in spite of
herself, for she was generous, and she reproached herself for not
knowing how to forget. But the best of women may pardon intimate
offenses; she never forgets them.
Day by day, rents appeared in the fine cloth woven by love. The
cloth remained stretched tight, but the least breath made disquieting
shivers pass over it. Annette, observing Roger in the family circle,
with his family traits, the hardness, the dryness of certain of his
speeches and his contempt for humble people, said to herself:
"He is fading. At the end of a few years there will remain nothing
of what I love in him."
And since she loved him still, she wished to avoid the bitter
disillusion, the degrading conflicts between them that she foresaw, if
they were united.
Two nights before Easter, her decision was made. A miserable

night. There were many desires to be vanquished, obstinate hope
that did not wish to die had to be trodden under foot. She had, in
imagination, built her nest with Roger. So many dreams of happiness
that they had whispered to each other! Renounce them! Recognize
that they had been mistaken! Admit that one was not made for
happiness! . . .
For that is what she told herself in her discouragement. Another,
in her place, would not have been cast down. Why was she not
capable of accepting it? Why could she not sacrifice a part of her
nature? . . . But no, she could not! How badly life is arranged! One
cannot live without mutual affection; no more can one live without
independence. The one is as sacred as the other. One as much as
the other is necessary to the air we breathe. How can they be
reconciled? They say to you: "Sacrifice! If you do not sacrifice, you
do not love enough. . . ." But it is almost always those who are
capable of a great love who are also the most enamoured of
independence. For in them, all is strong. And if they sacrifice to their
love the principle of their pride, they feel themselves degraded even
in their love, they dishonor love. . . . No, it is not so simple as the
morality of humility would have us believe—or that of pride,—the
Christian or the Nietzschean doctrine. In us a strength is not
opposed to a weakness, a virtue to a vice; it is two forces
confronting two virtues, two duties. . . . The sole true morality,
according to the true life, would be a morality of harmony. But, so
far, human society has known only a morality of repression and
renunciation,—tempered by lies. Annette could not lie. . . .
What was to be done? . . . To escape from equivocation as quickly
as possible, at any price! Since she was convinced that it would be
impossible to live in this union, to break it the next day! . . .
Break! . . . She imagined to herself the family's stupefaction, the
scandal. . . . That was nothing. . . . But Roger's grief. . . .
Immediately she pictured to herself in the darkness the image of his
beloved face. . . . At this vision a new surge of passion swept
everything away. . . . Annette, burning and icy, motionless in her
bed, upon her back, with her eyes open, suppressed the beatings of
her heart. . . .
"Roger," she implored, "my Roger, forgive me! . . . Oh! If I could
spare you this pain! . . . I cannot, I cannot! . . ."
Then she was bathed in such a flood of love and of remorse that
she nearly went running to fling herself at the foot of Roger's bed, to
kiss his hands, and say to him:
"I will do everything you wish. . . ."
What! She still loved him? . . . She rebelled. . . .
"No, no! I don't love him any more! . . ."
She lied to herself furiously. . . .
"I don't love him any more! . . ."
In vain! . . . She still loved him. She loved him more than ever.
Perhaps not with the noblest part of her—(but what is noble, and
what is not?)—Yes! with the noblest too, and with the least! Body
and soul! . . . If one could only stop loving when one stopped
respecting! How comfortable that would be! . . . But to suffer at the
hands of the beloved has never exempted one from loving him: one
feels it only the more cruelly when one is forced to love him! . . .
Annette was suffering in her wounded love—from lack of confidence,
lack of faith in herself, lack of Roger's profound love. She was
suffering from the bitter consciousness of all the destroyed hopes
which she had hatched and which would never see the light of day.
It was because she loved Roger so ardently that she insisted on
making him accept her independence. She wanted to be to him
more than a woman who abdicates, passive in the union,—a free
and sure companion. He took no stock in it. She felt within herself a
sorrow, an anger of offended passion. . . .
"No! no! I love him no longer! I ought not to, I don't want to. . .
."
But her strength crumpled, and, even before she could finish her
cry of rebellion, she wept. . . . In the night, in silence. . . . Beneath
the ice of reason, alas! she was on fire. . . . There was that which
she did not wish to say: what joy she would have found in sacrificing
to him all that she had, even her independence, if only he had made
a generous move, a gesture, a simple gesture, to sacrifice himself,
rather than to sacrifice her! . . . She would not have let him do it.
She would have demanded no more than an outburst of the heart, a
proof of true love. But that proof, although he loved her in his own
way, he was incapable of giving. It did not enter his thoughts. He
had judged Annette's desire as a feminine requirement that must be
received smilingly, but in which there was not touch sense. What
could she wish? Why the devil was she crying? Because she loved
him? Well then? . . .
"You love me, don't you? You love me? That is the essential thing.
. . ."
Ah! that word, she had not forgotten that either! . . .
Annette smiled amid her tears. Poor Roger! He was what he was.
One could not grudge him that. But one does not change. Neither
he, nor I. We cannot live together. . . .
She dried her eyes.
"Come now, one must put a stop to this. . . ."
XIV
After a white night—(she had drowsed for only an hour or two at

dawn)—Annette arose, resolute. With the light of day, calm returned
to her. She dressed herself and did her hair methodically, coldly,
shutting out of her mind everything that might awaken its doubts,
attentive to her toilet, which she made with an even more than
ordinary meticulous attention to correct detail.
About nine o'clock Roger knocked gaily at her door. Following his
morning custom, he had come to take her for a walk.
They set out, escorted by a gamboling dog. They took a road that
led beneath the trees. The young, verdant woods were shot through
with sunlight. The branches were alive with the songs and cries of
birds. Every step sent them flying; there were beatings of wings,
rustlings of leaves, clashing of branches, frenzied flights through the
forest. The excited dog snapped and sniffed and zig-zagged. Jays
were bickering. In the cupola of an oak, two ringdoves were cooing.
And far away, the cuckoo was circling, circling, farther, then nearer,
tirelessly repeating his ancient jest. It was the outburst of spring
fever. . . .
Roger, noisy, very gay, laughing, and exciting his dog, was himself
like a big, happy dog. Annette followed silently, at a few paces. She
was thinking:
"Here! . . . No, yonder at the turning. . . ."
She was watching Roger. She was listening to the forest. How
different all would be, after she had spoken! . . . The turn was
passed. She had not spoken. . . . She said: "Roger . . ." in an
uncertain, trembling voice, almost a whisper. . . . He did not hear it,
he noticed nothing. Stooping down in front of her, he gathered some
violets, and he talked, talked. . . . She repeated: "Roger!" this time
in such an accent of distress that he turned around, startled. At once
he saw the pallor of her gravely serious face; he came to her. . . . He
was afraid already. She said:
"Roger, we must separate."
His features expressed stupefaction and dismay. He stammered:
"What's that you say? What's that you say?"
Avoiding his glance, she repeated firmly:
"We must separate, Roger; it is sad, but we must. I have come to
see that it is impossible, impossible for me to be your wife. . . ."
She wanted to go on, but he prevented her.
"No, no, that's not true! . . . Be still! Be still! You are mad! . . ."
"I must go away, Roger," she said.
He shouted: "Go away, you! . . . I don't want you to! . . ."
He had seized her arms, and was squeezing them brutally. Then
he caught sight of her proud face, obstinate and glacial; he felt that
he was lost, he let go, he begged pardon, he prayed, he pleaded.
"Annette! My little Annette! Stay, stay! . . . No, it isn't possible. . .
. But what has happened? What have I done?"
Pity reappeared on the firm face. She said:
"Let's sit down, Roger. . . ."
(He seated himself docilely beside her on a mossy bank: his eyes
never left her, imploring at every word).
". . . Be calm, everything must be explained. . . . Be calm, I beg
of you! . . . Believe me that I have to use all my strength to be. . . .
I could not speak unless I forced myself to do so. . . ."
"But don't speak," he cried. "It is madness! . . ."
"It is necessary."
He tried to close her mouth. She pulled herself away. Despite the
disturbance within her, her resolution seemed so inflexible that she
imposed it on Roger who, abandoning the struggle, beaten and
haggard, listened to her words, without daring to look at her.
Annette, in a voice that seemed impassive, cold and mournful, but
which was marked by sharp breaks, and which once or twice
stopped to take breath along the way,—said what she had decided
to say, in words that were clear, studied, and moderate, but which
seemed all the more implacable for that. . . . She had sincerely
wanted to test out whether they could live together. She hoped so at
first, she wished it with all her heart. She had seen that this dream
could not be realized. Too many things separated them. Too many
differences in their surroundings and in their thoughts. She laid the
blame at her own door; she had definitely recognized that she could
not live a married life. She had conceptions of life, of independence,
which did not accord with Roger's. Perhaps Roger was right. The
majority of men, perhaps of women even, thought as he did. She
was wrong, no doubt. But right or wrong, that was how she was. It
was useless for her to cause another's misery and her own. She was
made to live alone. She freed Roger from all promises made to her,
and took back her own freedom. For the rest, they were not bound.
Everything had been upright between them. They must separate
uprightly, as friends. . . .
While speaking, she stared at the grass at her feet; she was very
careful not to look at Roger. But, as she spoke, she heard his
gasping breath, and it was a sore trial for her to go on to the end.
When she had finished, she risked looking at him. In her turn, she
was smitten. Roger's face was like that of a drowning man: flushed,
breathing noisily, he had not the strength to cry out. Awkwardly he
moved his clenched hands, sought and found his breath, and
groaned:
"No, no, no, no, I cannot, I cannot . . ."
And he burst into sobs.
From a field by the edge of the woods, they heard the voice of a
peasant, the noise of a plow-share. Annette, overcome with
emotion, seized Roger by the arm and drew him away from the
road, into the bushes, then further into the midst of the forest.
Roger, devoid of strength, let himself be led, repeating:
"I cannot, I cannot. . . . What is going to become of me? . . ."
Tenderly she tried to keep him from speaking. But he was
overwhelmed by his despair: the misery of his love, of his pride, the
public humiliation, the ruined happiness that was to be his lot,—all
these were at once commingled. This big child who had been spoiled
by life, who had never seen anything resist his desire, broke down at
this defeat: it was a catastrophe, a crumbling of all his certainties;
he was losing faith in himself, he was losing his foothold, there was
no way for him to turn. Annette, touched by this great grief, was
saying:
"My sweetheart . . . my sweetheart. . . . Don't cry! . . . You have,
you will have a beautiful life . . . you will have no need of me."
He continued to moan.
"I can't do without you. I no longer believe in anything. . . . I no
longer believe in my life. . . ."
And he flung himself on his knees.
"Stay! Stay with me! . . . I will do what you want . . . everything
that you want. . . ."
Annette knew perfectly well that he was making promises that he
could not fulfill, but she was touched. Gently she replied:
"No, my friend, you are saying it sincerely, but you couldn't do it,
or you would suffer because of it, and I should suffer too; life would
be a perpetual conflict. . . ."
When he saw that he could not shake her resolution, he burst into
tears at her feet, like a child. Annette was pierced by pity and by
love. Her energy melted. She tried to remain firm, but she could not
resist these tears. She thought of herself no longer; she thought
only of him. She caressed that dear head resting against her legs,
and she said tender words to him. She lifted up her big, unhappy
boy, she dried his eyes with her handkerchief, she took him by the
arm again, she compelled him to walk. He was so prostrated that he
surrendered himself, knowing only how to weep. As they went along,
the branches of the trees lashed their faces. They went into the
woods, without seeing, without knowing where. Annette felt emotion
and love rising within her. Supporting Roger, she said:
"Don't cry! . . . my dear! . . . my little one. It tears me to pieces. .
. . I can't bear it. . . . Don't cry! . . . I love you. . . . I love you, my
poor little Roger. . . ."
And he answered:
"No . . .!" in the midst of his tears.
"Yes! I love you, I love you, a thousand times more than you have
ever loved me. . . . What do you want me to do? . . . Oh! I shall do
it. . . . Roger, my Roger. . . ."
And now as they were walking, they came out of the woods, and
found themselves at the fence of the Rivière property, near the old
house. Annette recognized it. . . . She looked at Roger. . . . And
suddenly passion invaded her whole body. A wind of fire. A
drunkenness of the senses, like the intoxication of an acacia in
bloom. . . . She ran towards the door, holding Roger by the hand.
They entered the deserted habitation. The blinds were shut. Coming
in out of the broad daylight, they were blinded. Roger bumped
against the furniture. Without seeing and without thinking, he let
himself be guided by the burning hand that led him through the
darkness of the ground floor rooms. Annette did not hesitate, her
destiny drew her on. . . . Into the room at the back, the room of the
two sisters, in which from the past autumn there still floated the
perfume of their two bodies, toward the big bed, where they had
both slept, she went with him; and, in a passion of pity and of joy,—
she gave herself to him.
XV
When they awakened from their overwhelming intoxication, their

eyes were accustomed to darkness. The room seemed lighted. Rays
of sunlight came dancing through the slits in the blinds, reminding
them of the fine day outside. Roger was covering Annette's
unclothed body with kisses; he was giving voice to his gratitude in
inarticulate words. . . .
But after he had spoken, he suddenly fell silent, his face resting
against Annette's side. . . . Annette, silent and motionless, was
dreaming. . . . Outside, in the rosebush by the wall, bees were
buzzing. . . . And, like a song receding in the distance, Annette
heard Roger's love take wings. . . .
Already he loved her less. Roger, too, felt it with shame and
annoyance; but he was unwilling to admit it. Fundamentally, he was
shocked that Annette had given herself. . . . Ridiculous exigence of
man! He desires the woman, and when she sincerely surrenders
herself to him, he almost regards her over-generous act as an
infidelity! . . .
Annette leaned towards him, lifted up his head, looked into his
eyes for a long time, said nothing, and smiled a melancholy smile.
When he felt this glance piercing him to his very soul, he sought to
deceive her. He intended to appear thoroughly enamoured. He said:
"Now, Annette, you cannot go: I must marry you."
Annette's sad smile reappeared. She had read him perfectly. . . .
"No, my friend," said she, "you must nothing."
He recovered himself.
"I want . . ."
But she replied: "I am going to go."
"Why?" he asked.
And before she spoke he already understood her reasons for
departure. However, he felt obliged to dispute them afresh. She put
her hand over his mouth; and he kissed that hand with passionate
anger. . . . Oh! how much he loved her! He was humiliated by his
own thoughts. Had not she seen them? . . . And the sweet, moist
hand that caressed his lips seemed to say:
"I have seen nothing. . . ."
From a distant village came the tolling of bells, borne upon the
fitful wind. . . . After a long silence, Annette sighed. . . . Come, this
time it is the end. . . . In a hushed voice she said:
"Roger, we must go back. . . ."
Their bodies drew apart. Kneeling beside the bed, he pressed his
brow against Annette's bare feet. He wished to prove to her:
"I am thine."
But he did not succeed in driving away his afterthought.
He went out of the room, leaving Annette to dress. While waiting
he leaned his elbows on a wall of the little entrance court, listening
vaguely to the noises of the countryside and savoring the hour just
passed. Importunate ideas were eclipsed. He rejoiced in the
happiness of pride and sensual appeasement. He was proud of
himself. He thought:
"Poor Annette!"
He corrected himself:
"Dear Annette! . . ."
She came out of the house. As calm as ever. But very pale. . . .
Who can tell all that had passed during those brief moments that
she had been left alone: assaults of passion, grief, renunciation? . . .
Roger saw nothing of all this, he was absorbed in himself. He went
to her and sought to renew his protestations. She raised a finger to
her lips: Silence! . . . At the hedge that enclosed the garden she
plucked a branch of hawthorn, she broke it in two, and gave him
half. And as she left the Rivière estate with him, on the very
threshold, she pressed her lips to Roger's.
They returned without a word, through the forest. Annette had
begged him not to break the silence. He held her arm. His attitude
was very tender. She was smiling, with her eyes half closed. And this
time it was he who guided her steps. He did not recall that only an
hour ago, at this very spot, he had wept. . . .
In the depths of the forest the dog was barking in pursuit of
game. . . .
XVI
She took her departure on the following day. Her excuse was a
letter, a sudden illness of her old aunt. The Brissots were not
completely fooled by this. For some time they had been more
suspicious than Roger that Annette was escaping them. But it suited
their dignity not to seem to admit this possibility, and to believe in
the reasons given for this sudden departure. Up to the last moment
they played a comedy of brief separation and early reunion. This
constraint was painful to Annette; but Roger had begged her not to
announce her decision until later, at Paris, and Annette admitted to
herself that she would have found it hard to inform the Brissots by
word of mouth. So, when they took leave of each other, they
exchanged smiles, coy words and embraces from which the heart
was absent.
Roger again accompanied Annette in a carriage to the station.
They were both sad. Roger had virtuously renewed his request to
Annette that she should marry him; he felt that he was bound to: he
was a gentleman. Too much of a one. He also felt that he had the
right, now, to make his authority felt,—in the interest of Annette. He
thought that because she had given herself, because Annette had
abdicated, the situation between them was no longer quite equal,
and that he must now demand marriage. Annette saw only too
clearly that, if he married her now, he would think himself justified a
thousand times more than ever in playing her guardian. Of course,
she was grateful to him for his correct insistence. But . . . she
refused. Roger was secretly irritated by this. He no longer
understood her. . . . (He thought that he had always understood
her!) . . . And he judged her severely. He did not show it. But she
guessed it, with mingled sorrow and irony, and always tenderness. .
. . (He was still Roger! . . .)
When they had nearly arrived, she placed her gloved hand on
Roger's hand. He started:
"Annette!"
"Let us forgive each other!" she said.
He wished to speak; he could not. Their hands remained clasped.
They did not look at each other, but each knew that the other was
holding back the tears, ready to flow. . . .
They were at the station; they had to be discreet. Roger installed
Annette in her carriage. She was not alone in the compartment.
They had to restrict themselves to commonplace courtesies; but the
eyes of each were avidly seizing upon the image of the other's
beloved face.
The engine whistled.
"Till we meet again!" they said.
And they were thinking: "Never!"
The train pulled out. Roger returned home in the falling night. His
heart was full of sorrow and of anger. Of anger against Annette. Of
anger against himself. He felt torn asunder. He felt—oh, shame!—he
felt relieved. . . .
And stopping his horse on the deserted road, in contempt for
himself and in contempt for love, he wept bitterly.
XVII
Annette returned home to the Boulogne house, and there she

shut herself up. When the letter to the Brissots had gone off, she
severed all connections with the outside world. None of her friends
knew that she had returned. She opened no letters. For days she
never left the floor on which she lived. Her old aunt, accustomed not
to understand her and not to worry about it, respected her isolation.
Her external life seemed suspended. Her other, secret life was only
the more intense. Her silence was swept by storms of wounded
passion. She had to be alone so that she might abandon herself to
them to the point of exhaustion. She emerged from them broken,
her blood drained, her mouth parched, with burning brow, and
hands and feet like ice. There followed torpid periods given over to
deep dreams. For days she dreamed; and she made no effort to
direct her thoughts. She was invaded by a confused mass of mingled
emotions. . . . A somber melancholy, a bitter sweetness, a taste of
ashes in the mouth, disappointed hopes, sudden flashes of memory
that made her heart leap, fits of embittered despair, pride and
passion, and a sense of ruin, of the irremediable, of a Fate against
which all efforts are vain,—at first a crushing feeling, then mournful,
then dissolving into a drowsiness whose distant sorrow was marked
by a strange pleasure. . . . She did not understand. . . .
One night, in a dream, she saw herself in a bourgeoning forest.

She was alone. She was running through the thickets. Tree branches
laid hold of her dress, damp bushes clutched her; she freed herself,
but tore her clothes in doing so, and saw with shame that she was
half naked. She bent to cover herself with the tatters of her skirt.
And then before her, on the ground, she saw a small oval basket,
beneath a pile of sun-drenched leaves,—not yellow and gold, but
white as silver, like the trunk of a birch, white with the finest linen.
Deeply moved, she looked at it, she knelt beside it. She saw the
linen begin to stir. With beating heart, she stretched out her hand. . .
. Her emotion persisted. . . . She did not understand. . . .
There came a day—when she understood. . . . She was alone no

longer. . . . In her a life was arising, a new life. . . .
And the weeks passed, while she brooded over her hidden
universe. . . .
"Love, is it really thou? Love, thou who hast fled me when I

sought to seize thee, hast thou entered into me? I hold thee, I hold
thee, thou shalt not escape me; oh, my little prisoner, I hold thee in
my body. Revenge thyself! Devour me! Little consuming creature,
devour my vitals! Nourish thyself on my blood! Thou art myself.
Thou art my dream. Since I could not find thee in this world, I have
made thee with my flesh. . . . And now, Love, I have thee! I am he
whom I love! . . ."
THE END
*** END OF THE PROJECT GUTENBERG EBOOK ANNETTE AND
SYLVIE: BEING VOLUME ONE OF THE SOUL ENCHANTED ***
Updated editions will replace the previous one—the old editions will
be renamed.
Creating the works from print editions not protected by U.S.

copyright law means that no one owns a United States copyright in
these works, so the Foundation (and you!) can copy and distribute it
in the United States without permission and without paying
copyright royalties. Special rules, set forth in the General Terms of
Use part of this license, apply to copying and distributing Project
Gutenberg™ electronic works to protect the PROJECT GUTENBERG™
concept and trademark. Project Gutenberg is a registered trademark,
and may not be used if you charge for an eBook, except by following
the terms of the trademark license, including paying royalties for use
of the Project Gutenberg trademark. If you do not charge anything
for copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such as
creation of derivative works, reports, performances and research.
Project Gutenberg eBooks may be modified and printed and given
away—you may do practically ANYTHING in the United States with
eBooks not protected by U.S. copyright law. Redistribution is subject
to the trademark license, especially commercial redistribution.
START: FULL LICENSE

THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK
To protect the Project Gutenberg™ mission of promoting the free

distribution of electronic works, by using or distributing this work (or
any other work associated in any way with the phrase “Project
Gutenberg”), you agree to comply with all the terms of the Full
Project Gutenberg™ License available with this file or online at
www.gutenberg.org/license.
Section 1. General Terms of Use and

Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand, agree
to and accept all the terms of this license and intellectual property
(trademark/copyright) agreement. If you do not agree to abide by all
the terms of this agreement, you must cease using and return or
destroy all copies of Project Gutenberg™ electronic works in your
possession. If you paid a fee for obtaining a copy of or access to a
Project Gutenberg™ electronic work and you do not agree to be
bound by the terms of this agreement, you may obtain a refund
from the person or entity to whom you paid the fee as set forth in
paragraph 1.E.8.
1.B. “Project Gutenberg” is a registered trademark. It may only be

used on or associated in any way with an electronic work by people
who agree to be bound by the terms of this agreement. There are a
few things that you can do with most Project Gutenberg™ electronic
works even without complying with the full terms of this agreement.
See paragraph 1.C below. There are a lot of things you can do with
Project Gutenberg™ electronic works if you follow the terms of this
agreement and help preserve free future access to Project
Gutenberg™ electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law
in the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name associated
with the work. You can easily comply with the terms of this
agreement by keeping this work in the same format with its attached
full Project Gutenberg™ License when you share it without charge
with others.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the
terms of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.
1.E. Unless you have removed all references to Project Gutenberg:
1.E.1. The following sentence, with active links to, or other

immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears,
or with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this eBook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.
1.E.2. If an individual Project Gutenberg™ electronic work is derived

from texts not protected by U.S. copyright law (does not contain a
notice indicating that it is posted with permission of the copyright
holder), the work can be copied and distributed to anyone in the
United States without paying any fees or charges. If you are
redistributing or providing access to a work with the phrase “Project
Gutenberg” associated with or appearing on the work, you must
comply either with the requirements of paragraphs 1.E.1 through
1.E.7 or obtain permission for the use of the work and the Project
Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9.
1.E.3. If an individual Project Gutenberg™ electronic work is posted

with the permission of the copyright holder, your use and distribution
must comply with both paragraphs 1.E.1 through 1.E.7 and any
additional terms imposed by the copyright holder. Additional terms
will be linked to the Project Gutenberg™ License for all works posted
with the permission of the copyright holder found at the beginning
of this work.
1.E.4. Do not unlink or detach or remove the full Project

Gutenberg™ License terms from this work, or any files containing a
part of this work or any other work associated with Project
Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute this

electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1

Deep Reinforcement Learning in Unity: With Unity ML Toolkit 1st Edition Abhilash Majumder

Uploaded by

Copyright:

Available Formats

Deep Reinforcement Learning in Unity: With Unity ML Toolkit 1st Edition Abhilash Majumder

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Deep Reinforcement Learning in Unity: With Unity ML Toolkit 1st Edition Abhilash Majumder

Uploaded by

Copyright:

Available Formats

Download the full version of the textbook now at textbookfull.

Deep Reinforcement Learning in Unity: With

Explore and download more textbook at https://textbookfull.com

Deep Reinforcement Learning in Unity With Unity ML Toolkit

Mastering UI Development With Unity An In Depth Guide to

Learning C# programming with Unity 3D Alex Okita

Conspiracy Theories and Other Dangerous Ideas Sunstein

Sex Politics and Society The Regulation of Sexuality Since

Intelligent Web Data Management Software Architectures and

Foreign Investment Promotion Governance and Implementation

The Patient as Agent of Health and Health Care: Autonomy

ISBN-13 (pbk): 978-1-4842-6502-4 ISBN-13 (electronic): 978-1-4842-6503-1

Copyright © 2021 by Abhilash Majumder

About the Technical Reviewer������������������������������������������������������������������������������� xiii

Chapter 1: Introduction to Reinforcement Learning������������������������������������������������� 1

Chapter 2: Pathfinding and Navigation������������������������������������������������������������������� 73

Chapter 3: Setting Up ML Agents Toolkit�������������������������������������������������������������� 155

Linking Unity ML Agents with Tensorflow��������������������������������������������������������������������������������� 181

Chapter 4: Understanding brain agents and academy����������������������������������������� 209

Chapter 5: Deep Reinforcement Learning������������������������������������������������������������� 305

Deep Reinforcement Learning Algorithms�������������������������������������������������������������������������������� 344

Chapter 6: Competitive Networks for AI Agents��������������������������������������������������� 449

Building an Autonomous AI Agent for a Mini Kart Game����������������������������������������������������������� 502

Chapter 7: Case Studies in ML Agents������������������������������������������������������������������ 513

Figure 1-1. Interaction between agent and environment in reinforcement learning

• State is a set of possible enumerated states provided in the

OpenAI Gym Environment: CartPole

Figure 1-2. CartPole environment from OpenAI gym

• States: An array of length 4.:[cart position, cart velocity, pole

• Rewards: +1 for every timestamp the pole remains upright

• Actions: integer array of size 2 : [left direction, right direction], which

• Objective: to keep the pendulum or pole upright for 250 time-steps

Installation and Setup of Python for ML Agents

Note Anaconda Navigator is installed with Anaconda. This is an interactive

Figure 1-3. Anaconda navigator terminal

Jupyter notebook can be installed by using pip command as:

pip3 install –upgrade pip

Alternatively, Google Colaboratory (Google Colab) runs Jupyter notebooks on the

Figure 1-4. Google Colaboratory notebook

To start, create a new Python3 kernel notebook, and name it as CartPole

• Install Gym : Gym is the collection of environments created by

Run the command in Anaconda Prompt, Command Prompt :

pip install gym

Or run this command from Jupyter notebook or Google Colab

!pip install gym

• Install Tensorflow and Keras: Tensorflow is an open-source deep

pip install tensorflow>=1.7

These commands are for installation through Anaconda Prompt

For Jupyter notebook or Colab installation of Tensorflow and

!pip install tensorflow>=1.7

• Install gym pyvirtualdisplay and python opengl: These libraries and

!apt-get install –y xvfb python-opengl > /dev/null 2>&1

 laying with the CartPole Environment for Deep

from pyvirtualdisplay import Display

This is shown in Figure 1-5.

Figure 1-5. Observation and action space in CartPole environment

Visualization with TensorBoard

pip install tensorboard

To start the TensorBoard visualization in Colab or Jupyter Notebook, the following

About the Technical Reviewer�� xiii

Chapter 1: Introduction to Reinforcement Learning�� 1

Chapter 2: Pathfinding and Navigation�� 73

Chapter 3: Setting Up ML Agents Toolkit�� 155

Linking Unity ML Agents with Tensorflow�� 181

Chapter 4: Understanding brain agents and academy�� 209

Chapter 5: Deep Reinforcement Learning�� 305

Deep Reinforcement Learning Algorithms�� 344

Chapter 6: Competitive Networks for AI Agents�� 449

Building an Autonomous AI Agent for a Mini Kart Game�� 502

Chapter 7: Case Studies in ML Agents�� 513

OpenAI Gym Environment: CartPole

Installation and Setup of Python for ML Agents

laying with the CartPole Environment for Deep

Visualization with TensorBoard