Fading Intelligence Theory: A Theory On Keeping Artificial Intelligence Safety For The Future

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Fading Intelligence Theory: A Theory on Keeping

Artificial Intelligence Safety for the Future

Utku Kose Pandian Vasant


Computer Sciences App. and Res. Center Faculty of Science and Information Tech.
Usak University Universiti Teknologi Petronas
Usak, Turkey Perak, Malaysia
utku.kose@usak.edu.tr vasantglobal@gmail.com

Abstract— As a result of unstoppable rise of Artificial possible scenarios that are dangerous or harmful for the
Intelligence, there has been a remarkable focus on the question of existence of the humankind or at least its stable living
"Will intelligent systems be safe for humankind of the future?" standards on the Earth. Finally, the field of Artificial
Because of that, many researchers have started to direct their Intelligence has encountered such anxiety and that has caused a
works on dealing with problems that may cause problems on
new sub-research field to appeared: Artificial Intelligence
enabling Artificial Intelligence systems to behave out of control
or take positions dangerous for humans. Such research works are Safety.
currently included under the literature of Artificial Intelligence As related with also ethics on making intelligent machines
Safety and / or Future of Artificial Intelligence. In the context of – systems, Artificial Intelligence Safety is focused on ensuring
the explanations, this research paper proposes a theory on safe intelligent systems, which are not harmful to the
achieving safe intelligent systems by considering life-time of an humankind and effective in their problem solving scope. When
Artificial Intelligence based system according to some we consider the associated literature, we can see that research
operational variables and eliminate – terminate an intelligent studies on safety problems are answered with some research
system, which is 'old enough' to operate for giving chance to new area concepts like Artificial Intelligence / Machine Ethics,
generations of systems, which seem safer. The paper makes a
brief introduction to the theory and opens doors widely for
Future of Artificial Intelligence, Human-Compatible Artificial
further research on it. Intelligence…etc. [1-7]. All these research are concepts deal
with achieving safe intelligent systems and try to figure out
Index Terms— fading intelligence theory, artificial intelligence general approaches, rules, strategies, and policies on
safety, future of artificial intelligence, artificial intelligence developing the desired safe Artificial Intelligence oriented
systems. In detail, Artificial Intelligence Ethics is about works
I. INTRODUCTION on dealing with ethical dilemmas that intelligent systems may
encounter [1]. Here, there has been also an alternative
Since its first steps to the scientific arena, Artificial
discussion on how to understand the ‘ethics’ concept from
Intelligence has improved greatly and influenced almost all
perspective of intelligent systems and another concept:
fields of the modern life. By combining theoretical and applied
Artificial Intelligence Safety Engineering was proposed for the
aspects of Computer Science and running them with the
literature [8]. One of the most important milestones of
support of some advanced technologies like computer,
developments regarding to Artificial Intelligence Safety may
electronics, and communication, Artificial Intelligence
be start of the Artificial Intelligence Safety Research program
currently has a big power to overcome all kinds of real-world
in 2015 as funded primarily by Elon Musk and started by
based problems even they belong to different levels of
Future of Life Institute [9]. On the other hand, launch of the
complexity. It is clear that flexible and open solution scope of
non-profit Artificial Intelligence research company: Open AI
Artificial Intelligence has a remarkable role on improving
with a fund of 1 billion US dollars and as supported by people
effectiveness and efficiency of solutions approaches for the
like Elon Musk, Peter Thiel, and Sam Altman [10, 11] is also
real-world based problems and making the life more
an important sign of how the scientific community has given
comfortable for people. Especially mathematical and logical
necessary emphasis on Artificial Intelligence safety and the
approaches on the background have made it easier to adapt any
associated developments. Nowadays, some of remarkable
intelligent problem solution approach to unsolved problems of
research institutes / centers in which Artificial Intelligence
different fields. Here, difference between Artificial Intelligence
Safety oriented research are done can be listed as follows:
and another scientific field in a philosophical manner has been
not a problem for developing intelligent systems. This • Future of Humanity Institute – University of Oxford,
multidisciplinary characteristic makes the Artificial • Center for Human-Compatible AI – UC Berkeley,
Intelligence ones of the strongest scientific fields of the future. • Machine Intelligence Research Institute,
But anxiety on new technological improvements – • Leverhulme Centre for the Future of Intelligence –
developments has made people to always discuss about any University of Cambridge,

978-1-5386-1880-6/17/$31.00 ©2017 IEEE


• Vector Institute for Artificial Intelligence – University II. FADING INTELLIGENCE THEORY
of Toronto, Fading Intelligence Theory (FIT) is briefly about
• Future of Life Institute, employment state of an intelligent agent / system and in this
• Open AI, way, forming a general structure for a life-time for it. It is
• Centre for the Study of Existential Risk. possible to examine the theory under two aspects (Fig. 1):
When considering Artificial Intelligence Safety based • Training State of the System,
works, research works are generally designed on existence of • Life-Time State of the System.
intelligent agents. In this research study, such agents are also As general, we can derive two approaches under the theory
called as intelligent systems. A typical agent can consist of one by taking the related aspects into consideration:
or more Artificial Intelligence techniques to achieve its
existential structure. But this factor is not remarkable since the A. Training State Approach
main point of Artificial Intelligence Safety oriented works are Theory FIT-a. Let P be a global set of problems that can be
for solving problems on how well to train such systems or how solved by an intelligent agent / system Ag. Also, let T be the
well to control them. global set of training data, which achieves solving ‫ ׊‬p ę P by
There are already popular topics in this manner under the Ag with the success rate of 100%. By hypothesis, Ag cannot be
associated literature. Some of the remarkable ones in which trained more and it is called as ‘complete intelligent agent /
researchers are generally interested nowadays are as follows: system’.
• Inverse Reinforcement Learning / Reinforcement Proof FIT-a. Think about a new training data t to be added
Learning [12-16], to the T. Because it cannot guarantee the success rate not to
• Interruptible Agents / Ignorant Agents / Inconsistent change with new set of T and affects the distribution of the T
Agents / Bounded Agents [17-19], making ‫ ׊‬p ę P solvable at the success rate of 100%, the
• Corrigibility [20], current agent / system should be trained again to see results. So,
• Rationality [21-23], complete Ag is not the same agent / system because complete
• Super Intelligence [24-28], Ag is associated with the former T not including t. One should
In the context of the explanations so far, objective of this think about considering a total new training on the new set of T
research is to propose a theory on achieving safe intelligent with a new agent / system, which means the complete Ag
agents / systems by considering life-time of an Artificial cannot be trained more. Also, choosing to train the complete
Intelligence based system according to some operational Ag cause losing its rank, lowering in its success rate because of
variables and eliminate – terminate an intelligent agent / the not-well distributed new T, and finally ‘fading of its
system, which is 'old enough' to operate for giving chance to intelligence’. Furthermore, keeping the Artificial Intelligence
new generations of systems, which seem safer. Called as the Safety is associated with the total success rate of an agent /
‘Fading Intelligence Theory’, it is thought that an intelligent system. If an agent / system satisfies human’s needs on
agent / system cannot be further trained when it reaches to its problem solving with a total of 100% rate, then any additional
top training capacity otherwise it misses its objectives, which change in the T causes ‘butterfly effects’ and the safety to be
means it is not safe anymore. Also, sometimes one should stop violated.
training it in order to cause lowering in intelligence level with
not-well distributed training data. Finally, there should be some
global indicators to define which intelligent systems to operate
or eliminate – terminate. So there should be life-time for each
Artificial Intelligence system. Briefly, the paper is related to a
brief introduction to the theory.
According to the subject of the research – paper, remaining
content of the paper is organized as follows: The next section is
devoted to details of the theory. It provides some mathematical
– logical explanations on the theory and explains its
philosophical aspects. After that section, the third section
provides a representative evaluation with different kinds of
Artificial Intelligence techniques (in this case Machine
Learning techniques) to focus on what the Fading Intelligence
Theory tries to explain. The third section is followed by the
fourth section including some final discussion and the content
is ended by expressing conclusions and future works under the
last section. Fig. 1. Two aspects of the FIT.
B. Life-Time State Approach More detailed focus on this optimization problem of
Theory FIT-b. By considering Theory FIT-a, it is possible determining exact life-time can be subject of a further study.
to think about life-time state of the agent / system Ag. Briefly, Taking the expressed FIT into consideration, a general
the Theory FIT-a indicates that the Ag cannot be trained more scheme for the life-time of an intelligent agent / system can be
if it is a ‘complete intelligent agent / system’. So, by hypothesis, represented in Fig. 2.
any change in the T and / or P causes the Ag to be eliminated –
III. REPRESENTATIVE EVALUATION
terminated.
Proof FIT-b. Consider the complete Ag can continue to In order to see if the FIT makes sense (for Theory FIT-a
show success level of 100% even T and P are changed. In this and Theory FIT-b because of their applicability in at least short
case, any not-well distributed training data addition should not term) in real-case, some Machine Learning techniques have
affect the training results. But because this is not possible (at been trained with some data for some pre-defined sets of
least always), one cannot guarantee the same Ag to always problems to be solved. Technical details regarding to
solve updated Ps over updated Ts. So, the agent / system, parameters of the chosen techniques, training data, and the
which can solve updated Ps over updated Ts is not same with applied problems have not included here to just focus on the
the complete Ag, which means elimination – termination of the evaluation findings. On the other hand, readers interested in
complete Ag for the new P and / or T. More generally, that Machine Learning and the techniques more are referred to i.e.
means a new set of P and / or a new set of T requires a new [29-31].
system / agent to be employed. Within the representative evaluation process, four Machine
Theory FIT-c. Let gsr to be a global success rate stated by Learning techniques: Artificial Neural Networks (ANN), Q-
authorities as the best solution result obtained for a set of P, Learning (Q-L), Decision Trees (DT), and Naive Bayes
over the same T and with the same type of agent / system with Classifier (NBC) have been trained 30 times each to obtain
changing parameters. Let Ag to be a new agent / system, which average error rates for each technique. Each technique has been
is currently in use for the related P over the related T. By applied a set of ten problems. Obtained average error rates
hypothesis, Ag should have a life-time depending on the gsr were accepted as comparison value (like success rate of 100%)
and through a typical calculation on its training – application to see if changes in training or problem sets affect error rates of
times and some parameters including also the gsr. the techniques. Changes in training data set have been done by
Proof FIT-c. Consider there are lots of agents / systems adding different amounts of new data to the set while change in
that are increasing day by day for applying over the same P the problem set has been done by adding two new problems.
over the same T. Also, let Agb to be the agent / system having Findings taken from the process are represented briefly in
the gsr. In this case, ‘employing’ immortal agents / systems for Table 1.
P over T solved better causes the safety to be violated.
Continuing to design and use new agents / systems in a TABLE I. FINDINGS FROM THE REPRESENTATIVE EVALUATION.
heuristic way and not considering using only Agb towards
Change in;
more improvements on solutions just makes the already ML
Training Set Problem Set
observed solution areas to be discovered again and again. So, Tech.
Difference Change Rate in Difference Change Rate in
because employment of such agents / systems will cause many Rate Error Rate Error
problems in terms of safety, one cannot deny to have a life- %3 5,3%
time depending on gsr. Also a simple calculation on i.e. %10 11,4%
training – application times and parameters including the gsr ANN %20 %32,4
%55 30,4%
can give an accurate value for the life-time. %80 77,6%
Regarding to calculation of the life-time of an agent / %5 10,2%
system, typical optimization problems can be formed to
Q-L %8 22,5%
determine when to eliminate – terminate it except from other %20 %44,1
%15 56,6%
conditions indicated under FIT. This can be achieved by i.e.:
%65 80,4%
• Over a global optimization oriented minimization
%7 8,3%
problem of total error towards gsr and experienced
training results so far to determine some variables %10 14,6%
DT %20 %46,7
including also remaining use time of the agent / system. %65 43,5%

• Over a global optimization oriented maximization %90 79,1%


problem of success rate towards gsr and weighted %5 11,1%
amount of above average training sessions to %10 34,8%
NBC %20 %38,6
determine some variables including also remaining use %75 65,4%
time of the agent / system. %85 82,7%
• Over a combinatorial problem structure dealing with Error Rate Change Scale: 0- -100
parameters of past training sessions to determine
optimum path leading to near results to gsr or better
global results including also data of remaining use time.
As it can be seen from Table 1, changes in training and / systems. This will lead researchers to classify
problem sets make remarkable effects (even exponential employed agents / systems in terms of safety.
changes for training data) in error based performances of each • Use of gsr (global success rate) is an important aspect
technique. This ‘butterfly effect’ is an important sign for how a to achieve globally accurate Artificial Intelligence
secure position can result to bigger problems. That means in a based systems. This can be achieved by a central
practical manner that necessary changes should be done on set- database keeping all data regarding to all kinds of
up of the technique by making the current model of technique techniques on reported real-world based problems.
to eliminate – terminate for a newer model in the context of
Artificial Intelligence Safety. V. CONCLUSIONS AND FUTURE WORK
This paper has introduced the Fading Intelligence Theory,
which can be taken into consideration in keeping Artificial
Intelligence safety according to life-time of intelligent systems.
In detail, the theory deals with when to continue to train,
eliminate – terminate, or not to train an intelligent system to
avoid from any undesired situations that may appear because of
the ‘old’ intelligent system. It can be understood that this
theory makes Artificial Intelligence based systems mortal
although one can consider all intelligent systems immortal
because of their software oriented aspects that can be cloned,
transferred or recreated with appropriate approaches. The
situation accepting Artificial Intelligence systems as mortal
(having a life-time) is because ensuring general safety for
Artificial Intelligence of the future. On the other hand, the
Fig. 2. A general scheme for the life-time of an intelligent agent / system. theory also tries to define a general framework for life-time of
intelligent agents / systems.
IV. DISCUSSION It is clear that the theory needs further observations in
addition to the representative evaluation done within this
Regarding to findings within this research, it is possible to
research / paper. It is possible to express that a totally objective
indicate following remarkable points:
observation proving the Fading Intelligence Theory is
• Findings in the representative evaluation provides associated with the prove of a trained intelligent agent / system
proof for the FIT. Of course, more clear results can be with the accuracy of 100%, which seems impossible with the
derived from values matching with details of the FIT current scientific background. However, future work is based
(i.e. success rate at 100%). But changes according to on more and more observations that will be done by the authors.
accepted average rates have taken into consideration The authors also think about further works indicated under
and it is accepted enough by the authors to make especially late sections of the paper and additionally, any
objective comments. possible future extensions of the theory according to obtained
• It can be understood that one can say for the FIT that it findings will be considered in further experimental works.
is a theory on explaining each intelligent agent /
system is ‘unique’ and should be considered in this REFERENCES
manner to achieve desired Artificial Intelligence Safety [1] M. Anderson, and S. L. Anderson, (Eds.). Machine Ethics.
in the future. Cambridge University Press, 2011.
• Even small changes in the conditions can cause FIT to [2] N. Bostrom, and E. Yudkowsky, “The ethics of artificial
be not evaluated objectively. So, FIT seems robust in intelligence”. The Cambridge Handbook of Artificial
this manner. Intelligence, 316-334, 2014.
• Background of the FIT and more proofs will be [3] H. Moravec, “Rise of the robots-the future of artificial
available in accordance with improvements related to intelligence”. Scientific American, 23, 2009.
easier applications of Artificial Intelligence techniques. [4] D. L. Waltz, “Evolution, sociobiology, and the future of
• There is a paradox that even more advanced Artificial artificial intelligence”. IEEE Intelligent Systems, 21(3), 66-69,
Intelligence Safety oriented approach, method, or 2006.
technique may need precautions for itself. Here, such [5] A. Trabulsi, “Future of Artificial Intelligence”. Quid Blog, 2015.
issues are subject of further research works on the FIT. Online: http://www.fujitsu.com/us/Images/Panel1_Andrew_Trabulsi.pdf
• FIT should also be examined again in case of new [6] UC Berkeley – Center for Human-Compatible AI. “About”.
types of problems disturbing the safety. At this point, CHAI – Web Site, 2017. Online: http://humancompatible.ai/about
the authors are focused on a remarkable topic of [7] N. Soares, and B. Fallenstein, “Agent Foundations for Aligning
‘attacking via adversarial examples’ [32]. Machine Intelligence with Human Interests: A Technical
Research Agenda”. Machine Intelligence Research Institute,
• The concept of ‘complete agent / system’ can be 2014. Online: https://intelligence.org/files/TechnicalAgenda.pdf
examined more to define some ranks for trained agents
[8] R. V. Yampolskiy, “Artificial intelligence safety engineering: [21] T. L. Griffiths, F. Lieder, and N. D. Goodman, “Rational use of
Why machine ethics is a wrong approach”. In Philosophy and cognitive resources: Levels of analysis between the
Theory of Artificial Intelligence (pp. 389-396). Springer Berlin computational and the algorithmic”. Topics in Cognitive
Heidelberg, 2013. Science, 7(2), 217-229, 2015.
[9] Future of Life Institute, “AI Safety Research”. Future of Life [22] R. L. Lewis, A. Howes, and S. Singh, “Computational
Institute Web Site. Online: https://futureoflife.org/ai-safety- rationality: Linking mechanism and behavior through bounded
research/ utility maximization”. Topics in Cognitive Science, 6(2), 279-
[10] Open AI, “Introducing Open AI”. Open AI Web Site. Online: 311, 2014.
https://blog.openai.com/introducing-openai/ [23] S. Russell, “Rationality and intelligence: A brief update”. In
[11] Open AI, “Open AI – Sponsors”. Open AI Web Site. Online: Fundamental Issues of Artificial Intelligence (pp. 7-28).
https://openai.com/about/#sponsors Springer International Publishing, 2016.
[12] D. Abel, J. Salvatier, A. Stuhlmüller, and O. Evans, “Agent- [24] N. Bostrom, A. Dafoe, and C. Flynn, “Policy Desiderata in the
Agnostic Human-in-the-Loop Reinforcement Learning”. arXiv Development of Machine Superintelligence”, 2016. Online:
preprint arXiv:1701.04079, 2017. https://www.fhi.ox.ac.uk/wp-content/uploads/Policy-Desiderata-
in-the-Development-of-Machine-Superintelligence.pdf
[13] A. Y. Ng, and S. J. Russell, “Algorithms for inverse
reinforcement learning”. In Icml (pp. 663-670), 2000. [25] N. Bostrom, Superintelligence: Paths, Dangers, Strategies.
Oxford University Press, 2014.
[14] P. Abbeel, and A. Y. Ng, “Inverse reinforcement learning”. In
Encyclopedia of Machine Learning (pp. 554-558). Springer US, [26] M. Brundage, “Taking superintelligence seriously:
2011. Superintelligence: Paths, dangers, strategies by Nick Bostrom
(Oxford University Press, 2014)”. Futures, 72, 32-35, 2015.
[15] P. Abbeel, and A. Y. Ng, “Apprenticeship learning via inverse
reinforcement learning”. In Proceedings of the twenty-first [27] K. E. Drexler, “MDL Intelligence Distillation: Exploring
international conference on Machine learning (p. 1). ACM, strategies for safe access to superintelligent problem-solving
2004. capabilities”. Technical Report #2015-3, Future of Humanity
Institute, Oxford University: pp. 1-17, 2015.
[16] R. S. Sutton, and A. G. Barto, Introduction to Reinforcement
Learning. Cambridge: MIT Press, 1998. [28] N. Bostrom, “Ethical issues in advanced artificial intelligence.
Science Fiction and Philosophy: From Time Travel to
[17] L. Orseau, and S. Armstrong, “Safely interruptible agents”. In
Superintelligence”, 277-284, 2003.
Uncertainty in Artificial Intelligence: 32nd Conference (UAI
2016), edited by Alexander Ihler and Dominik Janzing (pp. 557- [29] E. Alpaydin, Introduction to Machine Learning. MIT Press,
566), 2016. 2014.
[18] O. Evans, A. Stuhlmüller, and N. D. Goodman, “Learning the [30] R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, (Eds.).
preferences of ignorant, inconsistent agents”. arXiv preprint Machine Learning: An Artificial Intelligence Approach.
arXiv:1512.05832, 2015. Springer Science & Business Media, 2013.
[19] O. Evans, and N. D. Goodman, “Learning the preferences of [31] T. M. Mitchell, Machine Learning. McGraw-Hill, 1997.
bounded agents”. In NIPS 2015 Workshop on Bounded [32] I. Goodfellow, N. Papernot, S. Huang, Y. Duan, P. Abbeel, and
Optimality, 2015. J. Clark, “Attacking Machine Learning with Adversarial
[20] N. Soares, B. Fallenstein, S. Armstrong, and E. Yudkowsky, Examples”, Open AI – Blog Web Site, 2017. Online:
Corrigibility. In Workshops at the Twenty-Ninth AAAI https://blog.openai.com/adversarial-example-research/
Conference on Artificial Intelligence, 2015.

You might also like