Actor-Critic for Multi-agent System with Variable Quantity of Agents

Wang, Guihong; Shi, Jinglun

doi:10.1007/978-3-030-14657-3_5

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 271))

Included in the following conference series:

International Conference on Internet of Things as a Service

975 Accesses
2 Citations

Abstract

Reinforcement learning (RL) has been applied to many cooperative multi-agent systems recently. However, most of research have been carried on the systems with fixed quantity of agents. In reality, the quantity of agents in the system is often changed over time, and the majority of multi-agent reinforcement learning (MARL) models can’t work robustly on these systems. In this paper, we propose a model extended from actor-critic framework to process the systems with variable quantity of agents. To deal with the variable quantity issue, we design a feature extractor to embed variable length states. By employing bidirectional long short term memory (BLSTM) in actor network, which is capable of process variable length sequences, any number of agents can communicate and coordinate with each other. However, it is noted that the BLSTM is generally used to process sequences, so we use the critic network as an importance estimator for all agents and organize them into a sequence. Experiments show that our model works well in the variable quantity situation and outperform other models. Although our model may perform poorly when the quantity is too large, without changing hyper-parameters, it can be fine-tuned and achieve acceptable performance in a short time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

€32.70 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: EUR 29.95; Price includes VAT (France)

eBook: EUR 42.79; Price includes VAT (France)

Softcover Book: EUR 52.74; Price includes VAT (France)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SparseMAAC: Sparse Attention for Multi-agent Reinforcement Learning

Imbalanced Equilibrium: Emergence of Social Asymmetric Coordinated Behavior in Multi-agent Games

Learning multi-agent communication with double attentional deep reinforcement learning

Article 25 March 2020

References

Li, Y.: Deep reinforcement learning: An overview. arXiv preprint arXiv:1701.07274 (2017)
Sukhbaatar, S., Fergus, R.: Learning multiagent communication with backpropagation. In: Advances in Neural Information Processing Systems (2016)
Google Scholar
Mao, H., et al.: ACCNet: Actor-Coordinator-Critic Net for Learning-to-Communicate with Deep Multi-agent Reinforcement Learning. arXiv preprint arXiv:1706.03235 (2017)
Lowe, R., et al.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems (2017)
Google Scholar
Tampuu, A., et al.: Multiagent cooperation and competition with deep reinforcement learning. PLoS ONE 12(4), e0172395 (2017)
Article Google Scholar
Leibo, J.Z., Zambaldi, V., Lanctot, M., Marecki, J., Graepel, T.: Multi-agent reinforcement learning in sequential social dilemmas. arXiv preprint arXiv:1702.03037 (2017)
Foerster, J., et al.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems (2016)
Google Scholar
Foerster, J., Assael, Y.M., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2137–2145 (2016)
Google Scholar
Foerster, J.N., Assael, Y.M., de Freitas, N., et al.: Learning to communicate to solve riddles with deep distributed recurrent q-networks. arXiv preprint arXiv:1602.02672 (2016)
Peng, P., Wen, Y., Yang, Y., et al.: Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games. arXiv preprint arXiv:1703.10069 (2017)
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45(11), 2673–2681 (1997)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. Eprint Arxiv (2014)
Google Scholar
Konda, V.R., Tsitsiklis, J.N.: Actor-critic algorithms. In: Advances in Neural Information Processing Systems (2000)
Google Scholar
Vinyals, O., Ewalds, T., Bartunov, S., et al.: Starcraft ii: A new challenge for reinforcement learning. arXiv preprint arXiv:1708.04782 (2017)

Download references

Author information

Authors and Affiliations

South China University of Technology, Guangzhou, 510641, China
Guihong Wang & Jinglun Shi

Authors

Guihong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jinglun Shi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinglun Shi .

Editor information

Editors and Affiliations

Northwestern Polytechnical University, Xi′an, China
Bo Li
Northwestern Polytechnical University, Xi'an, China
Mao Yang
Shandong University, Jinan, Qinghai, China
Hui Yuan
Northwestern Polytechnical University, Xi'an, Shaanxi, China
Zhongjiang Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, G., Shi, J. (2019). Actor-Critic for Multi-agent System with Variable Quantity of Agents. In: Li, B., Yang, M., Yuan, H., Yan, Z. (eds) IoT as a Service. IoTaaS 2018. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 271. Springer, Cham. https://doi.org/10.1007/978-3-030-14657-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-14657-3_5
Published: 07 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14656-6
Online ISBN: 978-3-030-14657-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Actor-Critic for Multi-agent System with Variable Quantity of Agents

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

SparseMAAC: Sparse Attention for Multi-agent Reinforcement Learning

Imbalanced Equilibrium: Emergence of Social Asymmetric Coordinated Behavior in Multi-agent Games

Learning multi-agent communication with double attentional deep reinforcement learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Actor-Critic for Multi-agent System with Variable Quantity of Agents

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

SparseMAAC: Sparse Attention for Multi-agent Reinforcement Learning

Imbalanced Equilibrium: Emergence of Social Asymmetric Coordinated Behavior in Multi-agent Games

Learning multi-agent communication with double attentional deep reinforcement learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation