ISBN-13: 9781119699033 / Angielski / Twarda / 2020 / 320 str.
ISBN-13: 9781119699033 / Angielski / Twarda / 2020 / 320 str.
Preface xiAcknowledgments xixAbout the Authors xxi1 Introduction: Multi-agent Coordination by Reinforcement Learning and Evolutionary Algorithms 11.1 Introduction 21.2 Single Agent Planning 41.2.1 Terminologies Used in Single Agent Planning 41.2.2 Single Agent Search-Based Planning Algorithms 101.2.2.1 Dijkstra's Algorithm 101.2.2.2 A* (A-star) Algorithm 111.2.2.3 D* (D-star) Algorithm 151.2.2.4 Planning by STRIPS-Like Language 151.2.3 Single Agent RL 171.2.3.1 Multiarmed Bandit Problem 171.2.3.2 DP and Bellman Equation 201.2.3.3 Correlation Between RL and DP 211.2.3.4 Single Agent Q-Learning 211.2.3.5 Single Agent Planning Using Q-Learning 241.3 Multi-agent Planning and Coordination 251.3.1 Terminologies Related to Multi-agent Coordination 251.3.2 Classification of MAS 261.3.3 Game Theory for Multi-agent Coordination 281.3.3.1 Nash Equilibrium 311.3.3.2 Correlated Equilibrium 361.3.3.3 Static Game Examples 381.3.4 Correlation Among RL, DP, and GT 401.3.5 Classification of MARL 401.3.5.1 Cooperative MARL 421.3.5.2 Competitive MARL 561.3.5.3 Mixed MARL 591.3.6 Coordination and Planning by MAQL 841.3.7 Performance Analysis of MAQL and MAQL-Based Coordination 851.4 Coordination by Optimization Algorithm 871.4.1 PSO Algorithm 881.4.2 Firefly Algorithm 911.4.2.1 Initialization 921.4.2.2 Attraction to Brighter Fireflies 921.4.2.3 Movement of Fireflies 931.4.3 Imperialist Competitive Algorithm 931.4.3.1 Initialization 941.4.3.2 Selection of Imperialists and Colonies 951.4.3.3 Formation of Empires 951.4.3.4 Assimilation of Colonies 961.4.3.5 Revolution 961.4.3.6 Imperialistic Competition 971.4.4 Differential Evolution Algorithm 981.4.4.1 Initialization 991.4.4.2 Mutation 991.4.4.3 Recombination 991.4.4.4 Selection 991.4.5 Off-line Optimization 991.4.6 Performance Analysis of Optimization Algorithms 991.4.6.1 Friedman Test 1001.4.6.2 Iman-Davenport Test 1001.5 Summary 101References 1012 Improve Convergence Speed of Multi-Agent Q-Learning for Cooperative Task Planning 1112.1 Introduction 1122.2 Literature Review 1162.3 Preliminaries 1182.3.1 Single Agent Q-learning 1192.3.2 Multi-agent Q-learning 1192.4 Proposed MAQL 1232.4.1 Two Useful Properties 1242.5 Proposed FCMQL Algorithms and Their Convergence Analysis 1282.5.1 Proposed FCMQL Algorithms 1292.5.2 Convergence Analysis of the Proposed FCMQL Algorithms 1302.6 FCMQL-Based Cooperative Multi-agent Planning 1312.7 Experiments and Results 1342.8 Conclusions 1412.9 Summary 1432.A More Details on Experimental Results 1442.A.1 Additional Details of Experiment 2.1 1442.A.2 Additional Details of Experiment 2.2 1592.A.3 Additional Details of Experiment 2.4 161References 1623 Consensus Q-Learning for Multi-agent Cooperative Planning 1673.1 Introduction 1673.2 Preliminaries 1693.2.1 Single Agent Q-Learning 1693.2.2 Equilibrium-Based Multi-agent Q-Learning 1703.3 Consensus 1713.4 Proposed CoQL and Planning 1733.4.1 Consensus Q-Learning 1733.4.2 Consensus-Based Multi-robot Planning 1753.5 Experiments and Results 1763.5.1 Experimental Setup 1763.5.2 Experiments for CoQL 1773.5.3 Experiments for Consensus-Based Planning 1773.6 Conclusions 1793.7 Summary 180References 1804 An Efficient Computing of Correlated Equilibrium for Cooperative Q-Learning-Based Multi-Robot Planning 1834.1 Introduction 1834.2 Single-Agent Q-Learning and Equilibrium-Based MAQL 1864.2.1 Single Agent Q-Learning 1874.2.2 Equilibrium-Based MAQL 1874.3 Proposed Cooperative MAQL and Planning 1884.3.1 Proposed Schemes with Their Applicability 1894.3.2 Immediate Rewards in Scheme-I and -II 1904.3.3 Scheme-I-Induced MAQL 1904.3.4 Scheme-II-Induced MAQL 1934.3.5 Algorithms for Scheme-I and II 2004.3.6 Constraint OmegaQL-I/OmegaQL-II(COmegaQL-I/COmegaQL-II) 2014.3.7 Convergence 2014.3.8 Multi-agent Planning 2074.4 Complexity Analysis 2094.4.1 Complexity of CQL 2104.4.1.1 Space Complexity 2104.4.1.2 Time Complexity 2104.4.2 Complexity of the Proposed Algorithms 2104.4.2.1 Space Complexity 2114.4.2.2 Time Complexity 2114.4.3 Complexity Comparison 2134.4.3.1 Space Complexity 2134.4.3.2 Time Complexity 2144.5 Simulation and Experimental Results 2154.5.1 Experimental Platform 2154.5.1.1 Simulation 2154.5.1.2 Hardware 2164.5.2 Experimental Approach 2174.5.2.1 Learning Phase 2174.5.2.2 Planning Phase 2174.5.3 Experimental Results 2184.6 Conclusion 2264.7 Summary 2264.A Supporting Algorithm and Mathematical Analysis 227References 2285 A Modified Imperialist Competitive Algorithm for Multi-Robot Stick-Carrying Application 2335.1 Introduction 2345.2 Problem Formulation for Multi-Robot Stick-Carrying 2395.3 Proposed Hybrid Algorithm 2425.3.1 An Overview of ICA 2425.3.1.1 Initialization 2425.3.1.2 Selection of Imperialists and Colonies 2435.3.1.3 Formation of Empires 2435.3.1.4 Assimilation of Colonies 2445.3.1.5 Revolution 2445.3.1.6 Imperialistic Competition 2455.4 An Overview of FA 2475.4.1 Initialization 2475.4.2 Attraction to Brighter Fireflies 2475.4.3 Movement of Fireflies 2485.5 Proposed ICFA 2485.5.1 Assimilation of Colonies 2515.5.1.1 Attraction to Powerful Colonies 2515.5.1.2 Modification of Empire Behavior 2515.5.1.3 Union of Empires 2525.6 Simulation Results 2545.6.1 Comparative Framework 2545.6.2 Parameter Settings 2545.6.3 Analysis on Explorative Power of ICFA 2545.6.4 Comparison of Quality of the Final Solution 2555.6.5 Performance Analysis 2585.7 Computer Simulation and Experiment 2655.7.1 Average Total Path Deviation (ATPD) 2655.7.2 Average Uncovered Target Distance (AUTD) 2655.7.3 Experimental Setup in Simulation Environment 2655.7.4 Experimental Results in Simulation Environment 2665.7.5 Experimental Setup with Khepera Robots 2685.7.6 Experimental Results with Khepera Robots 2695.8 Conclusion 2705.9 Summary 2725.A Additional Comparison of ICFA 272References 2756 Conclusions and Future Directions 2816.1 Conclusions 2816.2 Future Directions 283Index 285
Arup Kumar Sadhu, PhD, received his doctorate in Multi-Robot Coordination by Reinforcement Learning from Jadavpur University in India in 2017. He works as a scientist with Research & Innovation Labs, Tata Consultancy Services.Amit Konar, PhD, received his doctorate from Jadavpur University, India in 1994. He is Professor with the Department of Electronics and Tele-Communication Engineering at Jadavpur University where he serves as the Founding Coordinator of the M. Tech. program on intelligent automation and robotics.
1997-2025 DolnySlask.com Agencja Internetowa