Welcome to Scribd!

Backgammon Learning with Co-Evolution and Temporal Difference

Uploaded by

0% found this document useful (0 votes)

18 views18 pages

This document summarizes an experiment comparing coevolutionary learning and temporal difference learning methods for optimizing a Backgammon agent. The experiment found that: 1) A larger population size is needed for the noisy Backgammon task to maintain genetic diversity. 2) Increasing the number of games per individual, while keeping population size small, reduced behavioral diversity and hurt performance. 3) Coevolutionary learning was able to match the results of temporal difference learning given enough computational resources to support a large population size and number of games.

Original Description:

Original Title

AAI

Copyright

Available Formats

PPT, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

18 views18 pages

Backgammon Learning with Co-Evolution and Temporal Difference

Uploaded by

Akbal Juárez Martínez

Copyright:

Attribution Non-Commercial (BY-NC)

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 18

Search inside document

Computationally Intensive and Noisy Tasks: CO-EvolutionaryLearning and Temporal Difference Learning on Backgammon

Motivation Experiment Setup

A Benchmark and a Representation

Fitness Function and Other Parameters Measuring Genetic Diversity

Results

How Big a Population, How Many Games

Corollary: Small Population, More Games is Worse

Conclusions

Experimental Setup

Pubeval (Temporal Difference learning) The co-evolutionary system here also

uses Pubevals simple linear representation.

On a more sophisticatedneural network architecture, thismethod createdthe worlds best Backgammon computer, TDGammon

Experimental

Pubeval is two linear functions,

main part of a game of Backgammon the final racing stage, when pieces do not have to pass the opponents pieces. This racing stage is less interestingthan the main part, because there is an algorithm to exactly solve the end game.

Co-evolution here only optimizes the first function and The final racing part of the game uses Pubevals racing weights.

Measuring Genetic Diversity

A popular measure of genetic diversity

is the Shannon index. Givenn different groups, each of which has fraction fi of the total number of individuals, the Shannon index H

The question facing this paper is, for a given way to repre- sent a solution, how can CO-evolutionary learning obtain the highest ability from the least CPU time?

Results ( How Big a Population, How Many Games)

Plausible answer is that on this noisy

task,more samples

Do those extra games make any

differenceat all?

Sampling more games would more accurately discem the differences in ability among the members of the population.
that the extra precision in those evaluations does indeed have an effect,but a negative one: more games reduce the behavioral diversity, which ipso facto require more evaluations to discern those smaller differences between players.

Corollary: Small Population, More Games is Worse Since we're interested in achieving a
representation's peak ability, from the least CPU time, it is reasonable to ask what happens when the population size is barely large enough, instead of generously large. Smaller populations use less CPU time. But a smaller population has more trouble maintaining diversity, and Figure 3 and 4 show that

Co-Evolution,What Is ItGoodFor?

The answer depends partly on the your

learning task and computational resources

If your task is such that a small improvement doesn't count, and parallel hardware is unavailable, then temporal difference learning is attractive. If the task requires the best possible competitive advantage, and coming second means losing, then using much more CPU time for the best possible results may be worth it. if parallel hardware is available, co- evolution becomes attractive.

Conclusion
Use a generously large population: more noise requires more population. If you skimp on population size, more evaluations can be worse for learning, not better, because of its tendency to reduce diversity. Use just enough evaluations, so that more does not improve learning. This depends on the task, and your implementation. Here, each individual needs to take part in about 1600 games.

Computationally intensive noisy tasks

are tractable to coevolutionary learning on inexpensive parallel hardware, and given enough computational power, can create a solution comparableto Temporal Difference learning.

Simulating Human GM
Document8 pages
Simulating Human GM
Roberto Munter
No ratings yet
Case Study
Document46 pages
Case Study
Anshul Thakkar
No ratings yet
Digital Game Based Learning of Stack: Name-Trishit Gupta REG NO-20BIT0374
Document7 pages
Digital Game Based Learning of Stack: Name-Trishit Gupta REG NO-20BIT0374
TRISHIT DEVENDER GUPTA 20BIT0374
No ratings yet
Arthur Samuel - Some Studies in Machine Learning Using The Game of Checkers
Document21 pages
Arthur Samuel - Some Studies in Machine Learning Using The Game of Checkers
Mayara Carneiro
No ratings yet
Learning To Play Draughts Using Temporal Difference Learning With Neural Networks and Databases
Document8 pages
Learning To Play Draughts Using Temporal Difference Learning With Neural Networks and Databases
Shay Kar
No ratings yet
01 Speed Read Tensorflow Playground
Document6 pages
01 Speed Read Tensorflow Playground
moumita dey
No ratings yet
Artificial Neural Networks - Lect - 4
Document17 pages
Artificial Neural Networks - Lect - 4
ma5395822
No ratings yet
1967 Somes Studies in Machine Learning
Document17 pages
1967 Somes Studies in Machine Learning
Juan Camilo España
No ratings yet
Classification Models: An Overview of Ensemble Learning Techniques
Document40 pages
Classification Models: An Overview of Ensemble Learning Techniques
Friday Jones
100% (1)
Genetic Algorithm: Surma Mukhopadhyay
Document26 pages
Genetic Algorithm: Surma Mukhopadhyay
Ashwini Sawant
No ratings yet
Neuro - Evolutionary Model For Playing Games
Document6 pages
Neuro - Evolutionary Model For Playing Games
sid rai
No ratings yet
DeepLearning L1 Intro
Document92 pages
DeepLearning L1 Intro
lafdali
No ratings yet
Deep Reinforcement Learning in Mario: Final Project Report of CS747: Foundations of Intelligent Learning Agents
Document6 pages
Deep Reinforcement Learning in Mario: Final Project Report of CS747: Foundations of Intelligent Learning Agents
Toonz Network
No ratings yet
Genetic Algorithms: Population Representation and Fitness Function
Document44 pages
Genetic Algorithms: Population Representation and Fitness Function
Ahmed Essam
No ratings yet
A Final Year Project Presentation On: "Movie Recommendation System
Document21 pages
A Final Year Project Presentation On: "Movie Recommendation System
Dipen Shrestha
33% (3)
What is Neural Network
Document23 pages
What is Neural Network
B Basit
No ratings yet
Stable Archetypes For The Turing Machine: You, Them and Me
Document7 pages
Stable Archetypes For The Turing Machine: You, Them and Me
mdp anon
No ratings yet
1.1 Background
Document52 pages
1.1 Background
Shahab ali
No ratings yet
Efficient Self-Play Learning of The Game Abalone
Document18 pages
Efficient Self-Play Learning of The Game Abalone
Eldane Vieira
No ratings yet
Ee126 Project 1
Document5 pages
Ee126 Project 1
api-286637373
No ratings yet
Recurrent Neural Networks: Anahita Zarei, PH.D
Document37 pages
Recurrent Neural Networks: Anahita Zarei, PH.D
Nick
No ratings yet
A Recipe For Training Neural Networks
Document15 pages
A Recipe For Training Neural Networks
Choukha Ram (cRc)
No ratings yet
Deepmind Research Papers
Document4 pages
Deepmind Research Papers
afedsxmai
100% (1)
EDAP01
Document4 pages
EDAP01
Axel Rosenqvist
No ratings yet
Reliable, Decentralized Methodologies For Byzantine Fault Tolerance
Document7 pages
Reliable, Decentralized Methodologies For Byzantine Fault Tolerance
themacanerd
No ratings yet
Bayesian, Linear-Time Communication
Document7 pages
Bayesian, Linear-Time Communication
Gath
No ratings yet
Evolutionary Game Learning
Document5 pages
Evolutionary Game Learning
Chirag Thaker
No ratings yet
Cadi A Player
Document6 pages
Cadi A Player
Nurjamin
No ratings yet
Evolutionary Neural Networks for Product Design
Document11 pages
Evolutionary Neural Networks for Product Design
jlolaza
No ratings yet
Evolving No-Loss Tic-Tac-Toe Strategies Using Genetic Algorithms
Document46 pages
Evolving No-Loss Tic-Tac-Toe Strategies Using Genetic Algorithms
Amit Patel
73% (15)
Project Presentation Viva Question and Answers
Document4 pages
Project Presentation Viva Question and Answers
Pankaj Kumar Gond
No ratings yet
DL Class3
Document28 pages
DL Class3
Rishi Chaary
No ratings yet
Solve, The Mckinsey Game
Document47 pages
Solve, The Mckinsey Game
sj
No ratings yet
2012 Nikolaos Nikolaou MSC
Document102 pages
2012 Nikolaos Nikolaou MSC
uyjco0
No ratings yet
Engl317 Proj4 Whitepaper
Document18 pages
Engl317 Proj4 Whitepaper
api-356375495
No ratings yet
Module 5
Document72 pages
Module 5
prattyush1234
No ratings yet
6CS6.2 Unit 5 Learning
Document41 pages
6CS6.2 Unit 5 Learning
Aayush Agarwal
No ratings yet
Deployment of The Producer-Consumer Problem: Bill Smith
Document7 pages
Deployment of The Producer-Consumer Problem: Bill Smith
dagospam
No ratings yet
Unity 5 Game Optimization - Sample Chapter
Document43 pages
Unity 5 Game Optimization - Sample Chapter
Packt Publishing
No ratings yet
Scrum Simulation With LEGO Bricks v2.0
Document17 pages
Scrum Simulation With LEGO Bricks v2.0
Francisco Javier Méndez Vázquez
100% (1)
Week 03
Document28 pages
Week 03
Osii C
No ratings yet
Game Tree
Document25 pages
Game Tree
Areej Ehsan
No ratings yet
Ensemble Learning Combines Models
Document57 pages
Ensemble Learning Combines Models
YASH GAIKWAD
100% (1)
AI in Sport
Document5 pages
AI in Sport
Svastits
No ratings yet
Performance of Java Application - Part 1
Document9 pages
Performance of Java Application - Part 1
manjiri510
No ratings yet
Generative Adversarial Networks
Document10 pages
Generative Adversarial Networks
hputluri
No ratings yet
HOME
Document51 pages
HOME
endale
No ratings yet
Fuzzy
Document6 pages
Fuzzy
thrw3411
No ratings yet
Case Study 2022 Topics
Document28 pages
Case Study 2022 Topics
Priyanka Chaudhary
No ratings yet
Fork Join
Document24 pages
Fork Join
emailtorakesh
No ratings yet
What is Machine Learning
Document59 pages
What is Machine Learning
UrsTruly Anirudh
No ratings yet
Monte Carlo Tree Search in Monopoly
Document5 pages
Monte Carlo Tree Search in Monopoly
Kapil Kanagal
No ratings yet
Decision Tree Tutorial: How to Construct Them and Use for Classification
Document126 pages
Decision Tree Tutorial: How to Construct Them and Use for Classification
Nicknaim
No ratings yet
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning For Task-Oriented Dialogue Systems
Document8 pages
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning For Task-Oriented Dialogue Systems
Mohammad
No ratings yet
Neural Networks Project Report: 1. Background
Document5 pages
Neural Networks Project Report: 1. Background
sunil sanju
No ratings yet
Session 15-2 Future NLP & Deep Learning
Document81 pages
Session 15-2 Future NLP & Deep Learning
rearcow
No ratings yet
6 CNN
Document50 pages
6 CNN
SWAMYA RANJAN DAS
No ratings yet
This Particular Paper
Document6 pages
This Particular Paper
johnturkleton
No ratings yet
Artificial Intelligence Interview Questions
From Everand
Artificial Intelligence Interview Questions
Tech Interviews
Rating: 5 out of 5 stars
5/5 (2)
Analysis and Design of Algorithms: A Beginner’s Hope
From Everand
Analysis and Design of Algorithms: A Beginner’s Hope
Shefali Singhal
No ratings yet
Electrical Shop Management System New
Document38 pages
Electrical Shop Management System New
Deva Chandran
67% (3)
HHH HHHH
Document3 pages
HHH HHHH
JoãoLucianiFerreira
No ratings yet
Fpga Implementation of Binary Search 1
Document5 pages
Fpga Implementation of Binary Search 1
Araneesh Rajan
No ratings yet
Ios Interview Questions and Answers
Document11 pages
Ios Interview Questions and Answers
Renjith Raveendran
100% (2)
Logitech Z3
Document16 pages
Logitech Z3
bruati
No ratings yet
Handybackup User Manual 712
Document164 pages
Handybackup User Manual 712
Roberto Gabriel Palomo
No ratings yet
Consumer Electronics 2011-2012
Document135 pages
Consumer Electronics 2011-2012
Jaimichu07
No ratings yet
Chess GPS: 12 Chapters Exploring Strategic Concepts
Document17 pages
Chess GPS: 12 Chapters Exploring Strategic Concepts
Stans Wole
100% (1)
K5 UT+Electric+Control+Manual
Document8 pages
K5 UT+Electric+Control+Manual
Jesús Sánchez Carrión
No ratings yet
Weeks 9-10-11 Input/Output Interface Circuits and LSI Peripheral Devices
Document80 pages
Weeks 9-10-11 Input/Output Interface Circuits and LSI Peripheral Devices
ramen4naruto
No ratings yet
MetroRail ERP SolutionArch
Document484 pages
MetroRail ERP SolutionArch
balamurali_a
No ratings yet
Blaupunkt Malaga CD36
Document25 pages
Blaupunkt Malaga CD36
William Sarmiento
No ratings yet
Parts of Computer: Output Devices
Document15 pages
Parts of Computer: Output Devices
Ramanjot
No ratings yet
Introduction To EFI
Document190 pages
Introduction To EFI
SisirKP
100% (1)
Embedded Systems - Unit I - Notes
Document7 pages
Embedded Systems - Unit I - Notes
SHIVANI NANDA
No ratings yet
Aoc 152v Tsum16ak
Document49 pages
Aoc 152v Tsum16ak
edy
No ratings yet
MagicAd Enabler Eng
Document20 pages
MagicAd Enabler Eng
Linards Breidaks
No ratings yet
Dell - Venue 10 Pro 5055 - 14238-1
Document5 pages
Dell - Venue 10 Pro 5055 - 14238-1
jonathan hernandez
No ratings yet
Tongta Inverter: MODBUS Communication Application Manual
Document21 pages
Tongta Inverter: MODBUS Communication Application Manual
dsarabia
No ratings yet
HBXX 3817TB1 VTM - Aspx PDF
Document2 pages
HBXX 3817TB1 VTM - Aspx PDF
Luis Alfonso Lopez Arroyabe
No ratings yet
SLIDES2PCMKIIIUserManualv1 6
Document41 pages
SLIDES2PCMKIIIUserManualv1 6
morphelya
No ratings yet
User Manual for HD DVR Product
Document12 pages
User Manual for HD DVR Product
Mohamed Maher
No ratings yet
2-Phase Stepper Motor Unipolar Driver Ics: Absolute Maximum Ratings
Document7 pages
2-Phase Stepper Motor Unipolar Driver Ics: Absolute Maximum Ratings
Calin Luchian
No ratings yet
Coping With Unix
Document307 pages
Coping With Unix
Roni Tapeño
No ratings yet
B660M Pro RS
Document100 pages
B660M Pro RS
Marcelino Garavito
No ratings yet
Sony Ericsson TXT Pro Service Manual
Document84 pages
Sony Ericsson TXT Pro Service Manual
Chema Manzanito
100% (1)
Arduino UNO
Document1 page
Arduino UNO
Hiru Segway
No ratings yet
Harsh Sukhramani
Document6 pages
Harsh Sukhramani
Harsh Sukhramani
No ratings yet
Piping Layout Drawing
Document2 pages
Piping Layout Drawing
yulianus_sr
No ratings yet
SMD 357
Document6 pages
SMD 357
Marcoantonio Antonio
No ratings yet