Adaptive Behavior

 

Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Click here to register and gain free access

Click here for more information

Sign In to gain access to subscriptions and/or personal tools.
This Article
Right arrow Full Text (PDF)
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (3)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Khamassi, M.
Right arrow Articles by Guillot, A.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Adaptive Behavior, Vol. 13, No. 2, 131-148 (2005)
DOI: 10.1177/105971230501300205

Actor–Critic Models of Reinforcement Learning in the Basal Ganglia: From Natural to Artificial Rats

Mehdi Khamassi

AnimatLab, LIP6, Paris, France; LPPA, CNRS–Collège de France, Paris, France

Loïc Lachèze

AnimatLab, LIP6, Paris, France

Benoît Girard

AnimatLab, LIP6, Paris, France; LPPA, CNRS–Collège de France, Paris, France

Alain Berthoz

LPPA, CNRS–Collège de France, Paris, France

Agnès Guillot

AnimatLab, LIP6, Paris, France

Since 1995, numerous Actor–Critic architectures for reinforcement learning have been proposed as models of dopamine-like reinforcement learning mechanisms in the rat’s basal ganglia. However, these models were usually tested in different tasks, and it is then difficult to compare their efficiency for an autonomous animat. We present here the comparison of four architectures in an animat as it per forms the same reward-seeking task. This will illustrate the consequences of different hypotheses about the management of different Actor sub-modules and Critic units, and their more or less autono mously determined coordination. We show that the classical method of coordination of modules by mixture of experts, depending on each module’s performance, did not allow solving our task. Then we address the question of which principle should be applied efficiently to combine these units. Improve ments for Critic modeling and accuracy of Actor–Critic models for a natural task are finally discussed in the perspective of our Psikharpax project—an artificial rat having to survive autonomously in unpre dictable environments.

Key Words: animat approach • TD learning • Actor–Critic model • S–R task • taxon navigation


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Adaptive BehaviorHome page
M. Witkowski
An Action-Selection Calculus
Adaptive Behavior, March 1, 2007; 15(1): 73 - 97.
[Abstract] [PDF]