Computational Cognitive Neuroscience Methods for Guided Reinforcement Learning

Members: Aleksandrs Ecins, Ashley Kleinhans, Adam McLeod, Christian Huyck, Ching Teo, Daniel B. Fasnacht, John Harris, Janelle Szary, Jonathan Tapson, Kailash Patil, Mounya Elhilali, Nicolas Oros, Michael Pfeiffer, Sergio Davies, Shih-Chii Liu, Timmer Horiuchi, Tobi Delbruck, Terry Stewart

- Organized by John Harris & 'David Noelle'

John Harris harris@… 26-Jun 16-Jul
'David Noelle' dnoelle@… 25-Jun 17-Jul
Nicolas Oros oros.le.russe@… 25-Jun 6-Jul
Christian Huyck C.Huyck@… 4-Jul 10-Jul
'Chris Kello' ckello@… 8-Jul 16-Jul

BACKGROUND: Previous efforts in neuromorphic engineering have largely focused on perception. As accomplish- ments in this domain accumulate, there is an increasing interest in shifting these explorations to more abstract levels of representation. Rather than identifying features in patterns of light or sound, or cross-modal features that demand sensory integration, the focus would be on manipulating representations of more abstract entities, like situations, ac- tions, constraints, rewards, and costs. This is the domain of situated learning and integrative generation of behavior. In short, this is the domain of cognition.

PROBLEM: The focus of workshop will be on translating computational cognitive neuroscience models of human learning into artificial neuromorphic systems that display adaptation at the cognitive level. Of particular interest will be modern models of reinforcement learning based on interactions between dopamine nuclei and both the striatum and cortical areas. Models of interactions between the basal ganglia and the frontal cortex will also play a central role. These models provide neurocomputational methods for learning to generate sequences of actions so as to optimize expected reward. Integration with frontal cortex provides mechanisms for explicitly guiding the learning process, offering a working memory for relevant past experiences and focusing attention on relevant features, with the option of such explicit guidance arriving as verbal instructions. The goal is to produce a biologically grounded reinforcement learning agent which can by guided by explicit, linguistically structured, instructions. The capabilities of such an agent could be tested in a variety of task domains, including gambling tasks, game playing, search tasks, and routine sequential tasks (e.g., learning to prepare a cup of coffee).

SOLUTION: The literature contains a number of computational cognitive neuroscience models of dopamine-based reinforcement learning and attentional modulation by circuits in the prefrontal cortex. Specifically, models of these processes, capturing the behavior of humans on laboratory tasks, have been produced using the Leabra modeling framework. We propose extending these models to demonstrate scaling beyond the previously modeled laboratory tasks, as well as integrating such cognitive models with neuromorphic sensory systems. Previously developed neu- romorphic systems for vision and audition are to be incorporated, providing a means both for the resulting artificial agent to sense its environment and for the presentation of explicit instructions to the agent. Learning in the resulting system will be explored both at a “developmental” time scale, during which the system comes to learn the (restricted) language of instruction, and at a task-learning times scale, during which the system learns to perform a task both from explicit instruction and the delivery of reward.

POSSIBLE STUDENT PROJECTS: The typical student project will either extend an existing Leabra cognitive model, demonstrating the ability of the model to scale beyond a simple laboratory task, or will integrate a cognitive model with the output of an existing neuromorphic sensory system. Possible topics include:

• associating instructional tokens with reward, allowing utterances to act as secondary reinforcers

• associating instructional tokens with environmental features, allowing utterances to guide attention

• learning to parse instructions as abstract rules, associating situation features with actions

• learning multi-step tasks in which the current state of the environment is fully observable, as well as tasks in partially observable environments in which the current state can be determined given based on memory

• automaticity: reducing dependence on frontal cortex systems with extended practice


The first name listed is the project "captain" who calls for meetings ...

• Humanoid: Adam, Brian, Sam A., Katie, Jay

Focus on getting all the necessary subsytems working

• Object Recognition: Sam S., Brian, Janelle, Roi (with Michael and Ashley in small font)

  1. Point to or move to an object as indicated by speech
  2. Lean the objects that should be approached

• Sentence Understanding: Harris (with Trent, Jay and Nicolas with hypothesis generation)

  1. Simple
  2. Hypothesis generation

• Speech Recognition using images: Tapson, Brian, Shih-Chii, Katie, Kailash, Roi

  1. Simple
  2. Gender identification?
  3. Speaker ID
  4. Using Shih-Chii's spiking cochlea

• Virtual world example: David, Adam, Brian, Sam A.

  1. Single-agent
  2. Multi-agent


Results from our workgroup's three projects are described on this web page.