SLAM

Participants: Ralph Etienne-Cummings, Timmer Horiuchi, Nicolai Waniek, Ryad Benjamin Benosman, Michael Mathieu, Will Constable, Daniel Rasmussen, Christian Denk, Zafeirios Fountas, Amir Khosrowshahi, Garrick Orchard, Byron Galbraith, Amir Khosrowshahi

We are going to create an environment in which we can generate place, grid and border cells as needed. This is based on a grid of velocity/direction controlled oscillators, which constructive interference can be used to contruct place cells. Place cells can be constructed based on perceptual cues. Place cells can be linked together in a cognitive map.

This projects has many sub-projects, listed below.

neuroSLAM

Based on the ratSLAM, we implemented a spiking neuron based odometry. The system is composed of three layers :

* The first layer is composed of oscillators. Each oscillator has a base frequency of 8Hz, and is dephased according to the dot product of the instant velocity v of the robot with the receptive direction d of the oscillator. More formally, for each oscillator, we have dw/dt = Omega + dot(dp,d) where Omega is 8Hz and dv is a small change of position of the robot/rat. Then, the oscillator has a bank of dephased outputs max(0, sin(w(t)+theta)), where theta take typically 16 values between 0 and 2*pi. These outputs are input current spiking neurons, one per pair (oscillator, theta), so the continuous outputs are transformed to spiking frequencies.

* The second layer is composed of spiking neurons, taking at most one input per oscillator and spiking only when every connected oscillator spike at the same time. These oscillator are grid cells : their receptive fields (ie. the robot locations that make the neuron to fire) are periodic places on the map. The number of such cells and their connections is fixed before the experiment.

* The third layer is composed of place cells, which is created on the fly. Everytime we need a new place cell, a new neuron is created, with input weights proportional to the grid cells that are currently firing. This cell aggregates the information from several grid cells, and therefore its period is much larger than the grid cells, and its response is much more precise.

During the experiments, we run a robot with several control paradigms (manual control, random walk with obstacle avoidance based on bump sensors, precoded map) and we drop place cells as we move (and special place cells, called border cells, when we hit a object, to mark the obstacle). The place cell spikes are then sent to a neuron based mapping algorithm, linking place cells when they fire simultaneously, effectively building a reachability graph between the place cells.

Mapping and Path Planning

The activation of place cells to represent a specific region within a given environment provides a good foundation for discovering, through exploration, how actions connect places. By rapidly “storing” the specific experience of traveling from one place to another (a triplet: starting position, action, and ending position) a network of connected place cells begins to form a map. In the research literature, there are many variations on when these links can be initiated and what information they carry, we have focused on the storage of the direction of travel (or “action”) between place cells in allocentric (i.e., “world”) coordinates. Each pair of place cells thus is associated with a pair of actions. Determining the specific action that links the activation of place cells is challenging due to the need to evaluate the result of actions across time when a creature does not travel precisely between place field centers and multiple places are simultaneously active. In this project, while we investigated various learning approaches to this problem, we ultimately hand-coded these actions. . In addition to associating actions to pairs of place cells, we proposed relatively weak mutually-excitatory synaptic connections that can spread activation when the place cells are not being driven by the underlying odometry system (grid cell or oscillator system (see previous project).

To find paths through the environment based on previous experience, the network of place cells are “disconnected” from the odometry system either by suppressing (or somehow blocking synaptic drive from) the grid cells or oscillators. To find paths within the stored network of place cells, the entire layer of place cells is uniformly biased and activity is stimulated in one or more locations to trigger rapidly spreading activation through the network. By attending to the place cell(s) that represent(s) the current location, the direction of the first wavefront of activity to arrive is an indicator of the direction of the shortest path to the stimulated place cell(s).

A MPEG4 video of the path planning dynamics is available below ('cheese2cat1_v3.m4v')

Shown (above) is an example of the network guiding the simulated motion of a (mouse) robot from a starting position to the closer of two locations with cheese. Shown below, is a path found through the network discovered by the (real) robot (called “Daisy”) after exploring a hallway environment.

RBM (Restricted Boltzmann Machine) SLAM

The purpose of this project is to determine if a Restricted Boltzmann Machine or Deep Network can aid in navigation. The basic idea is to feed in activations from place cells as well as a heading, and have the network learn the to predict where it will be in some number of time steps. In a standard DBN trained in an unsupervised way with two hidden layers, and then finally trained with backpropagation to predict the place cell activations in ten time steps, the network does amazingly well.

To determine the accuracy, a script was built that converts place cell activations back into (x,y) coordinates, yielding an attractive visualization that can be found in the attached movie  http://neuromorphs.net/nm/attachment/wiki/SLAM/rbmslam.avi (4.2 Megs).

In the picture below, the movie is frozen at a point in which it is predicting t+10 in blue, with the probabilistic heatmap on the left representing where it thinks it will be in ten timestamps. The max of this heatmap yields the blue point.

In the figure below, green points are true (x,y) locations, red is reconstructed (x,y) from the place cell activations, and blue circles are the positions predicted at time t+1.

This work was intended to operate on the high level aspects of the SLAM problem. In particular, the problem was how to navigate to some specified location, given a place cell-like representation of the space. The place cell representation was taken as given from other parts of the SLAM project, and here we focused on the navigation/planning problem. The model is written for the Nengo simulation environment, and is implemented through spiking leaky integrate-and-fire neurons throughout.

The central feature of our solution is a gradient of activation, spreading out through the place cell network from the target location. The place cell connections are calculated so as to output a slightly decreased version of their input value; thus when the target place cell is activated, the other cells in the network will be representing progressively decreased values the farther from the target they are.

In addition to establishing a gradient, the network needs to represent information about the direction of the gradient, to allow the agent to navigate to the target. This is accomplished by having each place cell represent a D-dimensional value, where D is the number of incoming connections to that place cell (note that we are using “cell” here in an abstract sense, to refer to a population of neurons associated with that location rather than a single neuron). This population is then recurrently connected in a mutually inhibitory fashion, so that each cell computes a local winner-take-all over its inputs. Thus each place cells represents the value of the highest valued incoming connection, which indicates the direction up the gradient.

After calculating a gradient, this network includes an action selection component to allow the agent to use the gradient to navigate to the target. The basic idea here is that each connection to a place cell is associated with an action, which simply indicates a direction of travel. Once the place cell has determined the direction of the gradient, it inhibits the actions associated with all the connections except the one selected. Thus each place cell will simultaneously be indicating the action that leads up the gradient from its location. In order to narrow this down to one choice, the actions are all weighted by the current response of the place cell. Since place cells respond to the location of the agent, this means that the action selected will be the one that corresponds to the agent’s current location (or some weighted sum of that cell and its neighbours). As the agent follows that action it will move into the range of a different place cell, which will cause that action to be selected, allowing the agent to continue to follow the gradient uphill from that new location. If a new target is selected a new gradient will be established, and the agent will automatically begin moving in the new direction.

This method of navigation has a number of nice advantages. First, it is completely continuous. The agent does not use discrete planning/decision making/action timesteps, it is always smoothly combining action and navigation as it moves throughout the space. It also nicely interpolates between place cells, given a somewhat distributed place cell activation. This allows the agent to move in directions other than that encoded in any specific cell. Effectively this method combines the navigational knowledge of all the nearby place cells, choosing the best overall direction of travel. These features also make the network very robust and flexible. The agent never builds a complex plan that can be disrupted by unexpected events; it is always just acting based on the local information (although that local information encodes information about the direction to the global goal). Thus if the network is disrupted, or the agent’s movements do not take it where it planned, it will just continue navigating based on the new circumstances with no interruption.

Integration with other projects: As mentioned at the beginning, this system represents just one high-level aspect of the overall SLAM problem. The goal at the beginning of the workshop was to integrate this system with other projects, to provide a complete SLAM story. To some extent this was successful. We were able to build a place cell map based on the output of Timmer’s place cell mapping system, and then navigate through that map using the system described here. We were also able to run this model on Spinnaker, which in turn controlled a real robot’s actions based on the model’s output. However, there are some unresolved problems with the model’s translation onto Spinnaker, which mean that although the model works fairly well, it does not navigate as accurately as when simulated purely in Nengo.

The main aspect we didn’t integrate into this system is the place cell activation based on the agent’s current location. In the current system this activation is just artificially simulated based on the agent’s known location. However, in the future it would be interesting to replace this with a real place cell activation model based on odometry information, such as that developed by Ralph’s group.

A video of the system in action is attached (gradientdemo2.avi Download), as well as the source code (gradientaction.zip Download).

A video of the system running on the SpiNNaker is attached nef_spinnaker_robot.avi Download and can be seen in the Screenshot below.

NEF model running on the SpiNNaker Robot - screenshot

A variant of the planning problem is the question of navigating through multiple goals. Such planning is necessary in situations where goals are in different places, and must be picked up sequentially to construct an end-product. The solution is based on Timmer's place cell mapping system. Within the agent's internal map, each of the goals propagate signals which reach the agent's current location. Once the signal is received, the agent moves a step towards the goal. Neural mechanisms are built in to prevent the signals from two goals from confusing the agent as to where to go. The agent reaches the nearest goal first, followed by further goals. Once a goal is reached, it no longer sends out signals.

An image of the completed path after travelling through the goals is given below.

Attachments