Classify MNIST digits recorded with ATIS silicon retina sensor, using TrueNorth, trained with Caffe

Team: Paul Merolla, Kate Fischl, Garrick Orchard

Motivation

Neuromorphic sensors, such as the ATIS and DVS image sensors, convert images into a rich temporal spike code. Using these sensors as a front end for TrueNorth opens up the possibility for a fast, low power object recognition system that works only using spikes.

Our approach

There are two main steps involved in realizing the real-time digit classification system. The first step consists of creating and training the object recognition model to run on TrueNorth. The second step consists of integrating the ATIS sensor with TrueNorth to achieve live real-time operation.

Model Creation and Training

We made use of a publicly available spike-based conversion of the MNIST dataset which was taken with the ATIS sensor mounted on a pan-tilt while viewing MNIST digits on a computer monitor. Details of the dataset creation, as well as a download of the dataset itself, are available at:  http://www.garrickorchard.com/datasets/n-mnist

The continuous spike stream was converted to static images by accumulating spikes for 10ms at a time to create a static images for training. These static images are used to create a Lightning Memory Mapped Database (LMDB) on which training was performed using the caffe deep learning framework (modified to support TrueNorth). A simple 1 layer Neural Network was used, with 100 neurons trained to respond to each of the 10 digits. The final output of the system is a histogram of the number of spikes output by neurons representing each class. The class with the most spikes is deemed to be the most likely output.

Real-Time Operation

A laptop powers and interfaces to the ATIS sensor which is mounted on a helmet worn by a user. This laptop performs simple noise filtering on ATIS spikes and activity based tracking of the MNIST digit on a screen. Spikes occurring within the 28x28 pixel region of interest being tracked are remapped to target corresponding cores and axons on the TrueNorth (sometimes multiple axons per spike). The laptop accumulates spikes until 130 spikes are available for classification, at which time all 130 spikes are communicated to TrueNorth over UDP.

The trained neural network runs on TrueNorth and output spikes are communicated to a second laptop using UDP. This second laptop performs visualization of the results.

Results

On the spiking MNIST test set, we achieved 76%-80% accuracy at 100 classification/sec. The classification rate (100/sec) is limited by the fact that we use 10ms of data for each classification, but TrueNorth is capable of performing 1000 such classifications per second. In the real-time system the classifier on TrueNorth uses only 4 cores (0.1% of the chip). Temporally, utilization on this 0.1% of the physical chip is below 10% (i.e. the 4 cores are idle >90% of the time) when performing 100 classifications per second.

Future directions

Other training schemes are possible, and we should explore them to see which one achieves the best performance. Other possibilities are to use constant events (instead of constant time), or to use a more continuous approach that integrates incoming spikes with a decay.

Our approach can be scaled to more complex data sets. Ideally, we would like to use spatio-temporal data, so we can make use of more precise spike timing and achieve better performance than traditional frame-based systems.

Attachments