Convolution based Event-Driven Stereo Computation on Bernabe's hardware

In this workgroup we wanted to experiment with multiple DVS cameras and event-based Convolution modules for visualizing objects in 3D. For this, we brought a set of PCB hardware modules (chips were developed by IMSE ( http://www.imse-cnm.csic.es) and PCBs jointly with ATC ( http://www.atc.us.es)):

- A set of eight DVS event-driven cameras from IMSE-CNM-CSIC (Sevilla Microelectronics Institute).

2rets

These are enhanced-contrast-sensitivity DVS cameras (simlar to IEEE J. Solid-State Circuits, June 2011, pp. 1443-1455, but with low power operation).

- Three in-house made "Nodeboard" PCBs, each holding a Spartan6, 4 SATA connectors for 2.5Gbps RocketI/O serial AER interfacing, and parallel AER pins.

nodeboard

This board can be used to program parallel and serial AER modules, such as mergers, splitters, and Convolution filters, routers, as well as arrays of Convolvers configured in a 2D mesh. These boards can be complemented with plug-in modules, like the one shown in figure below, which holds an ASIC 64x64 pixel high speed convolution chip (see IEEE J. Solid-State Circuits, Feb. 2012, pp. 504-517).

conv64

This board can, for example, be programmed to fuse the event flows from four DVS retinas into a single serial SATA AER flow, as shown in the figure below.

4rets

- We also brought a commercial Virtex6 prototyping board from XilinX (ML605), which can be programmed to hold a mesh of up to 8x8 Convolution with routers (see C. Zamarreño-Ramos, et al., Multi-Casting Mesh AER: A Scalable Assembly Approach for Reconfigurable Neuromorphic Structured AER Systems. Application to ConvNets?, EEE Trans. BioCAS, in Press).

The goal of this project was to perform stereo vision through DVS cameras by using the following clues for matching events captured by separate cameras:

a) matched events have to appear within a narrow time window (for example, 1ms)

b) matched events have to satisfy epipolar geomtric restrictions

c) each retina AER flow will be sent to 3 separate Gabor filters for orientation detection. Event driven convolution computation provides output events simultaneous to input events. Consequently, detected orientations also have to match among the corresponding events of the different cameras.

In order to achieve these goals the hardware setup in the figure below was used, which contains two AER DVS cameras, a merger modules, the ML605 Virtex6 board which contains the 6 required Gabor filters and fuses the 6 Gabor filter outputs together with the two input retinas, and a USBAERmini2 board to collect all events and sends them through USB2.0 to a laptop for visualizing and recording with jAER.

stereo_hw

With this setup we recorded moving objects, as shown in the two figures below (as well as attachment IMGP1694.AVI available at the bottom of this page).

cube3d_a cube3d_b

The figure below shows a 40ms capture, directly from jAER, of all the events from the 2 retinas and the 6 gabor filters. Top 3 Gabors correspond to Retina2 and bottom 3 Gabors correspond to Retina1.

jaer

After recording timestamped events, we can display and analyze them in matlab. For example, the figure below displays the output events of the two retinas, together with their 3 Gabor filter output events, for a time window of 5ms. The events captured are marked with blue dots. For each Gabor filter output, the corresponding retina events have been superimposed using either green (Retina1) or magenta (Retina2) circles. As can be seen, the Gabor orientation detection events overlap in time with the retina events for this 5ms time window, highlighting the simultaneity property of input and output event flow in event driven convolution processing hardware.

jAER_matlab

The computation starts by calibrating the stereo system and estimate the fundamental matrix. This is done by acquiring matched locations using a pattern and use them to derive a ransac type of estimation.

The idea is then to use the combination of the epipolar geometry and the orientation of each edge to validate a match. Each incoming event is then matched based on its arrival time, its distance to active epipolar lines and orientation. A set of active events are kept for 1ms. The orientation is shown in the figure above for each edge of the cube. Once a match is found, a disparity map is computed as shown below. Dark blue refers to more distant points while red is for ore closer ones.

The movie in the attachment below (stereo.wmv) shows the computed depth in time when observing a moving pen approaching and moving away from the retinas.

Attachments