MONOCULAR VISUAL SLAM - Event-Based Features

Main contributors Greg Cohen, Garrick Orchard, Xavier Lagorce


We split the SLAM problem up into two sub-problems. The first is the problem of reliably recognizing and localizing features from the event-based data. The second is the problem of determining the 3D locations of features and the camera in space. The first problem is discussed below, whereas the details of the second problem can be found on the [Monocular Visual SLAM Results] page.


All Visual SLAM implementations rely on a robust and reliable feature detection mechanism. These algorithms typically use a single camera to estimate both camera position and to build a map of the surroundings and are therefore entirely dependent on these features. When moving to event-based cameras, the existing approach to feature detection no longer directly applies and there exists a need for a new class of feature detectors that can produce tracked features suitable for use in a SLAM system. The feature detection method below generates appropriate features for a SLAM algorithm.

The problem involves two different steps: the first being the detection of good features within the events stream and the second being the reliable tracking of these features as new events arrive. It is also important to note that having a descriptor calculated for each feature is very desirable as it allows the SLAM engine to keep track of features that are no longer visible on the scene.

Feature Detection

The core of the feature detection algorithm we build was a modified Shi and Tomasi corner detector. The feature detector is only used when looking for new features in the stream of events coming in from the camera, and is not active when tracking features as they move.If the number of features being tracked is less than the desired amount, the corner detector is invoked each time that an event arrives. It operates on a patch of the exponential time surface centered on the address of the incoming event. Corners are desirable on the time surface as represent a time signature that we can repeatedly detect and track.

Whenever a feature is detected, a descriptor is calculated for the time-surface patch. Many different types of descriptors were tested, including SIFT-like orientation-based histograms. It was found that a simple normalized histogram of the region provided a good compromise between speed and reliability.

Feature Tracking

Unfortunately, it is not simply possible to keep searching for the stored features in the time surface generated by later events as even small movements of the camera or a slight velocity change can cause large variations in the feature over time. To counter this, the features are updated as often as possible checking if any events are close to a given feature, and the testing the descriptor calculated on the time surface centered around the new events against the stored descriptor for that feature. If a match exists, then the feature is updated to have moved to the new location. In addition to this, a new descriptor is calculated based on the appearence of the previous and new descriptors. This allows the feature to change as it moves across the scene.

Features are removed if they are not updated for a certain amount of time, at which time new features are then found using the feature detection method described above.

Test Data

The final algorithm for detecting and tracking features was evaluated using a real-world recording taken with the ATIS mounted onto a helmet. The recording involved Xavier walking in and out of the side entrance doors to the school, starting from a position at the base of the stairs. He then made three loops of differing radii through the open doors and into the courtyard outside and back inside through the second door. These recordings therefore contained both indoor and outdoor lighting conditions. The input video can be seen below:


The results of the feature tracker can be seen in the video below:

The exponential time surface is displayed in the top left of the image with the black dots representing the current location of features. At the bottom left is a plot of the features with trajectory through the image. The two images on the top right show the patches of the time surface, with the one on the left showing the current feature, and the one on the left showing the new feature. Below these two are two plots showing the location of the features and the trajectory through which that feature has progressed. Underneath that is a plot of the two descriptors as straight lines. And finally, at the bottom right is a histogram showing the number of times each of the active features has been matched.