The torque operator

Participants: Cornelia Fermuller, Ching Teo, Yiannis Aloimonos


The idea is to define an attention mechanism that responds to image regions that contain closed boundaries. We suggest this can be realized through an operator that covers differently sized image regions, computing edge measurements and evaluating the ‘closedness of boundaries’ within the region. We call this operator the torque operator after the so-called physical concept, which it computes. The torque operator in essence produces a new map that encodes connected regions of different size and shape, and it finds long boundary contours. We next explain the concept, and then show results.

The concept

We adopt the concept of the ‘torque’ from physics, and define the torque in the image as illustrated in Figure 1a. Let 0 be the origin, r the vector from 0 to a point, and e the oriented edge at the point (perpendicular to the image gradient). Then we define

torque(r) = r x e.

The value of the torque amounts to

τ(r) = |r| |e| sin (a),

with a the angle between r and e. We are interested in the torque as a measure of the edges within patches (Figure 1b). Thus, for a given patch with center, 0, we sum over the torque values of all points in the patch, and we define the magnitude of torque of a local patch P as this sum normalized by the area of the patch:

τ(P) = sum (_over points in the patch) τ(r)/ area(P).

Figure 1: Illustration of the torque.

Note that in our definition edges are oriented. If border ownership is available - coming from motion or stereo, there naturally is an induced orientation. For example we can define the edge always going counter-clockwise in the foreground object. In the absence of other cues we define the edge vector based on the brightness polarity. Let it be at +90 degrees from its gradient, such that the brighter side is on its right and the darker side on its left. The intuition behind the torque is to obtain a measure that signals the presence of connected regions bounded by contours. The magnitude of the torque of a patch will be larger when the edges in the patch tend to be in order, and enclose a contour. On the other hand, if edge points are randomly distributed over the patch as it is the case for textures, the magnitude of the torque of the patch will be of small value, because the different values cancel each other. Patches with larger values of τ(P) tend to contain contours organized around the center of the patch. If the center of the patch is away from the center of a closed region, the value of τ(P) will be smaller, and if the patch only contains part of a closed region, the value of τ(P)will be even smaller.

Using the torque

We compute the torque at every image point over a range of patch sizes. Then we combine the different values into one map,which we call the torque value map, by selecting at every point the maximum value over the patch sizes. Figure 2 illustrates the process. Given the image in Figure 2a, we down-sampled it by a factor of two and extracted the table region (Fig 2c). Figure 2b shows the torque values for three sizes of image patches, and Figure 2c shows the combined torque value map. Note, since the torque value depends on the polarity of edges, darker regions on lighter background have negative torque value, and lighter regions have positive torque value. We then computed connected components on the negative torque values and selected the four largest connect components. Their mass centers were selected as fixation points to be passed on to the segmentation algorithm (Fig 2c).

Figure 2. Illustration of the torque mechanism.

Further examples from images processed during the demonstration are shown in Figure 3.

Figure 3: Torque value maps and selected fixation points.