Algorithm Support

Supporting corner detection algorithms for SLAM

Corner Detection

Corners are the most preferred type of feature to track in an image. Since the goal is to match the same feature over different frames, the feature needs to be as recognisable as possible, and corners are regions with a high gradient variation in the image. When looked at them through a small window, shifting the window over the point would result in significant changes (1).

Harris Corner Detector

Harris corner detector is a mathematical approach to this problem, published in 1988 and basing on Moravec's already existing algorithm. Moravec's algorithm tests each pixel in the image to see if a corner is present, by considering how similar a patch (window) centred on the pixel is to nearby, largely overlapping patches. The similarity is measured by taking the sum of squared differences (SSD) between the corresponding pixels of two patches. A lower number indicates more similarity. Therefore, if the pixel is in a region of uniform intensity, then the nearby patches will look similar. If the pixel is on an edge, then nearby patches in a direction perpendicular to the edge will look quite different, but nearby patches in a direction parallel to the edge will result in only in a small change. If the pixel is on a feature with variation in all directions, then none of the nearby patches will look similar. The problem with this algorithm lies in the fact that Moravec proposed angles of 45° between the directions used for SSD computation, so if the edge is not horizontal, vertical or diagonal, the edge will be incorrectly chosen as a feature (4, 5).

Harris and Stephens improved this idea, creating an isotropic detector. Again, the sum of the squared distances is used to compare image patches. Given a patch p with dimensions (p_w+1 \times p_h+1), the SSD over it compared to a patch of the same size as p shifted by (x, y) in the direction of the U and V axis of the image, is defined as:

E(u,v) = \sum_{x, y}w(x,y)[I(x+u, y+v)-I(x,y)]^2

where w(x,y) is the window function, I(x,y) is the intensity (gray-scale value at image position (x,y), and I(x+u,y+v) is the shifted intensity. For nearly constant patches, the squared term will be close to 0, whereas for very distinctive patches it will be large: the aim is therefore to maximise E(u,v), and in particular the term (1, 2):

\sum_{x, y}[I(x+u, y+v)-I(x,y)]^2

Using Taylor's expansions, the first term becomes:

I(x+u, y+v) \approx I(x,y) + \frac {\partial I}{\partial x} u + \frac {\partial I}{\partial y} v

Defining I_x = \frac {\partial I}{\partial x} and I_y = \frac {\partial I}{\partial y} (1):

E(u,v) \approx \sum_{x, y}w(x,y)(uI_x + vI_y)^2 \\ E(u,v) \approx \sum_{x,y}w(x,y)(u^2I_x^2 + 2uvI_xI_y + v^2I_y^2)

This can now be expressed in matrix form:

E(u,v) \approx (u,v) \begin{bmatrix} \sum I_x^2 & \sum I_xI_y \\ \sum I_xI_y & \sum I_y^2 \end{bmatrix} \begin{pmatrix} u \\ v \end{pmatrix} \Rightarrow (u,v) \mathcal {M} \begin{pmatrix} u \\ v \end{pmatrix}

The window function w(x,y) in the simplest approximation, like in Moravec's definition, is 1 (it is defined as 1 inside the window and 0 outside it). In order to achieve an isotropic detector, therefore valid considering all directions, the function is given a Gaussian profile and delimits a circular window:

w(x,y) = $exp$ \left (- \frac {(x^2+y^2)}{2 \sigma^2} \right)

A corner - and in general an interest point - is characterised by a large variation of E in the directions of the vector (x,y). An interest point will therefore have two large eigenvalues in the matrix \mathcal{M}. Basing on their values, these inferences can be made:

  1. if \lambda_1 \approx 0 and \lambda_2 \approx 0, the pixel (x,y) has no features;
  2. if \lambda_1 \approx 0 and \lambda_2 has a large value, then it is an edge;
  3. if \lambda_1 and \lambda_2 have large values, then it is a corner (4).

Since the evaluation of the eigenvalues is computationally expensive, a score R is given to each window, defined as:

Plotting the derivatives as 2D points: corners have a larger distribution over the vector (x,y) (2)

R = \lambda_1 \lambda_2 - k(\lambda_1 + \lambda_2) = $ det$ (\mathcal{M}) - k $ trace$ (\mathcal{M})^2

where k is determined empirically and in literature has values 0.04-0.15. A window with an R score greater than a certain threshold is considered a corner (1, 4).


Shi-Tomasi Corner Detection

In 1994 Shi and Tomasi made a small modification to this Harris algorithm in their paper Good Features to Track, which showed better results. Shi and Tomasi proposed as a scoring function R:

R = min(\lambda_1, \lambda_2)

This means that only when both eigenvalues are greater than a threshold \lambda_{min}, the feature is considered a corner (3).

This image shows the classification of image points using the eigenvalues of M , according to the Harris scoring function (left) and Shi-Tomasi (right) (image created by Maurice Yap)

Sources

Reference list

  1. OpenCV. Harris corner detector — OpenCV 2.4.12.0 documentation. [Online] Available from: http://docs.opencv.org/2.4/doc/tutorials/features2d/trackingmotion/harris_detector/harris_detector.html [Accessed: 13th March 2016]
  2. Collins R. Lecture 06: Harris Corner detector. [Online] Available from: http://www.cse.psu.edu/~rtc12/CSE486/lecture06.pdf [Accessed: 14th March 2016]
  3. OpenCV. Shi-Tomasi corner detector & good features to track — OpenCV 3.0.0-dev documentation. [Online] Available from: http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_feature2d/py_shi_tomasi/py_shi_tomasi.html [Accessed: 13th March 2016]
  4. Albrecht S. An analysis of VISUAL MONO-SLAM. [Online] 2009 Oct. Available from: http://www-lehre.inf.uos.de/~svalbrec/documents/master_thesis.pdf [Accessed: 11th March 2016]
  5. Harris C, Stephens M. A COMBINED CORNER AND EDGE DETECTOR. [Online] 1988 [Accessed: 13th March 2016]. Available from: https://www.cis.rit.edu/~cnspci/references/dip/feature_extraction/harris1988.pdf [Accessed: 13th March 2016]