Publications related learning.
- T. Lourens. A Biologically Plausible Model for Corner-based Object Recognition from Color Images [5296 KB pdf]. Shaker Publishing B.V., Maastricht, The Netherlands, March 1998.
1.1 Outline of the thesis
The flow of visual information via the M or magno-cellular (from magnus = large) pathway is basically used to order the chapters in the thesis, with one exception: Chapter 7 is about color which belongs to the P or parvo-cellular (from parvus = small) pathway.
We know that our visual system detects and locates corners and edges accurately. In Chapter 2 we take a closer view of the main flow of visual information from eye to primary visual cortex. This information flow is known as early vision.
Chapter 3 gives a general overview of the most widely used models for different cells in early vision. The modeled properties of most of these cells form the basis for our approach of artificial vision. We model the so-called center-surround, simple, and complex cell types. In the visual system these cell types can differ in their spatial (and temporal) resolution. Some cells only respond to very small parts of the visual field and are involved with highly detailed vision, while other cells respond to large parts of the visual field. These cells interact at different levels of accuracy (scales). Interaction at different scales raises the questions which scales are useful and how these scales can be ordered. In natural vision systems the spatial accuracy drops with eccentricity, but in most static artificial systems accuracy is uniform.We do pay attention to this phenomenon, however, for the sake of completeness. The decrease in spatial accuracy with eccentricity highly reduces the amount of information. A reduction of information is important since it makes the system relatively fast when motion is involved. Motion is not included in this thesis but it will be added to the model in future research.
In Chapter 4 a corner operator based on responses of cortical end-stopped cells proposed by Heitger et al. [71] will be modeled (with some minor but important changes) and compared to six standard corner enhancing operators. We are aiming at a robust operator with respect to the position under different conditions (e.g. rotation of the image), since it forms the basis for the graph. Rotation and position at different scales are analyzed to determine the robustness of the operator.
Chapter 5 improves the end-stopped operator by using multiple scales. This is necessary since the operator is noise sensitive at small scales and does not respond to all different corners at a single scale. The response of the operator will be examined for different scales at convex corners, rounded corners, and several junctions.We discuss the choice for a proper operator to combine multiple scales and motivate the choice for the number of scales. Finally the multiple scale corner operator will be compared with standard operators at both single and multiple scales.
Chapter 6 describes a line-segment extraction algorithm where the corners obtained with the end-stopped operator at multiple scales are used together with the “edge enhanced image”. Edges between a pair of corner points will be extracted only, hence edges without two corner points are not detected. The content of this chapter is meant as an intermediate step towards a graph representation and should be regarded as preprocessing for graph matching (see Chapter 8).
Chapter 7 extends the model with color. We use two “color-opponent” channels which are found in natural color vision. In previous chapters we gave a model which is based on achromatic vision. In this chapter we use this model but apply it to two different opponent channels. Hence models for biologically plausible color-opponent cells are proposed: one opponent cell type which responds to edges of a preferred orientation and a type which responds to corners. In natural vision two opponent color channels are found; in combination with the achromatic channel, every color can be reconstructed. The three channels are combined to yield the final edge and corner detection model.
Chapter 8 gives a graph matching algorithm which searches copies of different known objects in the input graph. The representation of these objects accomplishes scale, rotation, and translation invariance. The matching algorithm is based on a standard back-tracking algorithm which is a time consuming (NP-hard) problem. Hence angle and length ratio attributes which are found in every two-dimensional graph are added to speed up the search. Appendix A describes some functions used in linear filtering, such as the Gaussian, Laplacian of a Gaussian, and the difference of Gaussians. This appendix aims at the reader who is interested in the differences and properties of these functions.
Parts of Chapter 3 and Appendix A have been published in [120, 121], parts of Chapters 4, 5, and 7 in [198], and parts of Chapters 6 and 8 in [122].1.2 Contributions
In this section we give the contributions of this thesis.
- An overview of early vision from a computational point of view is given. With this overview an artificial vision system based on line and corner enhancement can be constructed (Chapter 3).
- The corner detecting qualities of the model of end-stopped cells, proposed by Heitger et al. [71], are assessed (Chapter 4).
- We propose a new corner detector by a multi-scale combination of the modeled end-stopped cells (Chapter 5), which yields:
- a physiological model for the percept of a corner and
- a useful corner operator for computer vision.
- Edge, and corner enhancement algorithms are generalized to color channels. We use the properties of the complex and end-stopped cells and assume that these cells respond excitatory to one color and inhibitory to another color (Chapter 6).
- We develop a line detection algorithm, based on the assumption that corners are more stable than lines, and use it to extract line-segments by following enhanced edges from one corner to another (Chapter 7).
- We develop an attributed graph format for views of objects, which is suited for objects in which all edges are spanned by corners.
- A graph matching algorithm will be used for object recognition (Chapter 8), where the choice of attributes leads to:
- invariance under translation, rotation and scaling,
- robustness under small perspective changes and undetected lines, and
- reduction of evaluations from N! to less than N3.
Points 4-6 apply also to physiologically motivated color channels and complete color images. This is still done rarely in computer vision.
- T. Lourens and R. P. Wurtz. Object Recognition by Matching Symbolic Edge Graphs. [587 KB pdf]. In R. Chin and T. C. Pong, editors, Proceedings of the Third Asian Conference on Computer Vision, ACCV ’98, volume 1352 of Lecture Notes in Computer Science, pages 193-200. Springer-Verlag, January 1998.
Abstract
We present an object recognition system based on symbolic grpahs with object corners as vertices and object outlines as edges. Corners are determined in a robust way by a multiscale combination of an operator modeling cortical end-stopped cells. Graphs are constructed by line-following from corner to corner. Model matching is then done by finding subgraph isomorphisms in the image graph. The complexity is reduced by adding labels to corners and edges. The choice of labels makes the recognition system invariant under translation, rotation, and scaling.
- N. Petkov and T. Lourens. Interacting cortical filters for object recognition [150 KB pdf]. In K. Sugihara, editor, Proceedings of Asian Conference on Computer Vision, ACCV ’93, pages 583-586, Nov. 23-25 1993.
Abstract
It is shown how cortical filters can be used for image analysis and object recognition. Similarly to previous work in this area, we compute functional inner products of a two-dimensional input signal (image) with a set of two-dimensional Gabor functions which fit the receptive fields of simple cells in the primary visual cortex of mammals. We propose a method in which these inner products become the subject of thresholding orientation competition and lateral inhibition. Each of the resulting cortical images contains only edge lines of a particular orientation and a particular light-to-dark transition direction. In this way, the information which is present in the original image is split in different channels and we show how this splitting can be used for object recognition. The method discriminates between simple geometrical figures, e.g. polygons with different numbers of edges, with reliability of 100% and a recognition rate of 99% has been achieved when the method was applied to a large database of face images.
- N. Petkov, P. Kruizinga, and T. Lourens. Orientation Competition in Cortical Filters -an Application to Face Recognition [468 KB pdf]. In H.A. Wijshoff, editor, Proceedings of Computing Science in The Netherlands, CSN ’93, pages 285-296, Nov. 9-10 1993.
Abstract
A biologically motivated, computationally intensive approach to computer vision is developed and applied to the problem of automatic face recognition. The approach is based on the use of two-dimensional Gabor functions which model the receptive eld functions of simple cells in the primary visual cortex of mammals. The convolutions of an input image with a set of antisymmetric visual receptive field functions (imaginary parts of Gabor functions) become the subject of thresholding and orientation competition. The developed cortical lters deliver highly structured information which is used for efficient feature extraction and representation in a lower dimension space. Applied to face recognition, the method gives a recognition rate of 98.5% on a large database of face images.
- N. Petkov, P. Kruizinga, and T. Lourens. Face Recognition on the Connection Machine CM-5 [519 KB pdf]. In G.R. Joubert, D. Trystram, F.J. Peters, and D.J. Evans, editors, Parallel Computing: Trends and Applications, Proceedings of the International Conference on Parallel Computing ’93, volume 9 of Advances in Parallel Computing, pages 185-192, Grenoble, France, Sept. 7-10, 1993. Elsevier Science Publishers B.V. Amsterdam.
Abstract
A biologically motivated compute intensive approach to computer vision is developed and applied to the problem of face recognition. The approach is based on the use of two-dimensional Gabor functions that t the receptive elds of simple cells in the primary visual cortex of mammals. A descriptor set that is robust against translations is extracted and used for a search in an image database. The method was applied on a database of 205 face images of 30 persons and a recognition rate of 94% was achieved. The nal version of the paper will report on the results obtained by applying a set of 1024 Gabor functions on a database of 1000 face images of 150 persons and on the implementation on a Connection Machine CM-5 parallel supercomputer to be installed at our university until the end of 1992.
- N. Petkov and T. Lourens. Natural vision simulations – an application to face recognition.[272 KB pdf]. In H. Dedieu, editor, Circuit Theory and Design 93, Proceedings of the 11th Conference on Circuit Theory and Design, pages 821-826, Davos, Switzerland, Aug. 30 – Sept. 3 1993. Elsevier Science Publishers B.V. Amsterdam.
Abstract
A method is proposed in which the convolutions of an input image with a set of visual receptive field functions become the subject of thresholding, orientation competition and lateral inhibition. The developed cortical filters deliver highly structured information which can be used for efficient feature extraction. The method is applied to face recognition achieving a recognition rate of 99% on a large database of face images.
- N. Petkov, P. Kruizinga, and T. Lourens. Biologically motivated approach to face recognition [512 KB pdf]. In J. Mira, J. Cabestany, and A. Prieto, editors, New Trends in Neural Computation, Proceedings of the International Workshop on Artificial Neural Networks, IWANN ’93, volume 686 of Lecture Notes in Computer Science, pages 68-77. Springer-Verlag, June 9-11 1993.
Abstract
A biologically motivated compute intensive approach to computer vision is developed and applied to the problem of face recognition. The approach is based on the use of two-dimensional Gabor functions that fit the receptive fields of simple cells in the primary visual cortex of mammals. A descriptor set that is robust against translations is extracted by a global reduction operation and used for a search in an image database. The method was applied on a database of 205 face images of 30 persons and a recognition rate of 94% was achieved.