An efficient solution to inexact graph matching with application to computer vision

Thoresen, Simon

Thoresen, Simon

Doctoral thesis

Åpne

122501_FULLTEXT01.pdf (4.511Mb)

Permanent lenke

http://hdl.handle.net/11250/249860

Utgivelsesdato

2007

Metadata

Vis full innførsel

Samlinger

Institutt for datateknologi og informatikk [6770]

Sammendrag

Graph matching is considered the recognition step in computer vision. Our work uses attributed relation graphs (ARG’s) for image representation and analysis, a structural description that we enhance with the complete and coherent spatial knowledge of the source image.

In chapter 1 we reveal a trend where researchers are slowly incorporating more and more spatial knowledge into relational graphs. The ultimate realization of such knowledge in graphs is what we term a “spatially coherent attributed relational graph” whose connectivity is such that any degree of connectivity can be derived from any other. We argue that selective pruning or thresholding of connectivity in a graph is therefore the projection of a solution into a problem instance. This is our first contribution.

This trend degenerates most popular matching methods since these rely on graphs being sparsely connected, and typically collapse as connectivity increases.

In part 1 we introduce our second contribution; an inexact graph matcher whose performance increases with the connectivity of the graphs. Although the method is designed for our spatially coherent graphs, early research shows that it can even be applied to more traditional relational graphs as well. Our graph matcher extends the ideas of semilocal constraints to hold as global constraints. To solve intermediate instances of the assignment problem, we propose a very simple twopass method that performs with sufficient accuracy.

We present all runtimes in the number of comparisons that are performed on vertices and edges in a problem instance, since this measurement is separate from processor-time – a number biased by implementation skill, processor architecture and operating system. Our method runs by the least possible amount of vertex comparisons, and a tiny fraction of the upper-bound edge comparisons. It has a memory footprint that scales effortlessly with graph sizes.

Finally, in part 2 we present our third and last contribution; an object matcher capable of combining a set of graph matching results derived from multiple vision domains.

Utgiver

Fakultet for informasjonsteknologi, matematikk og elektroteknikk

Serie

Doktoravhandlinger ved NTNU, 1503-8181; 2007:164