A method for the evaluation of similarity measures on graphs and network-structured data
- Authors: Naude, Kevin Alexander
- Date: 2014
- Subjects: Graph theory , Network analysis (Planning)
- Language: English
- Type: Thesis , Doctoral , DPhil
- Identifier: vital:10494
- Description: Measures of similarity play a subtle but important role in a large number of disciplines. For example, a researcher in bioinformatics may devise a new computed measure of similarity between biological structures, and use its scores to infer bio-logical association. Other academics may use related approaches in structured text search, or for object recognition in computer vision. These are diverse and practical applications of similarity. A critical question is this: to what extent can a given similarity measure be trusted? This is a difficult problem, at the heart of which lies the broader issue: what exactly constitutes good similarity judgement? This research presents the view that similarity measures have properties of judgement that are intrinsic to their formulation, and that such properties are measurable. The problem of comparing similarity measures is one of identifying ground-truths for similarity. The approach taken in this work is to examine the relative ordering of graph pairs, when compared with respect to a common reference graph. Ground- truth outcomes are obtained from a novel theory: the theory of irreducible change in graphs. This theory supports stronger claims than those made for edit distances. Whereas edit distances are sensitive to a configuration of costs, irreducible change under the new theory is independent of such parameters. Ground-truth data is obtained by isolating test cases for which a common outcome is assured for all possible least measures of change that can be formulated within a chosen change descriptor space. By isolating these specific cases, and excluding others, the research introduces a framework for evaluating similarity measures on mathematically defensible grounds. The evaluation method is demonstrated in a series of case studies which evaluate the similarity performance of known graph similarity measures. The findings of these experiments provide the first general characterisation of common similarity measures over a wide range of graph properties. The similarity computed from the maximum common induced subgraph (Dice-MCIS) is shown to provide good general similarity judgement. However, it is shown that Blondel's similarity measure can exceed the judgement sensitivity of Dice-MCIS, provided the graphs have both sufficient attribute label diversity, and edge density. The final contribution is the introduction of a new similarity measure for graphs, which is shown to have statistically greater judgement sensitivity than all other measures examined. All of these findings are made possible through the theory of irreducible change in graphs. The research provides the first mathematical basis for reasoning about the quality of similarity judgments. This enables researchers to analyse similarity measures directly, making similarity measures first class objects of scientific inquiry.
- Full Text:
- Date Issued: 2014
- Authors: Naude, Kevin Alexander
- Date: 2014
- Subjects: Graph theory , Network analysis (Planning)
- Language: English
- Type: Thesis , Doctoral , DPhil
- Identifier: vital:10494
- Description: Measures of similarity play a subtle but important role in a large number of disciplines. For example, a researcher in bioinformatics may devise a new computed measure of similarity between biological structures, and use its scores to infer bio-logical association. Other academics may use related approaches in structured text search, or for object recognition in computer vision. These are diverse and practical applications of similarity. A critical question is this: to what extent can a given similarity measure be trusted? This is a difficult problem, at the heart of which lies the broader issue: what exactly constitutes good similarity judgement? This research presents the view that similarity measures have properties of judgement that are intrinsic to their formulation, and that such properties are measurable. The problem of comparing similarity measures is one of identifying ground-truths for similarity. The approach taken in this work is to examine the relative ordering of graph pairs, when compared with respect to a common reference graph. Ground- truth outcomes are obtained from a novel theory: the theory of irreducible change in graphs. This theory supports stronger claims than those made for edit distances. Whereas edit distances are sensitive to a configuration of costs, irreducible change under the new theory is independent of such parameters. Ground-truth data is obtained by isolating test cases for which a common outcome is assured for all possible least measures of change that can be formulated within a chosen change descriptor space. By isolating these specific cases, and excluding others, the research introduces a framework for evaluating similarity measures on mathematically defensible grounds. The evaluation method is demonstrated in a series of case studies which evaluate the similarity performance of known graph similarity measures. The findings of these experiments provide the first general characterisation of common similarity measures over a wide range of graph properties. The similarity computed from the maximum common induced subgraph (Dice-MCIS) is shown to provide good general similarity judgement. However, it is shown that Blondel's similarity measure can exceed the judgement sensitivity of Dice-MCIS, provided the graphs have both sufficient attribute label diversity, and edge density. The final contribution is the introduction of a new similarity measure for graphs, which is shown to have statistically greater judgement sensitivity than all other measures examined. All of these findings are made possible through the theory of irreducible change in graphs. The research provides the first mathematical basis for reasoning about the quality of similarity judgments. This enables researchers to analyse similarity measures directly, making similarity measures first class objects of scientific inquiry.
- Full Text:
- Date Issued: 2014
Network analysis of trophic linkages in two sub-tropical estuaries along the South-East coast of South Africa
- Authors: Vosloo, Mathys Christiaan
- Date: 2012
- Subjects: Estuaries -- South Africa -- Eastern Cape , Estuarine ecology , Network analysis (Planning)
- Language: English
- Type: Thesis , Doctoral , PhD
- Identifier: vital:10708 , http://hdl.handle.net/10948/d1010966 , Estuaries -- South Africa -- Eastern Cape , Estuarine ecology , Network analysis (Planning)
- Description: Estuaries are some of the most productive yet threatened ecosystems in the world. Despite their importance they face significant threats through changes to river flow, eutrophication, rapid population growth long the caost and harvesting of natural resources. A number of international studies have been conducted investigating the structure and functioning of an array of ecosystems using ecological network analysis. Energy flow networks have been contsructed for coastal, lagoonal, intertidial and, most notably, permantently open estuaries. Despite the valualble insights contributed by these and other studies, a lack of information on the majority of estuarine ecosystems exists.
- Full Text:
- Date Issued: 2012
- Authors: Vosloo, Mathys Christiaan
- Date: 2012
- Subjects: Estuaries -- South Africa -- Eastern Cape , Estuarine ecology , Network analysis (Planning)
- Language: English
- Type: Thesis , Doctoral , PhD
- Identifier: vital:10708 , http://hdl.handle.net/10948/d1010966 , Estuaries -- South Africa -- Eastern Cape , Estuarine ecology , Network analysis (Planning)
- Description: Estuaries are some of the most productive yet threatened ecosystems in the world. Despite their importance they face significant threats through changes to river flow, eutrophication, rapid population growth long the caost and harvesting of natural resources. A number of international studies have been conducted investigating the structure and functioning of an array of ecosystems using ecological network analysis. Energy flow networks have been contsructed for coastal, lagoonal, intertidial and, most notably, permantently open estuaries. Despite the valualble insights contributed by these and other studies, a lack of information on the majority of estuarine ecosystems exists.
- Full Text:
- Date Issued: 2012
- «
- ‹
- 1
- ›
- »