|
|
Flinders Academic Commons >
Research Publications >
05 - Mathematics, Information and Communication Sciences >
0801 - Artificial Intelligence and Image Processing >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/2328/9631
|
| Title: | A unifying semantic distance model for determining
the similarity of attribute values |
| Authors: | Roddick, John Francis de Vries, Denise Bernadette Hornsby, Kathleen |
| Issue Date: | 2003 |
| Publisher: | Australian Computer Society |
| Citation: | Roddick, J.F., Hornsby, K., & de
Vries, D.B., 2003. A unifying semantic distance model for determining the similarity of
attribute values. Computer Science 2003: Proceedings of the Twenty-Sixth Australasian
Computer Science Conference, 111-118. |
| Abstract: | The relative difference between two data values is of
interest in a number of application domains including
temporal and spatial applications, schema versioning,
data warehousing (particularly data preparation), internet
searching, validation and error correction, and
data mining. Moreover, consistency across systems in
determining such distances and the robustness of such
calculations is essential in some domains and useful in
many. Despite this, there is no generally adopted approach
to determining such distances and no accommodation
of distance within SQL or any commercially
available DBMS.
For non-numeric data values calculating the difference
between values often requires application-specific
support but even for numeric values the practical
distance between two values may not simply be
their numeric difference or Euclidean distance.
In this paper, a model of semantic distance is
developed in which a graph-based approach is used
to quantify the distance between two data values.
The approach facilitates a notion of distance, both
as a simple traversal distance and as weighted arcs.
Transition costs, as an additional expense of passing
through a node, are also accommodated. Furthermore,
multiple distance measures can be incorporated
and a method of ‘localisation’ is discussed which allows
relevant information to take precedence over less
relevant information. Some results from our investigations,
including our SQL based implementation, are
presented. |
| URI: | http://hdl.handle.net/2328/9631 |
| ISBN: | 0909925941 |
| Appears in Collections: | 0801 - Artificial Intelligence and Image Processing
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|