About

Ontologies or datasets. Most of these systems use string similarity measures and/or structural measures to determine the similarity between a pair of resources. However, little is known about how one can include the domain of the data into the matching process.
For ImREAL, knowledge about the domain could potentially lead to a higher quality alignment between simulator data and external data on the Web, like in the project's use cases in the domain of interpersonal communication. We have performed a first exploration into the use of domain knowledge for ontology alignment and developed a bootstrap method to automatically derive the needed domain knowledge from an initial set of high confidence matches, and compare this to a baseline method without any domain knowledge, and an 'oracle' method with perfect domain knowledge. To explore the generalizability of the derived domain knowledge, we have performed an evaluation in which we derive the domain knowledge from one dataset and use it to find matches in another dataset.

To top

Example scenario

Problem of matching the data between two or even more datasets is examined on the example from the Polimedia project (www.polimedia.nl):

  • Both primary and secondary datasets consists from structured documents (XML format) that includes some metadata and raw text.
  • Purpose of the project is to find descriptions of the events described in the primary and the secondary datasets and then create links between matched elements in the primary and secondary datasets.
  • To achieve this named entities and topics (created using topic modeling methods) are extracted from primary dataset to automatically generate queries that are later used for retrieval of data from secondary datasets (using similarity measures to rank the retrieved documents).
  • Queries actually contains not just named entities but the context in which those entities appear in the domain documents.
Context in a query

To top

Publications

  1. Kristian Slabbekoorn, Laura Hollink, Geert-Jan Houben. Domain-aware Matching of Events to DBpedia. In DeRiVE workshop on Detection, Representation, and Exploitation of Events in the Semantic Web at ISWC, Bonn, Germany, 2011.
  2. Damir Juric, Laura Hollink, Geert-Jan Houben.

    Bringing parliamentary debates to the Semantic Web. In DeRiVE workshop on Detection, Representation, and Exploitation of Events in the Semantic Web at ISWC, Boston, USA, 2012.

To top