What is ontology engineering

Ontologies

Basics

The term ontology has a long tradition in philosophy and linguistics, but it was not introduced into computer science as a technical term for artificial intelligence until the 1990s (cf. [Staab, Studer 2004]). The original motivation was:

  • on the one hand, to provide domain models for reuse that are largely application-independent in the area of ​​expert systems (XPS) [Heijst et al. 1997; Studer et al. 1998];
  • on the other hand in multi-agent systems (MAS) and distributed knowledge-based systems through explicit models of domain-specific vocabulary to create a common communication basis for the exchange of messages in speech files [Gruber 1995].

A current definition that is as complete as possible includes the following aspects:

An ontology is:

  • an explicit, formal specification
  • the conceptualization of a delimited discourse area for a defined purpose
  • on which a group of actors has agreed.

Some of these terms are explained further:

  • The Unification aspect stems from the fact that the use of ontology should, in a certain way, enable or simplify direct or indirect communication between actors. This happens because the actors agree on a "common language", the ontology. This becomes obvious when used for the exchange of messages in multi-agent systems, but also in more current uses such as intelligent information integration from heterogeneous sources (ontologies as source, target or intermediate formats help with fully or partially automatic schema transformation) or when searching in Information portals or retrieval systems in knowledge management (the ontology represents the uniform structure view of knowledge organization shared between document indexers and document seekers).
  • A Conceptualization includes the terms and relationships that structure a discourse area. Conceptualizations in ontologies usually include a Taxonomy of Conceptualized (also: concepts, classes), which Attributes (also: characteristics, properties) possess and through Relationships (Relations) can be linked to one another. Axioms can describe cardinalities, value ranges or also default values ​​of relations or attributes, as well as properties of relations (symmetry, transitivity, reflexivity, inverse relations); some ontology languages ​​still offer the possibility of arranging relations taxonomically. Often, standard languages ​​can also be used to formulate Constraints or more Derivation knowledge can be used to describe implicit relationships.
    In contrast to an object-oriented conceptual data schema plus expressive rule formalism, an ontology can also Instances (Objects) of concepts and their Attribute values as instantiated relationships contain, i.e. relate instances to each other. Often a lexical level described (lexical layer)which (for one or more natural languages) contains the natural language words with which certain terms, relationships or instances of the conceptual level are referenced in texts.
  • A explicit specification means first of all that the conceptualization is written down in a form that is as unambiguous as possible and that can be misinterpreted as little as possible. In certain applications, this could already happen through semi-structured texts for human use. Normally, however, special ontology representation languages ​​are used, the semantics of which should be specified completely and unambiguously; this is generally done using mathematical logic. The first generation of ontology languages ​​took up the modeling approaches of semantic networks, frames and description logics. Today, ideally, at least the core of one's knowledge representation is based on largely machine-processable languages ​​that the World Wide Web Consortium (W3C) standardizes, such as the Resource Description Framework (RDF) and RDF schema for web-based metadata and simple ontologies, the web ontology Language (OWL) also for more complex ontologies, as well as the Rule Interchange Format (RIF) for rules. In the area of ​​rules there are also other common languages, such as SWRL as a combination of OWL and rules or F-Logic, a rule language in the style of logic programming.

Types and contents of ontologies

There are many types and also classifications of ontologies, see [Gomez-Perez et al. 2004]. One of the most widely cited classifications goes back to Guarino (cf. [Guarino 1998]) and distinguishes (1) Top-level ontologies (very general concepts that basically structure the world); (2) Generic ontologies (often reusable models of important aspects of the world such as time, space, etc.); (3) Domain ontologies (describe an application-independent discourse area, e.g. mechanical engineering or lung diseases; instantiate and refine (1) and use (2)); (4) Task ontologies (describe domain-independent task types and their contexts of knowledge, e.g. diagnosis; instantiating (1) and using (2)); (5) Application ontologies (merge (3) and (4) for a knowledge-based system).

Another useful classification goes back to van Heijst (cf. [van Heijst et al. 1997]) and distinguishes (A) Terminological ontologies (describe in particular how people talk about a domain; thus form the link to thesauri, information retrieval, etc.); (B) Information ontologies (describe the structure and meta-properties of information sources, so they could be used in particular in information integration); (C) Knowledge ontologies (represent very detailed and formally "clean" knowledge models, as required, for example, in expert systems).

Areas with already very extensive and successful "ontological penetration" are, for example, genetics and genome research or medicine and life sciences as a whole (see e.g. http://www.geneontology.org/). Well-known top-level ontologies are DOLCE [Gangemi et al. 2002] or the IEEE Sugested Upper Merged Ontology (SUMO, see http://suo.ieee.org/SUO/SUMO/). In the area of ​​rather lightweight ontologies, the FOAF (friend-of-a-friend, see http://www.foaf-project.org/) format for describing people, their Internet presences and their social networks has become widespread. Important information ontologies concern e.g. the description of web services with metadata (see below, see for example OWL-S, WSMO or WSDL-S see [Studer et al. 2007]).

Ontology usage

Today one can identify the main fields of application of ontologies in the semantic web and in knowledge management:

  • Semantic search: When searching in complex, confusing or even unknown information stocks, ontologies can deliver a wide range of added values ​​depending on the usage scenario, content and degree of formalization: on the side of the Inquiry formulation By using ontological structural knowledge, inquiries can be disambiguated, generalized or specialized depending on the result, adapted or corrected in a context- or user-specific manner; in the Inquiry processing Ontological derivation and definition knowledge can be used to bridge large discrepancies in query formulation and document representation, and background knowledge can also be used for similarity-based searches; in the Document representation Can ontologically structured metadata or metadata describe complex document contents via ontologically described domains, enable document contexts or links between informal (multimedia) and formal contents (database contents) or also give meta descriptions of documents (e.g. about content quality); after all, at the Presentation of results User- or context-specific presentation or sorting rules formulated via ontologies are used, knowledge structure relationships represent the basis for information visualization, or knowledge-based evaluations are carried out using retrieval results.
  • Intelligent information integration: As very expressive conceptual schema description languages, ontologies can help to conveniently formulate translation and transformation rules with ontology languages ​​when integrating information from different sources, or they can be fully or partially automated; they can facilitate consistency checks with ontological definition knowledge; they also offer convenient options for enriching the information model on the request side with additional concepts that can arise through the merging and intelligent processing of source data.
  • Knowledge-based advice and assistance systems: In the tradition of expert systems, the aim of which was originally to solve difficult tasks (such as planning, diagnosis, configuration, process control, tutoring, ...) fully automatically, there are now other approaches in the sense of a "human-computer tandem" to find useful combinations of human problem-solving and partial machine automation. Typical examples would be intelligent search and reference functions in call centers, where human employees try to solve customer problems, but should be efficiently supplied with information; or so-called "critiquing" components that, for example, "observe" an engineer while designing and check compliance with certain regulations or the observance of optimality goals.
  • Semantic infrastructures: Approaches such as Semantic Web Services, Semantic Peer-to-Peer, Semantic Grid or also Semantic Software Engineering in general use known paradigms of distributed computing, but use ontology-based metadata for resource description (of services, peers, grid resources, software modules, etc.) a. This opens up all sorts of combinations of the benefit effects of the three categories of use listed above, e.g. semantic web services or semantically described (i.e. described with reference to ontologies) software modules can then be more easily found in repositories or registries using methods of semantic search, using rule-based ones Mechanisms can be more easily combined or adapted automatically, their interfaces can be coordinated with one another through ontology mapping from the information integration, etc. etc.

In addition, there are all sorts of current research questions and special approaches, such as the idea of ​​the semantic desktop, the use of ontological background knowledge in data mining or the combination of ontology-based approaches and methods of social software. In principle, ontology-based methods are available if very refined evaluations are required or if a very complex or confusing area of ​​application requires comfortable modeling methods as additional levels of abstraction; the subject area of ​​ambient intelligence or pervasive computing ("Internet of Things") offers challenges for both phenomena. Further ontology-based applications can be found in [Cardoso et al. 2007].

Ontology creation, methods and tools

There are a number of widely used methods for creating ontologies ("Ontology engineering"), for example as a continuation of earlier modeling approaches for products and processes (IDEF5, http://www.idef.com/IDEF5.html) or on the basis of business process analyzes (eg DECOR [Abecker 2004]), as well as" native "Ontology modeling methods, eg On-To-Knowledge [Sure 2003] or METHONTOLOGY [Gomez-Perez et al 2004]. Newer approaches emphasize the aspect of unification more strongly (eg DILIGENT [Vrandecic et al. 2005]), accentuate networked and modular ontologies as well the ontology life cycle and the ontology evolution over time, or integrate newer ideas from computer science and software engineering, such as design patterns or collaborative approaches.

There are also many commercial and non-commercial ones Tools for ontology modeling (cf. [Gomez-Perez 2004]; especially Protégé, see http://protege.stanford.edu/), for the storage and management of ontologies (e.g. the KAON or NEON toolkit , see http://neon-toolkit.org/) or for the support of ontology engineering with methods of machine learning and text analysis [Cimiano 2006]. Furthermore, there is of course a wide range of downstream application software based on ontologies; this often includes:

  • Annotation tools for ontology-based metadata generation for documents or information sources in semantic search [Uren et al. 2006]
  • Tools for the so-called "Ontology mapping", in which different ontologies are mapped to one another semi-automatically or fully automatically for the purposes of message translation or information integration [Euzenat, Shvaiko 2007]
  • Reasoner For automatic reasoning with ontologies (e.g. Pellet, FaCT ++, Kaon2, RacerPro; for a comparison of some reasoners see e.g. [Volz 2008])

In operational practice, an ontology as a model is a purpose-oriented designed engineering artifact in which one tries to reduce the possibilities of misinterpretation and to increase the possibilities of meaningful machine interpretation and processing through detailed declarative description of design decisions of the modeling. Basically, there is of course a conflict of objectives when modeling in operational practice(trade-off) between the so-called "Sharing Scope" and the degree of application independence on the one hand and the modeling effort or the temporal stability of an ontology on the other (cf. also [Elst, Abecker 2002]). Anothertrade-off concerns for example the degree of formality and completeness of the modeling, which enable a high degree of automatic services, but increase the effort and difficulty of ontology creation and maintenance (see [Hepp 2007] for a discussion of such practical considerations in ontology engineering).

Current challenges in the area of ​​creating and using ontologies are still ergonomics in dealing with large ontologies, cost-benefit considerations in ontology creation and evolution, practical application scenarios, scalable reasoning and much more.

literature

Abecker, Andreas: Business-Process Oriented Knowledge Management: Concepts, Methods, and Tools. Dissertation, Institute AIFB, University of Karlsruhe (TH), 2004.

Cardoso, Onelio Jorge; Hepp, Martin; Lytras, Miltiadis (Ed): Real-world Applications of Semantic Web Technology and Ontologies. Berlin, Heidelberg, New York, Tokyo: Springer-Verlag, 2007.

Cimiano, Philipp: Ontology Learning and Population from Text: Algorithms, Evaluation and Applications. Berlin, Heidelberg, New York, Tokyo: Springer-Verlag, 2006.

Elst, Ludger van; Abecker, Andreas: Ontologies for Information Management: Balancing Formality, Stability, and Sharing Scope, Expert Systems with Applications, 23 (4): 357-366, Elsevier, 2002.

Euzenat, Jérôme; Shvaiko, Pavel: Ontology Matching. Berlin, Heidelberg, New York, Tokyo: Springer-Verlag, 2007.

Gangemi, Aldo; Guarino, Nicola; Masolo, Claudio; Oltramari, Alessandro; Schneider, Luc: Sweetening Ontologies with DOLCE. In: EKAW-2002, pp. 166-181. Berlin, Heidelberg, New York, Tokyo: Springer-Verlag, 2002.

Gomez-Perez, Asuncion; Fernandez-Lopez, Mariano; Corcho-Garcia, Oscar: Ontological Engineering. Berlin, Heidelberg, New York, Tokyo: Springer-Verlag, 2004.

Guarino, Nicola (Ed.): Formal Ontology in Information Systems. Amsterdam, Berlin, Oxford: IOS Press, 1998.

Gruber, Thomas R .: Toward Principles for the Design of Ontologies Used for Knowledge Sharing. International Journal Human-Computer Studies 43 (5-6): 907-928, Elsevier, 1995.

Heijst, Gertjan van; Schreiber, Guus; Wielinga, Bob: Using Explicit Ontologies in KBS Development. International Journal of Human-Computer Studies 46 (2-3): 183-292, Elsevier, 1997.

Hepp, Martin: Possible Ontologies: How Reality Constrains the Development of Relevant Ontologies. IEEE Internet Computing 11(1):90-96, 2007.

Staab, Steffen; Studer, Rudi (Ed.): Handbook on Ontologies in Information Systems. Berlin, Heidelberg, New York, Tokyo: Springer-Verlag, 2004.

Studer, Rudi; Benjamin, V. Richard; Fensel, Dieter: Knowledge Engineering: Principles and Methods. Data and Knowledge Engineering 25 (1-2): 161-197, Elsevier, 1998.

Studer, Rudi; Grimm, Stephan; Abecker, Andreas: Semantic Web Services - Concepts, Technologies and Applications. Berlin, Heidelberg, New York, Tokyo: Springer-Verlag, 2007.

Sure, York: Methodology, Tools and Case Studies for Ontology based Knowledge Management. Dissertation, Institute AIFB, University of Karlsruhe (TH), 2003.

Uren, Victoria; Cimiano, Philipp; Iria, José; Glove, Siegfried; Vargas-Vera, Maria; Motta, Enrico; Ciravegna, Fabio: Semantic Annotation for Knowledge Management: Requirements and a Survey of the State of the Art. Journal of Web Semantics 4 (1): 14-28, Elsevier, 2006.

Volz, Raphael (Ed.): Semantics at Work: Ontology Management - Tools and Techniques. eBook, available at: http://www.lulu.com/content/1969742. Last accessed: September 01, 2008.

Vrandecic, Denny; Pinto, Sofia; Sure, York; Tempich, Christoph: The DILIGENT Knowledge Processes. Journal of Knowledge Management 9 (5): 85-96, Emerald, 2005.

author


 

Prof. Dr. Rudi Studer, University of Karlsruhe (TH), Institute for Applied Computer Science and Formal Description Procedures - AIFB, 76128 Karlsruhe

Author info


Item Actions