Definition of the
CIDOC

Conceptual Reference Model

 

Version 3.4.9


This page is the introductory page of Definition of CIDOC object-oriented Conceptual Reference Model and Crossreference Manual.


Contents:

  1. Intitial Page
  2. Introduction
    1. Objectives of the CIDOC CRM
    2. Scope of the CIDOC CRM
    3. Compatibility with the CRM
    4. Applied Form
      1. Terminology
      2. Property Quantifiers
      3. Naming Conventions
    5. Modelling Principles
      1. Monotonicity
      2. Minimality
      3. Shortcuts
      4. Disjointness
      5. About Types
      6. Extensions
      7. Coverage
    6. Examples
  3. The Entity and Property List
  4. APPENDIX

Introduction

 

This document is the formal definition of the CIDOC Conceptual Reference Model (“CRM”), a formal ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information. The CRM is the culmination of more than a decade of standards development work by the International Committee for Documentation (CIDOC) of the International Council of Museums (ICOM). Work on the CRM itself began in 1996 under the auspices of the ICOM-CIDOC Documentation Standards Working Group. Since 2000, development of the CRM has been officially delegated by ICOM-CIDOC to the CIDOC CRM Special Interest Group, which collaborates with the ISO working group ISO/TC46/SC4/WG9 to bring the CRM to the form and status of an International Standard.

Objectives of the CIDOC CRM

The primary role of the CRM is to enable information exchange and integration between heterogeneous sources of cultural heritage information. It aims at providing the semantic definitions and clarifications needed to transform disparate, localised information sources into a coherent global resource, be it within a larger institution, in intranets or on the Internet.

Its perspective is supra-institutional and abstracted from any specific local context. This goal determines the constructs and level of detail of the CRM.

 

More specifically, it defines and is restricted to the underlying semantics of database schemata and document structures used in cultural heritage and museum documentation in terms of a formal ontology. It does not define any of the terminology appearing typically as data in the respective data structures; however it foresees the characteristic relationships for its use. It does not aim at proposing what cultural institutions should document. Rather it explains the logic of what they actually currently document, and thereby enables semantic interoperability.

 

It intends to provide an optimal analysis of the intellectual structure of cultural documentation in logical terms. As such, it is not optimised to implementation-specific storage and processing aspects. Rather, it provides the means to understand the effects of such optimisations to the semantic accessibility of the respective contents.

 

The CRM aims to support the following specific functionalities:

 

Users of the CRM should be aware that the definition of data entry systems requires support of community-specific terminology, guidance to what should be documented and in which sequence, and application-specific consistency controls. The CRM does not provide such notions.

 

By its very structure and formalism, the CRM is extensible and users are encouraged to create extensions for the needs of more specialized communities and applications.

Scope of the CIDOC CRM

The overall scope of the CIDOC CRM can be summarised in simple terms as the curated knowledge of museums.

 

However, a more detailed and useful definition can be articulated by defining both the Intended Scope, a broad and maximally-inclusive definition of general application principles, and the Practical Scope, which is expressed by the overall scope of a reference set of specific identifiable museum documentation standards and practices that the CRM aims to encompass, however restricted in its details to the limitations of the Intended Scope.

 

The Intended Scope of the CRM may be defined as all information required for the exchange and integration of heterogeneous scientific documentation of museum collections. This definition requires further elaboration:

 

 

The Practical Scope [2] of the CRM is expressed in terms of the current reference standards for museum documentation that have been used to guide and validate the CRM’s development. The CRM covers the same domain of discourse as the union of these reference standards; this means that data correctly encoded according to any of these museum documentation standards can be expressed in a CRM-compatible form, without any loss of meaning.

 

Compatibility with the CRM

Users intending to take advantage of the semantic interoperability offered by the CRM may want to make parts of their data structures compatible with the CRM. The respective parts should pertain either to the associations by which users would like their data to be accessible in an integrated environment, or to contents intended for transport to other environments, so that the meaning encoded by its structure is preserved in another target system.

 

In that sense, the CRM is not aimed at proposing a complete matching of user documentation structures with the CRM, nor that a user should always implement all CRM concepts and associations; rather it is intended to leave room for all kinds of extensions to capture the richness of cultural information, but also for simplifications for reasons of economy.

 

Further, the CRM is a means to interpret structured information in a way, so that large amounts of data contents can be transformed or mediated automatically. As a consequence, the CRM aims not at resolving free text information into a formal logical form. In other terms, it does not intend to provide more structuring than the users have done before, and free text information does not fall under the scope of compatibility considerations. The CRM foresees however the associations to transport such information in relation to structured information.

The CRM is a formal ontology, expressible in terms of logic or a suitable knowledge representation language. Its concepts can be instantiated as sets of statements that form models of the assumed reality referred to in a structured document. Any encoding of CRM instances in a formal language that preserves the relations to the CRM classes, properties and inheritance rules among them is regarded a “CRM-compatible form”.

 

A part of a documentation structure is compatible with the CRM, if a deterministic logical algorithm can be found, that transforms any data correctly encoded in this structure into a CRM-compatible form without loss of meaning. No assumptions are made about the nature of this algorithm. It may in particular draw on other formal ontologies expressing background knowledge such as thesauri. The algorithm itself can only be found and verified intellectually by understanding the meaning intended by the designer of the data structure and the CRM concepts. By the term “correctly encoded” we mean that the data are encoded so that the meaning intended by the designer of the data structure is correctly applied to the intended meaning of the data.

 

Information system implementers may choose to provide export facilities of selected data into a CRM-compatible form. They may further choose to provide a service to access selected data by querying with CRM concepts. It is not regarded a loss of compatibility, if certain subclasses and subproperties of the CRM are not supported in such a service. In that case it is regarded essential that the services publishes the set of CRM concepts it supports.

 

Applied Form

The CRM is a domain ontology in the sense used in computer science. It has been expressed as an object-oriented semantic model, in the hope that this formulation will be comprehensible to both documentation experts and information scientists alike, while at the same time being readily converted to machine-readable formats such as RDF Schema, KIF, DAML+OIL, OWL, STEP, etc. It can be implemented in any Relational or object-oriented schema. CRM instances can also be encoded in RDF, XML, DAML+OIL, OWL and others.

 

Although the definition of the CRM provided here is complete, it is an intentionally compact and concise presentation of the CRM’s 81 classes and 132 unique properties. It does not attempt to articulate the inheritance of properties by subclasses throughout the class hierarchy (this would require the declaration of several thousand properties, as opposed to 132). However, this definition does contain all of the information necessary to infer and automatically generate a full declaration of all properties, including inherited properties.

 

Terminology

The following definitions of key terminology used in this document are provided both as an aid to readers unfamiliar with object-oriented modelling terminology, and to specify the precise usage of terms that are sometimes applied inconsistently across the object oriented modelling community for the purpose of this document. Where applicable, the editors have tried to consistently use terminology that is compatible with that of the Resource Description Framework (RDF) [3] , a recommendation of the World Wide Web Consortium. The editors have tried to find a language which is comprehensible to the non-computer expert and precise enough for the computer expert so that both understand the intended meaning.

 

Class

A class is a category of items that share one or more common traits serving as criteria to identify the items belonging to the class. These properties need not be explicitly formulated in logical terms, but may be described in a text (here called a scope note) that refers to a common conceptualisation of domain experts. The sum of these traits is called the intension of the class. A class may be the domain or range of none, one or more properties formally defined in a model. The formally defined properties need not be part of the intension of their domains or ranges: such properties are optional. An item that belongs to a class is called an instance of this class. A class is associated with an open set of real life instances, known as the extension of the class. Here “open” is used in the sense that it is generally beyond our capabilities to know all instances of a class in the world and indeed that the future may bring new instances about at any time (Open World). Therefore a class cannot be defined by enumerating its instances. A class plays a role analogous to a grammatical noun, and can be completely defined without reference to any other construct (unlike properties, which must have an unambiguously defined domain and range). In some contexts, the terms individual class, entity or node are used synonymously with class.

For example:

Person is a class. To be a Person may actually be determined by DNA characteristics, but we all know what a Person is. A Person may have the property of being a member of a Group, but it is not necessary to be member of a Group in order to be a Person. We shall never know all Persons of the past. There will be more Persons in the future.

 

subclass

A subclass is a class that is a specialization of another class (its superclass). Specialization or the IsA relationship means that:

  1. all instances of the subclass are also instances of its superclass,
  2. the intension of the subclass extends the intension of its superclass, i.e. its traits are more restrictive than that of its superclass and
  3. the subclass inherits the definition of all of the properties declared for its superclass without exceptions (strict inheritance), in addition to having none, one or more properties of its own.

 

A subclass can have more than one immediate superclass and consequently inherits the properties of all of its superclasses (multiple inheritance). The IsA relationship or specialization between two or more classes gives rise to a structure known as a class hierarchy. The IsA relationship is transitive and may not be cyclic. In some contexts (e.g. the programming language C++) the term derived class is used synonymously with subclass.

 

For example:

Every Person IsA Biological Object, or Person is a subclass of Biological Object.

Also, every Person IsA Actor. A Person may die. However other kinds of Actors, such as companies, don’t die (c.f. 2).

Every Biological Object IsA Physical Object. A Physical Object can be moved. Hence a Person can be moved also (c.f. 3).

 

superclass

A superclass is a class that is a generalization of one or more other classes (its subclasses), which means that it subsumes all instances of its subclasses, and that it can also have additional instances that do not belong to any of its subclasses. The intension of the superclass is less restrictive than any of its subclasses. This subsumption relationship or generalization is the inverse of the IsA relationship or specialization.

In some contexts (e.g. the programming language C++) the term parent class is used synonymously with superclass.

 

For example:

“Biological Object subsumes Person” is synonymous with “Biological Object is a superclass of Person”. It needs fewer traits to identify an item as a Biological Object than to identify it as a Person.

 

intension

The intension of a class or property is its intended meaning. It consists of one or more common traits shared by all instances of the class or property. These traits need not be explicitly formulated in logical terms, but may just be described in a text (here called a scope note) that refers to a conceptualisation common to domain experts. In particular the so-called primitive concepts, which make up most of the CRM, cannot be further reduced to other concepts by logical terms.

 

extension

The extension of a class is the set of all real life instances belonging to the class that fulfil the criteria of its intension. This set is “open” in the sense that it is generally beyond our capabilities to know all instances of a class in the world and indeed that the future may bring new instances about at any time (Open World). An information system may at any point in time refer to some instances of a class, which form a subset of its extension.

 

scope note

A scope note is a textual description of the intension of a class or property.

Scope notes are not formal modelling constructs, but are provided to help explain the intended meaning and application of the CRM’s classes and properties. Basically, they refer to a conceptualisation common to domain experts and disambiguate between different possible interpretations. Illustrative example instances of classes and properties are also regularly provided in the scope notes for explanatory purposes.

 

instance

An instance of a class is an item that has the traits that match the criteria of the intension of the class. 

For example:

The painting known as the “The Mona Lisa” is an instance of the class Physical Man Made Object.

 

An instance of a property is a factual relation between an instance of the domain and an instance of the range of the property that matches the criteria of the intension of the property.

 

For example:

“The Louvre is current owner of The Mona Lisa” is an instance of the property “is current owner of”.

 

property

A property serves to define a relationship of a specific kind between two classes. The property is characterized by an intension, which is conveyed by a scope note. A property plays a role analogous to a grammatical verb, in that it must be defined with reference to both its domain and range, which are analogous to the subject and object in grammar (unlike classes, which can be defined independently). It is arbitrary, which class is selected as the domain, just as the choice between active and passive voice in grammar is arbitrary. In other words, a property can be interpreted in both directions, with two distinct, but related interpretations. Properties may themselves have properties that relate to other classes (This feature is used in this model only in order to describe dynamic subtyping of properties). Properties can also be specialized in the same manner as classes, resulting in IsA relationships between subproperties and their superproperties.

In some contexts, the terms attribute, reference, link, role or slot are used synonymously with property.

 

For example:

“Physical Man-Made Stuff depicts CRM Entity” is equivalent to “CRM Entity is depicted by Physical Man-Made Stuff”.

 

subproperty

 

A subproperty is a property that is a specialization of another property (its superproperty). Specialization or IsA relationship means that:

  1. all instances of the subproperty are also instances of its superproperty,
  2. the intension of the subproperty extends the intension of the superproperty, i.e. its traits are more restrictive than that of its superproperty,
  3. the domain of the subproperty is the same as the domain of its superproperty or a subclass of that domain,
  4. the range of the subproperty is the same as the range of its superproperty or a subclass of that range,
  5. the subproperty inherits the definition of all of the properties declared for its superproperty without exceptions (strict inheritance), in addition to having none, one or more properties of its own.

 

A subproperty can have more than one immediate superproperty and consequently inherits the properties of all of its superproperties (multiple inheritance). The IsA relationship or specialization between two or more properties gives rise to the structure we call a property hierarchy. The IsA relationship is transitive and may not be cyclic.

Some object-oriented languages, such as C++, have no equivalent to the specialization of properties.

 

superproperty

 

A superproperty is a property that is a generalization of one or more other properties (its subproperties), which means that it subsumes all instances of its subproperties, and that it can also have additional instances that do not belong to any of its subproperties. The intension of the superproperty is less restrictive than any of its subproperties. The subsumption relationship or generalization is the inverse of the IsA relationship or specialization.

 

domain

The domain is the class for which a property is formally defined. This means that instances of the property are applicable to instances of its domain class. A property must have exactly one domain, although the domain class may always contain instances for which the property is not instantiated. The domain class is analogous to the grammatical subject of the phrase for which the property is analogous to the verb. It is arbitrary, which class is selected as the domain and which as the range, just as the choice between active and passive voice in grammar is arbitrary. Property names in the CRM are designed to be semantically meaningful and grammatically correct when read from domain to range. In addition, the inverse property name, normally given in parentheses, is also designed to be semantically meaningful and grammatically correct when read from range to domain.

 

range

The range is the class that comprises all potential values of a property. That means that instances of the property can link only to instances of its range class. A property must have exactly one range, although the range class may always contain instances that are not the value of the property. The range class is analogous to the grammatical object of a phrase for which the property is analogous to the verb. It is arbitrary, which class is selected as domain and which as range, just as the choice between active and passive voice in grammar is arbitrary. Property names in the CRM are designed to be semantically meaningful and grammatically correct when read from domain to range. In addition the inverse property name, normally given in parentheses, is also designed to be semantically meaningful and grammatically correct when read from range to domain.

 

inheritance

Inheritance of properties from superclasses to subclasses means that if an item x is an instance of a class A, then

  1. all properties that must hold for the instances of any of the superclasses of A must also hold for item x, and

all optional properties that may hold for the instances of any of the superclasses of A may also hold for item x.

 

strict

inheritance

Strict inheritance means that there are no exceptions to the inheritance of properties from superclasses to subclasses. For instance, some systems may declare that elephants are grey, and regard a white elephant as an exception. Under strict inheritance it would hold that: if all elephants were grey, then a white elephant could not be an elephant. Obviously not all elephants are grey. To be grey is not part of the intension of the concept elephant but an optional property. The CRM applies strict inheritance as a normalization principle.

 

multiple

inheritance

Multiple inheritance means that a class A may have more than one immediate superclass. The extension of a class with multiple immediate superclasses is a subset of the intersection of all extensions of its superclasses. The intension of a class with multiple immediate superclasses extends the intensions of all its superclasses, i.e. its traits are more restrictive than any of its superclasses. If multiple inheritance is used, the resulting “class hierarchy” is a directed graph and not a tree structure. If it is represented as an indented list, there are necessarily repetitions of the same class at different positions in the list.

For example, Person is both, an Actor and a Biological Object.

 

instance

An instance of a class is a real world item that fulfils the criteria of the intension of the class. Note, that the number of instances declared for a class in an information system is typically less than the total in the real world. For example, you are an instance of Person, but you are not mentioned in all information systems describing Persons.

 

endurant, perdurant

“The difference between enduring and perduring entities (which we shall also call endurants and perdurants) is related to their behaviour in time. Endurants are wholly present (i.e., all their proper parts are present) at any time they are present. Perdurants, on the other hand, just extend in time by accumulating different temporal parts, so that, at any time they are present, they are only partially present, in the sense that some of their proper temporal parts (e.g., their previous or future phases) may be not present. E.g., the piece of paper you are reading now is wholly present, while some temporal parts of your reading are not present any more. Philosophers say that endurants are entities that are in time, while lacking however temporal parts (so to speak, all their parts flow with them in time). Perdurants, on the other hand, are entities that happen in time, and can have temporal parts (all their parts are fixed in time).”
(Gangemi et al. 2002, pp. 166-181).

 

shortcut

A shortcut is a formally defined single property that represents a deduction or join of a data path in the CRM. The scope notes of all properties characterized as shortcuts describe in words the equivalent deduction. Shortcuts are introduced for the cases where common documentation practice refers only to the deduction rather than to the fully developed path. For example, museums often only record the dimension of an object without documenting the Measurement Event that observed it. The CRM allows shortcuts as cases of less detailed knowledge, while preserving in its schema the relationship to the full information.

 

monotonic

reasoning

Monotonic reasoning is a term from knowledge representation. A reasoning form is monotonic if an addition to the set of propositions making up the knowledge base never determines a decrement in the set of conclusions that may be derived from the knowledge base via inference rules. In practical terms, if experts enter subsequently correct statements to an information system, the system should not regard any results from those statements as invalid, when a new one is entered. The CRM is designed for monotonic reasoning and so enables conflict-free merging of huge stores of knowledge.

 

disjoint

Classes are disjoint if the intersection of their extensions is an empty set. In other words, they have no common instances in any possible world.

 

primitive

The term primitive as used in knowledge representation characterizes a concept that is declared and its meaning is agreed upon, but that is not defined by a logical deduction from other concepts. For example, mother may be described as a female human with child. Then mother is not a primitive concept. Event however is a primitive concept.

Most of the CRM is made up of primitive concepts.

 

Open World

The “Open World Assumption” is a term from knowledge base systems. It characterizes knowledge base systems that assume the information stored is incomplete relative to the universe of discourse they intend to describe. This incompleteness may be due to the inability of the maintainer to provide sufficient information or due to more fundamental problems of cognition in the system’s domain. Such problems are characteristic of cultural information systems. Our records about the past are necessarily incomplete. In addition, there may be items that cannot be clearly assigned to a given class.

In particular, absence of a certain property for an item described in the system does not mean that this item does not have this property. For example, if one item is described as Biological Object and another as Physical Object, this does not imply that the latter may not be a Biological Object as well. Therefore complements of a class with respect to a superclass cannot be concluded in general from an information system using the Open World Assumption. For example, one cannot list “all Physical Objects known to the system that are not Biological Objects in the real world”, but one may of course list “all items known to the system as Physical Objects but that are not known to the system as Biological Objects”.

 

complement

The complement of a class A with respect to one of its superclasses B is the set of all instances of B that are not instances of A. Formally, it is the set-theoretic difference of the extension of B minus the extension of A. Compatible extensions of the CRM should not declare any class with the intension of them being the complement of one or more other classes. To do so will normally violate the desire to describe an Open World. For example, for all possible cases of human gender, male should not be declared as the complement of female or vice versa. What if someone is both or even of another kind?

 

query containment

Query containment is a problem from database theory: A query X contains another query Y, if for each possible population of a database the answer set to query X contains also the answer set to query Y. If query X and Y were classes, then X would be superclass of Y.

 

interoperability

Interoperability means the capability of different information systems to communicate some of their contents. In particular, it may mean that

  1.  two systems can exchange information, and/or
  2.  multiple systems can be accessed with a single method.

 

Generally, syntactic interoperability is distinguished from semantic interoperability. Syntactic interoperability means that the information encoding of the involved systems and the access protocols are compatible, so that information can be processed as described above without error. However, this does not mean that each system processes the data in a manner consistent with the intended meaning. For example, one system may use a table called “Actor” and another one called “Agent”. With syntactic interoperability, data from both tables may only be retrieved as distinct, even though they may have exactly the same meaning. To overcome this situation, semantic interoperability has to be added. The CRM relies on existing syntactic interoperability and is concerned only with adding semantic interoperability.

 

semantic interoperability

Semantic interoperability means the capability of different information systems to communicate information consistent with the intended meaning. In more detail, the intended meaning encompasses

  1. the data structure elements involved,
  2. the terminology appearing as data and
  3. the identifiers used in the data for factual items such as places, people, objects etc.

 

Obviously communication about data structure must be resolved first. In this case consistent communication means that data can be transferred between data structure elements with the same intended meaning or that data from elements with the same intended meaning can be merged. In practice, the different levels of generalization in different systems do not allow the achievement of this ideal. Therefore semantic interoperability is regarded as achieved if elements can be found that provide a reasonably close generalization for the transfer or merge. This problem is being studied theoretically as the query containment problem. The CRM is only concerned with semantic interoperability on the level of data structure elements.

 

property quantifiers

We use the term property quantifiers for the declaration of the allowed number of instances of a certain property that an instance of its range or domain may have. These declarations are ontological, i.e. they refer to the nature of the real world described and not to our current knowledge. For example, each person has exactly one father, but collected knowledge may refer to none, one or many.

 

universal

The fundamental ontological distinction between universals and particulars can be informally understood by considering their relationship with instantiation: particulars are entities that have no instances in any possible world; universals are entities that do have instances. Classes and properties (corresponding to predicates in a logical language) are usually considered to be universals. (after Gangemi et al. 2002, pp. 166-181).

 

Property Quantifiers

Quantifiers for properties are provided for the purpose of semantic clarification only, and should not be treated as implementation recommendations. The CRM has been designed to accommodate alternative opinions and incomplete information, and therefore all properties should be implemented as optional and repeatable for their domain and range (“many to many (0,n:0,n)”). Therefore the term “cardinality constraints” is avoided here, as it typically pertains to implementations.

 

The following table lists all possible property quantifiers occurring in this document by their notation, together with an explanation in plain words. In order to provide optimal clarity, two widely accepted notations are used redundantly in this document, a verbal and a numeric one. The verbal notation uses phrases such as “one to many”, and the numeric one, expressions such as “(0,n:0,1)”. While the terms “one”, “many” and “necessary” are quite intuitive, the term “dependent” denotes a situation where a range instance cannot exist without an instance of the respective property. In other words, the property is “necessary” for its range.

 

 

many to many (0,n:0,n)

Unconstrained: An individual domain instance and range instance of this property can have zero, one or more instances of this property. In other words, this property is optional and repeatable for its domain and range.

one to many

(0,n:0,1)

 

An individual domain instance of this property can have zero, one or more instances of this property, but an individual range instance cannot be referenced by more than one instance of this property. In other words, this property is optional for its domain and range, but repeatable for its domain only. In some contexts this situation is called a “fan-out”.

many to one

(0,1:0,n)

An individual domain instance of this property can have zero or one instance of this property, but an individual range instance can be referenced by zero, one or more instances of this property. In other words, this property is optional for its domain and range, but repeatable for its range only. In some contexts this situation is called a “fan-in”.

 

many to many, necessary (1,n:0,n)

An individual domain instance of this property can have one or more instances of this property, but an individual range instance can have zero, one or more instances of this property. In other words, this property is necessary and repeatable for its domain, and optional and repeatable for its range.

 

one to many, necessary

(1,n:0,1)

 

An individual domain instance of this property can have one or more instances of this property, but an individual range instance cannot be referenced by more than one instance of this property. In other words, this property is necessary and repeatable for its domain, and optional but not repeatable for its range. In some contexts this situation is called a “fan-out”.

 

many to one, necessary

(1,1:0,n)

An individual domain instance of this property must have exactly one instance of this property, but an individual range instance can be referenced by zero, one or more instances of this property. In other words, this property is necessary and not repeatable for its domain, and optional and repeatable for its range. In some contexts this situation is called a “fan-in”.

 

one to many, dependent

(0,n:1,1)

 

An individual domain instance of this property can have zero, one or more instances of this property, but an individual range instance must be referenced by exactly one instance of this property. In other words, this property is optional and repeatable for its domain, but necessary and not repeatable for its range. In some contexts this situation is called a “fan-out”.

 

one to many, necessary, dependent

(1,n:1,1)

An individual domain instance of this property can have one or more instances of this property, but an individual range instance must be referenced by exactly one instance of this property. In other words, this property is necessary and repeatable for its domain, and necessary but not repeatable for its range. In some contexts this situation is called a “fan-out”.

 

many to one, necessary, dependent

(1,1:1,n)

An individual domain instance of this property must have exactly one instance of this property, but an individual range instance can be referenced by one or more instances of this property. In other words, this property is necessary and not repeatable for its domain, and necessary and repeatable for its range. In some contexts this situation is called a “fan-in”.

 

one to one

(1,1:1,1)

An individual domain instance and range instance of this property must have exactly one instance of this property. In other words, this property is necessary and not repeatable for its domain and for its range.

 

The CRM defines some properties as being necessary for their domain or as being dependent from their range, following the definitions in the table above. Note that if such a property is not specified for an instance of the respective domain or range, it means that the property exists, but the value on one side of the property is unknown. In the case of optional properties, the methodology proposed by the CRM does not distinguish between a value being unknown or the property not being applicable at all. For example, one may know that an object has an owner, but the owner is unknown. In a CRM instance this case cannot be distinguished from the fact that the object has no owner at all. Of course, such details can always be specified by a textual note.

Naming Conventions

The following naming conventions have been applied throughout the CRM:

 

·       Classes are identified by numbers preceded by the letter “E” (historically classes were sometimes referred to as “Entities”), and are named using noun phrases (nominal groups) using title case (initial capitals). For example, E63 Beginning of Existence.

·       Properties are identified by numbers preceded by the letter “P,” and are named in both directions using verbal phrases in lower case. Properties with the character of states are named in the present tense, such as “has type”, whereas properties related to events are named in past tense, such as “carried out.” For example, P126 employed (was employed by).

·       Property names should be read in their non-parenthetical form for the domain-to-range direction, and in parenthetical form for the range-to-domain direction.

·       Properties with a range that is a subclass of E59 Primitive Value (such as E1 CRM Entity. P2 has note: E62 String, for example) have no parenthetical name form, because reading the property name in the range-to-domain direction is not regarded as meaningful.

·       Properties that have identical domain and range are either symmetric or transitive. Instantiating a symmetric property implies that the same relation holds for both the domain-to-range and the range-to-domain directions. An example of this is E53 Place. P122 borders with: E53 Place. The names of symmetric properties have no parenthetical form, because reading in the range-to-domain direction is the same as the domain-to-range reading. Transitive asymmetric properties, such as E4 Period. P9 consist of (forms part of): E4 Period, have a parenthetical form that relates to the meaning of the inverse direction.

·       The choice of the domain of properties, and hence the order of their names, are established in accordance with the following priority list:

·       Temporal Entity and its subclasses

·       Stuff and its subclasses

·       Actor and its subclasses

·       Other

 

Modelling principles

 

The following modelling principles have guided and informed the development of the CIDOC CRM.

Monotonicity

Because the CRM’s primary role is the meaningful integration of information in an Open World, it aims to be monotonic in the sense of Domain Theory. That is, the existing CRM constructs and the deductions made from them must always remain valid and well-formed, even as new constructs are added by extensions to the CRM.

 

For example:

One may add a subclass of E7 Activity to describe the practice of an instance of group to use a certain name for a place over a certain time-span. By this extension, no existing IsA Relationships or property inheritances are compromised.

 

In addition, the CRM aims to enable the formal preservation of monotonicity when augmenting a particular CRM compatible system. That is, existing CRM instances, their properties and deductions made from them, should always remain valid and well-formed, even as new instances, regarded as consistent by the domain expert, are added to the system.

 

For example:

If someone describes correctly that an item is an instance of E19 Physical Object, and later it is correctly characterized as an instance of E20 Biological Object, the system should not stop treating it as an instance of E19 Physical Object.

 

In order to formally preserve monotonicity for the frequent cases of alternative opinions, all formally defined properties should be implemented as unconstrained (many:many) so that conflicting instances of properties are merely accumulated. Thus knowledge integrated following the CRM serves as a research base, accumulating relevant alternative opinions around well-defined entities, whereas conclusions about the truth are the task of open-ended scientific or scholarly hypothesis building.

 

For example:

El Greco and even King Arthur should always remain an instance of E21 Person and be dealt with as existing within the sense of our discourse, once they are entered into our knowledge base. Alternative opinions about properties, such as their birthplaces and their living places, should be accumulated without validity decisions being made during data compilation.

Minimality

Although the scope of the CRM is very broad, the model itself is constructed as economically as possible.

 

·       A class is not declared unless it is required as the domain or range of a property not appropriate to its superclass, or it is a key concept in the practical scope.

·       CRM classes and properties that share a superclass are non-exclusive by default. For example, an object may be both an instance of E20 Biological Object and E22 Man-made Object.

·       CRM classes and properties are either primitive, or they are key concepts in the practical scope.

·       Complements of CRM classes are not declared.

Shortcuts

Some properties are declared as shortcuts of longer, more comprehensively articulated paths that connect the same domain and range classes as the shortcut property via one or more intermediate classes. For example, the property E18 Physical Stuff. P52 has current owner (is current owner of): E39 Actor, is a shortcut for a fully articulated path from E18 Physical Stuff through E8 Acquisition to E39 Actor. An instance of the fully-articulated path always implies an instance of the shortcut property. However, the inverse may not be true; an instance of the fully-articulated path cannot always be inferred from an instance of the shortcut property.

 

The class E13 Attribute Assignment allows for the documentation of how the assignment of any property came about, and whose opinion it was, even in cases of properties not explicitly characterized as “shortcuts”.

Disjointness

Classes are disjoint if they share no common instances in any possible world. There are many examples of disjoint classes in the CRM.

 

A comprehensive declaration of all possible disjoint class combinations afforded by the CRM has not been provided here; it would be of questionable practical utility, and may easily become inconsistent with the goal of providing a concise definition. However, there are two key examples of disjoint class pairs that are fundamental to effective comprehension of the CRM:

 

·       E2 Temporal Entity is disjoint from E77 Persistent Item. Instances of the class E2 Temporal Entity are perdurants, whereas instances of the class E77 Persistent Item are endurants. Even though instances of E77 Persistent Item have a limited existence in time, they are fundamentally different in nature from instances of E2 Temporal Entity, because they preserve their identity between events. Declaring endurants and perdurants as disjoint classes is consistent with the distinctions made in data structures that fall within the CRM’s practical scope.

·       E18 Physical Stuff is disjoint from E28 Conceptual Object. The distinction is between material and immaterial items, the latter being exclusively man-made. Instances of E18 Physical Stuff and E28 Conceptual Object differ in many fundamental ways; for example, the production of instances of E18 Physical Stuff implies the incorporation of physical material, whereas the production of instances of E28 Conceptual Object does not. Similarly, instances of E18 Physical Stuff cease to exist when destroyed, whereas an instance of E28 Conceptual Object perishes when it is forgotten or its last physical carrier is destroyed.

About Types

Virtually all structured descriptions of museum objects begin with a unique object identifier and information about the “type” of the object, often in a set of fields with names like “Object Type,” “Object Name,” “Category,” “Classification,” etc. All these fields are used for terms that declare that the object is a member of a particular class or category of items, and are described by the CRM as instances of E55 Type. Since the instances of this class are themselves classes, E55 Type is in fact a metaclass.

 

The class E1 CRM Entity is the domain of the property P2 has type (is type of), which has the range E55 Type. Consequently, every class in the CRM, with the exception of E59 Primitive Value, inherits the property P2 has type (is type of). This provides a general mechanism for refining the classification of CRM instances to any level of detail, by linking to external vocabulary sources, thesauri, classification schema or ontologies that function as extensions to the CRM class and property hierarchies. The external vocabularies do not themselves fall within the scope of the CRM.

 

The class E55 Type also serves as the range of properties that relate to categorical knowledge commonly found in cultural documentation. For example, the property P125 used object of type (was type of object used in) enables the CRM to express statements such as “this casting was produced using a mould”, meaning that there has been an unknown or unmentioned instance of “mould” that was actually used. This enables the specific instance of the casting to be associated with the entire class of manufacturing devices known as moulds. Further, the objects of type “mould” would be related via P2 has type (is type of) to this term. This indirect relationship may actually help in detecting the unknown object in an integrated environment. On the other side, some casting may refer directly to a known mould via P16 used specific object (was used for).  So a statistical question to how many objects in a certain collection are made with moulds could be answered correctly (following both paths through P16 used specific object (was used for) - P2 has type  (is type of) and P125 used object of type (was type of object used in). This consistent treatment of categorical knowledge significantly enhances the CRM’s ability to integrate cultural knowledge.

 

Some properties in the CRM are associated with an additional property. These are numbered in the CRM documentation with a ".1" extension. These do not appear in the property hierarchy list but are included as part of the property declarations and referred to in the class declarations. For example, P62.1 mode of depiction: E55 Type is associated with E24 Physical Man-made Stuff. P62 depicts (is depicted by): E1 CRM Entity. The range of these properties of properties always falls within the type hierarchy E55 Type. Their purpose is to allow dynamic extensions to their parent property through the use of property subtypes declared as instances of E55 Type. This function is analogous to that of the P2 has type (is type of) property, which all CRM classes inherit from E1 CRM Entity. System implementations and schemas that do not support properties of properties may use dynamic subtyping of the parent properties instead.

 

Finally, types play a central role in the history of human understanding; they are intellectual products, and documentation about the history and justification by physical evidence of types (particularly in disciplines such as archaeology and natural history) falls squarely within the intended scope of the CRM. Therefore types are modelled as “conceptual objects,” in parallel to their structural role as metaclasses. This approach elegantly addresses the dual nature of types in a manner consistent with material culture and natural history documentation.

Extensions

Since the intended scope of the CRM is a subset of the “real” world and is therefore potentially infinite, the model has been designed to be extensible through the linkage of compatible external type hierarchies.

 

Compatibility of extensions with the CRM means that data structured according to an extension must also remain valid as a CRM instance. In practical terms, this implies query containment: any queries based on CRM concepts should retrieve a result set that is correct according to the CRM’s semantics, regardless of whether the knowledge base is structured according to the CRM’s semantics alone, or according to the CRM plus compatible extensions. For example, a query such as “list all events” should recall 100% of the instances deemed to be events by the CRM, regardless of how they are classified by the extension.

 

A sufficient condition for the compatibility of an extension with the CRM is that CRM classes subsume all classes of the extension, and all properties of the extension are either subsumed by CRM properties, or are part of a path for which a CRM property is a shortcut. Obviously, such a condition can only be tested intellectually.

Coverage

Of necessity, some concepts covered by the CRM are less thoroughly elaborated than others: E39 Actor and E30 Right, for example. This is a natural consequence of staying within the CRM’s clearly articulated practical scope in an intrinsically unlimited domain of discourse. These ‘underdeveloped’ concepts can be considered as hooks for compatible extensions.

 

The CRM provides a number of mechanisms to ensure that coverage of the intended scope is complete:

  1. Existing high level classes can be extended, either structurally as subclasses or dynamically using the type hierarchy.
  2. Existing high level properties can be extended, either structurally as subproperties, or in some cases, dynamically, using properties of properties which allow subtyping.
  3. Additional information that falls outside the semantics formally defined by the CRM can be recorded as unstructured data using E1 CRM Entity. P3 has note: E62 String.

 

In mechanisms 1 and 2 the CRM concepts subsume and thereby cover the extensions.

 

In mechanism 3, the information is accessible at the appropriate point in the respective knowledge base. This approach is preferable when detailed, targeted queries are not expected; in general, only those concepts used for formal querying need to be explicitly modelled.

 

Examples

 

fig. 1 reasoning about spatial information

 

The diagram above shows a partial view of the CRM, representing reasoning about spatial information. Five of the main hierarchy branches are included in this view: E39 Actor, E51 Contact Point, E41 Appellation, E53 Place, and E70 Stuff. The relationships between these main classes and their subclasses are shown as branching lines. Properties between classes are shown as green ovals. A ‘shortcut’ property is included in this view: P59 has section (is located on or within) between E53 Place and E19 Physical Object is a shortcut of the path through E46 Section Definition. In some cases the order of priority for property names has been modified in order to facilitate reading the diagram from left to right.

 

As can be seen, an instance of E53 Place is identified by an instance of E44 Place Appellation, which may be an instance of E45 Address, E47 Spatial Coordinates, E48 Place Name, or E46 Section Definition such as ‘basement’, ‘prow’, or ‘lower left-hand corner.’ An instance of E53 Place may consist of or form part of another instance of E53 Place, thereby allowing a hierarchy of physical ‘containers’ to be constructed.

 

An instance of E45 Address can be considered both as an E44 Place Appellation–a way of referring to an E53 Place–and as an E51 Contact Point for an E39 Actor. An E39 Actor may have any number of instances of E51 Contact Point. E18 Physical Stuff is found on locations as a consequence of being created there or being moved there. Therefore the properties P53 has former or current location (is former or current location of) (and P55 has current location (currently holds) are regarded as shortcuts of the fully articulated paths through the respective events. P55 has current location (currently holds) is a subproperty of P53 has former or current location (is former or current location of). The latter is a container for location information in the absence of knowledge about time of validity and related events.

 

An interesting aspect of the model is the P58 has section definition (defines section) property between E46 Section Definition and E18 Physical Stuff (and the corresponding shortcut from E53 Place to E19 Physical Object). This allows an instance of E53 Place to be defined as a section of an instance of E19 Physical Object. For example, we may know that Nelson fell at a particular spot on the deck of H.M.S. Victory, without knowing the exact position of the vessel in geospatial terms at the time of the fatal shooting of Nelson. Similarly, a signature or inscription can be located “in the lower right corner of” a painting, regardless of where the painting is hanging.

 

 

fig. 2 reasoning about temporal information

 

This second example shows how the CRM handles reasoning about temporal information. Four of the main hierarchy branches are included in this view: E2 Temporal Entity, E52 Time-Span, E77 Persistent Item and E53 Place.

 

The E2 Temporal Entity class is an abstract class (i.e. it has no instances) that serves to group together all classes with a temporal component, such as instances of E4 Period, E5 Event and E3 Condition State.

 

An instance of E52 Time-Span is simply a temporal interval that does not make any reference to cultural or geographical contexts (unlike instances of E4 Period, which took place at a particular instance of E53 Place). Instances of E52 Time-Span are sometimes identified by instances of E49 Time Appellation, often in the form of E50 Date.

 

Both E52 Time-Span and E4 Period have transitive properties. E52 Time-Span has the transitive property P86 falls within (contains), denoting a purely incidental inclusion, whereas E4 Period has the transitive property P9 consists of (forms part of) that supports the decomposition of instances of E4 Period into their constituent parts. For example, the E52 Time-Span during which a building is constructed might falls within the E52 Time-Span of a particular government, although there is no causal or contextual connection between the two instances of E52 Time-Span; conversely, the E4 Period of the Chinese Song Dynasty consists of the Northern Song Period and the Southern Song Period.

 

Instances of E52 Time-Span are related to their outer bounds (i.e. their indeterminacy interval) by the property P82 at some time within, and to their inner bounds via the property P81 ongoing throughout. The range of these properties is the E61 Time Primitive class, instances of which are treated by the CRM as application or system specific date intervals that are not further analysed.

Class & Property Hierarchies

Although they do not provide comprehensive definitions, compact monohierarchical presentations of the class and property IsA hierarchies have been found to significantly aid comprehension and navigation of the CRM, and are therefore provided below.

 

The class hierarchy presented below has the following format:

 

 

The property hierarchy presented below has the following format:

 

 


Copyright © 2003 International Council of Museums



[1] The ICOM Statutes provide a definition of the term “museum” at http://icom.museum/statutes.html#2

[2] The Practical Scope of the CIDOC CRM, including a list of the relevant museum documentation standards, is discussed in more detail on the CIDOC CRM website at http://cidoc.ics.forth.gr/scope.html

[3] Information about the Resource Description Framework (RDF) can be found at http://www.w3.org/RDF/