CIDOC CRM Mappings, Specializations and Data Examples
Please, see also mapping technology page.
Introduction
Data Integration is one of the key problems for the development of modern information systems. The exponential growth of the web and the extended use of database management systems has brought to the fore the need for seamless interconnection of diverse and large numbers of information sources. In order to provide uniform access to heterogeneous and autonomous data sources, complex query and integration mechanisms have to be designed and implemented.
An essential matter in heterogeneous database integration is the mapping process. We define the mapping of two schemata as a sufficient specification to transformation of each instance of schema 1 into an instance of schema 2 with the same meaning as shown in figure 1
Fig 1
Here we use as example target schema the CIDOC Conceptual Reference Model that provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation. Whereas the CIDOC CRM was created for information from cultural heritage, it is adequate to model other domains as well.
This page contains information about how to define mappings, tools to support the mapping process and particular mapping definitions between relevant data structures and the CIDOC CRM.
How to define mappings
The following documents
contain information on how to define mappings between relevant data structures and the CIDOC CRM.
Haridimos Kondylakis, Martin Doerr, Dimitris Plexousakis, Mapping Language for Information Integration, 2006, Technical Report 385, ICS-FORTH, December 2006
Available: pdf file (163 kb)
Martin Doerr, Mapping a Data Structure to the CIDOC Conceptual Reference Model, Heraklion, Crete, April 2, 2002
Available: ppt file (205 kb)
Mapping Tools
XML2RDF Data Transformation Tool : This generic data transformation tool maps XML data files to RDF files, given a schema matching definition, based on this Mapping Language Schema (available: xsd file [9 Kb]). This tool is developed by Mary Koutraki and is based on Mapping Language for Information Integration, 2006, Technical Report 385, ICS-FORTH (see above "How to define mappings"). Based on this tool we have created a mapping to transform LIDO format (XML-files) to CIDOC-CRM (RDF-files).
Available as a java application (with source code): - latest version on github XML2RDF-DataTransformation-MappingTool (version 3.3) (licensed under a LGPL Licence)
- previous version also available in XML2RDF Data Transformation Tool (version 1.1) (licensed under a
Creative Commons Attribution-ShareAlike 3.0 Unported License, includes as example the LIDO to CIDOC-CRM mapping)
The following document gives
advice how to formulate schema matching definitions from any schema to CIDOC CRM :
Martin Doerr, Mapping format for data structures of the CIDOC CRM , July
2001. Available: rtf file [39
Kb], pdf file [135 Kb].
For practical applications you may use the following mapping utility (compatible only with CIDOC CRM version 3.4) which can be used for manual schema matching definition from any schema to CIDOC-CRM: mapping
tool (for better comprehension of the utility of mappings
see the Data Transformations). The generated mappings by the this tool, follow this dtd.
Particular Mapping Definitions
On this page we publish schema matching definitions from schemata of cultrural information systems to the CIDOC CRM.
The EDM Model
The Europeana (European Digital Library) currently finalizes its schema for the next release
to be implemented in 2011, the so-called EDM Model. EDM includes some concepts from ORE and from
Dublin Core. It denotes its own namespace as "ens:". It includes in its own namespace a series of
concepts from the CIDOC CRM, and generalizations over CIDOC CRM concept for the purpose of highly
general queries against a large body of data. We have created in a graphical form a first draft of a
mapping from CRM-FRBRoo to EDM and the Dublin Core properties it reuses. This mapping is complete -
except for some CRM properties about structuring time and space EDM does not deal with or has not
developed yet. Martin Doerr, 02 November 2010
Available: CRM-EDM_FRBR.ppt (116 Kb), CRM-EDM_FRBR properties.ppt (136 Kb)
See also : The Europeana Data Model (EDM)
Updated graphical representation of the harmonized EDM-CRM-FRBRoo-DC-ORE models, Martin Doerr, September 2011
Available: EDM-DC-ORE-CRM-FRBR_Integration.ppt (195 Kb)
Graphical representation of the harmonized EDM-CRM-FRBRoo-DC-ORE-CRMdig models, Martin Doerr, September 2011
Available: EDM-DC-ORE-CRM-FRBR-CRMdig_Integration.ppt file [222Kb]
The LIDO model
The following documents describe the mapping of LIDO-Lightweight Information Describing Objects v0.7 to the CIDOC CRM v5.0.1.
Mary Koutraki, Martin Doerr, Mapping LIDO v0.7 to CIDOC-CRM v5.0.1, FORTH-ICS, March 2010.
Available: doc file (184 Kb), mapping xml file [51 Kb], schema for the mapping language xsd file [9 Kb], LIDO v0.7 xsd file [92 Kb]
The Arachne project
The Arachne project in Cologne is mapping archaeology data to the CIDOC-CRM.
The UNIMARC Bibliographic format
The following documents contain a mapping from the UNIMARC Bibliographic format (2nd edition, Update 4, 2002) to the semantic model CIDOC CRM (version 4.0, April 2004).
Patric Le Boeuf, Mapping from UNIMARC Bibliographic to CIDOC CRM, April 2006
Available: zip file (105 Kb)
The MIDAS Standard
The following document describes a set of rules on how to map the MIDAS standard to the CIDOC CRM.
Available: doc file (34 kb)
The following document describes the mapping of the MIDAS element set to the CIDOC CRM.
Available: xls file (99 kb)
The
Dublin Core Element Set
The following document describes the mapping of the DC-types to the CIDOC CRM version 4.2.2.
Konstantia Kakali, Martin Doerr, Christos Papatheodorou Thomais Stasinopoulou, "DC.type mapping to CIDOC/CRM", Technical Report, DELOS WP5-Task 5.5. "Ontology Driven Interoperability", January 2007
Available: doc file (440 Kb)
The mapping of the DC-types to the CIDOC CRM version 4.2.2. was aslo presented at: C. Kakali, I. Lourdi, T. Stasinopoulou, L. Bountouri, C. Papatheodorou, M. Doerr, M. Gergatsoulis, "Integrating Dublin Core Metadata for Cultural Heritage Collections Using Ontologies", Proceedings of the 7th International Conference on Dublin Core and Metadata Applications, DC-2007, Singapore, August 2007, pp. 128-139.
This paper also presents a mapping from the Dublin Core Collection
Application Profile (DCCAP) to CRM.
Available at:http://www.dcmipubs.org/ojs/index.php/pubs/article/view/16/11
The following document describes the mapping of the The
Dublin Core Element set to the CIDOC
CRM version 2.3 and the Agios
Pavlos extensions.
Martin Doerr, Mapping of the Dublin Core Metadata Element Set to the CIDOC
CRM , Technical Report FORTH-ICS/TR-274, July 2000. Available: pdf
file (184 Kb), rtf file (103
Kb).
The
AMICO Data Model
The following document describes the mapping of the The
AMICO data model to the CIDOC CRM version
3.0 and the Agios Pavlos extensions.
Martin Doerr, Mapping of the AMICO data dictionary to the CIDOC CRM , Technical
Report FORTH-ICS/TR-288, June 2000. Available: pdf
file (142 Kb), rtf file (81
Kb).
The
EAD
The following document describes an updated version of the mapping of the EAD to the CIDOC CRM version 4.2.2. Thomais Stasinopoulou, Martin Doerr, Christos Papatheodorou, Konstantia Kakali "EAD mapping to CIDOC/CRM" Technical Report, DELOS WP5-Task 5.5. "Ontology Driven Interoperability"
Available: doc file (1.95 Mb)
The above mapping was also presented at:
Th. Stasinopoulou, L. Bountouri, C. Kakali, I. Lourdi, C. Papatheodorou, M. Doerr, M. Gergatsoulis, "Ontology-based Metadata Integration in the Cultural Heritage Domain", in Proceedings of the 10th International Conference on Asian Digital Libraries, ICADL-2007, Hanoi, Vietnam, December 2007, Lecture Notes in Computer Science (LNCS) No. 4822: Springer-Verlag, 2007, pp. 165-175.
Available at: http://www.springerlink.com/content/k252223528n55127/
The following document describes the mapping of the The
EAD to the CIDOC CRM version 3.0 and
the Agios Pavlos extensions.
Maria Theodoridou & Martin Doerr, Mapping of the Encoded Archival Description
DTD Element Set to the CIDOC CRM, Technical Report FORTH-ICS/TR-289, June 2001.
Available: pdf
file (226 Kb), rtf file (1.2 Mb).
Science Museum of London
The following
document describes transformations of sample data of
the Science
Museum of London to the CIDOC CRM:
Martin Doerr & Iraklis Karvasonas, Converting object documentation into
a CRM-compatible XML form using Data Junction 7.5 , ICS-FORTH, Heraklion, Greece,
May 2001. Available: word file (128
Kb).
Harmony + CIMI Tests
In the framework of the Harmony
+ CIMI Collaboration: Interoperability and metadata vocabularies (Call
for Participation, 3 October 2000), the extended CIDOC CRM
has been base for data transfer experiments from museum data
of 4 organisations to the CIDOC CRM (National
Museum of Denmark, AMOL, RLG, The
John Clayton Herbarium of The
Natural History Museum, London). ICS-FORTH is
assisting Harmony in
this collaboration with respect to the use of the CIDOC CRM.
- Data sample from the National
Museum of Denmark (NMD)
The current schema of the NMD database GENREG is shown graphically on the
following images created with ACCESS:
1. 2.
This file shows the ACCESS
representation of the data dictionary of the NMD database
with comments about the mapping to the CIDOC CRM (word
file 682 Kb). The GENREG model is event-centric.
As reasonable in a Relational implementation, it keeps
the number of tables small. Therefore fine-granularity
distinctions between events as in the CIDOC CRM are
expressed by types of events and types of roles. Naturally,
there is no built-in mechanism top constrain event
types to the allowed roles. Such a service could be
implemented using the CIDOC CRM. In this mapping we
have not analyzed in depth all types of events used
in the NMD base to achieve an optimal mapping to its
coprresponding CIDOC CRM subclasses of Event. The idea
of this test was to demonstrate the feasibility. This
demonstrates that in general a mapping is based on
the input schema and on type definitions used in the
input data.
A peculiarity of the NMD
data is the default event of classification and measurement:
Classification if not otherwise specified is implied
in the "use event" (Brug), and measurement in the acquisition
event. We have traced these cases and interpreted those
events as multiple instantiation of both implied CIDOC
CRM classes. Note, that we regard a collection as a
physical object, similar to a set of chessmen, a bikini,
a set of plates. The argument is, that a collection
has a total weight, can be destroyed, shares a common
life-cycle. Coming and going of parts is neither unusual
to other objects, just look at your computer.
Here now the result of
the transformation, data from the ethnographic collection
of the NMD, with embedded images. This file must be
viewed with an XSL-enabled viewer.
NMD sample in CIDOC CRM form: (xml
file 273 Kb)*
- Data sample from the Australian Museums On-Line (AMOL)
The schema of the Australian Museums On-Line (AMOL) database is a flat list
of attributes shown in: AMOL schema (word
file 59 KB).
The fields of the AMOL
schema have a loose semantic connection to the data
in it. They are more on the level of a document structure
than of a conceptual model. Therefore a direct mapping
of AMOL field semantics to CIDOC CRM notions is not
possible besides a few fields. The Clayton example
below is just the opposite. All data fields can be
interpreted with high precision, but they provide few
structuring. They allow however for complete automatic
data transformation to the CIDOC CRM. We have mapped
in a first step all such structuring fields of the
AMOL data to the CIDOC CRM "has note" property, and
the interpretable fields to the respective CIDOC CRM
properties.
Here now the result of
the automatic transformation, data from the Australian
Museums On-Line (AMOL), with embedded images. This
file must be viewed with an XSL-enabled viewer.
AMOL sample in CIDOC CRM form, part 1: (xml
file 83 Kb)*
AMOL sample in CIDOC CRM form, part 2: (xml
file 86 Kb)*
AMOL sample in CIDOC CRM form, part 3: (xml
file 95 Kb)*
AMOL sample in CIDOC CRM form, part 4: (xml
file 95 Kb)*
In a second step we have
analyzed one AMOL record by hand in order to demonstrate
that the meanings referred in these records are completely
covered by the CIDOC CRM. A satisfactory automatic
transformation of the AMOL data to the CIDOC CRM could
be achieved by the use of text parsers, based on heuristics
and by comparison with place name and person name authorities,
as usual in data mining and citation index generation.
This was however beyond the resources we could assign
to this test. The complexity of such an analysis could
be greatly reduced, if a certain displine of separating
person names,organisation names and place names would
be applied. The field "subject" exhibits a certain
object type dependent polysemy, which could have been
better analyzed by us. The notion of modelling "subject" in
the librarians' sense is an issue still under discussion
in the CIDOC CRM.
Here now the result of
the transformation by hand of one record from the Australian
Museums On-Line (AMOL), with embedded image. This file
must be viewed with an XSL-enabled viewer.
AMOL sample in CIDOC CRM form, complete mapping: (xml
file 4 Kb)*
-
Data sample from the The
John Clayton Herbarium of The
Natural History Museum, London
The schema of the Clayton database is a flat list of attributes shown in: "Clayton
schema" .
This transformation contistutes
the first test of natural history data with the CIDOC
CRM. The Clayton example consists of reasoning between
object types, their names and types, classification
and prototypicality of specimen. The only aspect we
could identify not to be covered by the CIDOC CRM already
is the "Type Specimen", which could be generalized
as a property: E55 Type: has prototype (is prototype
of). Else, the events of classification, the distinction
between names and types, and the recent (Agios Pavlos
Extensions) subordination of E55 Type to CIDOC CRM
Entity seemed to us to be satisfactory to capture this
reasoning. We kindly ask Natural History experts and
particularly the curators of the Clayton collection
to provide us with feedback to our interpretation of
these data.
Here now the first result
of the transformation, data from the John Clayton Herbarium,
with embedded images. This file must be viewed with
an XSL-enabled viewer.
Clayton sample in CIDOC CRM form, part 1: (xml
file 77 Kb)*
Clayton sample in CIDOC CRM form, part 2: (xml
file 76 Kb)*
Clayton sample in CIDOC CRM form, part 3: (xml
file 77 Kb)*
Clayton sample in CIDOC CRM form, part 4: (xml
file 81 Kb)*
Specializations of the CIDOC CRM
- Centre
for Archaeology CfA
Keith May, Ontological modelling and Revelation,
The Newsletter of the Historic Environment Records Forum, Issue 6 July 2005
Paul Cripps, Anne Greenhalgh, Dave Fellows, Keith May, David Robinson
Ontological Modelling of the work of the Centre for Archaeology, September 2004
Available: pdf
file (207 Kb),
Also available: The CRM Diagram, pdf file (65 Kb)
- FRBR
This is a draft definition of the FRBRoo Model and mapping to the FRBR-ER (version 0.8.1). Editors: Martin Doerr, Patrick Le Bouf, Contributors: Trond Aalberg, Jérôme Barthélémy, Chryssoula Bekiari, Guillaume Boutard, Günther Görz, Dolores Iorizzo,Max Jacob,Carlos Lamsfus, Mika Nyman, Christian Emil Ore, Allen H. Renear, Richard Smiraglia, Stephen Stead, Maja Žumer, May 2007.
Available: doc file (1.81 Mb), pdf file (807 Kb)
Data Examples
The following examples are based on a representation of CIDOC CRM instances encoded by a simple DTD. This DTD only implements the CIDOC CRM properties, whereas classes are represented as data. It does not enforce correct use of the CICOD CRM classes and properties.
If it is correctly used this representation is equivalent to an RDF representation and can be automatically transformed into an RDF representation. An RDF instance of CIDOC CRM can formally be validated against the CIDOC CRM model.
For viewing CIDOC CRM instances encoded with this DTD use this XSL file.
- A comprehensive data example
for the CRM from the museum Benaki with rich comments about
form and contents. It employs a non-formalized indented notation,
that corresponds to version 4.2 of the CIDOC CRM in RDFS.
Ifigenia Dionissiadou & Martin Doerr, Data Example of the CIDOC
Conceptual Reference Model- Epitaphios GE3460, September 2007. Available: html
file (106 Kb), word
file (405 Kb), xml file (17 Kb), pdf
file (17 Kb), rdf file (493 Kb. The rdf file is a valid instance of cidoc_v4.2.rdfs).
- This is a data example that is a valid
instance of version 5.0.1 of the CIDOC CRM in RDFS.
Ifigenia Dionissiadou & Martin Doerr, Data Example of the CIDOC Conceptual
Reference Model- Epitaphios GE3460, September 2007. Available: html
file (106 Kb), word file (405
Kb), xml file (17 Kb), pdf
file (17 Kb), rdf file (13 Kb).
|