Mapping Technology
All over the world there are increasing investments for making information from culture and science publicly available on the Web. After a phase of promoting that each information provider from the archival, library, museum and scientific disciplinary communities creates his own "Web presence" in order to make information available, now a new phase of integrating rich metadata in very large aggregation services has started, such as the National Digital Library of Taiwan, the European Digital Library "Europeana", the Mellon Foundation funded ResearchSpace Project, the British CLAROS Project, and many more. The recent attempts go well beyond OAI PMH harvesting of minimal metadata. Ultimately, these services should bring the dream of a global network of knowledge [1] closer to us.
Behind this development stands an architecture of "information providers" actively delivering content and metadata in various, heterogeneous formats, which are transformed to a normalized data model such as the CIDOC CRM by a mapping toolset often called "Submission Information Package Creator", following the OAI Digital Preservation Model, and subsequently ingested into an aggregation service.
Numerous projects have created such mapping tools, again and again, each time claiming to having solved the problem, often with public funding. Up to now, no comprehensive toolset exists that would allow for a mapping service of industrial quality, that takes into account quality criteria, the scale, social constellations and social roles under which current aggregation services aim at running. As a consequence, the integrated data are much poorer than the provider's internal ones, much less than could be provided, and rarely or never updated. The transformed data contain numerous mapping errors and suffer from other quality shortcomings. The mapping process itself causes immense costs.
The main reason for this global failure of tool development is a complete underestimation of the complexity of such a tool set, and the lack of a reference model that would make generic requirements and a suitable architecture widely known. A complete mapping service consists of quite a number of necessary and optional subservices that can be implemented in a wide range of sophistication. Current solutions suffer from the following:
- They are monolithic. Of possible functionalities, each implementation has developed another subset, without chance of integration.
- They do not represent the schema matching information in a way a domain expert could verify.
- They do not allow for switching between XML, RDF and RDBMS support on the source side, and XML and RDF on the target side.
- They do not support incremental changes of source schema, target schema and URI generation policies.
- They do not maintain an automated communication for data cleaning with the provider.
- They do not foresee collaborative work of experts with different roles on the mapping process.
Therefore, the 27th CRM-SIG Meeting in Amersfoort, on November 2012, proposed the creation of a Mapping Reference Model, which describes the comprehensive functionality in terms of individual components/subservices which can be implemented independently and communicate with minimal open interfaces with each other. Subservices may be necessary or optional. All subservices may be developed at different levels of sophistication. A common GUI framework should allow for plugging in and freely combining subservices. We envisage a collaborative effort of Open Source tool developers to coordinate their efforts to realize all subservices foreseen by this model.
Further it was decided that this activity will be organized by the CRM-SIG and will be open to any interested participant. Comments and expressions of interest should be sent to crm-sig@ics.forth.gr.
On this site we publish all documents related to the progress of this activity.
Minutes available: doc file (87 Kb)
CIDOC CRM Mapping memory
We publish schema matching definitions from schemata of cultrural information systems to the CIDOC CRM here.
List of Contributions
- Component Design for Data Mapping Pipeline
Gerald de Jong, DELVING, 20 October 2013
Available: pdf file (480 Kb)
- A Reference Model for Data Mapping tools
Draft Update August 2013
Available: pdf file (162 Kb)
- Mapping Data to CIDOC-CRM
Gerald de Jong, DELVING, June 2013
Available: pdf file (403 Kb)
- Mapping Process Model
Martin Doerr, FORTH, June 2013
Available: pptx file (107 Kb)
- Mapping from Flat or Hierarchical Metadata Schemas
to a Semantic Web Ontology
Poznan Supercomputing and Networking center
Justyna Walkowska, Marcin Werla
Available: pdf file (1.179 Kb)
- A Reference Model for Data Mapping tools
November 2012
Martin Doerr, Achille Felicetti
Available: docx file (27 Kb)
- STAR - STELLAR Project tools
Hypermedia Research Unit, University of Glamorgan
Available: docx file (14 Kb)
- STELLAR Introduction
Hypermedia Research Unit, University of Glamorgan
Ceri Binding, Douglas Tudhope
Available: ppt file (2.595 Kb)
- OmNom, DM2E's Ingestion Platform
CONCEPT FOR THE RDFIZATION FRAMEWORK DEVELOPED BY WORK PACKAGE 2 OF DM2E
Revision 1, 30.09.2012
Konstantin Baierer
Available: pdf file (626 Kb)
- 3D-COFORM Mapping Tool
Achille Felicetti
VAST-LAB, PIN S.c.R.L., Università degli Studi di Firenze
Available: pdf file (6.226 Kb)
|