Copyright 2017 Datariver s.r.l.
All rights reserved P. IVA 03276860362
How it works

MOMIS (Mediator EnvirOnment for Multiple Information Sources) is a framework to perform information extraction and integration from both structured and semistructured data sources, plus query management facilities to take incoming queries and process them (see the video tutorials).

The MOMIS system is based on a conventional wrapper/mediator architecture, and provides methods and open tools for data management in distributed information systems. The system is developed in Java and the GUI is based on the Eclipse RCP (Rich Client Platform) framework.

architettura_momis

The framework consists of a language and several semi-automatic tools:

  • The ODLI3 language is an object-oriented language, with an underlying Description Logic; it is derived from the standard ODMG.
  • Information integration is performed in a semi-automatic way, by exploiting the knowledge in a Common Thesaurus (defined by the framework) and ODLI3 descriptions of source schemas with a combination of clustering techniques and Description Logics. This integration process gives rise to a virtual integrated view of the underlying sources (the Global Schema, also called GS) for which mapping rules and integrity constraints are specified to handle heterogeneity.
  • The Query Manager is the coordinated set of functions which take an incoming query w.r.t the Global Schema, decompose the query according to the mapping of the GS onto the local data sources relevant for the query, send the subqueries to these data sources, collect the local answer set, fuse them, perform any residual filtering as necessary, and finally deliver the answer set to the requesting user/application.
GS Generation

The Global Schema, is a schema that rappresents the structure of all sources and is used to query and integrate the data.
The MOMIS methodology allows to discover semantic relationships among classes and attributes of the data sources to be integrated. On the basis of these semantic realtionships, it is possible to identify similar classes, that is, classes that describe the same or semantically related concept in different sources.
To this end, affinity coefficients are evaluated for all possible pairs of classes, based on the relationships in the Common Thesaurus properly strengthened.
Affinity coefficients determine the degree of matching of two classes based on their names (Name Affinity coefficient) and their strictures (Structural Affinity coefficient) and are fused into the Global Affinity coefficient.
Global affinity coefficients are then used by a hierarchical clustering algorithm to classify classes according to their degree of affinity.

The Query Manager

The Query Manager is the coordinated set of functions which take an incoming query w.r.t the Global Schema, decompose the query according to the mapping of the GS onto the local data sources relevant for the query, send the subqueries to these data sources, collect the local answer set, fuse them, perform any residual filtering as necessary, and finally deliver the answer set to the requesting user/application.

The query processing of queries expressed on the GS (global query) consist of the following steps:

  • Query rewriting: to rewrite a global query as an equivalent set of queries expressed on the local sources (local queries).
  • Local queries execution: the local queries are sent and executed at local sources.
  • Data Fusion and Data Reconciliation: the local answers are fused into the global answer and data conflicts are solved.

 


Video Tutorials >>

    10/04/2017 DataRiver is officially a CRO

    On Monday, the 10th of April 2017, DataRiver has been certified by AIFA as CRO (Contract Research Organization), in accordance with the Italian Ministry of Health Decree DM 15/11/2011. DataRiver

    02/11/2016 SPARK Health and Wellness

    November, 7th Mirko Orsini (DataRiver CEO) will take part in the first SPARK meeting, focused on the topic of Health and Wellness, that will be held