Data integration is a problem faced by large enterprises and
organizations, which need integrated access to distributed data sources. In the
last years several architectures have been proposed to solve this problem,
including the federated databases, the mediator architecture and the peer
database management systems (PDBMS). The general principle of such solutions is
that in order to offer integrated access to distributed data they need to
provide semantic mappings between the data sources.
The development of data integration systems, independently of the architecture
they are based on (i.e. mediation systems, P2P architectures or GRID
infrastructures) poses two major problems: the evaluation of the quality of the
information offered by these systems and the maintenance of the semantic
mappings which connect the data sources. So, the absence of techniques to
perform the quality evaluation of the information offered by the system and the
maintenance of the semantic mappings can make data integration systems
inoperative and obsolete.
Actually, without the data quality evaluation the data offered by these systems
will not be useful as support to the decision making. Information about the
freshness and precision of the data are crucial for such task. Besides, the
environment over which data integration system is built is not static it may
evolve frequently. Consequently, in order to maintain the data integration
system alive, it is necessary to dynamically reconsider the semantic links and
adapt them to the new changes. Otherwise, the data integration system becomes
progressive useless. Additional cost paid for this dynamic maintenance of
semantic links may dramatically increase with the volume of change events and
with the frequency of these events. The data quality evaluation and the
evolution management are not independent processes, frequently the data sources
evolution may cause changes in the data quality.
Therefore, in this project these two problems will be considered together and
integrated solutions will be proposed.
The project overall objective is the development of techniques, algorithms and
tools to provide support for the evolution and quality management in data
integration systems. Different types of data integration systems will be
considered for this project, including well structured ones, as mediation
systems, and less structured ones as peer data management systems. Besides the
technical and scientific results, this project will be of fundamental
importance for strengthening the collaboration among the partners and for
fostering new partnerships.
Participants
Laboratoire PRiSM - Université de Versailles - France.
Mokrane Bouzeghoub, Zoubida Kedad.
Centro de Informática – Universidade Federal de Pernambuco - Brasil
Ana Carolina Salgado.
Laboratoire LSIS, Université Paul Cézanne (Aix-Marseille) - France
Omar Boucelma.
LIA – Universidade Federal do Ceará - Brasil
Bernadette Farias.
Instituto de Computación – Facultad de Ingeniería – Universidad de la
República - Uruguay
Raul Ruggia (International Coordinator), Adriana Marotta, Verónika
Peralta.