MATERIAL PARA CURSO DE INTEROPERABILIDAD
 
 

Esta página contiene algunos enlaces a páginas de proyectos  interesantes para el curso de Interoperabilidad.

Información complementaria puede encontrarse en http://meta2.stanford.edu/people/duschka/projects.html

Cualquier otra información es bienvenida. Dirigirla a rmotz@fing.edu.uy



 

PROYECTOS


HERMES: A Heterogeneous Reasoning and Mediator System

HERMES is a system for semantically integrating different and possibly heterogeneous information sources and reasoning systems. This is accomplished by executing programs, called mediators, written in the HERMES system. Mediators, first proposed by Wiederhold,
are guidelines of how information from different sources will be combined and integrated. HERMES system is based on the theory of Hybrid Knowledge Bases, due to Lu, Nerode and Subrahmanian. In this framework, external information sources are abstracted as
domains which execute certain functions with pre-specified input and output type. These domains are accessed in mediators using a logic-based declarative language. This language is based on Annotated Logics, due to Kifer and Subrahmanian, and it provides a
powerfull and extensible programming environment. The system also provides a uniform
environment for the easy addition of new external sources to existing mediators. The system currently runs on Sun Sparc stations (under Unix), as well as on the IBM-PC platform under DOS/Windows 3.1. A graphical user interface has been built on both platforms.

Publicación recomendada> http://www.cs.umd.edu//projects/hermes/publications/postscripts/tois.ps
 

The SIMS Group of Projects at ISI

The SIMS Group consists of several related research projects in ISI's Intelligent Systems Division that are investigating different aspects of the problem of retrieving and integrating data distributed over multiple heterogeneous information sources. The group began by working with well-structured databases, and has expanded over the years to deal also with more loosely structured text sources and Web pages.

Publicación recomendada>

José Luis Ambite and Craig A. Knoblock
  Reconciling Distributed Information Sources
    Working Notes of the AAAI Spring Symposium on Information Gathering in
    Distributed Heterogeneous Environments, Palo Alto, CA, 1995.
                                        Get Postscript

 
TSIMMIS

As an acronym, TSIMMIS stands for "The Stanford-IBM Manager of Multiple Information Sources." In addition,
TSIMMIS is a Yiddish word for a stew with "heterogeneous" fruits and vegetables integrated into a surprisingly tasty whole.

Publicaciones  recomendadas> ver en http://www-db.stanford.edu/tsimmis/publications.html

J. Hammer, H. Garcia-Molina, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom. "Information Translation, Mediation, and Mosaic-Based Browsing in the TSIMMIS System". In Exhibits Program of the Proceedings of the
     ACM SIGMOD International Conference on Management of Data, page 483, San Jose, California, June 1995.

    J. Hammer, M. Breunig, H. Garcia-Molina, S. Nestorov, V. Vassalos, R. Yerneni. "Template-Based Wrappers  in the TSIMMIS System". In Proceedings of the Twenty-Sixth SIGMOD International Conference on
     Management of Data, Tucson, Arizona, May 12-15, 1997.

    Chen Li, Ramana Yerneni, Vasilis Vassalos, Hector Garcia-Molina, Yannis Papakonstantinou, Jeffrey Ullman,Murty Valiveti. "Capability Based Mediation in TSIMMIS". SIGMOD 98 Demo, Seattle, June 1998.
 
 

On Answering Queries in the Presence of Limited Access Patterns

 
 
Stanford Digital Library Technologies

The Stanford Digital Library Technologies Project was initiated in July as part ofthe Federally funded Digital Library Initiative Phase 2. The goal of this Project is to design and implement the infrastructure and services needed for collaborativelycreating, disseminating, sharing and managing information in a digital library context.

Publicaciones recomendadas>
          Mind Your Vocabulary: Query mapping Across Heteerogeneous Information Sources
 Chen-Chuan K. Chang, Hector Garcia -Molina
  ACM SIGMOD 1999 (12pp)

Approximate Query Translation Across Heterogeneous Information Sources
 Chen-Chuan K. Chang, Hector Garcia -Molina
   Reporte Tecnico de la Univ. de Standford, Nro.SIDL-WP-1999-011
                          (38 pp) (Extended Version of "Mind your Vocabulary...")

 
 
The Garlic Project

 
 

 The goal of Garlic is to enable large-scale multimedia information systems: large scale in that they involve lots of data with multimedia taken as broadly as possible to mean data of many types. We are particularly concerned about situations in which there is enough data of sufficiently specialized types that users have already made decisions about how to manage it, and have stored it in separate repositories that are specifically adapted to data of that type.

Publicaciones  recomendadas>

Data Engineering Bulletin '99: Transforming Heterogeneous Data with Database Middleware: Beyond Integration.

VLDB '99: Cost Models DO Matter: Providing Cost Information for Diverse Data Sources in a Federated System (postscript ~284k)

(Encuentran las publicaciones en la página del proyecto)

 
  • The ARANEUS Project 

  • The project aims at developing tools for the management of data coming from the World Wide Web. The proposed techniques are based on database technology: Web sites are described using a formal data model; based on the model, we have developed tools and methodologies for wrapping, querying, integrating, designing and  implementing Web sites.

        Publicaciones recomendadas> http://www.dia.uniroma3.it/Araneus/articles.html

         G. Mecca, P. Atzeni, A. Masci, P. Merialdo, G. Sindoni From Databases to
           Web-Bases: The ARANEUS Experience - Technical Report n. 34-1998 -
           Dipartimento di Informatica e Automazione, Universita' di Roma Tre, May, 1998

           S. Grumbach, G. Mecca In Search of the Lost Schema - In Proceedings of Intern.
          Conference on Database Theory (ICDT'99), 1999(Tema: WRAPPER GENERATION)

     
  • DISCO (INRIA, 1996-1999) - Prototype in Project Caravel 

  • Daniela Florescu, Olga Kapitskaia, Hubert Naacke, Patrick Valduriez, Loiqua Raschid, Antony Tomasic
    DISCO is a I³-compliant information integration system written in Java. It allows the integration of object-oriented and relational databases. Sources are described using SQL3. Its approach is similar to multidatabase systems: schema transformations between sources and the global scheme are stored in multidatabase catalogs.
    One highlight of DISCO is that is tries to handle source failures. If a source is not available at the time of query execution, a modified query is generated that tries to return a preliminary result to the user from the results of other sources. So-called Metawrappers are responsible for providing information about alternative concepts across source-boundaries.

               Homepage of Project Caravel

    Publicaciones  recomendadas>

    Daniela Florescu, Marc Friedman, Zachary Ives , Alon Levy, Dan Weld, "An Adaptive Query Execution Engine for Data Integration" , Proc. of ACM SIGMOD Conf. on Management of Data    ,Phildelphia, PA , 1999  en http://www-rodin.inria.fr/Fpubsbyyear.html

        "A Data Model and Query Processing Techniques for Scaling Access to Distributed Heterogeneous Databases in Disco." Tomasic, Anthony and Raschid, Louiqa and Valduriez, Patrick. Invited paper in the IEEE Transactions
         on Computers, special issue on Distributed Computing Systems, 1997.
          en http://www.umiacs.umd.edu/labs/CLIP/im.html

     
     
    The Miro-Web Project (sucesor de los proyectos IRO-DB y DISCO)

    The MIROWeb Esprit project has developed a unique technology to integrate multiple data sources through an object-relational model with semistructured data types. It addresses the problem of integrating irregular Web sources and regular relational databases through a mediated architecture based on a hybrid model, supporting relational, object and semistructured features.

    descripcion del proyecto en http://www.prism.uvsq.fr/~duc/MiroWeb.htm  y en
                                               http://www.darmstadt.gmd.de/oasys/projects/miro/mirowe.html


    The Information Manifold (IM) is a tool for navigation, retrieval, and organization of information distributed across heterogeneous networked information sources. IM incorporates a deductive knowledge representation system to enable efficient management of a large, poorly structured information space, as characterized by the refractory nature of the Internet.Our approach emphasizes putting more intelligence on the client side of this information world, under control of the user.

    Publicaciones recomendadas> http://portal.research.bell-labs.com/orgs/ssr/people/levy/paper-abstracts.html#iga

    Querying Heterogeneous Information Sources Using Source Descriptions by Alon Y. Levy, Anand Rajaraman and Joann J. Ordille. In the Proceedings of the 22nd International Conference on Very Large Databases,
    VLDB-96, Bombay, India, September, 1996

    Knowledge Based Access to  Information on the World Wide Web
    Thomas Kirk

    Information Manifold Overview (transparencias)


     
  • W4F

  • W4F is a toolkit that allows to write wrappers for Web sources in a couple of minutes.
    http://db.cis.upenn.edu/~sahuguet/WAPI/
    http://db.cis.upenn.edu/W4F/

    Nota> Sería muy interesante que quien estudiara esta herramienta diseñara una pequeña aplicación para evaluar realmente su desempeño.

    ARTíCULOS
     

    SchemasSQL - A language for Interoperation in Relational Multidatabase Systems
        VLDB Conf. 1996     VLDB Conf. 1996

    Semantic Integration in Heterogeneous Databases Using Neural Networks.
    Wen-Syan Li, Chris Clifton  VLDB 1994: 1-12

    An Approach to Integration of Web Information Source Search and
    Web Information Retrieval
    Y. Iizuk, M. Tsunakawa,S.Seo,T.Ikeda      In SAC2000 Como, Italy

    nD-SQL: A  Multidimensional Language for Interoperability and OLAP
                                          F. Gingras and L.V.S. Lakshmanan
                                          VLDB 98 pp 134- 145.
     

    Cost Models DO Matter: Providing Cost Information for  Diverse Data Sources in a Federated System
            Mary Tork Roth, Fatma Özcan and Laura M. Haas
            VLDB99
     

    CHAOS: An Active Security Mediation System  (este es sobre seguridad)
             David Liu, Kincho Law and Gio Wiederhold
              (14pp)
              CAISE 2000, Junio, Estocolmo.