Seminar

Study of models and techniques of Workflow with the objective of define a load process for charge and maintain of Data Warehouses

The seminar is part of activities of System Information Conception (Spanish) group (CSI) of Computer Institute (Engineering Faculty (Spanish), University of Uruguay Republic (Spanish) UdelaR).

Bandera de la República Oriental del Uruguay This page in Spanish

Dynamic

Two hours for week are reserved for participants that present the activities done. This activities consist of:
individual specific article expositions related to the seminar.
divulgation of important activities and results.

White Board

Actual Situation: The pending expositions already conclude, now we are expecting new proposes.

Done and planed Activities

All slides and links of this page are in Spanish.

 
Date Theme Seminarist
Day 1: 30/4/2001 Article: [BFM99] Modeling Data Warehouse Refreshment Process as a Workflow Application (Exposition, also available HTML, PDF y PostScript), (Comments also available HTMLPDF y PostScript) Raúl Ruggia
Day 2: 7/5/2001 Examples of Extracting, Transforming and Load tools (ETL)
with some WF characteristic.
Exposition de Hummingbird Genio (Exposition, also available HTML, PDF y PostScript)
Alejandro Gutiérrez
Day 3: 14/5/2001 Examples of Extracting, Transforming and Load tools (ETL)
with some WF characteristic.
Exposition de Microsoft DTS (Exposition, also available HTML, PDF y PostScript), (Comments also available HTMLPDF y PostScript)
Alejandro Gutiérrez
Day 4: 28/5/2001 Exposition of work about definition of primitives exclusively with SQL. (Exposition, also available HTML, PDF y PostScript). First advance of the document (Word, also available HTML, PDF y PostScript) Ignacio Larrañaga
Day 5: 1/6/2001 Continuation: Exposition of work about definition of primitives exclusively with SQL. Ignacio Larrañaga
Day 6: 4/6/2001 A review about WFMC standard. (Exposition, also available HTML, PDF y PostScript) Pablo Morales
Day 7: 11/6/2001 Article: [GFSE2000] Declaratively cleaning your data using AJAX (Exposition, also available HTML, PDF y PostScript) and Conclusions of Primitives Exposition. Ignacio Larrañaga
Day 8: 22/6/2001 (~13:30) Exposition of Veronika´s master thesis. Verónika Peralta
Day 9: 25/6/2001 Continue Exposition of Veronika´s master thesis. Verónika Peralta
Day 10: 29/6/2001 (~11:00) Continue Exposition of Veronika´s master thesis. Verónika Peralta
Day 11: 2/7/2001 Continue Exposition of Veronika´s master thesis. Verónika Peralta
Day 12: 9/7/2001 Conclusion about Veronika´s master thesis.(Exposition, also available HTML, PDF y PostScript) Verónika Peralta
Day 13: 3/8/2001 (13:00) Article of Garcia-Molina et. al. [GMER]. (Exposition, also available HTML, PDF y PostScript) Pablo Morales
Day 14: 09/8/2001 Article de Garcia-Molina et. al. (second part)  Pablo Morales
Day 15: 10/8/2001 Exposition about prototype work of ETL parallel process developed in PVM. (Exposition, also available HTML, PDF y PostScript). First advance of final document (document, also available PDF, PostScript). Ignacio Larrañaga
Day 16: 17/8/2001 Exposition about CWM. Raúl Ruggia
Day 17: 30/11/2001 Exposition about Informatics System for Teachings evaluation (Exposition, also available HTML, PDF y PostScript). Sandro Moscatelli
Day 18: 28/12/2001 (13:30) Exposition of results about study of Conceptualization of what was doing at Teachings DW project. (Exposition, also available HTML, PDF y PostScript, additional material: system manual, DW tables, general description; BDs Access: DW, Intermediate layer, Auxiliary tables). Ignacio Larrañaga
Day 19: 25/2/2002 Exposition of results about study of Teachings DW project (Exposition, also available HTML, PDF y PostScript). Adriana Marotta
Day 20: 22/3/2002 Analysis Verónika's methodology presented at your master thesis to get more understanding of it (Exposition, also available HTML, PDF y PostScript). Alejandro Gutiérrez
Day 21: 5/4/2002 Analysis Verónika's methodology presented at your master thesis to get more understanding of it (Part II). Alejandro Gutiérrez
Day 22: 12/4/2002 Analysis Verónika's methodology presented at your master thesis to propose a mechanism for initial load and actualization with DWD (where DWD is the mechanism based on Primitives) (Part III, Final Exposition, also available HTML, PDF y PostScript). Alejandro Gutiérrez
Day 23: 19/4/2002 Exposition of work about DW integration of Adriana (Exposition not available yet). Adriana Marotta
Day 24: 26/4/2002 Exposition of work done by Verónika at France (Exposition, also available HTML, PDF y PostScript), Technical Report. Verónika Peralta
Day 25: 21/6/2002 Exposition of "Modeling ETL Activities as Graphs" (by Vassiliadis et al) (Exposition, also available HTML, PDF y PostScript). Adriana Marotta
? Applying Verónika's methodology presented at your master thesis to Teaching DW problem. Ignacio Larrañaga
? One more detailed description about the classes, and over all those associated with the transformation process. This require the study of basic classes of CWM and MOF. One more detailed description about specialized transformations (Map, Tree, ....). One more detailed description about how the load flow is made (the Activity transformation based on the transformation step connection). See who the cycle problem is, and the use of constraints, etc. Show that using examples. ?
? A balance about how CWM responds to: (a)  the requirements of other articles (Mokrane, Garcia-Molina, etc); (b) our discussions about other workflow models; (c) comparison with the old Metadata Coalition standard (OIM), over the article located at haddock_csi/doctecnica/metadata/DW (CWMvsOIM_SigmRec_Set2000.pdf; (d) with respect to the parallel/distributed execution of load process (¿ see HPC work of Ignacio ?) ?
? Article: Model ConTract ?
? Transformation schema operation of DWDesigner. Simplified where WF is involved ?. ?
? Evolution of source schemas in web warehouse. Simplified where WF is involved ?. ?
? Article: Algorithms Strobe ?

Propose calendar change, comments or other information not registered from here.

Bibliography

[BFM99] M. Bouzeghoub, F. Fabret, M. Matulovic-Broqué. Modeling Data Warehouse Refreshment Process as a Workflow Application. Proc. DMDW 1999. (Disponible: http://www.dbnet.ece.ntua.gr/~dwq/publications.html)
[GFSE2000] H. Galhardas, D. Florescu, D. Shasha, E. Simon. Declaratively cleaning your data using AJAX. (Available at: http://citeseer.nj.nec.com/309494.html)
[GFSE1999] H. Galhardas, D. Florescu, D. Shasha, E. Simon. An Extensible Framework for data Cleaning. (Available at: http://citeseer.nj.nec.com/galhardas00extensible.html)
[Dts] Microsoft. Helps on line of Microsoft SQL Server 7.0. 1999.
[PSqlDTS] Mark Chafin,  Brian Knight, Todd Robinson, Professional SQL Server 2000 DTS (Data Transformation Service). (Books at library, donation of Microsoft).
[MSqlDTS] Timothy Peterson, Microsoft SQL Server 2000 Data Transformation Services (DTS). (Books at library, donation of Microsoft).
[Genio] Hummingbird. Manuals de Hummingbird Genio 3.1. 1999.
[GMER] Garcia, Molina. Efficient Resumption of Interrupted Warehouse Loads.
[CWM2001] Object  Management Group (OMG). Common Warehouse Metamodel (CWM) Specification, Version 1.0.

 

Internal Works

Modeling of refresh process as Workflow process.
Exposition (also available HTML, PDF y PostScript).
Tools of ETL.
Genio (also available HTML, PDF y PostScript).
DTS (also available HTML, PDF y PostScript).
Cleaning tools.
Exposition (also available HTML, PDF y PostScript).
WFMC Standard.
Exposition (also available HTML, PDF y PostScript).
Verónika Peralta, master thesis.
Exposition (also available HTML, PDF y PostScript).
Continuing load process [GMER].
Exposition (also available HTML, PDF y PostScript).
First advance about view Primitives as SQL.
Exposition (also available HTML, PDF y PostScript).
First document advance (also available HTML, PDF y PostScript).
Work about of ETL process parallelism (in particular aggregation).
Exposition (also available HTML, PDF y PostScript).
First advance of final document (also available PDF, PostScript).
Work about Teachings DW project.
Exposition (also available HTML, PDF y PostScript)
Work of Conceptualization of Teachings DW projects.
Exposition (also available HTML, PDF y PostScript)
Additional Material:
System manual,
DW tables,
General description
BDs Access:
DW,
Intermediate layer
Auxiliary tables
Study of Teachings DW.
Exposition (also available HTML, PDF y PostScript)
Analysis Verónika's methodology presented in your master thesis.
Exposition (also available HTML, PDF y PostScript)
Final Exposition (also available HTML, PDF y PostScript)
Verónika's work at France
Technical Report
Exposition (also available HTML, PDF y PostScript)
Exposition of "Modeling ETL Activities as Graphs" (by Vassiliadis et al)
Exposition (also available HTML, PDF y PostScript)

 

Evaluation mechanism

Links

Bibliography of Data Warehouse (Verónika Peralta), http://www.fing.edu.uy/~vperalta/bibliografiaDW.htm
Foundations on Data Warehouse Quality Spirit Project 22469, http://www.dbnet.ece.ntua.gr/~dwq
Curse Advanced Topics in Data Warehouse, http://www.fing.edu.uy/~csi/Cursos/cursos_posg/2001/AdvTopicsDW.html

 

Last Update: 15/7/2002.

Page Visits: [Page Visit Counter]

Suggestions, Please excuse my poor English knowledge, do not hesitate to contact me if some word, expression or paragraph is not well understandable.