Welcome to Modeling | Modeling News | Modeling Videos | Latest Modeling Trends


Wednesday, October 31, 2007

Integrating ERD and UML Concepts When Teaching Data Modeling

In this paper, we describe a teaching approach that evolved from our experience teaching in both the traditional database and systems analysis classes as well as a number of semesters spent team-teaching an object-oriented systems development course. Fundamentally, we argue that existing knowledge of structured systems development can and should inform our teaching processes when teaching object-oriented systems development techniques. We draw from an anecdotal industry example provided by one of our former students to illustrate the value of this approach given our perception that there is a need in practice today to easily shift from structured to object-oriented thinking.

There seems to be growing interest in adopting the Unified Modeling Language (UML) within information systems (IS) curricula, and some authors of database texts have expressed interest in changing their widely adopted books to include the UML notation when representing data models. One might argue that notation is only syntax; therefore, a change in notation should not require a change in the content or approach used in teaching data modeling techniques. However, the interest in UML suggests consideration of a more fundamental question: Should we rethink the processes taught in our database courses to more closely align the way we think about data with the way applications are developed?

Currently, most database courses use entity-relationship diagram (ERD) techniques for data modeling. The traditional ERD has a rich theoretical basis and is specifically intended for modeling relational database structures (Chen 1976, 1977; Date 1986; Martin 1982). Clear guidance exists in many academic and practitioner books about how to use this method to develop conceptual models and transition them to logical forms (including normalization practices) and physical forms that are focused on tuning for performance (Chen 1977; Hoffer, Prescott, and McFadden 2005; Martin 1982). Further, some empirical studies suggest ERDs are often more correct and easier to develop than corresponding object-oriented (OO) schémas (Shoval and Shiran 1997).

Advocates of the UML suggest that the class diagram should replace the ERD notation and approach to data modeling. Class diagrams provide the same opportunity to document data and their relationship as ERDs do. In addition, class diagrams provide for the capture of operations. This allows for the modeling of relational data but also provides rich support for object-oriented implementations in the form of OO program languages (i.e., JAVA) as well as a more component-based approach (i.e., J2EE). Moreover, the UML includes mechanisms for modeling behavior, and the acceptance of the UML as an Object Management Group (OMG) standard provides wide support in industry for using the UML, especially for the design of object-oriented software (Halpin and Bloesch 1999). While some might advocate ERD versus UML as contrasting methodological perspectives, we see advantages of teaching both methods to information systems students in an integrated fashion.

Understanding the implications of the adoption of UML notation in the database class cannot be undertaken without consideration of the other courses in the IS curriculum as well as indications of the future of practice. In a recent panel discussion focused on database course content (Vician et al. 2004), the interrelationships among courses in the IS curriculum was discussed. Specifically, the panelists advised that when the core systems analysis course uses an objectoriented analysis and design approach, using a more traditional approach in the database class may lead to inconsistencies that hinder student learning.

A review of the trade literature suggests that transitions to object-oriented databases may be on the horizon; with Oracle and IBM indicating willingness to "go beyond classical relational dogma" (Monash 2005; p 26). Some organizations have adopted object-oriented databases for storing important corporate data (Lai 2005). Further, as more applications are developed in OO languages such as Java or C++ while utilizing data stored in a relational database, IT professionals are left to write custom code to access the data or rely on object-relational mapping (Walsh 2005). Growing interest among firms such as Oracle as well as the ubiquity of relational databases suggests that mapping OO applications to relational data storage is the likely immediate experience current MIS students can expect in practice (Krill 2005).

Should we rethink the processes taught in our database courses to more closely align the way we think about data with the way applications are developed? This is our fundamental question. In this paper, we draw from our experience in the classroom and in practice to make recommendations for how the UML notation can be integrated with ERD when teaching data modeling.