Welcome to Modeling | Modeling News | Modeling Videos | Latest Modeling Trends


Tuesday, August 21, 2007

Data Modeling Education: The Changing Technology

Data modeling is a difficult topic for students to learn. Worse yet is the fact that practitioners, who look to academia for methods and techniques to perform such model building have found little on which to standardize, although many techniques exist. Entity relationship (ER) modeling was developed in order to help database developers visualize their (relational) database design with its data stores and internal relationships. This technique was certainly an important step forward, yet data collected over the past 11 years would indicate database developers are still having difficulty learning, assimilating, and using design techniques (cf. Blaha, 2004). Confounding the issue is the arrival of the object-oriented paradigm. The Unified Modeling Language (UML) was introduced in order to speed, simplify, and clarify design of systems. Portions of the UML are derived from ER modeling and are useful in merging the front end portion of the system with the back end data storage so a picture of the entire system can be viewed by the designer. While providing functionality that ER modeling lacks, the UML approach to data modeling also leaves some developers indecisive and confused as to which technique to use in practice. The same indecision appears to haunt the academic world. So how should data modeling be taught? In order to shed light on this question, we asked contributors to focus on whether this new system of modeling (the UML) yields a better understanding of the database design to the extent that better database designs result. We detected a buzz in the literature and in the IT world that a dichotomy of opinion over this question exists, and so this special issue was born. Educators need to air their opinions, facts, and results and discuss this controversial topic to encourage refinement in this important area. We hope that research ideas can be generated and practitioners informed that this topic is being addressed in academia. As expected, the contributors to this issue provided a dichotomy of opinion but surprisingly, their experiences and opinions moved the issue in a direction far different than what we could have predicted. We now provide you with insight into this poignant topic by presenting this special issue.

Chen (1976) developed the entity relationship diagram (ERD) as a set of tools to assist the database designer in visualizing the internal workings of a complex design. Chen laid out the basic data model with symbols and later other extensions were added. The extended ER model was developed by Teorey, Yang and Fry (1986) to include the concept of generalization (inheritance) in the modeling technique. No clear standard has emerged and there are several in use, including the so-called Crow's foot model, or Information Engineering and the Integrated Definition for Information Modeling (IDEF1X) adopted by the Federal Information Processing Standards agency in 1993.

Object-oriented (OO) techniques began to infiltrate the database world because it seemed that this natural way of processing software "objects" in OO programming languages could be extended to the database. In fact, it seemed like a perfect match since each record in a database table fits the definition of an object as defined in the OO paradigm. Database designers rushed to market their new product, the OODBMS, because at the time, the writing was (or at least seemed to be) clearly on the wall-all software will eventually be subsumed under the OO umbrella. Growth rates of these products were predicted to be so high (~50%) that one wonders how manufacturers could have kept up. The harsh reality was this growth rate was never achieved, and the OODBMS approach has never completely caught on. Pockets of use may be in existence, but only because the technology has been suited to specialized applications-CAD/CAM, multimedia systems and network management systems as examples. The OODBMS has not effectively competed with the advantages provided by relational database's particularly strong roots in mathematical theory. Added to this is the extent of infrastructure held by the relational products making it is extremely doubtful it will be supplanted by the (pure) OODBMS in the foreseeable future.

In spite of many misgivings, the OO movement did force relational database designers to add object-oriented extensions to their products. Primarily due to the need to process complex objects, the object-relational database is now able to store and perform searches within audio, geographic data, telecommunications data1 and other complex data types. The OO movement also brought new data modeling tools. The Unified Modeling Language (UML) became the standard for OO systems and included tools for database design. The UML class diagram is probably the strongest contender in this realm since it maps directly into a relational (logical) design and is able to convey even more system information than just entities and their relationships. The use of UML diagrams by relational database designers is somewhat controversial and not entirely accepted. We detected this controversy among database instructors and systems analysis instructors and decided looking further into this issue would help to place the arguments for both sides on the table and foster a healthy academic discussion. Our goal for this issue was to advance the body of knowledge on the use of modeling techniques by airing this controversy and promoting cogent discussion of the topic. We hope this goal has been accomplished.