A MODELING APPROACH TO SEAMLESS
OBJECT-ORIENTED SYSTEMS DEVELOPMENT
David W. Embley
Scott N. Woodfield
Department of Computer Science

Stephen W. Liddle
School of Accountancy and Information Systems

Brigham Young University
Provo, UT 84602
INTRODUCTION

Existing approaches to object-oriented systems development are poorly integrated in several ways. These include (1) unwanted paradigm shifts in the software development lifecycle and the models, languages, and tools used to develop software, (2) impedance mismatches in implementation languages, and (3) the inaccessibility or unavailability both of meta-information and of high-level, abstract objects. This inadequate integration is ubiquitous and causes numerous inefficiencies in the object-oriented development process.

These problems can be addressed by abandoning typical object-oriented models in favor of a single seamless system model. By using a seamless model, such as the one we propose, we can overcome the integration inefficiences to which we allude, and in addition, we can also raise the level of abstraction for object-oriented system implementation and enable same-paradigm system evolution.

The foundation for our approach is a formally defined logical model that is seamless in three ways.

(1) It maintains a single software development paradigm across analysis, specification, design, and implementation, and consequently also provides for software evolution in the same paradigm.
(2) It resolves the impedance mismatch between structural and behavioral components of the language, between imperative and declarative programming paradigms, and between visual and textual representation.
(3) It makes meta-level information fully accessible and modifiable and makes all high-level abstractions first-class components.
INTEGRATION PRINCIPLES

Several principles are involved in integrating across all spectrums of interest.

Development Lifecycle Unification

The greatest objection to a unified paradigm may be the question of whether a single model is appropriate for all phases of software development. Our answer to this question is a qualified ``yes.'' A single logical model can address the needs of analysis, specification, design, and high-level implementation. However, low-level implementation may require fine-tuning to achieve required levels of efficiency. As shown by the database community and their principle of data independence, however, we can focus our optimization efforts on the most critical parts of a system to achieve acceptable performance while still implementing our system at a higher level of abstraction. Thus, this is our strategy: create a single model that is appropriate for systems modeling and high-level implementation, then concentrate on optimization research so that system performance becomes acceptable.

Impedance Mismatch Resolution

There are several kinds of impedance mismatches including associative/iterative access to persistent/non-persistent data, imperative/declarative programming-language paradigms, and textual/visual specification. To resolve these mismatches, we propose that implementation be done using a higher level language whose model for persistent objects and behavior protocols is precisely the same model used for analysis, specification, and design. This implementation language is a level of abstraction above current programming languages and paradigms, and it allows for various types of data access using multiple language paradigms expressed in possibly mixed textual and graphical form.

Reification of Abstract Objects

Reification makes abstract objects concrete, available, and first class. We consider the reification of both application information and specification constructs. To reify application information, we have three major levels of information, all of which are concrete, available and first class. These include a data instance, a model instance, and a meta-model instance. A data instance consists of the information about instantiated objects in a system and conforms to a scheme, called a model instance. The model instance consists of structural information, such as object classes and relationship sets, and behavior information, such as states, transitions, and object interactions. The model instance conforms to another model instance, called the meta-model, which describes a valid model instance. The key to achieving reification is to describe the meta-model using the model itself; for then the same mechanism can be used for accessing and modifying both the data instance and the model instance, which allows the elements of the meta-model to be first-class elements in the system.

The reification of specification constructs allows high-level, abstract objects to be treated the same as low-level atomic objects. For example, an object could be high level, in the sense that it contains objects and relationships and conceals lower level detail, but is, itself, considered as a single object. By reifying high-level abstractions in this way, we introduce scalability and abstraction into the system model. A high-level object could represent something as simple as a database record, or something as complex as a global climate system. In either case, we can treat these high-level objects as first-class objects in the system.

THE OSM MODEL

Having discussed the underlying principles, we now briefly describe our model and programming language that satisfies these principles.

OSM is an object-oriented model for systems analysis, specification, design, implementation, and evolution. The OSM model includes an object-relationship model, an object-behavior model, and an object-interaction model [6].

The structural components of OSM are object classes and relationship sets. An object has unique identity, may be lexical or non-lexical, is active concurrently with other objects, and may simultaneously have several active threads of behavior. Each object class has a state net, which is a template for the behavior of objects in an object class. A state net consists of states, which may be on or off, and transitions that, when triggered, perform actions and move objects among states in the state net. Based on state nets, an object may also synchronize and interact with other objects via interactions.

OSM supports several kinds of high-level components, including high-level object classes, relationship sets, states, transitions, and interactions. These high-level components give OSM scalable abstraction capabilities. A high-level state, for example, may contain other states and transitions, but can also be treated in exactly the same way as a non-high-level state in the sense that, like an atomic state, it may be on or off and may serve as a precondition for triggering actions.

As described above, an OSM model instance has three levels: a meta-model, a model instance, and a data instance. All three levels are stored together in an OSM storage facility. Querying the data in a data instance allows a user to obtain the current state of the database, whereas querying the ``data'' in a meta-model allows user to obtain the current state of the system. A user can find out, for example, how abstract components relate, how busy computational resources are, and how far along some transaction has progressed. Changing the data in a data instance updates the database, whereas changing the ``data'' in the meta-model evolves the model instance. In addition to evolving the structure, as is traditional in schema evolution, model-instance evolution also allows for controlled behavior evolution. Thus, program and data can evolve together.

An important aspect of OSM is that it has been formally defined using a first-order, temporal logic language called OSM-Logic [1,2].

Every OSM model instance can be converted to a set of OSM-Logic formulas. We then formally interpret these formulas by mapping the language's symbols to objects, points in time, functions, and relations in a mathematical structure. An interpretation for a set of formulas is valid if the formulas are true for the mathematical structure. Given an OSM model instance, we formally define its semantics as the set of all valid interpretations for the set of formulas resulting from the conversion of the model instance to OSM-Logic.

One benefit of this formal foundation is that we have a formally defined execution model for OSM. An execution model is a mechanism for generating a sequence of valid interpretations of a model instance. With a formal execution model, we have a mechanism for directly simulating, prototyping, and executing OSM model instances. As a consequence we immediately resolve the lifecycle integration problem because analysis, specification, and design models are all executable. This lets us move to implementation without changing models, and thus also lets us evolve the system without changing models.

Although formally defined, OSM allows its formalism to be "tunable" [3].

Portions of OSM model instances need not be fully formal. Triggers, actions, and several types of constraints can be written informally using a natural language, and low-level detail may be omitted. Informal statements, of course, are not executable and informal constraints are not enforceable. To make them formal, we need to rewrite them either in OSM-Logic or some other language that maps to OSM-Logic, such as Melody, which we describe below in the next section. By allowing various levels of formal completion in an OSM model instance, OSM becomes appropriate for all levels of users, from theoreticians to practitioners, and from analysts to programmers.

MELODY

Melody is a high-level language created to implement OSM model instances. It has has a number of interesting features, including some advanced features such as a uniform model of persistence, multiple database query capabilities, an interesting model for concurrency control, and active object behavior [11].

We mention here, however, only those features specifically related to our integration discussion.

Model-Driven Implementation Language. We consider the underlying model to be more important than the language with which a software system is constructed. It is the model that should drive the language, not vice versa. For this reason, the structural and behavioral models for Melody come directly from OSM and there is a one-to-one mapping between them. In contrast, most programming languages stand alone, defining their own data model and execution semantics.

Graphical and Textual Representation. In support of our system-representation design goal, Melody supports various representations. An OSM model instance can be written using OSM's graphical notation, or it can be written entirely in text. The graphical notation is particularly useful for promoting high-level understanding, whereas the textual notation is useful for providing implementation detail.

Declarative and Procedural Paradigms. Melody supports both procedural and declarative program specifications. Typical first-order logic rules are provided that can be unified and resolved in a traditional logic-programming fashion. Predicates in these rules represent object classes and relationship sets. Melody also provides procedural statements such as would be found in Ada, C++, or Smalltalk. For example, Melody has traditional control statements like \f4IF\fR ... \f4THEN\fR and \f4WHILE\fR ... \f4DO\fR. To make the integration smooth, we ensure that these traditional statements correspond precisely with particular state-net control patterns and that logic rules are seamlessly integrated to provide for conditions and actions.

Implementation Status. Our implementation of OSM is gaining momentum. Using C++ on HP 700 series workstations running under HP-UX 9.01, we have written an OSM model instance diagram editor that lets us create OSM model instances. We have also implemented a storage facility that not only stores OSM model instances, data instances, and meta-model instances, but also checks all model-specified constraints over these instances. Based on these foundation tools, a graphical query language OSM-QL has been implemented [5], and we have also created a rapid prototyping tool that executes model instances with both formal and informal triggers, actions, constraints, and interactions [10].

Our prototyping tool also fully executes a subset of Melody. The subset includes basic features such as persistence, active behavior, and a limited set of logic statements that are embedded in state nets. It does not, however, support transaction processing, multiple threads of control, or optimization. We are currently enhancing our tools and working on a more complete implementation of Melody.

SPECIFIC RESEARCH PROJECTS

Within our OSM integration framework we are working on or wish to work on several specific research projects. These include:

A complete Melody implementation. This includes not only a full prototype implementation of the language, but also a prototype implementation of concurrency control and crash recovery.

Implementations of additional query languages. In addition to our graphical query language, OSM-QL, we would like to implement both a logic data language, OSM-LDL, and an SQL-like language, OSM-SQL.

Specification tools. We would like to turn our initial rapid prototyping tool [10]. for OSM into an ``industrial-strength'' CASE tool for specification. Part of this effort would be to allow default interfaces, which we currently generate automatically, to be turned into ``look & feel'' quality user interfaces. Another part of this effort would be to allow specification documents to be semi-automatically generated in a form required by specific clientele.

Design tools. Much research has been accomplished on designing relational database systems from OSM model instances [7], and we are beginning work on a prototype for a near ``industrial-strength'' CASE tool for relational database design. We have also done some work on the theory behind designing object-oriented database systems from OSM, including the development of a new nested normal form (NNF) [12]. We wish to eventually use this work as the basis for a design tool for Melody.

Secondary storage management. To efficiently implement Melody, we need high quality clustering mechanisms. We are able to use our theoretical work on NNF to balance excessive pointer chasing and redundant data storage [12]. In addition, we have done some initial work on on-line, dynamic reclustering.

Support environment. We have begun to investigate distributed object servers [4] as the underlying basis for our OSM system. There is much to do in this area, and we have not come very far along this road. Even now, however, we envision the possibility that a ground-up effort may be advantageous.

System evolution. We have made some initial steps in defining evolution for OSM systems. In OSM system evolution object behavior as well as object schemes can evolve under the control of the system. Here, the seamless meta-model definition and formal foundation provide the key to guaranteeing the evolution properties we desire to have for both schemes and behavior.

Reverse engineering and reengineering. We have designed some initial algorithms to reverse engineer SQL schemes into OSM model instances. Once we can reverse-engineer a system into an OSM model instance (which is no small task), we can use the principles of our seamless lifecycle to reengineer systems in the same way as we engineer OSM systems in the first place. We would also like to investigate the reengineering of functioning systems while they continue to be used.

Legacy code. Over the last several decades, an inordinate number of lines of code has been written, and many useful software systems have been developed. Some of this software should be reused, rather than reengineered or redeveloped. Our strong adherence to the principle of data independence and our desire for efficiency have forced us to consider a means to provide hooks to external code and thus to provide a means of software reuse [8]. To make this work, however, we need to develop interface requirements to legacy code and to implement a prototype system for software reuse.

Applications. We have considered application areas in general as targets for OSM, but those of particular interest are large applications that require active, persistent objects. One preliminary study we have done considers the use of OSM for emergency response to possible disasters in the transportation of nuclear waste [9].

We would be happy to receive financial support for any, or all, of these research projects.

REFERENCES

[1] S.W. Clyde, D.W. Embley, and S.N. Woodfield, The Complete Formal Definition for the Syntax and Semantics of OSA, Technical Report, Computer Science Department, Brigham Young University, 1992.

[2] S.W. Clyde, Ph.D. Dissertation, An Initial Theoretical Foundation for Object-Oriented Systems Analysis and Design, Computer Science Department, Brigham Young University, 1993.

[3] S.W. Clyde, D.W. Embley, and S.N. Woodfield, Tunable Formalism in Object-oriented Systems Analysis: Meeting the Needs of Both Theoreticians and Practitioners, OOPSLA '92 Conference Proceedings, Vancouver, British Columbia, Canada, October 1992, 452-465.

[4] S.W. Clyde, D.W. Embley, and S.N. Woodfield, Dynamic Distribution of Object Fragments, Technical Report, Department of Computer Science, Brigham Young University, November, 1993.

[5] B.D. Czejdo, R.P. Tucci, D.W. Embley, and S.W. Liddle, Graphical Query Specification with Cardinality Constraints, Proceedings of the Fifth International Conference on Computing and Information, Sudbury, Ontario, Canada, May 1993, 433-437.

[6] D.W. Embley, B.D. Kurtz, and S.N. Woodfield, Object-Oriented Systems Analysis: A Model-Driven Approach, Prentice-Hall, Yourdon Press Series, Englewood Cliffs, New Jersey, 1992

[7] D.W. Embley and T.W. Ling, Synergistic Database Design with an Extended Entity-Relationship Model, Proceedings of the Eighth International Conference on Entity-Relationship App roach, Toronto, Canada, October 1989, 118-135.

[8] D.W. Embley and S.N. Woodfield, A Knowledge Structure for Reusing Abstract Data Types, Proceedings of the 9th Annual International Conference on Software Engineering, Monterey, California, March/April 1987, 360-368, also reprinted in Software Reuse %dash% Emerging Technology, W. Tracz (ed.), September 1988, 309-317.

[9] D.W. Embley, G. Nagy, and R.R. Souleyrette, An Intelligent Geographic Information System for Radioactive Waste Transportation and Emergency Response, Technical Report, Brigham Young University (Provo, Utah) & Rensselaer Polytechnic Institute (Troy, New York) & University of Nevada (Las Vegas, Nevada), September 1992.

[10] R.B. Jackson, D.W. Embley, and S.N. Woodfield, Automated Support for the Development of Formal Object-oriented Requirements Specifications, The 6th Conference on Advanced Information Systems Engineering, Utrecht, The Netherlands, 6-10 June 1994. (in press)

[11] S.W. Liddle, D.W. Embley, and S.N. Woodfield, Melody Language Specification, Technical Report, Department of Computer Science, Brigham Young University, Provo, Utah. (in preparation)

[12] W.Y. Mok, D.W. Embley, and K-Y. Ng, Theoretical and Practical Implications of a New Definition For Nested Normal Form, Technical Report, Department of Computer Science, Brigham Young University, February, 1994.