Degrees of Data Abstraction
Degrees of Data Abstraction
If you ask 10 database designers what a data model is, you will end up with 10 different answers—
depending on the degree of data abstraction.
To illustrate the meaning of data abstraction, consider the example of automotive design. A car
designer begins by drawing the concept of the car to be produced. Next, engineers design the details
that help transfer the basic concept into a structure that can be produced. Finally, the engineering
drawings are translated into production specifications to be used on the factory floor. As you can
see, the process of producing the car begins at a high level of abstraction and proceeds to an ever-
increasing level of detail. The factory floor process cannot proceed unless the engineering details are
properly specified, and the engineering details cannot exist without the basic conceptual framework
created by the designer.
Using levels of abstraction can also be very helpful in integrating multiple (and sometimes
conflicting) views of data at different levels of an organization.
ANSI/SPARC architecture defines three levels of data abstraction: external, conceptual, and internal.
The figure below presents the external schemas for two Tiny College business units: student
registration and class scheduling. Each external schema includes the appropriate entities,
relationships, processes, and constraints imposed by the business unit. Also note that although the
application views are isolated from each other, each view shares a common entity with the other
view. For example, the registration and scheduling external schemas share the entities CLASS and
COURSE.
The use of external views that represent subsets of the database has some important advantages:
• It is easy to identify specific data required to support each business unit’s operations.
• It makes the designer’s job easy by providing feedback about the model’s adequacy. Specifically,
the model can be checked to ensure that it supports all processes as defined by their external
models, as well as all operational requirements and constraints.
• It helps to ensure security constraints in the database design. Damaging an entire database is
more difficult when each business unit works with only a subset of data.
• It makes application program development much simpler.
The most widely used conceptual model is the ER model. The conceptual model yields some
important advantages.
• it provides a bird’s eye (macro level) view of the data environment that is relatively easy to
understand.
• the conceptual model is independent of both software and hardware. Software independence
means that the model does not depend on the DBMS software used to implement the model.
Hardware independence means that the model does not depend on the hardware used in the
implementation of the model. Therefore, changes in either the hardware or the DBMS software
will have no effect on the database design at the conceptual level. Generally, the term logical
design refers to the task of creating a conceptual data model that could be implemented in any
DBMS.
Because the internal model depends on specific database software, it is said to be software
dependent. Therefore, a change in the DBMS software requires that the internal model be changed
to fit the characteristics and requirements of the implementation database model. When you can
change the internal model without affecting the conceptual model, you have logical independence.
However, the internal model is still hardware independent because it is unaffected by the type of
computer on which the software is installed. Therefore, a change in storage devices or even a
change in operating systems will not affect the internal model.
Early data models forced the database designer to take the details of the physical model’s data
storage requirements into account. However, the now-dominant relational model is aimed largely at
the logical level rather than at the physical level; therefore, it does not require the physical-level
details common to its predecessors.