CHP 9
CHP 9
CHP 9
Chapter 9
Database Design
Discussion Focus
What is the relationship between a database and an information system, and how does this
relationship have a bearing on database design?
An information system performs three sets of services:
It provides for data collection, storage, and retrieval.
It facilitates the transformation of data into information.
It provides the tools and conditions to manage both data and information.
Basically, a database is a fact (data) repository that serves an information system. If the database is
designed poorly, one can hardly expect that the data/information transformation will be successful, nor is
it reasonable to expect efficient and capable management of data and information.
The transformation of data into information is accomplished through application programs. It is
impossible to produce good information from poor data; and, no matter how sophisticated the
application programs are, it is impossible to use good application programs to overcome the effects of
bad database design. In short: Good database design is the foundation of a successful information
system.
Database design must yield a database that:
Does not fall prey to uncontrolled data duplication, thus preventing data anomalies and the
attendant lack of data integrity.
Is efficient in its provision of data access.
Serves the needs of the information system.
The last point deserves emphasis: even the best-designed database lacks value if it fails to meet
information system objectives. In short, good database designers must pay close attention to the
information system requirements.
Systems design and database design are usually tightly intertwined and are often performed in parallel.
Therefore, database and systems designers must cooperate and coordinate to yield the best possible
information system.
What is the relationship between the SDLC and the DBLC?
The SDLC traces the history (life cycle) of an information system. The DBLC traces the history (life
cycle) of a database system. Since we know that the database serves the information system, it is not
surprising that the two life cycles conform to the same basic phases.
Suggestion: Use Figure 9.13 as the basis for a discussion of the parallel activities.
41
What basic database design strategies exist, and how are such strategies executed?
Suggestion: Use Figure 9.14 as the basis for this discussion.
There are two basic approaches to database design: top-down and bottom-up.
Top-down design begins by identifying the different entity types and the definition of each entity's
attributes. In other words, top-down design:
starts by defining the required data sets and then
defines the data elements for each of those data sets.
Bottom-up design:
first defines the required attributes and then
groups the attributes to form entities.
Although the two methodologies tend to be complementary, database designers who deal with small
databases with relatively few entities, attributes, and transactions tend to emphasize the bottom-up
approach. Database designers who deal with large, complex databases usually find that a primarily
top-down design approach is more appropriate.
In spite of the frequent arguments concerning the best design approach, perhaps the top -down vs.
bottom-up distinction is quite artificial. The text's note is worth repeating:
NOTE
Even if a generally top-down approach is selected, the normalization process that revises
existing table structures is (inevitably) a bottom-up technique. E-R models constitute a topdown process even if the selection of attributes and entities may be described as bottom-up.
Since both the E-R model and normalization techniques form the basis for most designs, the
top-down vs. bottom-up debate may be based on a distinction without a difference.
42
43
D
A
T
A
Conceptual Model
System
Processes
Data
Constraints
44
D
I
C
T
I
O
N
A
R
Y
NOTE
Use the texts Figures 9.15 and 9.16 to contrast the two design approaches, then use Figure
9.6 to show the procedure flows; demonstrate that such procedure flows are independent of
the degree of centralization.
In contrast, when company operations are spread across multiple operational sites or when the
database has multiple entities that are subject to complex relations, the best approach is often based
on the decentralized design.
Typically, a decentralized design requires that the design task be divided into multiple modules, each
one of which is assigned to a design team. The design team activities are coordinated by the lead
designer, who must aggregate the design teams' efforts.
Since each team focuses on modeling a subset of the system, the definition of boundaries and the
interrelation between data subsets must be very precise. Each team creates a conceptual data model
corresponding to the subset being modeled. Each conceptual model is then verified individually
against the user views, processes, and constraints for each of the modules. After the verification
process has been completed, all modules are integrated in one conceptual model.
Since the data dictionary describes the characteristics of all the objects within the conceptual data
model, it plays a vital role in the integration process. Naturally, after the subsets have been aggregated
into a larger conceptual model, the lead designer must verify that the combined conceptual model is
still able to support all the required transactions. Thus the decentralized design activities may be
summarized as shown in Figure Q8.6B.
45
DATA COMPONENT
Conceptual
Models
Verification
Subset A
Subset B
Subset C
Views,
Processes,
Constraints
Views,
Processes,
Constraints
Views,
Processes,
Constraints
Aggregation
D
A
T
A
D
I
C
T
I
O
N
A
R
Y
Keep in mind that the aggregation process requires the lead designer to assemble a single model in
which various aggregation problems must be addressed:
synonyms and homonyms. Different departments may know the same object by different names
(synonyms), or they may use the same name to address different objects (homonyms.) The
object may be an entity, an attribute, or a relationship.
entity and entity subclasses. An entity subset may be viewed as a separate entity by one or more
departments. The designer must integrate such subclasses into a higher-level entity.
Conflicting object definitions. Attributes may be recorded as different types (character,
numeric), or different domains may be defined for the same attribute. Constraint definitions, too,
may vary. The designer must remove such conflicts from the model.
6. What is the minimal data rule in conceptual design? Why is it important?
The minimal data rule specifies that all the data defined in the data model are actually required to fit
present and expected future data requirements. This rule may be phrased as All that is needed is
46
NOTE
Do keep in mind that an ERD cannot always include all the applicable business rules. For
example, although constraints are often crucial, it is often not possible to model them. For
instance, there is no way to model a constraint such as no pilot may be assigned to flight
duties more than ten hours during any 24-hour period.
It is also worth emphasizing that the description of (company) operations must be done in
almost excruciating detail and it must be verified and re-verified. An inaccurate description
of operations yields inaccurate business rules that lead to database designs that are destined
to fail.
9. What is the data dictionary's function in database design?
A good data dictionary provides a precise description of the characteristics of all the entities and
attributes found within the database. The data dictionary thus makes it easier to check for the
existence of synonyms and homonyms, to check whether all attributes exist to support required
reports, to verify appropriate relationship representations, and so on. The data dictionary's contents
are both developed and used during the six DBLC phases:
DATABASE INITIAL STUDY
The basic data dictionary components are developed as the entities and attributes are defined during
this phase.
DATABASE DESIGN
The data dictionary contents are used to verify the database design components: entities, attributes,
and their relationships. The designer also uses the data dictionary to check the database design for
homonyms and synonyms and verifies that the entities and attributes will support all required query
47
ACTIVITY
Identify, analyze, and refine the business rules.
Identify the main entities, using the results of Step 1.
Define the relationships among the entities, using the results of Steps 1 and 2.
Define the attributes, primary keys, and foreign keys for each of the entities.
Normalize the entities. (Remember that entities are implemented as tables in an RDBMS.)
Complete the initial ER diagram.
Have the main end users verify the model in Step 6 against the data, information, and
processing requirements.
Modify the ER diagram, using the results of Step 7.
Point out that some of the steps listed in Table 9.1 take place concurrently. And some, such as the
normalization process, can generate a demand for additional entities and/or attributes, thereby causing
the designer to revise the ER model. For example, while identifying two main entities, the designer
might also identify the composite bridge entity that represents the many-to-many relationship between
those two main entities.
11. List and briefly explain the activities involved in the verification of an ER model.
48
ACTIVITY
Identify the ER models central entity.
Identify each module and its components.
Identify each modules transaction requirements:
Internal: Updates/Inserts/Deletes/Queries/Reports
External: Module interfaces
Verify all processes against the ER model.
Make all necessary changes suggested in Step 4.
Repeat Steps 25 for all modules.
Keep in mind that the verification process requires the continuous verification of business transactions
as well as system and user requirements. The verification sequence must be repeated for each of the
systems modules.
12. What factors are important in a DBMS software selection?
The selection of DBMS software is critical to the information systems smooth operation.
Consequently, the advantages and disadvantages of the proposed DBMS software should be carefully
studied. To avoid false expectations, the end user must be made aware of the limitations of both the
DBMS and the database.
Although the factors affecting the purchasing decision vary from company to company, some of the
most common are:
Cost. Purchase, maintenance, operational, license, installation, training, and conversion costs.
DBMS features and tools. Some database software includes a variety of tools that facilitate
the application development task. For example, the availability of query by example (QBE),
screen painters, report generators, application generators, data dictionaries, and so on, helps
to create a more pleasant work environment for both the end user and the application
programmer. Database administrator facilities, query facilities, ease of use, performance,
security, concurrency control, transaction processing, and third-party support also influence
DBMS software selection.
Underlying model. Hierarchical, network, relational, object/relational, or object.
Portability. Across platforms, systems, and languages.
DBMS hardware requirements. Processor(s), RAM, disk space, and so on.
Problem Solutions
49
The answer to this question may vary slightly from one designer to the next, depending on the
selected design methodology and even on personal designer preferences. Yet, in spite of such
differences, it is possible to develop a common design methodology to permit the development of a
basic decision-making process and the analysis required in designing an information system.
Whatever the design philosophy, a good designer uses a specific and ordered set of steps through
which the database design problem is approached. The steps are generally based on three phases:
analysis, design, and implementation. These phases yield the following activities:
ANALYSIS
1. Interview the shop manager
2. Interview the mechanics
3. Obtain a general description of company operations
4. Create a description of each system process
DESIGN
5. Create a conceptual model, using E-R diagrams
6. 8. Draw a data flow diagram and system flow charts
50
51
The Inventory module will include the Parts and Purchasing sub-modules. The Payroll Module
will handle all employee and payroll information. The Work order module keeps track of the car
maintenance history and all work orders for maintenance done on a car. The Customer module
keeps track of the billing of the work orders to the customers and of the payments received from
those customers.
c. How will a data dictionary help you develop the system? Give examples.
We have addressed the role of the data dictionary within the DBLC in detail in the answer to
review question 10. Remember that the data dictionary makes it easier to check for the existence
of synonyms and homonyms, to check whether all attributes exist to support required reports, to
verify appropriate relationship representations, and so on. Therefore, the data dictionary's
contents will help us to provide consistency across modules and to evaluate the system's ability to
generate the required reports. In addition, the use of the data dictionary facilitates the creation of
system documentation.
52
53
PLANNING
Initial assessment
Feasibility study
ANALYSIS
User requirements
Study of existing systems
Logical system design
IMPLEMENTATION
Coding, testing, debugging
Installation, fine-tuning
MAINTENANCE
Evaluation
Maintenance
Enhancements
b. What do you envision the DBLC to be?
As is true for the SDLC, the DBLC is not a function of the kind and extent of the collected
information. Thus, the DBLC phases and their activities remain as shown:
DATABASE DESIGN
Create the conceptual design
Create the logical design
create the physical design
OPERATION
Produce the required information flow
Also, as is true for any organization, the system scope and constraints may be very different for
different systems. Therefore, designers may opt to use different techniques at different stages. For
example, the database initial study phase may include separate studies carried out by separate design
teams at several geographically distant locations. Each of the findings of the design teams will later be
integrated to identify the main problems, solutions, and opportunities that will guide the design and
development of the system.
4. Using the same procedures and concepts employed in Problem 1, how would you create an
information system for the Tiny College example in Chapter 4?
Tiny College is a medium-sized educational institution that uses many database-intensive operations,
such as student registration, academic administration, inventory management, and payroll. To create
an information system, first perform an initial database study to determine the information system's
objectives.
Next, study Tiny College's operations and processes (flow of data) to identify the main problems,
constraints, and opportunities. A precise definition of the main problems and constraints will enable
the designer to make sure that the design improves Tiny College's operational efficiency. An
improvement in operational efficiency is likely to create opportunities to provide new services that
will enhance Tiny College's competitive position.
After the initial database study is done and the alternative solutions are presented, the end users
ultimately decide which one of the probable solutions is most appropriate for Tiny College. Keep in
mind that the development of a system this size will probably involve people who have quite different
backgrounds. For example, it is likely that the designer must work with people who play a managerial
role in communications and local area networks, as well as with the "troops in the trenches" such as
programmers and system operators. The designer should, therefore, expect that there will be a wide
range of opinions concerning the proposed system's features. It is the designer's job to reconcile the
many (and often conflicting) views of the "ideal" system.
Once a proposed solution has been agreed upon, the designer(s) may determine the proposed system's
scope and boundaries. We are then able to begin the design phase. As the design phase begins, keep
in mind that Tiny College's information system is likely to be used by many users (20 to 40 minimum)
who are located on distant sites across campus. Therefore, the designer must consider a range of
communication issues involving the use of such technologies as local area networks. These
technologies must be considered as the database designer(s) begin to develop the structure of the
database to be implemented.
The remaining development work conforms to the SDLC and the DBLC phases. Special attention
must be given to the system design's implementation and testing to ensure that all the system modules
interface properly.
Finally, the designer(s) must provide all the appropriate system documentation and ensure that all
appropriate system maintenance procedures (periodic backups, security checks, etc.) are in place to
ensure the system's proper operation.
Keep in mind that two very important issues in a university-wide system are end-user training and
56
57