Possible Solution of In-Class Exercise 1: PART A: Short Questions (4 Marks Each)
Possible Solution of In-Class Exercise 1: PART A: Short Questions (4 Marks Each)
Possible Solution of In-Class Exercise 1: PART A: Short Questions (4 Marks Each)
Possible Solution:
Both an EDW and a data mart are data warehouses. An EDW (Enterprise Data
Warehouse) is an all-encompassing DW that covers all subject areas of interest to the
entire organization. A data mart is a smaller DW designed around one problem,
organizational function, topic, or other suitable focus area.
Dependent data mart is a subset that is created directly from a data warehouse
Independent data mart is a small data warehouse designed for a strategic business unit or
a department.
Reference:
p.5-6 of Data Warehousing
Possible Solution:
1. Define the problem
2. Classify the problem into a standard category
3. Construct a model that describes the real-world problem
4. Identify possible solutions to the modeled problem and evaluate the solutions
5. Compare, choose, and recommend a potential solution to the problem
Reference
p.22 of Decision Support Systems and Business Intelligence
3. Use Simon’s four phases of decision making to describe how to decide the portfolio of an
investment.
Possible Solution:
Intelligence consists of gathering information by examining reality, then identifying and
defining the problem. In this phase problem ownership should also be established.
Example: collect information of stocks, bonds, precious metals…
Design consists of determining alternatives and evaluating them. If the evaluation will
require construction of a model, that is done in this phase as well.
Example: go through different portfolio template
The choice phase consists of selecting a tentative solution and testing its validity.
Example: choice the one that gives high return and is robust.
Implementation of the decision consists of putting the selected solution into effect.
Example: purchase.
Possible Solution:
Predictive analytics is the use of statistical techniques and data mining to determine what
is likely to happen in the future. Businesses use predictive analytics to forecast whether
customers are likely to switch to a competitor, what customers are likely to buy, how
likely customers are to respond to a promotion, and whether a customer is creditworthy.
Sports teams have used predictive analytics to identify the players most likely to
contribute to a team’s success.
Composite Key:
CustID, TheaterHallID, ReserveDate,
Calculation:
* PaidAmt = IncomePerNight x NoOfNights
Customer
CustID
CustAddr
CustCity Income
CustCountry
Recommender
ContactDate FK
FK Composite Key
TheaterHall FK
Auxiliary info
TheaterHallID Measure
TheaterLocation
HallNo Time
HallType TimeID
Year
Month
Date
Attribute hierarchies:
Date CustAddr
Customer
(b) A manufacturing company produces products. Each product contains information as
follows: product name, product ID and quantity on hand. Each product is made up of
many components and each component can be used to build various products. Each
component can be supplied by one or more suppliers and a supplier can provide
multiple components. The following component information is kept: component ID,
name, and description. Create an ERD to show how you would track this
information. Show entity names, primary keys, attributes for each entity,
relationships between the entities and cardinality. (5 marks)
Possible Solution:
(c) Consider the Health History Report of an animal hospital as shown in Fig. 2a. Fig.
2b shows the related database schema that is used to generate the report. Note that
the PROCEDURE column of the Health History Report is composed of the
attributes procedure_no and procedure_name.
(i) Which normal form is the schema in? Explain your answer. (3 marks)
(ii) What alternative schema would you suggest? (2 marks)
(a)
(b)
Fig. 2 (a) Sample of the Health History Report. (b) Database schema that is used
to generate the Health History Report
Possible Solution:
(i)
The schema is in the first normal form (1NF).
The schema is not in the second normal form because we can find a partial FD s.t.
procedure_no -> procedure_name and procedure_no is a subset of the primary key
of PET_VISIT.
(ii)
PET (pet_id, Pet_Name, pet_type, pet_age, owner)
PET_VISIT (pet_id, visit_date, procedure_no)
PROCEDURE(procedure_no, procedure_name)