0% found this document useful (0 votes)
20 views15 pages

Database NOTE

Uploaded by

Quang Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views15 pages

Database NOTE

Uploaded by

Quang Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

CHAPTER 1: THE DATABASE ENVIRONMENT AND

DEVELOPMENT PROCESS
I. Basic concepts and definitions
- Database: organized collection of logically related data 🡪 A database may be of any size and
complexity
- Data: stored representations of meaningful objects and events
o Structured: numbers, text, dates
o Unstructured: images, video, documents
- Information: data processed to increase knowledge in the person using the data
- Metadata: data that describe the properties or characteristics of end-user data and the context
of those data 🡪 Context helps users understand data
- Graphical displays turn data into useful information that managers can use for decision making
and interpretation
- Descriptions of the properties or characteristics of the data, including data types, field sizes,
allowable values, and data context 🡪 Metadata is not data per se. Instead, it is a description of
how data is to be stored and organized into a database.

II. Disadvantages of file processing


- Program-Data Dependence: All programs maintain metadata for each file they use
o Database application program (or set of related programs) that is used to perform a
series of database activities (create, read, update, and delete) on behalf of database
users
o Each application programmer must maintain his/her own data
o Each application program needs to include code for the metadata of each file
o Each application program must have its own processing routines for reading, inserting,
updating, and deleting data
o Lack of coordination and central control
o Non-standard file formats
- Duplication of Data: Different systems/programs have separate copies of the same data
- Limited Data Sharing: No centralized control of data
- Lengthy Development Times: Programmers must design their own file formats
- Excessive Program Maintenance: 80% of information systems budget
- Problems with data redundancy:
o Waste of space to have duplicate data
o Causes more maintenance headaches
o The biggest problem:
▪ Data changes in one file could cause inconsistencies (dữ liệu không nhất quán)

▪ Compromises in data intergrity (dữ liệu toàn vẹn) - refers to ensuring the
validity, security, and availability of a company’s data
III. Solution: The database approach
- Central repository of shared data
- Data is managed by a controlling agent
- Stored in a standardized, convenient form

🡪 Requires a Database Management System (DBMS)

- DBMS is a software system that is used to create, maintain, and provide controlled access to
user databases 🡪 DBMS manages data resources like an operating system manages hardware
resources
- Most current DBMSs are in the form of relational databases, which represent data as a
collection of tables in which all data relationships are represented by common values in related
tables. We will explore relational databases in detail throughout this course.
- Note that because all the data is shared in a central database, there is no longer the need for
separate systems and programs to maintain their own copy of the data. This reduces duplication
and increases integrity.

IV. Database Management System


- Data models
o Graphical diagram capturing nature and relationship of data
o Enterprise Data Model–high-level entities and relationships for the organization
o Project Data Model–more detailed view, matching data structure in database or data
warehouse
- Entities
o Noun form describing a person, place, object, event, or concept
o Composed of attributes
- Relationships
o Between entities
o Usually one-to-many (1:M) or many-to-many (M:N), but could also be one-to-one (1:1)
- Relational Databases
o Database technology involving tables (relations) representing entities and
primary/foreign keys representing relationships
- Associative entity: represent an association between two other entities
- Comparison of enterprise and project level data models
o The enterprise model doesn’t contain some details, including the attributes of the
entities, as well as the associative entities
o the project-level model which is much more detailed

V. Advantage of the database approach


- Program-data independence
- Planned data redundancy
- Improved data consistency
- Improved data sharing
- Increased application development productivity
- Enforcement of standards
- Improved data quality
- Improved data accessibility and responsiveness
- Reduced program maintenance
- Improved decision support

VI. Costs and risks of the database approach


- New, specialized personnel
- Installation and management cost and complexity
- Conversion costs
- Need for explicit backup and recovery
- Organizational conflict

VII. Components of the database encironment


- Data modeling and design tools -- automated tools used to design databases and application
programs
- Repository–centralized storehouse of metadata
- Database Management System (DBMS) –software for managing the database
- Database–storehouse of the data
- Application Programs–software using the data
- User Interface–text, graphical displays, menus, etc. for user
- Data/Database Administrators–personnel responsible for maintaining the database
- System Developers–personnel responsible for designing databases and software
- End Users–people who use the applications and databases

VIII. Enterprise data model


- First step in the database development process
- Specifies scope and general content
- Overall picture of organizational data at high level of abstraction
- Entity-relationship diagram
- Descriptions of entity types
- Relationships between entities
- Business rules
- One type of matrix matches business functions with the data entity types they need; this is
called a function-to-data-entity matrix.

IX. Two approaches to database and IS development


- SDLC (Systems development life cycle) 🡪 called a “waterfall” approach:
o System Development Life Cycle
o Detailed, well-planned development process
o Time-consuming, but comprehensive
o Long development cycle

Step Purpose Deliverable Database activity

Planning Preliminary Request for study Enterprise modeling and early


understanding conceptual data modeling
Analysis Thorough Functional system Thorough and integated couceptual data
requirements analysis specification modeling
and structuring

Logical Design Information Detailed sedign Logical database design (transactions,


requirements specifications forms, displays, views, data integrity and
elicitation and security)
structure

Physical Design Deevelop tachnoligy Program/ data Physical database design (define
and organizational structures, technology database to DBMS, physical data
specification purchases, organization, database processing
organization redesigns programs)

Implementation Programming, testing, Opetational programs, Database implementation, including


training, instrallation, documentation, coded programs, documentation,
documenting training meterials installation and conversion

Maintenance Monitor, repair, Periodic audits Database maintenance, performance


enhance analysis and tuning, error corrections

- Prototyping 🡪 a classical Rapid Application Development (RAD) approach:


o Rapid application development (RAD)
o Cursory attempt at conceptual data modeling
o Define database during development of initial prototype
o Repeat implementation and maintenance activities with new prototype versions
o Less formal and more ad hoc, involving iterations of coding and using and evolving

X. Other rapid application (RAD) approaches


- Agile – emphasizes “individuals and interactions over processes and tools, working software
over comprehensive documentation, customer collaboration over contract negotiation, and
response to change over following a plan.”
- Examples of agile programming methodologies
o eXtreme programming
o Scrum
o DSDM Consortium
o Feature-driven development

XI. Database schema


- External Schema 🡪 different people have different views ò the database
o User Views
o Subsets of Conceptual Schema
o Can be determined from business-function/data entity matrices
o DBA determines schema for different users
- Conceptual Schema
o E-R models–covered in Chapters 2 and 3
- Internal Schema 🡪 the underlying design and implementation
o Logical structures–covered in Chapter 4
o Physical structures–covered in Chapter 5

CHAPTER 2: MODELING DATA IN THE ORGANIZATION


I. E-R Model Constructs
- Entities:
o Entity instance–person, place, object, event, concept (often corresponds to a row in a
table)
o Entity Type–collection of entities (often corresponds to a table)
- Relationships:
o Relationship instance–link between entities (corresponds to primary key-foreign key
equivalencies in related tables)
o Relationship type–category of relationship…link between entity types
- Attributes:
o Properties or characteristics of an entity or relationship type (often corresponds to a
field in a table)
- Basic E-R notation
o Entity: Strong; Weak; Associative
o Relationships:
▪ Relationship degrees: Unary; Binary; Temary 🡪 specify number of entity types
involved
▪ Relationship cardinality: Mandatory one; Mandatory many; Optional one;
Optional many 🡪 specify how many of each entity type is allowed
II. Business Rules
- Are statements that define or constrain some aspect of the business
- Are derived from policies, procedures, events, functions
- Assert business structure
- Control/influence business behavior
- Are expressed in terms familiar to end users
- Are automated through DBMS software

III. A good business rule is:


- Declarative–what, not how
- Precise–clear, agreed-upon meaning
- Atomic–one statement
- Consistent–internally and externally
- Expressible–structured, natural language
- Distinct–non-redundant
- Business-oriented–understood by business people

IV. A good data name is:


- Related to business, not technical, characteristics
- Meaningful and self-documenting
- Unique
- Readable
- Composed of words from an approved list
- Repeatable
- Written in standard syntax

V. Data definitions
- Explanation of a term or fact:
o Term–word or phrase with specific meaning
o Fact–association between two or more terms
- Guidelines for good data definition
o A concise description of essential data meaning
o Gathered in conjunction with systems requirements
o Accompanied by diagrams
o Achieved by consensus, and iteratively refined

VI. Entities
- Entity – a person, a place, an object, an event, or a concept in the user environment about which
the organization wishes to maintain data
- Entity type – a collection of entities that share common properties or characteristics
- Entity instance – A single occurrence of an entity type
- An entity:
o Should be:
▪ An object that will have many instances in the database

▪ An object that will be composed of multiple attributes

▪ An object that we are trying to model


o Should not be:
▪ A user of the database system

▪ An output of the database system (e.g., a report)


- Inappropriate entities: System user and System output

- Strong entity:
o exists independently of other types of entities
o has its own unique identifier
o identifier underlined with single line
- Weak entity
o dependent on a strong entity (identifying owner)…cannot exist on its own
o does not have a unique identifier (only a partial identifier)
o entity box and partial identifier have double lines
- Identifying relationship: links strong entities to weak entities
VII. Guidelines for naming and defining entities
- Names:
o Singular noun
o Specific to organization
o Concise, or abbreviation
o For event entities, the result not the process
o Name consistent for all diagrams
- Definitions:
o “An X is…”
o Describe unique characteristics of each instance
o Explicit about what is and is not the entity
o When an instance is created or destroyed
o Changes to other entity types
o History that should be kept

VIII. Attributes
- Attribute–property or characteristic of an entity or relationship type
- Classifications of attributes:
o Required versus Optional Attributes
o Simple versus Composite Attribute
o Single-Valued versus Multivalued Attribute
o Stored versus Derived Attributes
o Identifier Attributes
- Required attributes: must have a value for every entity (or relationship) instance with which it is
associated
- Optional attributes: may not have a value for every entity (or relationship) instance with which it
is associated
- Composite attribute: An attribute that has meaningful component parts (attributes)
- Multivalued attributes – may take on more than one value for a given entity (or
relationship) instance
- Derived attributes– values can be calculated from related attribute values (not physically
stored in the database)

IX. Identifiers (Keys)


- Identifier (Key)–an attribute (or combination of attributes) that uniquely identifies individual
instances of an entity type
- Simple versus Composite Identifier
- Candidate Identifier–an attribute that could be an identifier…satisfies the requirements for
being an identifier
- Criteria for identifiers
o Choose Identifiers that
▪ Will not change in value

▪ Will not be null


o Avoid intelligent identifiers (e.g., containing locations or people that might change)
o Substitute new, simple keys for long, composite keys
- The identifier is boldfaced and underlined
- Simple and composite identifier attributes
- Naming attributes:
o Name should be a singular noun or noun phrase
o Name should be unique
o Name should follow a standard format
▪ e.g. [Entity type name { [ Qualifier ] } ] Class
o Similar attributes of different entity types should use the same qualifiers and classes
- Defining attributes:
o State what the attribute is and possibly why it is important
o Make it clear what is and is not included in the attribute’s value
o Include aliases in documentation
o State source of values
o State whether attribute value can change once set
o Specify required vs. optional
o State min and max number of occurrences allowed
o Indicate relationships with other attributes

X. Modeling relationships
- Relationship Types vs. Relationship Instances
o The relationship type is modeled as lines between entity types…the instance is between
specific entity instances
- Relationships can have attributes
o These describe features pertaining to the association between the entities in the
relationship
- Two entities can have more than one type of relationship between them (multiple relationships)
- Associative Entity–combination of relationship and entity
- Relationship types and instances
- Degree of relationships:
o Degree of a relationship is the number of entity types that participate in it
▪ Unary Relationship: One entity related to another of the same entity type

▪ Binary Relationship: Entities of two different types related to each other

▪ Ternary Relationship: Entities of three different types related to each other

- Cardianlity of relationships:
o One-to-One: Each entity in the relationship will have exactly one related entity
o One-to-Many: An entity on one side of the relationship can have many related entities,
but an entity on the other side will have a maximum of one related entity
o Many-to-Many: Entities on both sides of the relationship can have many related entities
on the other side
- Note: a relationship can have attributes of its own
- Cardianlity constraints:
o Cardinality Constraints—the number of instances of one entity that can or must be
associated with each instance of another entity
o Minimum Cardinality: If zero, then optional / If one or more, then mandatory
o Maximum Cardinality: The maximum number
- Entities can be related to one another in more than one way
- If min cardinality constraint is 2. At least two professors must be qualified to teach each course.
Each professor must be qualified to teach at least one course.

- Multivalued attributes can be represented as relationships

XI. Associative entities


- An entity–has attributes
- A relationship–links entities together
- When should a relationship with attributes instead be an associative entity?
o All relationships for the associative entity should be many
o The associative entity could have meaning independent of the other entities
o The associative entity preferably has a unique identifier, and should also have other
attributes
o The associative entity may participate in other relationships other than the entities of
the associated relationship
o Ternary relationships should be converted to associative entities
- A binary relationship with an attribute 🡪 the date completed attribute pertains specifically to
the employee’s completion of a course…it is an attribute of the relationship.

- An associative entity (CERTIFICATE): is like a relationship with an attribute, but it is also


considered to be an entity in its own right.

- Note that the many-to-many cardinality between entities in Figure 2-11a has been replaced by
two one-to-many relationships with the associative entity.
- This could just be a relationship with attributes…it’s a judgment call.
- Time stamp – a time value that is associated with a data value, often indicating when some
event occurred that affected the data value
- Modeling time-dependent data has become more important due to regulations such as HIPAA
and Sarbanes-Oxley. 🡪 The Assignment associative entity shows the date range of a product’s
assignment to a particular product line.

- Different modeling software tools may have different notation for the same constructs.

You might also like