The document discusses the relational data model, which organizes data into tables with rows and columns. The relational model is based on mathematical relations and set theory. Each table represents a relation, with columns representing attributes and rows representing tuples. Relations have a schema that defines the attributes and constraints. Key constraints include superkeys, candidate keys, and primary keys to uniquely identify rows. Entity integrity requires primary keys to be non-null. Referential integrity constraints define relationships between rows in different tables through foreign keys.
The document discusses the relational data model, which organizes data into tables with rows and columns. The relational model is based on mathematical relations and set theory. Each table represents a relation, with columns representing attributes and rows representing tuples. Relations have a schema that defines the attributes and constraints. Key constraints include superkeys, candidate keys, and primary keys to uniquely identify rows. Entity integrity requires primary keys to be non-null. Referential integrity constraints define relationships between rows in different tables through foreign keys.
The document discusses the relational data model, which organizes data into tables with rows and columns. The relational model is based on mathematical relations and set theory. Each table represents a relation, with columns representing attributes and rows representing tuples. Relations have a schema that defines the attributes and constraints. Key constraints include superkeys, candidate keys, and primary keys to uniquely identify rows. Entity integrity requires primary keys to be non-null. Referential integrity constraints define relationships between rows in different tables through foreign keys.
The document discusses the relational data model, which organizes data into tables with rows and columns. The relational model is based on mathematical relations and set theory. Each table represents a relation, with columns representing attributes and rows representing tuples. Relations have a schema that defines the attributes and constraints. Key constraints include superkeys, candidate keys, and primary keys to uniquely identify rows. Entity integrity requires primary keys to be non-null. Referential integrity constraints define relationships between rows in different tables through foreign keys.
Design • Think “logically” when design a database, need some kind of framework to design the database • Like designing a data structure in some programming language, a data model is like a type system, but is abstract • Organize the data into tables in the relational data model • Initially no need to worry about how these tables are implemented Relational Model Concepts • The relational model of data is based on the concept of a mathematical relation • A relation is a mathematical concept based on set theory • The strength of the relational approach to data management comes from the formal foundation provided by the theory of relations Relational Model Concepts • The model was first proposed by Dr. E.F. Codd of IBM in 1970 in the following paper: "A Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970
The above paper caused a major revolution in the
field of Database management and earned Ted Codd the coveted Turing Award What is a Relational Database? Routes RId RName Grade Rating Height 1 Last Tango II 12 100 2 Garden Path I 2 60 3 The Sluice I 8 60 4 Picnic III 3 400 Climbers Climbs CId Cname Skill Age CId RId Date Duration 123 Edmund EXP 80 123 1 10/10/88 5 214 Arnold BEG 25 123 3 11/08/87 1 313 Bridget EXP 33 313 1 12/08/89 5 212 James MED 27 214 2 08/07/92 2 313 3 06/07/94 3 Why is the Database Like This? • Each route has an id, a name, a grade (an estimate of the time needed), a rating (how difficult it is), and a height • Each climber has an id, a name, a skill level and an age • A climb records who climbed what route on what date and how long it took (duration) • The data values in these tables are all “simple” Describing Relations • Relations are described by a schema which can be expressed in various ways • A database schema is usually expressed in a data definition language (DDL) Routes(RId:int, RName:string, Grade:string, Rating:int, Height:int) Climbers(CId:int, CNname:string, Skill:string, Age:int) Climbs(CId:int, RId:int, Date:date, Duration:int) Expressing Constraints • In SQL, constraints are defined as follows:
CREATE TABLE Climbers CREATE TABLE Climbs
(CId INTEGER, (CId INTEGER, CName CHAR(20), RId INTEGER, Skill CHAR(4), Date DATE, Age INTEGER, Duration INTEGER, PRIMARY KEY (CId), PRIMARY KEY (CId, RId, Date), UNIQUE (CName,Age)) FOREIGN KEY (CId) REFERENCES Climbers, FOREIGN KEY (RId) REFERENCES Routes) Informal Definition • RELATION: A table of values – A relation: a set of rows – A relation: alternately a set of columns – Each row represents a fact that corresponds to a real- world entity or relationship – Each row has a value of an item or set of items that uniquely identifies that row in the table – Sometimes row-ids or sequential numbers are assigned to identify the rows in the table – Each column: called by its column name or column header or attribute name Relational Model Terminology • Table = relation • Column headers = attributes • Row = tuple • Possible values of each attribute = domain – E.g., the domain of CName is string and that for Rating is real • Relation schema = relation name + attributes + other structure info., – E.g., keys, other constraints • Relation instance is the current set of rows for a relation schema • Database schema = collection of relation schemas Schema Definition • The schema of a Relation: R (A1, A2, ..... An ) • Relation schema R is defined over attributes A1, A2, .....An – R: Name – Degree: # of attributes n – Cardinality: # of tuples (rows) Example CUSTOMER (Cust-id, Cust-name, Address, Phone#)
• CUSTOMER is a relation defined over four attributes
Cust-id, Cust-name, Address, Phone# • Each attribute has a domain or a set of valid values – E.g., the domain of Cust-id could be 6 digit numbers • A tuple is an ordered set of values – <632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000"> – A relation may be regarded as a set of tuples (rows) Formal Definition • A relation is a subset of the Cartesian product of the domains – Domain is used in a specific role conveyed by the attribute name – E.g., attribute Cust-name is defined over the domain of strings of 26 characters. The role these strings play in the CUSTOMER relation is the name of customers • Formally, Given R(A1, A2, .........., An) r(R) Ì dom (A1) X dom (A2) X ....X dom(An) R: schema of the relation (intension) r(R): a specific "value" or population of R (extension) Example • Let D1 = {0,1} • Let D2 = {a, b, c} • Let r(R) Ì D1 X D2 • For example: r(R) = {<0,a> , <0,b> , <1,c>} is one possible “state” or “population” or “extension” r of the relation R, defined over domains D1 and D2 Definition Summary Informal Terms Formal Terms Table Relation Column name Attribute Row Tuple Values in a column Domain Table Definition Schema of a Relation Populated Table Extension Example Characteristics of Relations • Ordering of tuples in a relation – Not ordered (like a set) • Ordering of attributes and their values in a relation – Not important as long as the correspondence between attributes and values is maintained Characteristics of Relations • Values in a tuple: – A set of atomic values – Each value in the domain is indivisible – Composite and multi-valued attributes are not allowed – null value: unknown or inapplicable to certain tuples Some Relational Model Notation
• t[Ai] = vi (the value of attribute Ai for tuple
t) • t[Au, Av, ..., Aw]: the tuple of t containing the values of attributes Au, Av, ..., Aw Example Relational Data Model Relational Integrity Constraints • Constraints are conditions that must hold on all valid relation instances • Three main constraints: 1. Key constraints (single relation) 2. Entity integrity constraints (single relation) 3. Referential integrity constraints (two relations) Key Constraints • Superkey: A set of attributes SK of R such that no two tuples in any valid relation instance r(R) will have the same value for SK – For any distinct tuples t1 and t2 in r(R), t1[SK] ¹ t2[SK] • Candidate Key: A "minimal" superkey – A superkey K such that removal of any attribute from K results in a set of attributes that is not a superkey Example 1 CAR(State, Reg#, SerialNo, Make, Model, Year)
• Two candidate keys:
– Key1 = {State, Reg#} – Key2 = {SerialNo} – Are Key1 and Key2 superkeys? • {SerialNo, Make}, candidate key or superkey? • Primary key: If a relation has several candidate keys, one is chosen arbitrarily to be the primary key – The primary key attributes are underlined – Implies “not null” Example 2 • The CAR relation, with two candidate keys: License-number and Engine_serial_number Entity Integrity • Relational Database Schema: A set of relation schemas that belong to the same database – S is the name of the database S = {R1, R2, ..., Rn} • Entity Integrity: The PK (primary key) attributes of each relation schema R in S cannot have null values in any tuple of r(R) – t[PK] ¹ null for any tuple t in r(R) – Why? • Other attributes of R may be similarly constrained to disallow null values, even though they are not members of the primary key Referential Integrity Constraint • If a tuple in R1 refers to R2, it must refer to an existing tuple in that relation • E.g., Dno in every Employee tuple must match Dnumber value of some tuple in the Department relation Referential Integrity (Foreign Key Constraints) • A constraint involving two relations • Specify a relationship among tuples in two relations: – The referencing relation (R1) and the referenced relation (R2) – FK (foreign key attributes) of R1 reference PK (primary key attributes) or candidate key of R2 • Displayed in a relational database schema as a directed arc from R1.FK to R2.PK or R2.CK Foreign Key • A set of attributes in R1 is a foreign key of R1 that references relation R2 if it satisfies the following two rules: – The attributes in FK have the same domain(s) as the primary key (PK) attributes or candidate key of R2 – A value in the foreign key column (or columns) of the the referencing relation R1 can be either: 1. a value of an existing primary key or candidate key value in the referenced relation R2 2. is null • In case (2), the FK in R1 should not be a part of its own primary key Example
• If no row exists in R2 – violation of referential integrity
• Not all rows in R2 need to be referenced • Value of a foreign key might not be specified • Names of K1 and K2 need not to be the same Foreign Key (Cont.) • Foreign key can refer to its own relation • E.g. SuperSSN is a FK that refers to Employee relation itself Example 1 • Schema diagram for the COMPANY relational database schema, the primary keys are underlined Example 1 (Cont.) • One possible relational database state corresponding to the COMPANY schema Example 1 (Cont.) • Referential integrity constraints displayed on the COMPANY relational database schema diagram Example 2 Example 2 (Cont.) Other Types of Constraints • Domain constraints: values of each attribute A must be an atomic value from the domain for that attribute, dom(A) • Semantic Integrity Constraints: - Based on application semantics and cannot be expressed by the model - E.g., “one student cannot register for more than 9 credits” - A constraint specification language may have to be used to express these - SQL-99 uses triggers and ASSERTIONS to allow for some of these constraints Exercise Consider the following relations for a database that keeps track of student enrollment in courses and the books adopted for each course: STUDENT(SSN, Name, Major, Bdate) COURSE(Course#, Cname, Dept) ENROLL(SSN, Course#, Quarter, Grade) BOOK_ADOPTION(Course#, Quarter, Book_ISBN) TEXT(Book_ISBN, Book_Title, Publisher, Author) Add foreign key constraints to the schema above Update Operations on Relations • INSERT a tuple – Domain, key, entity and referential integrity constraints can be violated • DELETE a tuple – Only referential integrity constraint can be violated • MODIFY a tuple – Modify PK: delete one tuple and insert another in its place – Modify FK: make sure the new value refers to an existing tuple in the referenced relation (or is null) – Neither PK nor FK: only check new value is of the correct domain and data type Update Operations on Relations (Cont.) • Integrity constraints should not be violated by the update operations • Several update operations may have to be grouped together • Updates may propagate to cause other updates automatically Update Operations on Relations (Cont.) • In case of integrity violation, several actions can be taken: – Cancel the operation that causes the violation (REJECT option) – Perform the operation but inform the user of the violation – Trigger additional updates so the violation is corrected (CASCADE option, SET NULL option) – Execute a user-specified error-correction routine Example • The instances below satisfy some constraints Climbers: Climbs: CId CName Skill Age CId RId Date Duration 123 Edmund EXP 80 123 1 10/10/88 5 214 Arnold BEG 25 123 3 11/08/87 1 313 Bridget EXP 33 313 1 12/08/89 5 212 James MED 27 214 2 08/07/92 2 313 1 06/07/94 3
• Insert (123, Jeremy, MED, 16) into Climbers?
• Insert (456, 2, 09/13/98, 3) into Climbs? • Delete (313, Bridget, EXP, 33) from Climbers? • Modify 123 to 456 in Climbers?