Bài giảng Database systems - Relational data mode
Physical database design
The process of producing a description of the
implementation of the database on secondary storage.
Describes the base relations, file organizations,
and indexes design used to achieve efficient
access to the data, and any associated integrity
constraints and security measures.
48 trang |
Chia sẻ: vutrong32 | Lượt xem: 1700 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Bài giảng Database systems - Relational data mode, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
DATABASE SYSTEMS
Nguyen Ngoc Thien An
RELATIONAL DATA MODEL
Spring 2014
Contents
2
Overview of Relational Data Model
ER- & EER-to-Relational Mapping
Relational Algebra
Overview of Relational Data Model 3
Contents
4
Introduction
Basic Concepts
Characteristics of Relations
Relational Integrity Constraints
Operations on Relations
Main Phases of Database Design
Reading Suggestion: [1] Chapter 3
Introduction
5
Hierarchical and network models
Preceded the relational model
Relational model
The model was first introduced by Ted Codd of IBM
Research in 1970 in the following paper:
"A Relational Model for Large Shared Data Banks”,
Communications of the ACM, June 1970.
First commercial implementations available in early 1980s.
Has been implemented in a large number of commercial
systems.
The relational model of data is based on the concept of a
relation.
A relation is a mathematical concept based on the ideas of
sets.
Contents
6
Introduction
Basic Concepts
Characteristics of Relations
Relational Integrity Constraints
Operations on Relations
Main Phases of Database Design
Basic Concepts (1)
7
Relational data model
Represents a database in the form of relations.
A database may contain one or more relations.
Relation
A 2-dimensional table with rows and columns of data.
Row (Tuple): Represents a collection of related data values which typically
corresponds to a real-world entity or relationship.
Described by a relation schema.
Relation schema: 𝑅(𝐴1, 𝐴2, , 𝐴𝑛)
𝑅: Relation name.
𝐴1, 𝐴2, , 𝐴𝑛: List of attributes.
Each attribute 𝐴𝑖 is the name of a role played by some domain D in the
relation schema R.
E.g.: STUDENT (Name, SSN, HomePhone, Address, OfficePhone, Age,
GPA).
Basic Concepts (2)
8
Basic Concepts (3)
9
Degree (or arity) of a relation: the number of
attributes n of its relation schema.
Cardinality of a relation: the number of tuples in a
relation.
Domain D of 𝑨𝒊
Denoted by dom(𝐴𝑖).
Be a set of atomic values and a set of integrity
constraints.
Logical definitions.
Data type.
Format.
Additional information (units,).
Basic Concepts (4)
10
• Degree = ??
• Cardinality = ??
• dom(GPA) = ??
• Format of SSN ??
Basic Concepts (5)
11
• Degree = 7
• Cardinality = 5
• dom(GPA): from 0 to 5.0 (integrity constraint)
• Format of SSN: xxx-yy-zzzz
Basic Concepts (6)
12
Relation (or relation state, relation instance)
Relation schema: 𝑅(𝐴1, 𝐴2, , 𝐴𝑛).
Relation r of 𝑅, denoted by 𝑟(𝑅) , is a set of n-tuples
𝑟 = {𝑡1, 𝑡2, , 𝑡𝑚}.
Each n-tuple t: ordered list of n values 𝑡 =.
Each value 𝑣𝑖 , 𝑖 = 1. . 𝑛: an element of dom(𝐴𝑖) or a special NULL
value.
The ith value in tuple t, which corresponds to the attribute 𝐴𝑖, is
referred to as 𝑡[𝐴𝑖].
𝑟(𝑅): Mathematical relation of degree n on the domains.
𝑟(𝑅) ⊆ (𝑑𝑜𝑚 𝐴1 × 𝑑𝑜𝑚 𝐴2 × × 𝑑𝑜𝑚 𝐴𝑛) .
(the Cartesian product)
Current relation state
Relation state at a given time.
Reflects only the valid tuples that represent a particular state of
the real world.
Basic Concepts (7)
13
Relational database schema S
Set of relation schemas 𝑆 = {𝑅1, 𝑅2, , 𝑅𝑚}.
Set of integrity constraints 𝐼𝐶.
Relational database state
Set of relation states 𝐷𝐵 = {𝑟1, 𝑟2, , 𝑟𝑚}.
Each 𝑟𝑖 is a state of 𝑅𝑖.
𝑟𝑖 relation states satisfy integrity constraints
specified in 𝐼𝐶.
Invalid state: not obey all the integrity constraints.
Valid state: satisfies all the constraints in 𝐼𝐶.
Basic Concepts (8)
14
Basic Concepts (9)
15
Relational data model
Database schema
Relation schema
Relation
Tuple
Attribute
Basic Concepts (10)
16
Alternative Terminology for Relational Model
Formal Term Alternative 1 Alternative 2
Relation Table File
Tuple Row Record
Attribute Column (Header) Field
Domain All possible olumn Values
Schema of a Relation Table Definition
State of the Relation Populated Table
Contents
17
Introduction
Basic Concepts
Characteristics of Relations
Relational Integrity Constraints
Operations on Relations
Main Phases of Database Design
Characteristics of Relations (1)
18
Ordering in a relation
Relation defined as a set of tuples All tuples
must be distinct and have no order among them.
The order of attributes and their values is not that
important as long as the correspondence
between attributes and values is maintained.
Each attribute has a distinct name, and is always
referenced by that name, never by its position.
Characteristics of Relations(2)
19
Null value
Represents value for an attribute that is currently
unknown or inapplicable for tuple.
Deals with incomplete or exceptional data.
Meanings for NULL values:
Value unknown.
Value exists but is not available.
Value undefined (The attribute does not apply to this
tuple).
Represents the absence of a value and is not the
same as zero or spaces, which are values.
Characteristics of Relations(3)
20
Two different tuple orders of the relation STUDENT
Contents
21
Introduction
Basic Concepts
Characteristics of Relations
Relational Integrity Constraints
Operations on Relations
Main Phases of Database Design
Relational Integrity Constraints (1)
22
Constraints
Restrictions on the actual values in a database state.
Derived from the rules in the miniworld.
Three main categories:
Inherent model-based constraints or implicit constraints
Inherent in the data model.
Be the mentioned characteristics of relations.
Schema-based constraints or explicit constraints
Can be directly expressed in schemas (typically using DDL).
Application-based constraints or semantic constraints or
business rules
Relate to the meaning and behavior of attributes.
Cannot be directly expressed in schemas.
Expressed and enforced by application programs.
Relational Integrity Constraints (2)
23
Another important categories: data
dependencies
Functional dependencies
Multivalued dependencies
Discuss in Chapter 6 !!!
Schema-based Constraints (1)
24
Include:
Domain constraints
Constraints on NULLs
Key constraints
Entity integrity constraints
Referential integrity constraints
Schema-based Constraints (2)
25
Domain constraints
Specify that within each tuple, the value of each attribute A must
be an atomic value from the domain dom(A).
Typically include:
Standard numeric data types for integers and real numbers.
Characters, fixed-length strings, variable-length strings.
Booleans.
Date, time, timestamp.
Money.
Other special data types.
Subrange of values from a data type.
An enumerated data type.
Constraints on NULLs
Specifies whether an attribute may have NULL values or not.
Schema-based Constraints (3)
26
Key constraints
Superkey
A set of attributes SK of R such that no two distinct tuples in any valid
relation instance r(R) can have the same value for SK.
For any distinct tuples 𝑡1 and 𝑡2 in 𝑟(𝑅), 𝑡1[𝑆𝐾] ≠ 𝑡2[𝑆𝐾].
Key
A minimal superkey: a superkey K such that removal of any attribute
from K results in a set of attributes that is not a superkey.
Candidate key
Relation schema may have more than one key. Each of the keys is
called a candidate key.
Primary key
Chosen among candidate keys Used to identify tuples in the
relation.
Underline attributes of the primary key.
Other candidate keys are designated as unique keys (not
underlined).
Schema-based Constraints (4)
27
E.g.:
CAR(State, Reg#, SerialNo, Make, Model, Year)
Candidate keys
{State, Reg#}
{SerialNo} chosen as the primary key.
Super key
{SerialNo, Make} is a superkey but not a key.
{State, Reg#, Model} is a superkey but not a key.
Schema-based Constraints (5)
28
Entity integrity constraints
No primary key attribute value can be NULL:
t[PK] ≠ null for any tuple t in r(R).
Because primary key values are used to identify
the individual tuples.
Note: Other attributes of R may be similarly
constrained to disallow NULL values, even
though they are not members of the primary
key.
Schema-based Constraints (6)
29
Referential integrity constraints
A constraint involving two relations (the previous
constraints involve a single relation).
Maintains consistency among tuples in two relations:
the referencing relation and the referenced relation.
Tuples in the referencing relation 𝑅1 have attributes
FK (called foreign key attributes) that reference the
primary key attributes PK of the referenced relation
𝑅2.
A tuple 𝑡1 in 𝑅1 is said to reference a tuple 𝑡2 in 𝑅2 if
𝑡1 𝐹𝐾 = 𝑡2 PK
Displayed as a directed arc from 𝑅1. 𝐹𝐾 to 𝑅2.
Schema-based Constraints (7)
30
Schema-based Constraints (8)
31
Foreign key
The attributes in FK have the same domain(s) as
the primary key attributes PK.
Value of FK in a tuple 𝑡1 of the current state
𝑟1(𝑅1) either occurs as a value of PK for some
tuple 𝑡2 in the current state 𝑟2(𝑅2) or is NULL.
In case (2), the FK in 𝑅1 should not be a part
of its own primary key.
Schema-based Constraints (9)
32
Semantic Integrity Constraints
33
Semantic integrity constraints
Based on application semantics and cannot be
expressed by the model.
E.g.:
The max. no. of hours per employee for all projects he or
she works on is 56 hrs per week.
The salary of an employee should not exceed the salary of
the employee’s supervisor.
Can be specified and enforced
Using a general-purpose constraint specification
language triggers and assertions.
Within the application programs that update the
database more common.
Relational Integrity Constraints
34
State/Static constraints
Define the constraints that a valid state of the
database must satisfy.
All mentioned above constraints belong to this
type.
Transition/Dynamic constraints
Defined to deal with state changes in the
database.
E.g.: The salary of an employee can only
increase.
Contents
35
Introduction
Basic Concepts
Characteristics of Relations
Relational Integrity Constraints
Operations on Relations
Main Phases of Database Design
Operations on Relations
36
The relational model
Retrieval operations: can be specified by
Relational algebra Discussed in Part 3 of Chapter 4
Relational calculus Introduced briefly in Part 3 of Chapter
4.
Update (Modification) operations
Insert: insert one or more new tuples in a relation.
Delete: delete one or more tuples from a relation.
Update (Modify): change the values of some attributes in
existing tuples.
Integrity constraints should not be violated by the update
operations.
Update Operations on Relations (1)
37
Insertion
Can cause violations of integrity constraints:
An attribute value is given that does not appear in the
corresponding domain or is not of the appropriate data
type.
A key value in the new tuple already exists in another tuple
in the relation.
The value of a prime attribute (i.e. the key attribute) is
NULL.
The value of a foreign key (if any) must refer to an existing
tuple in the corresponding relation.
The value of any foreign key in a new tuple refers to a tuple
that does not exist in the referenced relation.
Options if the constraints are violated: Read [1] 3.3.1 !!!
Update Operations on Relations (2)
38
Deletion
Can violate only referential integrity:
The tuple being deleted is referenced by foreign keys from other
tuples in the database.
Options if the constraints are violated: Read [1] 3.3.2 !!!
Modification (Update)
Updating an attribute that is neither part of a primary key nor of a
foreign key:
Usually causes no problems.
Only check to confirm that the new value is of the correct data type
and domain.
Modifying a primary key value is similar to deleting one tuple and
inserting another in its place see Insert and Delete.
If a foreign key attribute is modified, the new value must refers to
an existing tuple in the referenced relation (or is set to NULL).
Options if the constraints are violated: Read [1] 3.3.3 !!!
Update Operations on Relations (3)
39
In case of integrity violation, several actions
can be taken:
Cancel the operation that causes the violation
(REJECT option).
Perform the operation but inform the user of the
violation.
Trigger additional updates so the violation is
corrected (CASCADE option, SET NULL option,
SET DEFAULT option).
Execute a user-specified error-correction routine.
More details: Read [1] 3.3 !!!
Contents
40
Introduction
Basic Concepts
Characteristics of Relations
Relational Integrity Constraints
Operations on Relations
Main Phases of Database Design
41
Three main phases
Conceptual database design.
Logical database design.
Physical database design.
Main Phases of Database Design (1)
42
Main Phases of Database Design (2)
43
Conceptual database design
The process of constructing a model of the data
used in an enterprise.
Independent of all physical considerations.
Comprise entity types, relationship types,
attributes and attribute domains, primary and
alternate keys, structural and integrity constraints.
Use ERD & EERD.
Main Phases of Database Design (3)
44
Result of the
conceptual
design for the
COMPANY
database:
the ERD
45
Logical database design
The process of constructing a model of the data
used in an enterprise.
Based on a specific data model (e.g. relational),
Independent of a particular DBMS and other
physical considerations.
ER- & EER-to-Relational Mapping.
Normalization (in Chapter 6).
Main Phases of Database Design (4)
46
Main Phases of Database Design (5)
Result of the
logical design
for the
COMPANY
database:
the Relational
Schema
(ER- & EER-
to-Relational
Mapping)
47
Physical database design
The process of producing a description of the
implementation of the database on secondary
storage.
Describes the base relations, file organizations,
and indexes design used to achieve efficient
access to the data, and any associated integrity
constraints and security measures.
Main Phases of Database Design (6)
Q & A
48
Các file đính kèm theo tài liệu này:
- 4_relational_data_model_section_1_825.pdf