Bài giảng Database System - Chapter 2: The Relational Data Model & SQL

UPDATE Used to modify attribute values of one or more selected tuples A WHERE-clause selects the tuples to be modified An additional SET-clause specifies the attributes to be modified and their new values Each command modifies tuples in the same relation Referential integrity should be enforced

ppt76 trang | Chia sẻ: vutrong32 | Lượt xem: 1881 | Lượt tải: 0download
Bạn đang xem trước 20 trang tài liệu Bài giảng Database System - Chapter 2: The Relational Data Model & SQL, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
Chapter 2 The Relational Data Model & SQLCopyright © 2004 Pearson Education, Inc.OutlineRelational Model ConceptsRelational Model Constraints and Relational Database SchemasUpdate Operations and Dealing with Constraint ViolationsBasic SQLSlide 2 -*Relational Model ConceptsThe relational Model of Data is based on the concept of a Relation.A Relation is a mathematical concept based on the ideas of sets.The strength of the relational approach to data management comes from the formal foundation provided by the theory of relations.We review the essentials of the relational approach in this chapter.Slide 2 -*Relational Model ConceptsThe model was first proposed by Dr. E.F. Codd of IBM in 1970 in the following paper: "A Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970. The above paper caused a major revolution in the field of Database management and earned Ted Codd the coveted ACM Turing Award.Slide 2 -*INFORMAL DEFINITIONSRELATION: A table of valuesA relation may be thought of as a set of rows.A relation may alternately be though of as a set of columns.Each row represents a fact that corresponds to a real-world entity or relationship.Each row has a value of an item or set of items that uniquely identifies that row in the table.Sometimes row-ids or sequential numbers are assigned to identify the rows in the table.Each column typically is called by its column name or column header or attribute name.Slide 2 -*FORMAL DEFINITIONSA Relation may be defined in multiple ways.The Schema of a Relation: R (A1, A2, .....An) Relation schema R is defined over attributes A1, A2, .....An For Example - CUSTOMER (Cust-id, Cust-name, Address, Phone#) Here, CUSTOMER is a relation defined over the four attributes Cust-id, Cust-name, Address, Phone#, each of which has a domain or a set of valid values. For example, the domain of Cust-id is 6 digit numbers.Slide 2 -*FORMAL DEFINITIONSA tuple is an ordered set of valuesEach value is derived from an appropriate domain.Each row in the CUSTOMER table may be referred to as a tuple in the table and would consist of four values. is a tuple belonging to the CUSTOMER relation.A relation may be regarded as a set of tuples (rows).Columns in a table are also called attributes of the relation.Slide 2 -*FORMAL DEFINITIONSA domain has a logical definition: e.g., “USA_phone_numbers” are the set of 10 digit phone numbers valid in the U.S.A domain may have a data-type or a format defined for it. The USA_phone_numbers may have a format: (ddd)-ddd-dddd where each d is a decimal digit. E.g., Dates have various formats such as monthname, date, year or yyyy-mm-dd, or dd mm,yyyy etc.An attribute designates the role played by the domain. E.g., the domain Date may be used to define attributes “Invoice-date” and “Payment-date”.Slide 2 -*FORMAL DEFINITIONSThe relation is formed over the cartesian product of the sets; each set has values from a domain; that domain is used in a specific role which is conveyed by the attribute name.For example, attribute Cust-name is defined over the domain of strings of 25 characters. The role these strings play in the CUSTOMER relation is that of the name of customers.Formally, Given R(A1, A2, .........., An) r(R)  dom (A1) X dom (A2) X ....X dom(An)R: schema of the relationr of R: a specific "value" or population of R.R is also called the intension of a relationr is also called the extension of a relationSlide 2 -*FORMAL DEFINITIONSLet S1 = {0,1}Let S2 = {a,b,c}Let R  S1 X S2Then for example: r(R) = { , , } is one possible “state” or “population” or “extension” r of the relation R, defined over domains S1 and S2. It has three tuples.Slide 2 -*DEFINITION SUMMARYInformal Terms Formal Terms Table Relation ColumnAttribute/DomainRowTupleValues in a columnDomainTable DefinitionSchema of a RelationPopulated TableExtensionChapter 2 *ExampleSlide 2-*CHARACTERISTICS OF RELATIONSOrdering of tuples in a relation r(R): The tuples are not considered to be ordered, even though they appear to be in the tabular form. Ordering of attributes in a relation schema R (and of values within each tuple): We will consider the attributes in R(A1, A2, ..., An) and the values in t= to be ordered . (However, a more general alternative definition of relation does not require this ordering). Values in a tuple: All values are considered atomic (indivisible). A special null value is used to represent values that are unknown or inapplicable to certain tuples. Slide 2 -*CHARACTERISTICS OF RELATIONSNotation:- We refer to component values of a tuple t by t[Ai] = vi (the value of attribute Ai for tuple t). Similarly, t[Au, Av, ..., Aw] refers to the subtuple of t containing the values of attributes Au, Av, ..., Aw, respectively.Slide 2 -*CHARACTERISTICS OF RELATIONSSlide 2-*Relational Integrity ConstraintsConstraints are conditions that must hold on all valid relation instances. There are three main types of constraints:Key constraintsEntity integrity constraintsReferential integrity constraints Slide 2 -*Key ConstraintsSuperkey of R: A set of attributes SK of R such that no two tuples in any valid relation instance r(R) will have the same value for SK. That is, for any distinct tuples t1 and t2 in r(R), t1[SK]  t2[SK].Key of R: A "minimal" superkey; that is, a superkey K such that removal of any attribute from K results in a set of attributes that is not a superkey.Example: The CAR relation schema:CAR(State, Reg#, SerialNo, Make, Model, Year)has two keys Key1 = {State, Reg#}, Key2 = {SerialNo}, which are also superkeys. {SerialNo, Make} is a superkey but not a key.If a relation has several candidate keys, one is chosen arbitrarily to be the primary key. The primary key attributes are underlined.Slide 2 -*Key ConstraintsSlide 2 -*Entity IntegrityRelational Database Schema: A set S of relation schemas that belong to the same database. S is the name of the database.S = {R1, R2, ..., Rn}Entity Integrity: The primary key attributes PK of each relation schema R in S cannot have null values in any tuple of r(R). This is because primary key values are used to identify the individual tuples.t[PK]  null for any tuple t in r(R) Note: Other attributes of R may be similarly constrained to disallow null values, even though they are not members of the primary key.Slide 2 -*Referential IntegrityA constraint involving two relations (the previous constraints involve a single relation).Used to specify a relationship among tuples in two relations: the referencing relation and the referenced relation.Tuples in the referencing relation R1 have attributes FK (called foreign key attributes) that reference the primary key attributes PK of the referenced relation R2. A tuple t1 in R1 is said to reference a tuple t2 in R2 if t1[FK] = t2[PK].A referential integrity constraint can be displayed in a relational database schema as a directed arc from R1.FK to R2. Slide 2 -*Referential Integrity ConstraintStatement of the constraintThe value in the foreign key column (or columns) FK of the the referencing relation R1 can be either: (1) a value of an existing primary key value of the corresponding primary key PK in the referenced relation R2,, or.. (2) a null.In case (2), the FK in R1 should not be a part of its own primary key.Slide 2 -*Other Types of ConstraintsSemantic Integrity Constraints:based on application semantics and cannot be expressed by the model per seE.g., “the max. no. of hours per employee for all projects he or she works on is 56 hrs per week”A constraint specification language may have to be used to express theseSQL-99 allows triggers and ASSERTIONS to allow for some of theseSlide 2 -*Case studyCompany DatabaseSlide 2 -*Requirements of the Company (oversimplified for illustrative purposes)The company is organized into DEPARTMENTs. Each department has a name, number and an employee who manages the department. We keep track of the start date of the department manager. Each department controls a number of PROJECTs. Each project has a name, number and is located at a single location.Example COMPANY DatabaseSlide 2 -*We store each EMPLOYEE’s social security number, address, salary, sex, and birthdate. Each employee works for one department but may work on several projects. We keep track of the number of hours per week that an employee currently works on each project. We also keep track of the direct supervisor of each employee.Each employee may have a number of DEPENDENTs. For each dependent, we keep track of their name, sex, birthdate, and relationship to employee.Example COMPANY DatabaseSlide 2 -*Slide 2-*Slide 2-*5.7Slide 2-*Update Operations on RelationsINSERT a tuple.DELETE a tuple.MODIFY a tuple. Integrity constraints should not be violated by the update operations.Several update operations may have to be grouped together.Updates may propagate to cause other updates automatically. This may be necessary to maintain integrity constraints.Slide 2 -*Examples:Slide 2 -*Update Operations on RelationsIn case of integrity violation, several actions can be taken:Cancel the operation that causes the violation (REJECT option)Perform the operation but inform the user of the violationTrigger additional updates so the violation is corrected (CASCADE option, SET NULL option)Execute a user-specified error-correction routine Slide 2 -*In-Class Exercise(Taken from Exercise 5.15)Consider the following relations for a database that keeps track of student enrollment in courses and the books adopted for each course:STUDENT(SSN, Name, Major, Bdate)COURSE(Course#, Cname, Dept)ENROLL(SSN, Course#, Quarter, Grade)BOOK_ADOPTION(Course#, Quarter, Book_ISBN)TEXT(Book_ISBN, Book_Title, Publisher, Author)Draw a relational schema diagram specifying the foreign keys for this schema.Slide 2-*Basic SQLSQL Data Definition & Data TypesSpecifying Constraints in SQLBasic Retrieval Queries in SQL INSERT, DELETE, UPDATESlide 2-*SQL developments: an overviewIn 1986, ANSI and ISO published an initial standard for SQL: SQL-86 or SQL1In 1992, first major revision to ISO standard occurred, referred to as SQL2 or SQL-92In 1999, SQL-99 (SQL3) was released with support for object-oriented data managementIn late 2003, SQL-2003 was releasedNow: SQL-2006 was publishedChapter 2-*Slide 2 -*SQLDDL: Create, Alter, DropDML: Select, Insert, Update, DeleteDCL: Commit, Rollback, Grant, RevokeChapter 2-*Slide 2 -*CREATE SCHEMAStarted with SQL 92A SQL Schema: is to group together tables and other constructs that belong to the same database applicationCREATE SCHEMA SchemaName AUTHORIZATION AuthorizationIdentifierSlide 2 -*CREATE TABLESpecifies a new base relation by giving it a name, and specifying each of its attributes and their data types (INTEGER, FLOAT, DECIMAL(i,j), CHAR(n), VARCHAR(n))A constraint NOT NULL may be specified on an attribute CREATE TABLE DEPARTMENT ( DNAME VARCHAR(10) NOT NULL, DNUMBER INTEGER NOT NULL, MGRSSN CHAR(9), MGRSTARTDATE CHAR(9) ); Slide 2 -*CREATE TABLECREATE TABLE Company.TableName orCREATE TABLE TableName Slide 2 -*CREATE TABLECREATE TABLE TableName {(colName dataType [NOT NULL] [UNIQUE][DEFAULT defaultOption][CHECK searchCondition] [,...]}[PRIMARY KEY (listOfColumns),]{[UNIQUE (listOfColumns),] [,]}{[FOREIGN KEY (listOfFKColumns) REFERENCES ParentTableName [(listOfCKColumns)], [ON UPDATE referentialAction] [ON DELETE referentialAction ]] [,]}{[CHECK (searchCondition)] [,] })Slide 2 -*Data TypesNumeric: INT or INTEGER, FLOAT or REAL, DOUBLE PRECISION, Character string: fixed length CHAR(n), varying length VARCHAR(n)Bit string: BIT(n), e.g. B’1001’Boolean: true, false or NULLDATE: Made up of year-month-day in the format yyyy-mm-ddTIME: Made up of hour:minute:second in the format hh:mm:ssTIME(i): Made up of hour:minute:second plus i additional digits specifying fractions of a second format is hh:mm:ss:ii...iTIMESTAMP: Has both DATE and TIME componentsSlide 2 -*Data TypesA domain can be declared and used with the attribute specification CREATE DOMAIN DomainName AS DataType [CHECK conditions];Example:Slide 2 -*Specifying Constraints in SQLSpecifying Attribute Constraints and Attribute DefaultsDefault valuesDEFAULT can be specified for an attributeIf no default clause is specified, the default value is NULL for attributes that do not have the NOT NULL constraintCHECK clause: restrict attribute or domain valuesDNUMBER INT NOT NULL CHECK (DNUMBER>0 AND DNUMBER0 AND D_NUMFROM WHERE is a list of attribute names whose values are to be retrieved by the query is a list of the relation names required to process the query is a conditional (Boolean) expression that identifies the tuples to be retrieved by the querySlide 2 -*Relational Database SchemaSlide 2 -*Populated DatabaseSlide 2 -*Simple SQL QueriesAll subsequent examples use the COMPANY databaseExample of a simple query on one relationQuery 0: Retrieve the birthdate and address of the employee whose name is 'John B. Smith'. Q0: SELECT BDATE, ADDRESS FROM EMPLOYEE WHERE FNAME='John' AND MINIT='B’ AND LNAME='Smith’ The SELECT-clause specifies the projection attributes and the WHERE-clause specifies the selection conditionThe result of the query may contain duplicate tuplesSlide 2 -*Simple SQL Queries (cont.)Query 1: Retrieve the name and address of all employees who work for the 'Research' department. Q1: SELECT FNAME, LNAME, ADDRESS FROM EMPLOYEE, DEPARTMENT WHERE DNAME='Research' AND DNUMBER=DNO (DNAME='Research') is a selection condition(DNUMBER=DNO) is a join conditionSlide 2 -*Simple SQL Queries (cont.)Query 2: For every project located in 'Stafford', list the project number, the controlling department number, and the department manager's last name, address, and birthdate. Q2: SELECT PNUMBER, DNUM, LNAME, BDATE, ADDRESS FROM PROJECT, DEPARTMENT, EMPLOYEE WHERE DNUM=DNUMBER AND MGRSSN=SSN AND PLOCATION='Stafford' In Q2, there are two join conditionsThe join condition DNUM=DNUMBER relates a project to its controlling departmentThe join condition MGRSSN=SSN relates the controlling department to the employee who manages that departmentSlide 2 -*Aliases, * and DISTINCT, Empty WHERE-clauseIn SQL, we can use the same name for two (or more) attributes as long as the attributes are in different relations A query that refers to two or more attributes with the same name must qualify the attribute name with the relation name by prefixing the relation name to the attribute nameExample: EMPLOYEE.LNAME, DEPARTMENT.DNAMESlide 2 -*ALIASESSome queries need to refer to the same relation twiceIn this case, aliases are given to the relation nameQuery 8: For each employee, retrieve the employee's name, and the name of his or her immediate supervisor. Q8: SELECT E.FNAME, E.LNAME, S.FNAME, S.LNAME FROM EMPLOYEE E S WHERE E.SUPERSSN=S.SSN In Q8, the alternate relation names E and S are called aliases or tuple variables for the EMPLOYEE relationWe can think of E and S as two different copies of EMPLOYEE; E represents employees in role of supervisees and S represents employees in role of supervisorsSlide 2 -*ALIASES (cont.)Aliasing can also be used in any SQL query for convenience Can also use the AS keyword to specify aliases Q8: SELECT E.FNAME, E.LNAME, S.FNAME, S.LNAME FROM EMPLOYEE AS E, EMPLOYEE AS S WHERE E.SUPERSSN=S.SSN Slide 2 -*UNSPECIFIED WHERE-clauseA missing WHERE-clause indicates no condition; hence, all tuples of the relations in the FROM-clause are selectedThis is equivalent to the condition WHERE TRUEQuery 9: Retrieve the SSN values for all employees.Q9: SELECT SSN FROM EMPLOYEE If more than one relation is specified in the FROM-clause and there is no join condition, then the CARTESIAN PRODUCT of tuples is selectedSlide 2 -*UNSPECIFIED WHERE-clause (cont.)Example: Q10: SELECT SSN, DNAME FROM EMPLOYEE, DEPARTMENT It is extremely important not to overlook specifying any selection and join conditions in the WHERE-clause; otherwise, incorrect and very large relations may resultSlide 2 -*USE OF *To retrieve all the attribute values of the selected tuples, a * is used, which stands for all the attributes Examples: Q1C: SELECT * FROM EMPLOYEE WHERE DNO=5 Q1D: SELECT * FROM EMPLOYEE, DEPARTMENT WHERE DNAME='Research' AND DNO=DNUMBERSlide 2 -*USE OF DISTINCTSQL does not treat a relation as a set; duplicate tuples can appearTo eliminate duplicate tuples in a query result, the keyword DISTINCT is usedFor example, the result of Q11 may have duplicate SALARY values whereas Q11A does not have any duplicate values Q11: SELECT SALARY FROM EMPLOYEE Q11A: SELECT DISTINCT SALARY FROM EMPLOYEESlide 2 -*SUBSTRING COMPARISONThe LIKE comparison operator is used to compare partial strings'%' (or '*' in some implementations) replaces an arbitrary number of characters'_' replaces a single arbitrary characterSlide 2 -*SUBSTRING COMPARISON (cont.)Query 25: Retrieve all employees whose address is in Houston, Texas. Here, the value of the ADDRESS attribute must contain the substring 'Houston,TX'. Q25: SELECT FNAME, LNAME FROM EMPLOYEE WHERE ADDRESS LIKE '%Houston,TX%’Slide 2 -*SUBSTRING COMPARISON (cont.)Query 26: Retrieve all employees who were born during the 1950s. Here, '5' must be the 8th character of the string (according to our format for date), so the BDATE value is '_______5_', with each underscore as a place holder for a single arbitrary character. Q26: SELECT FNAME, LNAME FROM EMPLOYEE WHERE BDATE LIKE '_______5_’ The LIKE operator allows us to get around the fact that each value is considered atomic and indivisible; hence, in SQL, character string attribute values are not atomicSlide 2 -*ARITHMETIC OPERATIONSThe standard arithmetic operators '+', '-'. '*', and '/‘ can be applied to numeric values in an SQL query resultQuery 27: Show the effect of giving all employees who work on the 'ProductX' project a 10% raise. Q27: SELECT FNAME, LNAME, 1.1*SALARY FROM EMPLOYEE, WORKS_ON, PROJECT WHERE SSN=ESSN AND PNO=PNUMBER AND PNAME='ProductX’Slide 2 -*Specifying Updates in SQLThere are three SQL commands to modify the database; INSERT, DELETE, and UPDATESlide 2 -*INSERTTo add one or more tuples to a relationAttribute values should be listed in the same order as the attributes were specified in the CREATE TABLE commandSlide 2 -*INSERT (cont.)Example: U1: INSERT INTO EMPLOYEE VALUES ('Richard','K','Marini', '653298653', '30-DEC-52', '98 Oak Forest,Katy,TX', 'M', 37000,'987654321', 4 ) An alternate form of INSERT specifies explicitly the attribute names that correspond to the values in the new tupleAttributes with NULL values can be left outExample: Insert a tuple for a new EMPLOYEE for whom we only know the FNAME, LNAME, and SSN attributes. U1A: INSERT INTO EMPLOYEE (FNAME, LNAME, SSN) VALUES ('Richard', 'Marini', '653298653')Slide 2 -*INSERT (cont.)Important Note: Only the constraints specified in the DDL commands are automatically enforced by the DBMS when updates are applied to the databaseAnother variation of INSERT allows insertion of multiple tuples resulting from a query into a relationSlide 2 -*INSERT (cont.)Example: Suppose we want to create a temporary table that has the name, number of employees, and total salaries for each department. A table DEPTS_INFO is created by U3A, and is loaded with the summary information retrieved from the database by the query in U3B. U3A: CREATE TABLE DEPTS_INFO (DEPT_NAME VARCHAR(10), NO_OF_EMPS INTEGER, TOTAL_SAL INTEGER); U3B: INSERT INTO DEPTS_INFO (DEPT_NAME, NO_OF_EMPS, TOTAL_SAL) SELECT DNAME, COUNT (*), SUM (SALARY) FROM DEPARTMENT, EMPLOYEE WHERE DNUMBER=DNO GROUP BY DNAME ;Slide 2 -*INSERT (cont.)Note: The DEPTS_INFO table may not be up-to-date if we change the tuples in either the DEPARTMENT or the EMPLOYEE relations after issuing U3B. We have to create a view (see later) to keep such a table up to date.Slide 2 -*DELETERemoves tuples from a relationIncludes a WHERE-clause to select the tuples to be deletedTuples are deleted from only one table at a time (unless CASCADE is specified on a referential integrity constraint)A missing WHERE-clause specifies that all tuples in the relation are to be deleted; the table then becomes an empty tableThe number of tuples deleted depends on the number of tuples in the relation that satisfy the WHERE-clauseReferential integrity should be enforcedSlide 2 -*DELETE (cont.)Examples: U4A: DELETE FROM EMPLOYEE WHERE LNAME='Brown’ U4B: DELETE FROM EMPLOYEE WHERE SSN='123456789’ U4C: DELETE FROM EMPLOYEE WHERE DNO IN (SELECT DNUMBER FROM DEPARTMENT WHERE DNAME='Research') U4D: DELETE FROM EMPLOYEESlide 2 -*UPDATEUsed to modify attribute values of one or more selected tuplesA WHERE-clause selects the tuples to be modifiedAn additional SET-clause specifies the attributes to be modified and their new valuesEach command modifies tuples in the same relationReferential integrity should be enforcedSlide 2 -*UPDATE (cont.)Example: Change the location and controlling department number of project number 10 to 'Bellaire' and 5, respectively. U5: UPDATE PROJECT SET PLOCATION = 'Bellaire', DNUM = 5 WHERE PNUMBER=10 Slide 2 -*UPDATE (cont.)Example: Give all employees in the 'Research' department a 10% raise in salary. U6: UPDATE EMPLOYEE SET SALARY = SALARY *1.1 WHERE DNO IN (SELECT DNUMBER FROM DEPARTMENT WHERE DNAME='Research') In this request, the modified SALARY value depends on the original SALARY value in each tupleThe reference to the SALARY attribute on the right of = refers to the old SALARY value before modificationThe reference to the SALARY attribute on the left of = refers to the new SALARY value after modification Slide 2 -*Summary of SQL QueriesA query in SQL can consist of up to six clauses, but only the first two, SELECT and FROM, are mandatory. The clauses are specified in the following order: SELECT FROM [WHERE ] [GROUP BY ] [HAVING ] [ORDER BY ]Slide 2 -*

Các file đính kèm theo tài liệu này:

  • pptdatabasesystem_ch2_3802.ppt