Bài giảng Database systems - Database system concepts & architecture

 Extending database capabilities for new applications:  Example applications: storage and retrieval of images, videos, data mining (large amounts of data need to be stored and analyzed), spatial databases, time series applications,  More complex data structures than relational representation.  New data types except for the basic numeric and character string types.  New operations and query languages for new data types.  New storage and retrieval methods.  New security mechanisms.

pdf59 trang | Chia sẻ: vutrong32 | Ngày: 17/10/2018 | Lượt xem: 134 | Lượt tải: 0download
Bạn đang xem trước 20 trang tài liệu Bài giảng Database systems - Database system concepts & architecture, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
DATABASE SYSTEMS Nguyen Ngoc Thien An DATABASE SYSTEM CONCEPTS & ARCHITECTURE Spring 2014 Contents  Introduction  File-based Approach & Shared File Approach  Database Approach  Three-Schema Architecture & Data Independence  Database Languages, Data Models, Database Schemas & Database States  Classification of DBMS  Data Management Systems Framework  Reading Suggestion: [1] Chapter 1, Chapter 2 2 Introduction (1) 3 • Store textual or numeric information Traditional database applications • Store images, audio clips, and video streams digitally Multimedia databases • Store & analyze maps, weather data, and satellite images Geographic information systems (GIS) Introduction (2) 4 • Extract and analyze useful business information from very large databases • Support decision making Data warehouses & Online analytical processing (OLAP) systems • Control industrial and manufacturing processes Real-time and active database technology Contents  Introduction  File-based Approach & Shared File Approach  Database Approach  Three-Schema Architecture & Data Independence  Database Languages, Data Models, Database Schemas & Database States  Classification of DBMS  Data Management Systems Framework 5 File-based Approach (1) 6  Data is stored in one or more separate computer files.  Data is then processed by computer programs – applications. 7 File-based Approach (3) 8  Problems/Limitations  Data Redundancy  Data Inconsistency File-based Approach (4) 9 Customer Orders Customer File Stock File Order File Customer Invoicing Customer File Stock File Order File Purchase Orders Stock File Supplier File Stock Control Stock File Order File Customer File Stock File Order File Supplier File Customer Orders Customer Invoicing Purchase Orders Stock Control Applications Applications Files Files Shared File Approach Shared File Approach 10  Data (files) is shared between different applications.  Data redundancy problem is alleviated.  Data inconsistency problem across different versions of the same file is solved.  Other problems:  Rigid data structure: If applications have to share files, the file structure that suits one application might not suit another  Physical data dependency: If the structure of the data file needs to be changed in some way, this alteration will need to be reflected in all application programs that use that data file  No support of concurrency control: While a data file is being processed by one application, the file will not be available for other applications or for ad hoc queries Contents  Introduction  File-based Approach & Shared File Approach  Database Approach  Three-Schema Architecture & Data Independence  Database Languages, Data Models, Database Schemas & Database States  Classification of DBMS  Data Management Systems Framework 11 Contents - Database Approach 12  Database Approach  Overview  Data, Database & DBMS  Actors on the Scene  Workers behind the Scene  Characteristics of Database Approach  Advantages of Database Approach  History of Database Systems  When Not to Use a DBMS? Overview of Database Approach (1) 13  Arose because:  Definition of data was embedded in application programs, rather than being stored separately and independently.  No control over access and manipulation of data beyond that imposed by application programs.  Result:  The Database and Database Management System (DBMS). Overview of Database Approach (2) 14 Data 15  Known facts that can be recorded and that have implicit meaning.  Information? Knowledge?  More: www.whatis.com Database 16  Shared collection of logically related data and a description of this data.  Logically related data comprises entities, attributes, and relationships of an organization’s information.  System catalog (metadata) provides description of data to enable program–data independence.  Miniworld or universe of discourse (UoD):  Represents some aspect of the real world.  Changes to the miniworld must be reflected in the database as soon as possible.  Designed to meet the information needs of an organization.  Example: Amazon.com DBMS – Definitions 17  DataBase Management System (DBMS) • A general-purpose software system that facilitates the processes of defining, constructing, manipulating, and sharing databases among various users and applications. Definition 1 • A software system that enables users to define, create, maintain, and control access to the databases. Definition 2 DBMS – Functions (1) 18 • Specify the data types, structures, and constraints of the data to be stored. • Meta-data: • Database definition or descriptive information. • Stored by the DBMS in the form of a database catalog or dictionary. Defining a database • Store the data on some storage medium that is controlled by the DBMS. Constructing a database • Query and update the database miniworld. • Generate reports. Manipulating a database Functions of DBMS DBMS – Functions (2) 19 • Allow multiple users and programs to access the database simultaneously. Sharing a database • System protection against hardware or software malfunction (or crashes). • Security protection against unauthorized or malicious access. Protecting a database • Allow the system to evolve as requirements change over time. Maintain a database Functions of DBMS DBMS – Other Conceptions 20  Application program  Accesses database by sending queries to DBMS.  Query  Causes some data to be retrieved.  Transaction  May cause some data to be read and some data to be written into the database.  Controlled access to database may include:  A security system  An integrity system  A concurrency control system  A recovery control system  A user-accessible catalog  Database System = the Database + DBMS software. Database System Environment 21 Example of Database: University 22 Examples of Queries & Updates 23  Examples of queries:  Retrieve the transcript.  List the names of students who took the section of the ‘Database’ course offered in fall 2008 and their grades in that section.  List the prerequisites of the ‘Database’ course.  Examples of updates:  Change the class of ‘Smith’ to sophomore.  Create a new section for the ‘Database’ course for this semester.  Enter a grade of ‘A’ for ‘Smith’ in the ‘Database’ section of last semester.  Delete a canceled ‘E-commerce’ course in this semester. Designing Database 24  Phases for designing a database:  Requirements specification and analysis  Conceptual design  Logical design  Physical design Actors on the Scene (1) 25  Database Administrator (DBA)  Authorize access to DB.  Coordinate and monitor its use.  Acquiring software and hardware resources.  Database Designers  Identify the data to be stored in DB.  Choose appropriate structures to represent and store this data.  End Users  People whose jobs require access to the database.  Types:  Casual end users  Naive or parametric end users  Sophisticated end users  Standalone users Actors on the Scene (2) 26  System Analysts  Determine requirements of end users.  Application Programmers  Implement these specifications as programs.  More details: see [1] - 1.4 Actors on the Scene (3) 27 Workers behind the Scene 28  DBMS system designers and implementers  Design and implement the DBMS modules and interfaces as a software package  Tool developers  Design and implement tools.  Operators and maintenance personnel  Responsible for running and maintenance of hardware and software environment for database system. Database Approach - Characteristics (1) 29 Main Characteristics of Database Approach Self- describing nature of a database system Insulation between programs and data, and data abstraction Support of multiple views of the data Sharing of data and multiuser transaction processing Database Approach - Characteristics (2) 30  Self-describing nature of a DB system  Database system contains complete definition of structure and constraints.  Meta-data  Describes structure of the database Database Approach - Characteristics (3) 31  Insulation between programs and data  Program-data independence: Structure of data files is stored in DBMS catalog separately from access programs.  Program-operation independence  Data abstraction = Program-data independence + Program-operation independence  Conceptual representation of data: does not include details of how data is stored or how operations are implemented.  Data model: Type of data abstraction used to provide conceptual representation. Database Approach - Characteristics (4) 32  Support of multiple views of the data  View  Subset of the database.  Contains virtual data derived from the database files but is not explicitly stored.  Multiuser DBMS  Users have a variety of distinct applications.  Must provide facilities for defining multiple views. Database Approach - Characteristics (5) 33  Sharing of Data and Multiuser Transaction Processing  Allow multiple users to access the database at the same time.  Concurrency control software.  Ensure that several users trying to update the same data do so in a controlled manner.  Online transaction processing (OLTP) application.  Transaction  Central to many database applications.  Executing program or process that includes one or more database.  Isolation property  Each transaction appears to execute in isolation from other transactions.  Atomicity property  Either all the database operations in a transaction are executed or none are. Database Approach – Advantages (1) 34  Controlling redundancy  Data normalization  Denormalization  Sometimes necessary to use controlled redundancy to improve the performance of queries.  Restricting unauthorized access  Providing persistent storage for program objects  Impedance mismatch problem  Providing storage structures and search techniques for efficient query processing  Indexes.  Buffering and caching.  Query processing and optimization.  Providing backup and recovery  Providing multiple user interfaces Database Approach – Advantages (2) 35  Representing complex relationships among data.  Enforcing integrity constraints  Referential integrity constraint  Key or uniqueness constraint  Business rules  Inherent rules of the data model  Permitting inferencing and actions using rules  Deductive database systems  Trigger  Stored procedures  Reduced application development time  Flexibility  Availability of up-to-date information  Economies of scale History of Database Systems (1) 36  First generation: Hierarchical & Network Databases  Second generation: Relational Databases  Providing data abstraction and application flexibility  Third generation: Object-Relational & Object-Oriented Databases  Used in specialized applications: engineering design, multimedia publishing, and manufacturing systems.  Others:  Interchanging data on the Web for e-commerce using XML  Extending database capabilities for new applications  Extensions to better support specialized requirements for applications  Enterprise resource planning (ERP)  Customer relationship management (CRM)  Information retrieval (IR)  Deals with books, manuscripts, and various forms of library-based articles More details: see [1] - 1.7 History of Database Systems (2) 37 When Not to Use a DBMS? 38  More desirable to use regular files for:  Simple, well-defined database applications not expected to change at all.  Stringent, real-time requirements that may not be met because of DBMS overhead.  Embedded systems with limited storage capacity.  No multiple-user access to data. More details: see [1] - 1.8 Contents  Introduction  File-based Approach & Shared File Approach  Database Approach  Three-Schema Architecture & Data Independence  Database Languages, Data Models, Database Schemas & Database States  Classification of DBMS  Data Management Systems Framework 39 Objectives 40 Objectives of Three-Schema Architecture All users should be able to access same data. Users should not need to know physical database storage details. DBA should be able to change database storage structures without affecting the users’ views. Internal structure of database should be unaffected by changes to physical aspects of storage. DBA should be able to change conceptual structure of database without affecting all users. Three-Schema Architecture (1) 41 View 1View 1View 1 Conceptual Schema Internal Schema External level Conceptual level Internal level Physical data organization . Stored Database External/Conceptual Mapping Conceptual/Internal Mapping End Users Three-Schema Architecture (2) 42  External Level  Users’ view of the database.  Describes part of the database that a particular user group is interested in.  Conceptual Level  Describes structure of the whole database for a community of users.  Describes what data is stored in database and relationships among the data.  Internal Level  Physical representation of the database on the computer.  Describes physical storage structure of the database (how the data is stored in the database). Three-Schema Architecture (3) 43 Data Independence (1) 44  Data Independence is the capacity to change the schema at one level of a database system without having to change the schema at the next higher level.  Logical Data Independence  Refers to immunity of external schemas to changes in conceptual schema.  Conceptual schema changes (e.g. addition/removal of entities) should not require changes to external schema or rewrites of application programs.  Physical Data Independence  Refers to immunity of conceptual schema to changes in the internal schema.  Internal schema changes (e.g. using different file organizations, storage structures/devices) should not require changes to conceptual or external schemas. Data Independence (2) 45 Contents  Introduction  File-based Approach & Shared File Approach  Database Approach  Three-Schema Architecture & Data Independence  Database Languages, Data Models, Database Schemas & Database States  Classification of DBMS  Data Management Systems Framework 46 Database Languages (1) 47  Data Definition Language (DDL) allows the DBA or user to describe and name entities, attributes, and relationships required for the application plus any associated integrity and security constraints.  In most DBMSs, the DDL is used to define both conceptual and external schemas.  Data Manipulation Language (DML) provides basic data manipulation operations (retrieval, insertion, deletion, modification).  Data Control Language (DCL) defines activities that are not in the categories of those for the DDL and DML, such as granting privileges to users, and defining when proposed changes to a databases should be irrevocably made. Database Languages (2) 48  Data Manipulation Language (DML)  Procedural DML: allows user to tell system exactly how to manipulate data (e.g., Network and hierarchical DMLs).  Non-Procedural DML (Declarative language) allows user to state what data is needed rather than how it is to be retrieved (e.g., SQL, QBE).  Fourth Generation Languages (4GLs)  Non-procedural languages: SQL, QBE, etc.  Application generators, report generators, etc.  See more in [1] - 2.3 for:  Storage definition language (SDL).  View definition language (VDL). Data Models (1) 49  Data Model: An integrated collection of concepts for describing data, relationships between data, and constraints on the data in an organization.  Provides means to achieve data abstraction.  Basic operations  Specify retrievals and updates on the database.  Dynamic aspect or behavior of a database application  Allows the database designer to specify a set of valid operations allowed on database objects. Data Models (2) 50  Categories of data models include:  Object-based (Conceptual/High-level)  Close to the way many users perceive data.  E.g.: Entity-Relationship model, Object- Oriented,  Record-based (Representational)  Easily understood by end users.  Also similar to how data organized in computer storage.  E.g.: Relational, Network, Hierarchical  Physical (Low-level):  Used to describe data at the internal level.  Describes how data is stored as files in the computer.  E.g.: Access path, Index Describe data at the conceptual & external levels Database Schemas (1) 51  Database Schema: the description of a database, which is specified during database design and is not expected to change frequently.  Defining a new database is specifying database schema to the DBMS  Schema Diagram: displays selected aspects of schema.  Database State (Snapshot): the data in the database at a particular moment in time.  Initial state: Populated or loaded with the initial data.  Valid state: Satisfies the structure and constraints specified in the schema.  Schema Evolution  Changes applied to schema as application requirements change. Database Schemas (2) 52 Contents  Introduction  File-based Approach & Shared File Approach  Database Approach  Three-Schema Architecture & Data Independence  Database Languages, Data Models, Database Schemas & Database States  Classification of DBMS  Data Management Systems Framework 53 Classification of DBMS 54 DBMS Data model Relational Object Hierarchical & network Native XML DBMS Number of users Single-user Multiuser Number of sites Centralized Distributed Homogeneous Heterogeneous Cost Open source Different types of licensing Types of access path options General or special- purpose Contents  Introduction  File-based Approach & Shared File Approach  Database Approach  Three-Schema Architecture & Data Independence  Database Languages, Data Models, Database Schemas & Database States  Classification of DBMS  Data Management Systems Framework 55 Data Management Systems Framework (1) 56  Where are we? Application Layer • Visualization, Collaborative Computing, Mobile Computing, Knowledge-based Systems Data Management Layer • Layer 3: information extraction & sharing • Data Warehousing, Data Mining, Internet DBs, Collaborative, P2P & Grid Data Management • Layer 2: interoperability & migration • Heterogeneous DB Systems, Client/Server DBs, Multimedia DB Systems, Migrating Legacy DBs • Layer 1: DB technologies • DB Systems, Distributed DB Systems Supporting Layer • Networking, Mass Storage, Agents, Grid Computing Infrastructure, Parallel & Distributed Processing, Distributed Object Management Data Management Systems Framework (2) 57  Extending database capabilities for new applications:  Example applications: storage and retrieval of images, videos, data mining (large amounts of data need to be stored and analyzed), spatial databases, time series applications,  More complex data structures than relational representation.  New data types except for the basic numeric and character string types.  New operations and query languages for new data types.  New storage and retrieval methods.  New security mechanisms. Contents  Introduction  File-based Approach & Shared File Approach  Database Approach  Three-Schema Architecture & Data Independence  Database Languages, Data Models, Database Schemas & Database States  Classification of DBMS  Data Management Systems Framework  Read more:  The Database System Environment: [1] – 2.4  Centralized and Client/Server Architectures for DBMSs: [1] - 2.5 58 Q & A 59

Các file đính kèm theo tài liệu này:

  • pdf1_database_system_concepts_architecture_9775.pdf