Oracle Data Warehouse Management Secrets of Oracle Data Warehousing

Notice . ii Publication Information . iii Table Of Contents iv Introduction . 1 Hour 1: . 2 Conceptual Overview 2 Objectives: . 2 Data Systems Architectures . 2 Data Warehouse Concepts . 7 Objectives: . 7 Data Warehouse Terminology . 8 Data Warehouse Storage Structures 10 Data Warehouse Aggregate Operations . 11 Data Warehouse Structure . 11 Objectives: . 11 Schema Structures For Data Warehousing 11 Oracle and Data Warehousing . 15 Hour 2: . 15 Oracle7 Features 15 Objectives: . 15 Oracle7 Data Warehouse related Features . 15 Oracle8 Features 19 Objectives: . 19 Partitioned Tables and Indexes 20 Oracle8 Enhanced Parallel DML . 22 Oracle8 Enhanced Optimizer Features 24 Oracle8 Enhanced Index Structures . 25 Oracle8 Enhanced Internals Features 25 Backup and Recovery Using RMAN . 26 COPYRIGHT © 2003 RAMPANT TECHPRESS. ALL RIGHTS RESERVED. ROBO BOOKS MONOGRAPH DATA WAREHOUSING AND ORACLE8I PAGE V Data Warehousing 201 27 Hour 1: . 27 Oracle8i Features . 27 Objectives: . 27 Oracle8i SQL Enhancements for Data Warehouses 27 Oracle8i Data Warehouse Table Options 31 Oracle8i and Tuning of Data Warehouses using Small Test Databases 36 Procedures in DBMS_STATS . 38 Stabilizing Execution Plans in a Data Warehouse in Oracle8i 62 Oracle8i Materialized Views, Summaries and Data Warehousing 68 The DBMS_SUMMARY Package in Oracle8i . 74 DIMENSION Objects in Oracle8i . 81 Managing CPU Utilization for Data Warehouses in Oracle8i . 84 Restricting Access by Rows in an Oracle8i Data Warehouse 103 DBMS_RLS Package 108 Hour 2: . 112 Data Warehouse Loading 112 IMPORT-EXPORT . 115 Data Warehouse Tools . 118 An Overview of Oracle Express Server . 118 An Overview of Oracle Discoverer . 120 Summary 121

pdf13 trang | Chia sẻ: tlsuongmuoi | Lượt xem: 1921 | Lượt tải: 0download
Bạn đang xem nội dung tài liệu Oracle Data Warehouse Management Secrets of Oracle Data Warehousing, để tải tài liệu về máy bạn click vào nút DOWNLOAD ở trên
Rampant TechPress Oracle Data Warehouse Management Secrets of Oracle Data Warehousing Mike Ault ROBO BOOKS MONOGRAPH DATA WAREHOUSING AND ORACLE8I PAGE II Notice While the author & Rampant TechPress makes every effort to ensure the information presented in this white paper is accurate and without error, Rampant TechPress, its authors and its affiliates takes no responsibility for the use of the information, tips, techniques or technologies contained in this white paper. The user of this white paper is solely responsible for the consequences of the utilization of the information, tips, techniques or technologies reported herein. COPYRIGHT © 2003 RAMPANT TECHPRESS. ALL RIGHTS RESERVED. ROBO BOOKS MONOGRAPH DATA WAREHOUSING AND ORACLE8I PAGE III Oracle Data Warehouse Management Secrets of Oracle Data Warehousing By Mike Ault Copyright © 2003 by Rampant TechPress. All rights reserved. Published by Rampant TechPress, Kittrell, North Carolina, USA Series Editor: Don Burleson Production Editor: Teri Wade Cover Design: Bryan Hoff Oracle, Oracle7, Oracle8, Oracle8i, and Oracle9i are trademarks of Oracle Corporation. Oracle In-Focus is a registered Trademark of Rampant TechPress. Many of the designations used by computer vendors to distinguish their products are claimed as Trademarks. All names known to Rampant TechPress to be trademark names appear in this text as initial caps. The information provided by the authors of this work is believed to be accurate and reliable, but because of the possibility of human error by our authors and staff, Rampant TechPress cannot guarantee the accuracy or completeness of any information included in this work and is not responsible for any errors, omissions, or inaccurate results obtained from the use of information or scripts in this work. Visit www.rampant.cc for information on other Oracle In-Focus books. ISBN: 0-9740716-4-1 COPYRIGHT © 2003 RAMPANT TECHPRESS. ALL RIGHTS RESERVED. ROBO BOOKS MONOGRAPH DATA WAREHOUSING AND ORACLE8I PAGE IV Table Of Contents Notice............................................................................................. ii Publication Information............................................................... iii Table Of Contents ........................................................................ iv Introduction ................................................................................... 1 Hour 1: ........................................................................................... 2 Conceptual Overview .......................................................................... 2 Objectives:........................................................................................................... 2 Data Systems Architectures................................................................................. 2 Data Warehouse Concepts................................................................... 7 Objectives:........................................................................................................... 7 Data Warehouse Terminology............................................................................. 8 Data Warehouse Storage Structures .................................................................. 10 Data Warehouse Aggregate Operations............................................................. 11 Data Warehouse Structure ................................................................. 11 Objectives:......................................................................................................... 11 Schema Structures For Data Warehousing ........................................................ 11 Oracle and Data Warehousing ................................................... 15 Hour 2: ......................................................................................... 15 Oracle7 Features ................................................................................ 15 Objectives:......................................................................................................... 15 Oracle7 Data Warehouse related Features......................................................... 15 Oracle8 Features ................................................................................ 19 Objectives:......................................................................................................... 19 Partitioned Tables and Indexes.......................................................................... 20 Oracle8 Enhanced Parallel DML....................................................................... 22 Oracle8 Enhanced Optimizer Features .............................................................. 24 Oracle8 Enhanced Index Structures................................................................... 25 Oracle8 Enhanced Internals Features ................................................................ 25 Backup and Recovery Using RMAN................................................................. 26 COPYRIGHT © 2003 RAMPANT TECHPRESS. ALL RIGHTS RESERVED. ROBO BOOKS MONOGRAPH DATA WAREHOUSING AND ORACLE8I PAGE V Data Warehousing 201................................................................ 27 Hour 1: ......................................................................................... 27 Oracle8i Features ............................................................................... 27 Objectives:......................................................................................................... 27 Oracle8i SQL Enhancements for Data Warehouses .......................................... 27 Oracle8i Data Warehouse Table Options .......................................................... 31 Oracle8i and Tuning of Data Warehouses using Small Test Databases ............ 36 Procedures in DBMS_STATS........................................................................... 38 Stabilizing Execution Plans in a Data Warehouse in Oracle8i .......................... 62 Oracle8i Materialized Views, Summaries and Data Warehousing.................... 68 The DBMS_SUMMARY Package in Oracle8i ................................................. 74 DIMENSION Objects in Oracle8i..................................................................... 81 Managing CPU Utilization for Data Warehouses in Oracle8i ........................... 84 Restricting Access by Rows in an Oracle8i Data Warehouse.......................... 103 DBMS_RLS Package ...................................................................................... 108 Hour 2: ....................................................................................... 112 Data Warehouse Loading ................................................................ 112 IMPORT-EXPORT ......................................................................... 115 Data Warehouse Tools..................................................................... 118 An Overview of Oracle Express Server........................................................... 118 An Overview of Oracle Discoverer ................................................................. 120 Summary.......................................................................................... 121 COPYRIGHT © 2003 RAMPANT TECHPRESS. ALL RIGHTS RESERVED. ROBO BOOKS MONOGRAPH DATA WAREHOUSING AND ORACLE8I PAGE VI COPYRIGHT © 2003 RAMPANT TECHPRESS. ALL RIGHTS RESERVED. ROBO BOOKS MONOGRAPH DATA WAREHOUSING AND ORACLE8I PAGE 1 Introduction I am Michael R. Ault, a Senior Technical Management Consultant with TUSC, an Oracle training, consulting and remote monitoring firm. I have been using Oracle since 1990 and had several years of IT experience prior to that going back to 1979. During the 20 odd years I have been knocking around in the computer field I have seen numerous things come and go. Some were good such as the PC and all it has brought to the numerous languages which have come, flared briefly and then gone out. Data warehousing is a concept that really isn't new. The techniques we will discuss today have their roots back in the colossal mainframe systems that were the start of the computer revolution in business. The mainframes represented a vast pool of data, with historical data provided in massive tape libraries that could be tape searched if one had the time and resources. Recent innovations in CPU and storage technologies have made doing tape searches a thing (thankfully) of the past. Now we have storage that can be as large as we need, from megabytes to terabytes and soon, petabytes. Not to mention processing speed. It wasn't long ago when a 22 mghz system was considered state-of-the-art, now unless you are talking multi-CPU each at over 400 mghz you might as well not even enter into the conversation. The systems we used to think where massive with a megabyte of RAM now have gigabytes of memory. This combination of large amounts of RAM, high processor speed and vast storage arrays has led to the modern data warehouse where we can concentrate on designing a properly architected data structure and not worry what device we are going to store it on. This set of lessons on data warehousing architecture and Oracle is designed to get you up to speed on data warehousing topics and how they relate to Oracle. Initially we will cover generalized data warehousing topics and then Oracle features prior to Oracle8i. A majority of time will be spent on Oracle8 and Oracle8i features as they apply to data warehousing. COPYRIGHT © 2003 RAMPANT TECHPRESS. ALL RIGHTS RESERVED. ROBO BOOKS MONOGRAPH DATA WAREHOUSING AND ORACLE8I PAGE 2 Hour 1: Conceptual Overview Objectives: The objectives of this section on data warehouse concepts are to: 1. Provide the student with a grounding in data systems architectures 2. Discuss generic tuning issues associated with the various data systems architectures. Data Systems Architectures Using the proper architecture can make or break a data warehouse project. OLTP Description and Use OLTP Stands for On-Line Transaction Processing. In an OLTP system the transaction size is generally small affecting single or few rows at a time. OLTP systems generally have large numbers of users that are generally not skilled in query usage and access the system through an application interface. Generally OLTP systems are designed as normalized where every column in a tuple is related to the unique identifier and only the unique identifier. OLTP systems use the primary-secondary key relationship to relate entities (tables) to each other. OLTP systems are usually created for a specific use such as order processing, ticket tracking, or personnel file systems. Sometimes multiple related functions a re performed in a single unified OLTP structure such as with Oracle Financials. OLTP Tuning OLTP tuning is usually based around a few key transactions. Small range queries or single item queries are the norm and tuning is to speed retrieval of single rows. The major tuning methods consist of indexing at the database level COPYRIGHT © 2003 RAMPANT TECHPRESS. ALL RIGHTS RESERVED. ROBO BOOKS MONOGRAPH DATA WAREHOUSING AND ORACLE8I PAGE 3 and using pre-tuned queries at the application level. Disk sorts are minimized and shared code is maximized. In many cases closely related tables may be merged (denormalized) for performance reasons. A fully normalized database usually doesn't perform as well as a slightly de- normalized system. Usually if tables are constantly accessed together they are denormalized into a single table. While denormalization may require careful application construction to avoid insert/update/delete anomalies, usually the performance gain is worth the effort. OLAP Description and Use An OLAP database, which is an On-line Analytical Processing database, is used to perform data analysis. An OLAP database is based on dimensions a dimension is a single detail record about a data item. For example, a product can have a quantity, a price, a time of sale and a place sold. These four items are the dimensions of the item product in this example. Where the dimensions of an object intersect is a single data item, for example, the sales of all apples in Atlanta Georgia for the month of May, 1999 at a price greater than 59 cents a pound. One problem with OLAP databases is that the cubes formed by the relations between items and their dimensions can be sparse, that is, not all intersections contain data. This can lead to performance problems. There are two versions of OLAP at last count, MOLAP and ROLAP. MOLAP stands for Multidimensional OLAP and ROLAP stands for Relational OLAP. The problem with MOLAP is that there is a physical limit on the size of data cube which can be easily specified. ROLAP allows the structure to be extended almost to infinity (petabytes in Oracle8i). In addition to the space issues a MOLAP uses mathematical processes to load the data cube, which can be quite time intensive. The time to load a MOLAP varies with the amount of data and number of dimensions. In the situation where a data set can be broken into small pieces a MOLAP database can perform quite well, but the larger and more complex the data set, the poorer the performance. MOLAPs are generally restricted to just a few types of aggregation. In a ROLAP the same performance limits that apply to a large OLTP come into play. ROLAP is a good choice for large data sets with complex relations. Data loads in a ROLAP can be done in parallel so they can be done quickly in comparison to a MOLAP which performs the same function. Some applications, such as Oracle Express use a combination of ROLAP and MOLAP. The primary purpose of OLAP architecture is to allow analysis of data whether comes from OLTP, DSS or Data warehouse sources. COPYRIGHT © 2003 RAMPANT TECHPRESS. ALL RIGHTS RESERVED. ROBO BOOKS MONOGRAPH DATA WAREHOUSING AND ORACLE8I PAGE 4 OLAP Tuning OLAP tuning involves pre-building the most used aggregations and then tuning for large sorts (combination of disk and memory sorts) as well as spreading data across as many physical drives as possible so you get as many disk heads searching data as is possible. Oracle parallel query technology is key to obtaining the best performance from an OLAP database. Most OLAP queries will be ad-hoc in nature, this makes tuning problematic in that shared code use is minimized and indexing may be difficult to optimize. DSS Description and Use In a DSS system (Decision Support System) the process of normalization is abandoned. The reason normalization is abandoned in a DSS system is that data is loaded and not updated. The major problem with non-normalized data is maintaining data consistency throughout the data model. An example would be a person's name that is stored in 4 places, you have to update all storage locations or the database soon becomes unusable. DSS systems are LOUM systems (Load Once – Use Many) any refresh of data is usually global in nature or is done incrementally a full record set at a time. The benefits of an DSS database is that a single retrieval operation brings back all data about an item. This allows rapid retrieval and reporting of records, as long as the design is identical to what the user wants to see. Usually DSS systems are used for specific reporting or analysis needs such as sales rollup reporting. The key success factor in a DSS is its ability to provide the data needed by its users, if the data record denormalization isn't right the users won't get the data they desire. A DSS system is never complete, users data requirements are always evolving over time. DSS Tuning Generally speaking DSS systems require tuning to allow for full table scans and range scans. The DSS system is not generally used to slice and dice data (that is the OLAP databases strength) but only for bulk rollup such as in a datamart situation. DSS systems are usually refreshed in their entirety or via bulk loads of data that correlate to specific time periods (daily, weekly, monthly, by the quarter, etc.). Indexing will usually be by dates or types of data. Data in a DSS system is generally summarized over a specific period for a specific area of a company such as monthly by division. This partitioning of data by discrete time and geographic locale leads to the ability to make full use of partition by range provided by Oracle8 as a tuning method. COPYRIGHT © 2003 RAMPANT TECHPRESS. ALL RIGHTS RESERVED. ROBO BOOKS MONOGRAPH DATA WAREHOUSING AND ORACLE8I PAGE 5 DWH A DWH, data warehouse, database is usually summarized operational data that spans the entire enterprise. The data is loaded through a clean-up and aggregation process on a predetermined interval such as daily, monthly or quarterly. One of the key concepts in data warehousing is the concept that the data is stored along a timeline. A data warehouse must support the needs of a large variety of users. A DWH may have to contain summarized, as well as atomic data. A DWH may combine the concepts of OLTP, OLAP and DSS into one physical data structure. The major operation in a DWH is usually reporting with a low to medium level of analytical processing. A data warehouse contains detailed, nonvolatile, time-based information. Usually data marts are derived from data warehouses. A data warehouse design should be straight forward since many users will query the data warehouse directly (however, only 10% of the queries in a DWH are usually ad-hoc in nature with 90% being canned query or reports). Data warehouse design and creation is an interative process, it is never "done". The user community must be intimately involved in the data warehouse from design through implementation or else it will fail. Generally data warehouses are denormalized structures. A normalized database stores the greatest amount of data in the smallest amount of space, in a data warehouse we sacrifice storage space for speed through denormalization. A dyed in the wool OLTP designer may have difficulty in crossing over to the dark side of data warehousing design. Many of the time-honored concepts are bent or completely broken when designing a data warehouse. In fact, it may be impossible for a great OLTP designer to design a great DWH! Many object- related concepts can be brought to bear on a DWH design so you may find a source for DWH designers in a pool of OO developers. DWH Tuning DHW tuning is a complex topic. The database must be designed with a DWH multi-functional profile in mind. Tuning must be for OLTP type queries as well as bulk reporting and bulk loading operations being performed as well. Usually these tuning requirements require two or more different set of initialization parameters. One set of initialization parameters may be optimized for OLTP type operations and be used when the database is in use during normal work hours, then the database is shutdown and a new set is used for the nightly batch reporting and loading operations. COPYRIGHT © 2003 RAMPANT TECHPRESS. ALL RIGHTS RESERVED.

Các file đính kèm theo tài liệu này:

  • pdframpant_techpress_oracle_data_warehouse_management00001_407.pdf
Tài liệu liên quan