一本面向 Oracle 运行的好书《Oracle 24x7 Tips and Techniques》(Venkat S. Devraj, Osborne/McGraw-Hill, November 1

来源:百度文库 编辑:神马文学网 时间:2024/04/27 16:52:02

这本99年的书,作者根据自己咨询经验,总结提炼了很多 Tips(窍门),如果你生活在高可用运行环境中,你会碰到那些作者提到的问题,作者不保守,乐于分享知识

图书英文目录 (Table of contents)

Part I: Introduction.  

Chapter 1. Identifying our Uptime Requirements.

    Understanding What 24X7 Means for Your Organization

    Working Toward an "Ideal" 24X7 System

    Tips for Semi-Technical Managers and Supervisors

    Summary

Chapter 2. Understanding and Dealing with Emergencies.

    What is an Emergency?

    Summary

Part II: Understanding Your Environment. 

Chapter 3. Hardware Configuration. 

    Hardware Configuration

        Know real well what's inside the box

        choose your disk-array size with caution

        Do Not Use Read-Ahead Caches for OLTP Applications

        Do Not Rely on Write Caches to Eliminate I/O Hotspots

        Use multi-level RAIDs

        Ensure Your Stripe-Sizes in line with Your OS and Database Block-Sizes

        Ensure that your disk and tape I/O sizes match

    Architectural Alternatives for System Redundancy and Performance

    Summary

Chapter 4. Operating Systems. 

    Oracle on Unix Versus Oracle on Windows NT

    Kernel and Block Size

        Learn to customize your kernel

        Know your OS logical and physical block sizes

    Raw Devices

        ......

        Pre-create abundant raw partitions

        Choose multiple standard sizes for raw partitions

        Do not use cylinder 0 of the disk for creating raw partitions

        Create symbolic links for all raw devices

        Be aware of other file system options

    System Capacity and Bottlenecks

        Proactively ensure that system capacity is not reached at any time

        Distribute all resource-intensive applications among multiple servers

        Set up limits on CPU and memory use for non-database servers

        Discourage non-production jobs during peak production hours

        Avoid resource-intensive commands that compete with Oracle processes

        Do not set priority levels for Oracle-related processes

        Do not use processor affinity on the database server

        Check frequently for memory leaks

        Set swap space between two and four times larger than  physical memory

        Distribute swap area across your fast disks

        Know whether your OS can address more than 2GB of RAM

        If possible, lock your shared-memory area in physical memory

        Understand logical and physical drives mapping

        Enable file system journaling on all production boxes

        Check the availability of free disk space on a regular basis

        Keep your file systems and directories slim

        Maintain a spare root file system

        If available, turn on large file support

        Look at important OS logs at regular intervals

        Invest in automated tools to monitor system bottlenecks

Chapter 5. Networks. 

    Managing your network

        ......

        Ensure that your network is not being overloaded

        Configure your network to use sub-networking effectively

    Customizing SQL*Net and Net8 for Your Network Configuration

        Increase the network queue size

        Disable the NAGLE algorithm

        Set the SQL*Net or Net8 packet size to match the protocol MTU

        Deploy homogeneous servers throughout your enterprise

        Tips for Semi-Technical Managers and Supervisors

Chapter 6. Applications and Data.

    Applications

        ......

        Make application code independent of future schemetic changes

        Evaluate using transaction splitting to clone data

        Evaluate using pro*C or OCI for all mission-critical applications

        Implement failover capabilities within all mission-critical applications

        Be familiar with various automatic failover options

        Have MAXEXTENTS unlimited for all volatile application segments

        Expedite your data loads (加速数据装载过程)

        Version control your application source code

        Manage your indexes efficiently

        Use threads,rather than processes,in your applications

        Use shared libraries in your applications

        Invest time in physical database design

        Some quick application-realted tips

    Data

        Take time to understand the schema model

        Categorize your data

Part III: Database Setup and Configuration. 

Chapter 7. Installing, Configuring and Customizing our Database Environment. 

    Server Configuration

        Follow the OFA standard

        Always use a config.ora file

        Use the crdb_SID.sql and crdb2_SID.sql creation scripts

        Ensure sufficient disk space for at least two weeks' worth of trace files and alert-log

        Consider OS block size and application characteristics when setting the database block size

        Ensure that contiguous(相邻) blocks per write are set high

        For a large number of data-files in Oracle7.x, enable CKPT

       Use LOG_CHECKPOINT_TIMEOUT and/or LOG_CHECKPOINT_INTERVAL toensure instance-recovery times specified in your SLA are met

        Create more than three mirrored online redo-log groups for high-actvity databases

        Place redo-logs such that ARCH and LGWR do not contend with each other

        Configure adequate redo latches to minimize contention

        Ensure that dbwr can keep up with database loads

        Partition your buffer cache into multiple buffer pools according to segment usage patterns

        Use process slaves to alleviate I/O bottlenecks during high database activity

        Enable vector posting

        Set up dedicated temporary tablespaces

        Set up sort area efficiently

        Use direct writes for sorts

        Use profiles to prevent runaway processes from hogging all resources

        Evaluate using vectorized reads for cooked file systems

        Enable direct I/O when using cooked file systems

        Use intimate shared memory(ISM)for databases with high concurrent user access

        Use post-wait drivers for high IPC thrughout

        Pre-page and "lock" the SGA in main memory

        Always install SQL*Plus help

    Other Self-Explanatory(Well,Almost)Server Configuration Tips

    SQL*Net/Net8 Configuration

        Use Oracle names when dealing with a large number of clients

        Use out-of-band breaks with SQL*Net/Net8 or set poll frequency high when out-of-band breaks are unavailable        

        Keep dead-connection detection to a minimum

        Manage large user populations effectively via SQL*Net and Net8 features

        Pre-spawn dedicated server processes

Chapter 8. Database Upgrades, Downgrades, Reorganizations, and Migrations. 

    Upgrades

    Downgrades

    Reorganizations

    Migrations

    When to Upgrade

        Practice the migration to a test environment

        Analyze each step and construct strategies to expedite it 

        Gotchas to watch for while upgrading from Oracle7 to Oracle8/8i

        Quick tips to enhance load/unload performance

    Real-Life Migration Case Study

Chapter 9. Backup, Recover, and Archiving Policies and Procedures. 

    Backups

        Select at least two backup methods after adequate analysis

        Perform hot backups during period of low DML activity

        Do not place all tablespaces in hot backup mode simultaneously

        If possible, back up to disk first and then to tape

        Do not back up online redo-logs during OS hot backups

        Evaluate the use of tripe mirrors

        Make sure that hot backup commands are executed synchronously

        Check the backup logs regularly

        Avoid performing backups manually

        Ensure that security is not compromised when using backup tools

        With RMAN, periodically resynchronize the recovery catalog

        Ensure that your backup strategy accounts for all Oracle file-types

        Back up the control-files immediately after all database changes

        Address export-related comcerns before you start performing exports

        Form "export-groups" for inter-segment consistency

        Plan sufficient space for export dumps

        Perform exports using direct path, rather than conventional path

        Ensure compatibility and usability of all backed-up export-dumps

        Perform a backup immediately after any system clock changes

        Check for database corruption during and after backups

        Set optimal I/O size during backups

        Make use of incremental backups available in Oracle8

    Archiving

        Evaluate whether ARCHIVELOG is necessary for your environment

        Implement antomatic archival as the primary archival mode

        Allocate adequate space for ARCHIVE_LOG_DEST

        ........

        Take a complete backup as soon as ARCHIVELOG mode is enabled

        Use ARCHIVE LOG CURRENT rather than a variation

        Do not create holes in the archived log sequence

        If possible, keep dual copies of archived logs on disk

        Create more online redo-log groups when archiving to offline media

        Be aware of special archiving considerations, when using OPS

    Recovery

        Understand the need for recovery and take proactive steps to avoid failure

        Understand factors affecting recovery time

        Ensure that recovery timings are consistent with your SLA MTTR specifications

        Consider alternative backup/recovery approaches for point-in-time recovery

        Ensure that checkpoints occur at sufficient intervals and redo-logs are sized appropriately to prevent recovery delays

        Always consider the lowest unit of recovery

        Always examine v$datafile prior to recovery

        Maintain a special init.ora file for recovery operations

        Always use RESETLOGS as the last resort

        Evaluate all UNRECOVERABLE and NOLOGGING operations

        Catergorize all segments and tablespaces for recovery

        Keep tablespaces without write activity READONLY to expedite backup and recovery

        Set AUTORECOVERY on during recovery

    Tips for semi-technical managers and supervisors

        .......

        Document all database structural changes

Chapter 10. Startup and Shutdown Procedures.

    Startup and Shutdown Procedures

        Always Automate Database Startup and Shutdown

        Use OEM for Starting UP/Shutting Down Multiple Databases

        For Urgent Database Bounce Requirements, Consider a "Shutdown Abort"

        Just Prior to a SHUTDOWN ABORT do an ALTER SYSTEM CHECKPOINT

Part IV: Database Maintenance.

Chapter 11. General Maintenance. 

    Proactive and Reactive Maintenance

        Implement robust security at all possible levels

        Understand the "ORA-1555: Snapshot too old" error and take steps to avoid it

        Understand, prevent, and resolve latch and enqueue contention

        Be familiar with different wait events and take steps to reduce them

        Periodically watch for and resolve locking conflicts

        Monitor shared pool effectiveness and tune, if necessary

    Other Critical Areas That Your General Maintenance Routines Need to Address

Chapter 12. Space and Growth Management. 

    Understanding and managing space and growth

        Understand the business drivers influencing segment growth

        Understand the units of space consumption

        Understand the various segments that comprise your database

        Be familiar with the errors and effects that result from inadequate space management

        Enable the AUTOEXTEND feature for all application tablespaces and SYSTEM

        Use standardized file-sizes

        Accommodate the data-file header block in your standardized file-sizes

        Use standardized storage clauses for each segment-type to reduce fragmentation and use space more optimally

        Set INITIAL and NEXT keeping DB_BLOCK_SIZE in mind

        Set PCTINCREASE optimally to reduce fragmentation

        Understand FREELISTS and set them optimally to reduce latch contention

        Periodically coalesce your application tablespaces

        Understand the effect of an erroneous high-water mark

        Periodically deallocate unused blocks

        Avoid dropping data-files explicitly

        Learn the "tablespace shuffle" for emergency unanticipated space requirements

        In a DSS environment, do not blindly accept unrestricted segment growth

Part V: Troubleshooting.

Chapter 13. Interpreting the Alert-Log and Trace-Files. 

    The Alert-Log

        Whenever possible, continually tail the alert-log

        Be familiar with events/messages in the alert-log

        Edit repetitive error messages to be listed just once

        Archive and trim the alert-log once every two weeks to three months

    Trace-Files

        Know what to look for in a trace-file

        Know how to identify trace-file belonging to a specific event/session

        Know what causes trace-files to be generated

    Tracing

    SQL*Net and Net8 Tracing

    PL/SQL Tracing

    ODBC Tracing

    Core Dumps

    Oracle Debugging Tools

        Whenever necessary, make your trace-files readable by developers - 让开发人员拥有读到跟踪文件的权限

Chapter 14. Identifying and Repairing Database Corruption.

    Understanding, Preventing, and Repairing Corruption

        Understand corruption and the ways it can manifest itself

        Proactively check for and detect media corruption

        Understand what you can do to prevent media corruption

        Be familiar with techniques to deal with memory corruption

        Be familiar with techniques to deal with logical corruption

        Identify and list all components susceptible to corruption

        Identify and list measures to prevent, detect and repair corruption in your environment

        Retain relevant information regarding the corruption

        Train your administrators to handle corruption

Part VI: High Availability Solutions.

Chapter 15. Standby Options. 

    Standby Options from Oracle

        Consider using a standby database for disaster recovery

        Consider using standby instances in environments where true 24X7 uptime is not mandatory

        Consider combining multiple solution to form a comprehensive one - 多方案组合

    Strategies for availability During Routine Maintenance and Emergencies  - 例行维护和紧急情况下的可用性策略

        Evaluate using a Custom Standby Database(CSD)

        Consider using standby tables to cater to segment-level failures

        Be aware of caveats with the custom solutions desccribed here

Chapter 16. Oracle Parallel Server. 

    Understanding and managing an OPS environment

        Know what OPS is and how it enhences availability

        Understand, detect, and eliminate false pinging

        Partitioning applications prior to deploying them on OPS

Chapter 17. Advanced Replication. 

    Understanding AR

        Be familiar with AR operations and basic functioning

        Understand your options for highly available AR implementations

        Be aware of possible AR architectural alternatives

Chapter 18. Third-Party HA Solutions.

    Hardware/OS Based

        Consider EMC SRDF for disaster-recovery purposes

        Evaluate using EMC TIMEFINDER for higher return on investment

    Database-Based

        Consider Quest Software's Shareplex for Oracle for higher standby flexibility and concurrent reporting access

Part VII: Building A Real-World Arsenal.

Chapter 19. The 24x7 Toolkit.

    Using Scripts to Be Alerted to Disruptive Problems

        Monitor for specific failures by running poll-scripts periodically

        Be familiar with techniques for application failover when calling PL/SQL

Part VIII: New Features.

Chapter 20. New HA Features in Oracle8i. 

    Maintenance-Oriented Features

        Consider transportable tablespaces for quick movement of data subsets

        Consider locally managed tablespaces to prevent fragmentation

        Be familiar with new change management features

        Create, rebuild and defragment indexes online

        Be familiar with non-partitioned table reorganization features

        Drop columns, when necessary, with greater ease

    Troubleshooting Features

        Be familiar with new techniques to repair corruption

    Backup and Recovery-Oriented Features

        Utilize methods to facilitate and expedite RMAN backups

        Control and expedite recovery time

    Miscellaneous HA Features

        Use newer listener failover and load-balancing services

        Be familiar with LogMiner functionality

        Be aware of an option to eliminate "fat" OS overhead

A: References.

Index