[Loney05] Chapter 9. Using STATSPACK

来源:百度文库 编辑:神马文学网 时间:2024/04/29 20:54:56

Chapter 9. Using STATSPACK

Youcan use the STATSPACK utility to monitor the performance of yourdatabase. In this chapter, you will see how to install STATSPACK, howto manage it, and how to run and interpret the reports generated.

As of Oracle10g,you can use the Automatic Workload Repository to gather and analyzestatistics. The STATSPACK utility provides an ad hoc capability toanalyze database statistics in a similar fashion—by taking snapshots ofthe database statistics at different times and generating reports basedon the differences.


9.1. Installing STATSPACK

STATSPACKmust be installed in every database to be monitored. The installationscript, named spcreate.sql, is found in the /rdbms/admin subdirectoryunder the Oracle software home directory. The spcreate.sql scriptcreates a user named PERFSTAT and creates a number of objects underthat schema.

NOTE

You should allocate at least 100MB for the initial creation of the PERFSTAT schema's objects.

Tostart the spcreate.sql script, change your directory to theORACLE_HOME/rdbms/admin directory and log into SQL*Plus using anaccount with SYSDBA privileges:

SQL> connect system/manager as SYSDBA
SQL> @spcreate

Duringthe installation process, you will be prompted for a password for thePERFSTAT user and a default tablespace for the PERFSTAT user (a list ofavailable tablespaces will be displayed along with this prompt). Youwill also be asked to specify a temporary tablespace for the user. Onceyou have provided the default and temporary tablespaces, the PERFSTATaccount will be created, and the installation script will log in asPERFSTAT and continue to create the required objects. If there is notsufficient space to create the PERFSTAT objects in the specifieddefault tablespace, the script will return an error.

NOTE

Althoughyou start the installation script while logged in as aSYSDBA-privileged user, the conclusion of the installation script willleave you logged in as the PERFSTAT user.

Ifyou want to drop the PERFSTAT user at a later date, you can run thespdusr.sql script located in the ORACLE_HOME/rdbms/admin directory.

9.1.1. Security of the PERFSTAT Account

ThePERFSTAT account is created with the password you specify duringSTATSPACK installation. You can change the PERFSTAT account password atany time.

ThePERFSTAT account is granted the SELECT_CATALOG_ROLE role and SELECTaccess on a large number of V$ views, along with several systemprivileges (CREATE/ALTER SESSION, CREATE TABLE, CREATE/DROP PUBLICSYNONYM, CREATE SEQUENCE, and CREATE PROCEDURE). Any user who canaccess your PERFSTAT account can select from all the dictionary views.For example, such a user could query all the database account usernamesfrom DBA_USERS, all the segment owners from DBA_SEGMENTS, and thecurrently logged-in sessions from V$SESSION. The PERFSTAT account, ifleft unprotected, provides a security hole that allows intruders tobrowse through your data dictionary and select targets for furtherintrusion.

Inaddition to the privileges it receives during the installation process,the PERFSTAT account will also have any privileges that have beengranted to PUBLIC. If you use PUBLIC grants instead of roles forapplication privileges, you must secure the PERFSTAT account. You canlock database accounts and unlock them as needed; see Chapter 10 for details.

9.1.2. Post-installation

Oncethe installation process is complete, the PERFSTAT account will owntables, indexes, a sequence, and a package. You will use the package,named STATSPACK, to manage the statistics-collection process and thedata in the tables. The collection tables, whose names all begin with"STATS$," will have column definitions based on the V$ viewdefinitions. For example, the columns in STATS$WAITSTAT are the onesfound in V$WAITSTAT, with three identification columns added at the top:

desc stats$waitstat

Name Null? Type
------------------------ -------- ------------
SNAP_ID NOT NULL NUMBER(6)
DBID NOT NULL NUMBER
INSTANCE_NUMBER NOT NULL NUMBER
CLASS NOT NULL VARCHAR2(18)
WAIT_COUNT NUMBER
TIME NUMBER

TheClass, Wait_Count, and Time columns are based on the Class, Count, andTime columns from V$WAITSTAT.STATSPACK has added the following threeidentification columns:

SNAP_ID An identification number for the collection. Each collection is called a "snapshot" and is assigned an integer value. DBID A numeric identifier for the database. INSTANCE_NUMBER A numeric identifier for the instance, for Real Application Clusters (RAC) installations.

Eachcollection you perform is given a new Snap_ID value that is consistentacross the collection tables. When you run a STATSPACK report, you willsee a list of all available snapshots.


9.2. Gathering Statistics

Each collection of statistics is called a snapshot.Snapshots are a point-in-time collection of the statistics availablevia the V$ views, and they are given a Snap_ID value to identify them.You can generate reports on the changes in the statistics between anytwo snapshots.

NOTE

Aswith the UTLBSTAT/UTLESTAT reports, the STATSPACK report will only bevalid if the database was not shut down and restarted between thesnapshots evaluated.

Betweensnapshots, STATSPACK performs no processing and adds no burden to yourdatabase performance. STATSPACK only impacts performance when you takesnapshots and when you generate reports on the changes in statisticsbetween two snapshots.

NOTE

Be sure the TIMED_STATISTICS database initialization parameter is set to TRUE prior to gathering statistics.

Togenerate a snapshot of the statistics, execute the SNAP procedure ofthe STATSPACK package, as shown in the following listing. You must belogged in as the PERFSTAT user to execute this procedure.

execute STATSPACK.SNAP;

PL/SQL procedure successfully completed.

Whenthe SNAP procedure is executed, Oracle populates your SNAP$ tables withthe current statistics. You can then query those tables directly, oryou can use the standard STATSPACK report (to see the change instatistics between snapshots).

Snapshots should be taken for the following reasons:

  • To evaluate performance during specific tests of the system For these tests, you can execute the SNAP procedure manually, as shown in the prior example.

  • To evaluate performance changes over a long period of time To establish a baseline of the system performance, you may generate statistics snapshots on a scheduled basis. For these snapshots, you should schedule the SNAP procedure execution via Oracle's internal DBMS_JOB scheduler or via an operating system scheduler. You can use the spauto.sql script in the $ORACLE_HOME/rdbms/admin directory to schedule snapshots.

Forthe snapshots related to specific tests, you may wish to increase thecollection level, which lets you gather more statistics. As noted inthe "Managing the STATSPACK Data"section, later in this chapter, each snapshot has a cost in terms ofspace usage and query performance. Avoid generating thousands of rowsof statistical data with each snapshot unless you plan to use them.

To support differing collection levels, STATSPACK provides the level parameter. By default, level is set to a value of 5. Prior to changing the level value, you should generate several snapshots and evaluate the reports generated. The default level value is adequate for most reports. Alternative level values are listed in the following table:

Level Description 0 General performance statistics on all memory areas, latches, pools, and events 5 Same statistics from the lower levels, plus high-resource-usage SQL statements 6 Same statistics from the lower levels, plus SQL plan and SQL plan usage data 7 Same statistics from the lower levels, plus segment level statistics, including logical and physical reads, row lock, and buffer-busy waits 10 Same statistics from the lower levels, plus parent/child latch data

Thegreater the collection level, the longer the snapshot will take. Thedefault value (5) offers a significant degree of flexibility during thequeries for the most resource-intensive SQL statements. The parametersused for the resource-intensive SQL portion of the snapshot are storedin a table named STATS$STATSPACK_PARAMETER. You can querySTATS$STATSPACK_PARAMETER to see the settings for the differentthresholds during the process of SQL statement gathering. Its columnsinclude Snap_Level (the snapshot level), Executions_Th (threshold valuefor the number of executions), Disk_Reads_Th (threshold value for thenumber of disk reads), and Buffer_Gets_Th (threshold value for thenumber of disk reads).

For a level 5 snapshot using the default thresholds, SQL statements are stored if they meet any of the following criteria:

  • The SQL statement has been executed at least 100 times.

  • The number of disk reads performed by the SQL statement exceeds 1000.

  • The number of parse calls performed by the SQL statement exceeds 1000.

  • The number of buffer gets performed by the SQL statement exceeds 10,000.

  • The sharable memory used by the SQL statement exceeds 1MB.

  • The version count for the SQL statement exceeds 20.

Whenevaluating the snapshot's data and the performance report, keep in mindthat the SQL threshold parameter values are cumulative. A veryefficient query, if executed enough times, will exceed 10,000 buffergets. Compare the number of buffer gets and disk reads to the number ofexecutions to determine the activity each time the query is executed.

Tomodify the default settings for the thresholds, use theMODIFY_STATSPACK_PARAMETER procedure of the STATSPACK package. Specifythe snapshot level via the i_snap_level parameter, along with the parameters to change. Table 9-1 lists the available parameters for the MODIFY_STATSPACK_PARAMETER procedure.

Table 9-1. Modification Parameters
Parameter Name Range of Values Default Description i_snap_level 0, 5, 6, 7, 10 5 Snapshot level i_ucomment Any text blank Comment for the snapshot i_executions_th Integer >=0 100 Threshold for the cumulative number of executions i_disk_reads_th Integer >=0 1000 Threshold for the cumulative number of disk reads i_parse_calls_th Integer >=0 1000 Threshold for the cumulative number of parse calls i_buffer_gets_th Integer >=0 10000 Threshold for the cumulative number of buffer gets i_session_id Valid SID from V$SESSION 0 Session ID of an Oracle session, if you wish to gather session-level statistics i_modify_parameter True or False False Set to True if you wish to save your changes for future snapshots

To increase the Buffer_Gets threshold for a level 5 snapshot to 100,000, issue the following command:

STATSPACK.MODIFY_STATSPACK_PARAMETER -
(i_snap_level=>5, i_buffer_gets_th=>100000);

Ifyou plan to run the SNAP procedure more frequently than hourly, youshould pin the STATSPACK package in the shared pool following databasestartup. The following listing shows a trigger that will be executedeach time the database is started. The KEEP procedure of theDBMS_SHARED_POOL procedure pins the package in the shared pool.

create or replace trigger PIN_ON_STARTUP
after startup on database
begin
DBMS_SHARED_POOL.KEEP ('PERFSTAT.STATSPACK', 'P');
end;
/

9.3. Running the Statistics Report

Ifyou have generated more than one snapshot, you can report on thestatistics for the period between the two snapshots. The database mustnot have been shut down between the times the two snapshots were taken.When you execute the report, you will need to know the Snap_ID valuesfor the snapshots. If you run the report interactively, Oracle willprovide a list of the available snapshots and the times they werecreated.

Toexecute the report, go to the /rdbms/admin directory under the Oraclesoftware home directory. Log into SQL*Plus as the PERFSTAT user and runthe spreport.sql file found there, like so:

SQL> @spreport

Oraclewill display the database and instance identification information fromV$INSTANCE and V$DATABASE and will then call a second SQL file,sprepins.sql. The sprepins.sql file generates the report of the changesin the statistics during the snapshot time interval. The availablesnapshots will be listed, and you will be prompted to enter a beginningand ending snapshot ID. Unless you specify otherwise, the output willbe written to a file named sp_beginning_ending.lst (sp_1_2.lst for a report between the Snap_ID values 1 and 2).

Thefirst portion of the report output provides an overview of the cacheareas and their usage. The following listing provides sample output forthis section, showing the cache sizes and the load profile:

Cache Sizes (end)
~~~~~~~~~~~~~~~~~
Buffer Cache: 1,536M Std Block Size: 8K
Shared Pool Size: 1,648M Log Buffer: 10,240K

Load Profile
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
Redo size: 85,921.59 1,924.45
Logical reads: 23,431.10 524.80
Block changes: 520.99 11.67
Physical reads: 457.96 10.26
Physical writes: 61.19 1.37
User calls: 2,000.92 44.82
Parses: 430.78 9.65
Hard parses: 0.04 0.00
Sorts: 57.11 1.28
Logons: 0.53 0.01
Executes: 699.70 15.67
Transactions: 44.65

% Blocks changed per Read: 2.22 Recursive Call %: 31.02
Rollback per transaction %: 0.06 Rows per Sort: 64.62

Theload profile helps to identify the type of activity being performed. Inthis example, the activity recorded includes both queries andtransactions. On average, this database supports 44.65 transactions persecond while also supporting logical reads and sort operations. Thedatabase is actively performing physical reads and writes. In aread-intensive application, the number of physical writes per secondshould be far below the number of physical reads per second.

Note that the load profile shows the per-second averages in the database; however, if your STATSPACKreporting interval is too large, there may not be such a thing as anaverage second. For example, if your interval includes both yourdata-loading processes and your online user activity, the averageswould reflect the combined impact of both types of usage, obscuring thetrue load profile of each separate activity.

NOTE

When using the load profile statistics, remember that these are presented as "per second" statistics—the rest of the STATSPACK report values are presented as values for the full duration of the snapshot.

Thenext sections of the report show the instance efficiency percentages(such as the buffer hit ratio and library cache hit ratio) followed bythe shared pool statistics. The shared pool statistics show thepercentage of the shared pool in use and the percentage of SQLstatements that have been executed multiple times (as desired). Thefollowing listing shows sample shared pool statistics from the report:

Shared Pool Statistics          Begin    End
------ ------
Memory Usage %: 100.00 100.00
% SQL with executions>1: 71.68 70.60
% Memory for SQL w/exec>1: 64.63 62.85

Basedon the data in the preceding listing, at the time of the secondsnapshot, 100 percent of the shared pool's memory was in use. Of thestatements in the shared pool, only 70 percent had been executed morethan once, indicating a potential need to improve cursor sharing in theapplication. Because the shared pool is completely used, you shouldconsider increasing its size.

Thenext portion of the generated report shows the top five wait events,the full list of wait events, and the background wait events.Identifying major wait events may help to target your tuning efforts.

Let'sconsider common wait events: db file scattered reads (waits encounteredduring multiblock reads such as during full table scans) and db filesequential reads (for single-block reads). For this same database, thestatistics (truncated for display here) were as follows:

Event                               Waits   Timeouts   Time (s)
---------------------------- ------------ ---------- ----------
db file sequential read 1,410,528 0 3,631
db file scattered read 20,503 0 36

Addingthem together, we find that for the interval there were a total of1,431,031 waits during reads from datafiles—single-block reads ofindexes, single-block reads of tables, plus multiblock reads. How manywaits were there per second? In this report, the time interval was onehour—3600 seconds—so the number of waits per second is

1,431,031 waits / 3600 seconds = 397.5 waits per second

How does that compare with the number of reads? From the load profile, we already know the number of physical reads per second:

~~~~~~~~~~~~                   Per Second      Per Transaction
------------ ---------------
Physical reads: 457.96 10.26

So,in an average second, how many waits occur per physical read? To answerthis, divide the waits per second by the reads per second:

397.5 waits per second / 457.96 physical reads per second = 0.868

Basedon these statistics, roughly 87 percent of all reads during theinterval encountered wait events. That is a very high percentage ofwaits per read, and you should examine the I/O environment to see ifthere are opportunities to improve its performance. Don't look for fulltable scans as the culprits—remember the source of the waits:

Event                               Waits   Timeouts   Time (s)
---------------------------- ------------ ---------- ----------
db file sequential read 1,410,528 0 3,631
db file scattered read 20,503 0 36

Single-blockreads accounted for 98.5 percent—that is,1,410,528/(1,410,528+20,503)—of the waits. Look for inefficient indexesthat are being scanned repeatedly. If you eliminate all full tablescans in the database, you will reduce the number of physical waits byat most 1.5 percent.

Themost resource-intensive SQL statements in the database are listed inthe next section of the report, in descending order of buffer gets.Because the buffer gets statistic is cumulative, the query with themost buffer gets may not be the worst-performing query in the database;it may just have been executed enough times to earn the highestranking. Compare the cumulative number of buffer gets to the cumulativenumber of disk reads for the queries; if the numbers are close, youshould evaluate the explain plan for the query.

NOTE

Ifthe shared pool is flushed between the execution times of the twosnapshots, the SQL portion of the output report will not necessarilycontain the most resource-intensive SQL executed during the period.

TheSQL statements are listed three separate times—ordered by buffer gets,then by physical reads, then by executions. It is common to findresource-intensive SQL commands that show up in only one or two ofthese three listings. For example, if your application constantlyexecutes a query such as

select TRUNC(SYSDATE) from DUAL;

thenit will not generate many physical reads. However, the number ofexecutions of this command may cause it to be one of the most expensivequeries in the database. Some applications run queries like thismillions of times per day—to the point at which one or more CPUs areconstantly busy doing nothing more than returning the current systemdate to the user. Even if the commands are efficient in terms ofphysical I/O by themselves, consider their number of executions andtheir buffer gets—and the corresponding CPU burdens they incur.

Followingthe SQL statement listing, you will see the list of changes tostatistics from V$SYSSTAT, titled "Instance Activity Stats." TheV$SYSSTAT statistics are useful for identifying performance issues notshown in the prior sections. For example, you should compare the numberof sorts performed on disk to the number performed in memory; increasethe PGA_ AGGREGATE_TARGET value to reduce disk sorts. If there is asignificant number of full table scans of large tables, evaluate themost-used queries. The following listing shows four rows from thissection of the report:

Statistic                     Total  per Second    per Trans
--------------------------- ------- ----------- ------------
sorts (disk) 13 0.0 0.0
sorts (memory) 205,651 57.1 1.3
table scans (long tables) 23 0.0 0.0
table scans (short tables) 602,813 167.4 3.8

Inthis case, there are full table scans of long tables (those with morethan five blocks), but they are a minority of the full table scansperformed. You should check to see if the small tables are beingproperly cached in memory (such as in the KEEP pool). The sorts to diskare small in number, but every sort to disk is wasted effort. Youshould increase the sort area size and avoid writing to the temporarytablespace unless absolutely necessary.

Thenext section of the report provides the I/O statistics by tablespaceand by datafile. If the I/O is not properly distributed among yourfiles, you many encounter performance bottlenecks during periods ofhigh activity. You can use this section of the report to identify suchbottlenecks and to measure how effectively you have resolved thoseproblems. See Chapter 4 for further details on I/O distribution across files.

Followingthe I/O statistics, the report lists the buffer cache statistics bypool (DEFAULT, KEEP, and RECYCLE), instance recovery statistics (thenumber of redo blocks), and the buffer pool advisory. The buffer pooladvisory shows an estimated physical read factor so you can judge howmuch impact increasing the data block buffer cache will have on thenumber of physical reads required. The buffer pool section is followedby an advisory on the PGA aggregate settings.

The next sections of the report show the buffer statistics related to waits during write events:

Tot Wait    Avg
Class Waits Time (s) Time (ms)
------------------ ----------- ---------- ---------
data block 119 0 1
undo header 86 0 0
undo block 6 0 0

Ingeneral, using Automated Undo Management will reduce or eliminate waitsfor undo headers and undo blocks. Data block waits may be reduced byusing asynchronous I/O or by improving the efficiency of the I/Oenvironment. In this example, a few milliseconds are being spentwaiting for data blocks.

Thenext section of the report lists the enqueue waits, including TX locksfor individual rows, TC locks for thread checkpoints, and others.Enqueue waits can be caused by application architecture decisions anddatabase configuration issues, so evaluating their cause may requireadditional testing.

Afterthose sections, the report provides undo segment statistics. First, itlists the activity in the undo segments (writes, wraps, shrinks,extends) and the waits encountered. Following that, the report showsthe number of undo blocks written per time interval.

Latchactivity and dictionary cache statistics are then presented, followedby the library cache activity. If your "Pct Miss" value is high, youmay need to improve cursor sharing in your application or increase thesize of the shared pool.

Ashared pool advisory is displayed in the next section of the report. Inthe advisory, Oracle estimates the number of library object hitsexpected in each incremental increase in the size of the shared pool.

Followingan SGA memory summary (from V$SGA) and a listing of the memory changesduring the snapshot interval, the report lists the databaseinitialization parameters in use at the beginning and end of the report.

Takenas a whole, the report generates a significant amount of data, allowingyou to develop a profile of the database and its usage. Based on theinitialization, file I/O, and SGA data, you can develop anunderstanding of the major components in the database configuration.Because this report generates so much data, you should be careful notto generate more statistics than you plan to use. The next section ofthis chapter addresses the management of the gathered data.

9.4. Managing the STATSPACK Data

Youshould manage the data generated by STATSPACK to guarantee that thespace usage and performance of the STATSPACK application meets yourrequirements as the application data grows. Managing STATSPACK dataincludes the following steps:

  1. Regularly analyze the STATSPACK data. At a minimum, you should analyze the STATSPACK table prior to running the spreport.sql report:

    execute DBMS_UTILITY.ANALYZE_SCHEMA('PERFSTAT','COMPUTE');

  2. Purge old data. Because you cannot generate valid interval reports across database shutdown/startup actions, data prior to the last database startup may not be as useful as the most current data. When the data is no longer needed, purge it from the tables. Oracle provides a script, sppurge.sql, to facilitate purges. The sppurge.sql script, located in the /rdbms/admin directory under the Oracle software home directory, lists the currently stored snapshots and prompts you for two input parameters: the beginning and ending snapshot numbers for the purge. The related records in the STATS$ tables will then be deleted. The sppurge script prompts you to back up your old statistics before purging them. You can back up the data by exporting the PERFSTAT schema. You may wish to keep old statistics for baseline measurements.

  3. Truncate the STATSPACK tables when the data is not needed. Old statistical data may no longer be relevant, or you may have imported the old statistics during database migrations or creations. To truncate the old tables, execute the sptrunc.sql SQL*Plus script from within the PERFSTAT account. The script is located in the /rdbms/admin directory under the Oracle software home directory.

9.5. Deinstalling STATSPACK

BecauseSTATSPACK includes public synonyms as well as private objects, youshould remove the application via a SYSDBA privileged account. Oracleprovides a script, spdrop.sql, to automate the deinstallation process.From within the /rdbms/admin directory under the Oracle software homedirectory, log into SQL*Plus and execute the script as shown here:

SQL> connect system/manager as SYSDBA
SQL> @spdrop

Thespdrop.sql script calls the scripts that drop the tables, package,public synonyms, and the PERFSTAT user. To reinstall STATSPACK, executethe spcreate.sql script as shown earlier in the chapter.