INSIDE THE AWR REPORT: HOW TO ANALYZE
Part 2.
By Donatien MBADI OUM, OCP, OCE, Oracle Instructor
1. Reminder
of Part 1
A Database operation is one or more SQL or PL/SQL
statements executed by one or more sessions. Those operations can generate a
lot of problems related to the performance. Problems can arise at any time. A
proactive DBA watches for problems and corrects them before they are noticed by
users.
Oracle have provided many performances gathering and
reporting tools over the years. Oracle8i introduced the Statpack functionality
which Oracle9i extended.
From 10g, Oracle Database automatically collects and
stores workload information in the Automatic Workload Repository (AWR).
In part 1 article, we have presenting the concepts of
AWR components such as snapshots, baselines and statistics. We have also
presented the different types of AWR reports used for standalone or RAC
instances and the related scripts if you are using SQL*Plus or Enterprise
Manager or other tools such as Toad and SQL Developer.
In this article, we will describe the different parts
of AWR reports. To start with let’s mention some high-level important tips
regarding AWR.
-
Collect
Multiple AWR Reports: It’s
beneficial to have two AWR Reports, one for the good time and other when
performance is poor or you can create three reports (Before/Meantime/After
reports) during the time frame problem was experienced and compare it with the
time frame before and after.
-
Stick
to Particular Time: You
must have a specific time when Database was slow so that you can choose a
shorter timeframe to get a more precise report.
-
Split
Large AWR Report into Smaller Reports: Instead of having one report for long time, like one
report for 3 hours, it is better to have 3 reports for one hour. This will help
to isolate the problem.
-
For
RAC, take each instance’s individual report: For RAC environment, you need to generate the report
separately of all the instances in the RAC. This helps to see if all instances
are balanced the way they should be.
-
Use
ASH also: Use AWR to identify
the troublesome areas and then use ASH to confirm those areas.
-
Increase
the retention period: Some
instances where you get more performance issues, you should increase the
retention time so that you can have more historical data to compare.
2. Important
AWR Scripts
-
AWR
(Automatic Workload Repository) Report for single instance: awrrpt.sql
(the most used)
-
AWR
Global Report for RAC database: awrgrpt.sql
-
ADDM
(Automatic Database Diagnostic Monitor) Report: addmrpt.sql
-
AWR
Single SQL Statement Report: awrsqrpt.sql
-
AWR
Diff Report: awrddrpt.sql
-
ASH
(Active Session History) Report: ashrpt.sql
-
AWR
Global Diff Report: awrgdrpt.sql
3. AWR
Report Generation
-
Log In
to the database
-
Run
the command to generate the report SQL>@$ORACLE_HOME/rdbms/admin/awrrpt.sql
then you will see below output:
On this screen, you will see the DBID, Db name,
Instance Number, and the instance name. You will choose the type of report
format; HTML is the default format. If you click ‘enter’ without giving
any input, then by default ‘html’ format report will be generated.
…….
…….
…….
The next input which you need to provide is the number
of days (i.e. num_days) to view the list of the snapshots generated. The result
will provide all snapshot IDs and details for the given number of
days. In the example, ‘2’ is given as an input to view all the list of
snapshots generated in 2 days.
After getting the list of the snapshots, you need to
choose the period for which you want to generate the report. The ‘Snap Started’
column shows the timestamp of snapshot and the associated snapshot ID is
present under ‘Snap Id’ column; in the same row. You need to select the Snap Id
from the list and provide as an input for ‘Enter value for begin_snap’.
Note: You
have to choose a specific Snap Id which correspond to the period you need to
analyze. For example, if the performance issue or particular database load
occur from 4:00AM and 7:00AM, you can choose the begin_snap as 1699.
Then press Enter
Now, you need to specify the end_snap. In this step,
you have to remember the best practice which specify that, instead to choose a
large period to analyze, it advisable to generate multiple AWR reports.
In our case, we will choose 1700 as end_snap en press
Enter. We can generate other AWR from 1700 to 1701 and from 1701 to 1702.
Now you need to enter your report name. You can
see the default suggested name on the prompt comprises of
‘awrrpt’ keyword, index number, begin snap ID, end snap ID and format of
the report which are separated by the underscore (_). If you want, then
you can provide the name which is well suited for the report else you can press
enter.
The report is generated on the database server from
where you ran the script. If the report is generated, enter exit and then copy
the report to your computer.
4. Steps
to analyze AWR Report
Now, let us walkthrough the report. Since, AWR report
is a huge report and area to look into AWR is also depends on problem.
Each section of the AWR report contains useful
information. However, the following sections are the ones that are most frequently
looked:
-
Report
Header:
This contains information about the Database and environment. Along with the
snapshot IDs and times.
-
Report
Summary:
Top N Timed Foreground Events and Host/Instance CPU
-
Load
Profile: This section contains information about Average
active sessions, DB CPU, logical and physical reads, user calls, executions,
parses, hard parses, logons, rollbacks, transactions
-
Wait
Event Statistics:
Foreground Wait Events and Wait Event Histogram
-
SQL
Statistics:
SQL ordered by Elapsed Time, CPU Time, Gets, Reads (SQL Statistics), Complete
List of SQL Text
-
Advisory
Statistics:
Buffer Pool Advisory, PGA Memory Advisory, Shared Pool Advisory, SGA Target
Advisory
-
Segment
Statistics:
Segments by Logical Reads, Physical Reads, Physical Writes, DB Block Changes,
Row Lock Waits, ITL Waits, Buffer Busy Waits
4.1. Summary statistics
4.1.1. Report
Header
This contains information about the Database and
environment. Along with the snapshot Ids and times. Important thing to notice
is that the configuration like CPU and Memory has not changed when the
performance is degraded.
If the report shows RAC=YES meaning that it’s a RAC
(Real Application Cluster) database. You also have information about the Database
role (In case of Dataguard configuration). If The report shows CDB=YES, it’s a
Container Database.
The table below give you the significance of Report
Header components:
PARAMETER |
DESCRIPTION |
ANALYSIS |
DB TIME |
Time spent in database during the Elapsed Time |
DB TIME > Elapsed Time will mean that the
sessions were active on database concurrently |
DB TIME/ELAPSED => 4,232.93/60.3 = 70.1978 |
||
So, database load (average active sessions) =
70.1978 |
||
It means that Maybe 70 users were active on
database for Elapsed time. |
||
or |
||
May be 140 users were active for Elapsed
Time/2 each |
||
or |
||
May be 210 users were active for Elapsed
Time/4 each… |
||
If DB Time has higher value means DB Activity/Sessions
were High during the AWR Time. |
||
This
means that for every minute of Elapsed time there is 70 minutes of work in
done in the database |
||
|
||
ELAPSED
TIME |
The time duration in which this AWR report has
been generated. |
Elapsed time should contain the issue
duration. |
Take manual snapshots if required |
||
CPUs |
Thread count per core. It is not “actual” CPU. |
|
STARTUP
TIME |
Database
startup time |
|
RAC/Instances |
|
If you have more than one node then take AWR
from all nodes if you don’t know issues are happening in which node. In this
case we have RAC with 4 instances |
Host
Configuration |
This will give you name, platform CUP, socket
and RAM etc. |
Important thing to notice is number of cores
into the system. In this example there are 36 cores into 2 single thread
Socket each and 500GB of memory on the node. |
4.1.2. Top 10
Foreground Events by Total Wait Time
The Top 10
Foreground Events by Total Wait Time section is a very good starting point for
diagnosing performance problems. The purpose of the first section is to
highlight the most significant issue. It shows those database events that might
constitute the bottleneck for the system.
PARAMETER |
DESCRIPTION |
ANALYSIS |
DB CPU |
Time running in CPU (waiting in
run-queue not included) |
Here 31.6% is the %DB Time for this Event
which is acceptable |
DB Time was 4,232.93 minutes and DB CPU is
1337.6 minutes) |
||
Sum of %DB Time |
|
The sum should be approx. 100%. If it is
way below 100% then it may mean that wait events were irrelevant OR
Server is overloaded. In our case it's around 70% |
enq TX – row lock contention
|
Waited for locked rows |
|
|
||
Db file sequential read |
This event generally indicates a single block
read (an index read, for example). A large number of waits here could
indicate poor joining orders of tables, or unselective indexing. It is normal
for this number to be large for a high-transaction, well-tuned system, but it
can indicate problems in some circumstances. You should correlate this wait
statistic with other known issues within the Statspack report, such as
inefficient SQL. |
|
|
||
Db file scattered read |
This generally indicates waits related to full
table scans. As full table scans are pulled into memory, they rarely fall
into contiguous buffers but instead are scattered throughout the buffer
cache. A large number here indicates that your table may have missing or
suppressed indexes. |
|
|
||
|
||
enq: UL – contention: |
This enq wait occurs when application
explicitly locks by executing the lock table command. |
|
enq: TM – contention |
This usually happens due to a missing
foreign key constraint on a table that’s part of a DML operation. |
|
A high level of DB
CPU is not usually a reason for concern, unless it is accompanied by a high
level of CPU Usage a few SQLs with high CPU Time (and potentially sub0optimal
execution plans).
Look at the Host
CPU and Instance CPU sections. The key things to look for are the values “%
Idle” in the “Host CPU” section and “% Total CPU” in the “Instance CPU”
section.
If the “% Idle” is
low and “% Total CPU” is high the instance could have a bottleneck in CPU.
Otherwise, the high DB CPU usage just means that the database is spending a lot
of time in CPU (processing) compared to I/O and other events.
In either case
(CPU is a bottleneck or not) there could be individual expensive SQLs with high
CPU time, which could indicate sub-optimal execution plans, especially I
accompanied with high buffer gets.
However, the waits
events can give us an indication of performance issues. The performance issues
could be due to poor executions plans on individual SQLs, which can be resolves
by SQL tuning, or they could be general performance issues that can be resolved
by changes to the Database or System configuration or System resources.
Here, first check
for waits class if wait class is User I/O, System I/O, Others etc. This could
fine but if wait class has the value “Concurrency”, There could be serious
problem. Next to look at is Total Wait Time (sec) which show how many times DB
was waiting in this class and then Wait Avg (ms). If Total Wait Time (sec) are
high but Wait Avg (ms) is low, then you can ignore this. If both are high or Wait
Avg(ms) is high, then this has to further investigate.
The direct path
write temp wait event occur when a process is writing buffers directly from PGA
to the temporary files by the shadow Oracle Process.
Db file sequential
read signifies that the user process is reading buffers in the SGA buffer cache
and waiting for a physical I/O call to return.
4.1.3. Load
Profile
Here in load profile (average
active sessions, DB CPU, logical and physical reads, user calls, executions,
parses, hard parses, logons, rollbacks, transactions). Check if the numbers are
consistent with each other and with general database profile (OLTP/DWH/mixed)
·
Pay most attention to physical reads, physical
writes, hard parse to parse ratio and executes to transaction ratio.
·
The ratio of hard parses to parses tells you how
often SQL is being fully parsed. Full parsing of SQL statements has a
negative effect on performance.
·
High hard parse ratios (>2 – 3 percent)
indicate probable bind variable issues or maybe versioning problems.
·
Rows per sort can also be reviewed here to see
if large sorts are occurring.
·
This section can help in the load testing for
application releases. You can compare this section for the baseline as well as
high load situation.
PARAMETER |
DESCRIPTION |
ANALYSIS |
Redo Size (Bytes) |
The main sources
of redo are (in roughly descending order): INSERT, UPDATE and DELETE. For
INSERTs and UPDATE s |
Not very scary
number in our report, but we are on a Datawarehouse Database with 300TB
size. 10,595,298.0 bytes/sec around
10MB/sec |
High redo figures
mean that either lots of new data is being saved into the database, or
existing data is undergoing lots of changes. |
||
How high is high?
Databases are not created equal, so there is no universal standard. However,
I find it useful multiplying redo per second by 86,400 (number of seconds
there are in a day) and compare it to the size of the database — if the
numbers are within the same order of magnitude, then this would make me
curious. Is the database doubling in size every few days? Or is it modifying
almost every row on a daily basis? Or maybe there is something going on that
I don’t know about? |
||
Just keep an eye
open for any suspicious DML activity. Any unusual statements? Or usual
statements processed more usual than often? Or produce more rows per
execution than usual? Also, be sure to take a good look in the segment’s
statistics section (segments by physical writes, segments by DB block changes
etc.) to see if there are any clues there. |
||
DB CPU |
It’s the amount of CPU time spent on user
calls. Same as DB time it does not include background process. The value is
in microseconds |
We have 36 CORES per socket and so we can
potentially use 72 seconds of CPU time per second. In this case DB CPU
(s) : 22.3 (per second) is reporting that the system is using 22.3 seconds of
CPU of the potential 72 seconds/second that it can use.We are not CPU Bound |
LOGICAL READS |
Logical Reads = Consistent gets + db block
gets |
Logical and physical reads combined
shows measure of how many IO’s (Physical and logical) that the database is
performing. |
As a process, Oracle will try to see if the
data is available in Buffer cache i.e., SGA? |
If this is high go to section “SQL by logical
reads”. That may help in pointing which SQL is having more logical reads. |
|
If it does, then logical read increases to 1. |
|
|
To explain a bit further, if Oracle gets the data
in a block which is consistent with a given point in time, then a counter
name “Consistent Gets” increases to 1. |
|
|
But if the data is found in current mode, that
is, the most up-to-date copy of the data in that block, as it is right now or
currently then it increases a different counter name “db block Gets”. |
|
|
Therefore, a Logical read is calculated as =
Total number of “Consistent Gets” + Total number of “db block gets”. |
|
|
These two specific values can be observed in
‘Instance Activity Stats’ section. |
|
|
USER QUERIES |
Number of user queries generated |
|
PARSES |
The total of all parses, hard and soft |
|
HARD PARSES |
The parses requiring a completely new parse of
the SQL statement. These consume both latches and shared pool area. |
How much hard parsing is acceptable? |
A hard parse occurs when a SQL statement is
executed and is not currently in the shared pool. A hard parse rate greater
than 100/second could indicate that bind variables are not being used
effectively; the CURSOR_SHARING initialization parameter should be used; or
you have a shared pool–sizing problem. |
||
If you suspect that excessive parsing is
hurting your database’s performance: |
||
1) check “time model statistics” section (hard
parse elapsed time, parse time elapsed etc.) |
||
2) see if there are any signs of library cache
contention in the top-5 events |
||
3) see if CPU is an issue. |
||
Soft Parses: |
Soft parses are not listed but derived by
subtracting the hard parses from parses. A soft parse reuses a previous hard
parse; hence it consumes far fewer resources. |
|
Physical Reads |
But if it Oracle does not find the data
in buffer cache, then it reads it from physical block and increases then
Physical read count to 1. Clearly, buffer get is less expensive than physical
read because database has to work harder (and more) to get the data. Basically,
time it would have taken if available in buffer cache + time actually taken
to find out from physical block. |
If this is high go to section “SQL by
Physical reads”. That may help in pointing which SQL is having
more Physical reads. In our case, we have 2,471,197.5 physical
reads/second and we notice that most of them are coming from Statistics
gathering and database link from another database. |
User Calls |
number of calls from a user process into the
database – things like “parse”, “fetch”, “execute”, “close” |
This is an extremely useful piece of
information, because it sets the scale for other statistics (such as commits,
hard parses etc.). |
In particular, when the database is executing
many times per a user call, this could be an indication of excessive context
switching (e.g. a PL/SQL function in a SQL statement called too often because
of a bad plan). In such cases looking into “SQL ordered by executions” will
be the logical next step. |
||
Logons |
logons – really means what it means. Number of
logons |
Establishing a new database connection is also
expensive (and even more expensive in case of audit or triggers). “Logon
storms” are known to create very serious performance problems. If you suspect
that high number of logons is degrading your performance, check “connection
management elapsed time” in “Time model statistics”. |
Sorts |
Sort operations consume resources. Also,
expensive sorts may cause your SQL fail because of running out of TEMP
space. So obviously, the less you sort, the better (and when you
do, you should sort in memory). However, I personally rarely find sort
statistics particularly useful: normally, if expensive sorts are hurting your
SQL’s performance, you’ll notice it elsewhere first. |
|
DB Time |
|
average number of active sessions is
simply DB time per second. |
Block Changes |
Number of blocks modified during the
sample interval |
|
4.1.4. Instance Efficiency Percentages
Always minimize the number of Hard parses. This
reduction yields the benefits of minimizing CPU overhead spent performing
costly parse work. Every ratio here should reach 100%
PARAMETER |
DESCRIPTION |
ANALYSIS |
In memory sort % |
Shows % of times Sorting operations happened
in memory than in the disk (temporary tablespace). |
In Memory Sort being low (in the high 90s or
lower) indicates PGA_AGGREGATE_TARGET or SORT_AREA_SIZE issues |
soft parse % |
Shows % of times the SQL in shared pool is used. Shows
how often sessions issued a SQL statement that is already in the shared pool
and how it can use an existing version of that statement. |
Soft Parsing being low indicates bind variable
and versioning issues. With 97.39 % for the soft parse meaning that
about 2.61 % (100 – soft parse) is happening for hard parsing. Quite Low hard
parse is good for us. |
% Non-Parse CPU |
Oracle utilizes the CPU mostly for
statement execution but not for parsing. |
If this value is near 100% means most of the
CPU resources are used into operations other than parsing, which is good for
database health. 97.88 is too bad in our case. |
Execute to Parse % |
Shows how often parsed SQL statements are
reused without re-parsing. |
The way this ratio is computed, it will
be a number near 100 percent when the application executes a given SQL
statement many times over but has parsed it only once. |
If the number of parse calls is near the
number of execute calls, this ratio trends towards zero. |
||
If the number of executes increase while parse
calls remain the same this ratio trends up. |
||
When this number is low, parsing is consuming
CPU and shared pool latching. |
||
Parse CPU to Parse Elapsed % |
Gives the ratio of CPU time spent to parse SQL
statements. |
If the value is low, then it means that there
could be a parsing problem. You may need to look at bind variable issues or
shared pool sizing issue. If low, it also means some bottleneck is there
related to parsing. We would start by reviewing library cache contention and
contention in shared pool latches. You may need to increase the shared pool.
7.87 in our case is too bad value. |
Buffer Hit % |
Measures how many times a required block was
found in memory rather than having to execute an expensive read operation on
disk to get the block. |
|
Buffer Nowait% |
Indicates % of times data buffers were
accessed directly without any wait time. |
This ratio relates to requests that a server
process makes for a specific buffer. This is the percentage of those requests
in which the requested buffer is immediately available. All buffer types are
included in this statistic. If the ratio is low, check the Buffer Wait
Statistics section of the report for more detail on which type of block is
being contended. Most likely, additional RAM will be required. |
Library Hit% |
Shows % of times SQL and PL/SQL found in
shared pool. |
Library hit % is great when it is near
100%. If this was under 95% we would investigate the size of the shared pool. |
|
||
In this ration is low then we may need to: |
||
• Increase the SHARED_POOL_SIZE init
parameter. |
||
• CURSOR_SHARING may need to be set to FORCE. |
||
• SHARED_POOL_RESERVED_SIZE may be too small. |
||
• Inefficient sharing of SQL, PLSQL or JAVA
code. |
||
• Insufficient use of bind variables |
||
Latch Hit % |
Shows % of time latches are acquired without
having to wait. |
If Latch Hit % is <99%, you may have
a latch problem. Tune latches to reduce cache contention |
Redo Nowait% |
Shows whether the redo log buffer has
sufficient size. |
|
4.2.
Wait Event Statistics
4.2.1. Time
Model Statistics
This is a detailed
explanations of system resource consumptions. Stats are order by Time (s) and
(%) of DB Time. Time model statistics tell you the processing time spent on
various metrics during the snapshot’s interval.
Sometimes Sum of
all % of DB Time may be > 100%, because this is cumulative time, i.e SQL
execute elapsed time may show more DB time, which includes it sub parts like
parse time elapsed. Hard parse elapsed time etc. So, if you find Hard parse
time elapsed is taking more %.
PARAMETER |
DESCRIPTION |
ANALYSIS |
SQL EXECUTE ELAPSED TIME |
Time spent executing the SQL Statement |
Out of all the DB Time, which is 180,752.18
seconds, 71.17% of time, 128,641.33 seconds, database is executing the SQL
query so our attention will be to find out what all SQLs took so much of DB
Time |
DB CPU |
DB CPU represents time spent on CPU
resource by foreground user processes. This time doesn’t include waiting time
for CPU. |
DB time and DB CPU define two important
timescales: |
|
||
wait times should be measured against the DB
TIME, |
||
while CPU consumption during certain activity
(e.g. CPU time parsing) should be measured against DB CPU. |
||
|
||
Above % showing for DB CPU may not be the only
% to focus on. You should find below number and then see what is DB CPU
usage. |
||
|
||
DB CPU usage (% of CPU power available) = |
||
CPU time / NUM_CPUS / elapsed time |
||
|
||
Where NUM_CPUS is found in the Operating
System statistics section. |
||
Of course, if there are other major CPU users
in the system, the formula must be adjusted accordingly. To check that, look
at OS CPU usage statistics either directly in the OS. |
||
Parse Time Elapsed |
“Parse time elapsed” represents time
spent for Syntax and Semantic checks. |
|
Hard parse elapsed time |
“Hard parse include time” represents
time spent for Syntax and Semantic checks PLUS time spent for optimizing the
SQL and generating optimizer plan. |
|
% DB Time |
In the time model statistics hierarchy,
a child statistic may be counted under more than one parent and that is why
the sum of the percentages equal more than 100 |
|
soft parse |
can be get by subtracting parse time
from hard parse |
|
4.2.2.
Foreground
Wait Events
This is useful because there could be time
consuming report wait events that do not appear in the “Top N Timed Foreground
Events”
PARAMETER |
DESCRIPTION |
ANALYSIS |
SQL*Net Message from client |
Idle wait event |
We can find the number of average inactive
sessions by this wait event |
This doesn’t mean user sessions as such but
the number of such connections from Application Server connection pool. |
||
Direct path read/write to temp |
|
Shows excessive sorting/hashing/global
temp table/bitmap activity going to your temporary tablespace. Review
PGA_AGGREGATE_TARGET settings. Even if it looks like it is big enough, if you
aregetting multiple small sorts to disk it could mean your user load is
over-utilizing it. |
SQL*Net Message to client |
|
SQL*Net message to client waits almost
always indicates network contention. |
SQL*Net more data from client |
|
If it is very low then it indicates that
the Oracle Net session data unit size is likely set correctly. |
Db file sequential reads |
A high number of waits for this event
indicates possible problems with join operation of SQLs or invocation of non-selective
indexes. This wait is caused by many single blocks reads |
Usually indicates memory starvation,
look at the db cache analysis and for buffer busy waits along with cache
latch issues. |
Db file scattered reads |
This wait is
usually caused by full table scans. This is normal for DSS (Decision Support
System) but is critical for OLTP (OnLine Transaction Processing). The DBA
should the caching of small tables to eliminate file reads. |
Usually indicates excessive full table
scans, look at the AWR segment statistics for tables that are fully scanned |
Buffer Busy: |
This wait is
caused by concurrent access to buffers in the Buffer Cache. This statistic should
be correlated with the Buffer Waits Section of the AWR report. |
|
Log file Sync |
|
Log file related waits: Look at excessive
log switches, excessive commits or slow IO subsystems. |
Free Buffer |
This wait
indicates that Oracle waited many times for a free buffer in the Buffer
Cache. This could by caused by small size of Buffer Cache or large number of
read which populated the buffer cache with unnecessary data. |
|
4.3. Advisory
Statistics
4.3.1.
Buffer
Pool Advisory
This section of
the AWR report shows the estimates from the buffer pool advisory, which are
computed based on I/O activity that occurred during the snapshot interval.
The advisory section is read row by row. You start from the row where third
field “Size Factor” =1.0.
Rows with ‘Size Factor’=1.0 show approximate actual size of the buffer
cache during the time interval of the AWR report. The size itself is shown in
the 2nd column ‘Size for Estimate(M)’: 1 024 MB. It has been allocated either
automatically if SGA_TARGET>0 or manually.
The sixth field of this row ‘Estimated Physical read’ shows estimation of
physical reads for 1 082 MB size of default buffer cache. Note also ‘Est
Physical Read Factor’=1.00 in the same row.
PARAMETER |
DESCRIPTION |
ANALYSIS |
First Parameter “P” |
Apart from default buffer cache – pool (or sub
pool) which is always present, buffer cache may have other sub pools. |
|
Buffer Cache Advisory section will then have
separate subsection for each of those sub pools distinguished from others by
a letter in the very left column of the section as follows: |
||
‘D’ – Default buffer cache (always present), |
||
‘K’ – Keep buffer cache (if db_keep_cache_size
parameter is defined), |
||
‘R’ – Recycle buffer cache (if
db_recycle_cache_size parameter is defined), |
||
<N> – Caches for non-default block sizes
(if defined with parameters db_<N>k_cache_size) |
||
Size Factor |
Changing ‘Size Factor’ shows ratio of
the proposed size of the buffer cache (increased or decreased) to
the approximate actual size currently in use found in the row with ‘Size
Factor’ = 1.0. |
|
Estimated Phys Read Factor |
Changing ‘Estimated Phys Read Factor’
shows ratio of the estimated number of Physical Reads for the proposed
(increased or decreased) size of the buffer cache to the number of Physical
Reads calculated for the current size of buffer cache found in
the row with ‘Estimated Phys Read Factor’ = 1.0. |
In our case, we can see that if we increase
the buffer cache from 32 to 64GB, it is not really helping use significantly |
4.3.2.
PGA
Memory Advisory
The preferred and
easiest way of monitoring and setting the PGA_AGGREGATE_TARGET instance
parameter is by examining the 'PGA Memory Advisory' section in an AWR.
PARAMETER |
DESCRIPTION |
ANALYSIS |
Size Factr |
Indicates the size factor of the PGA estimates |
A value of 1 indicates the current PGA size |
PGA Target Est (MB) |
Show PGA size |
The column Size Factr indicates the size
factor of the PGA estimates; a value of 1 indicates the current PGA size. The
'PGA Target Est (MB)' value of this row will show your current PGA size: 51GB |
Estd Extra W/A MB Read/Written to Disk |
When you go down or up the advisory section
from the row with 'Size Factr' = 1.0, you get estimates for Disk usage -
column 'Estd Extra W/A MB Read/Written to Disk ' - for bigger or smaller
settings of PGA_AGGREGATE_TARGET. The less Disk usage figure in this column,
usually the better. A lower value means less work areas have to be spilled to
disk, enhancing performance of the Oracle instance. |
|
Estd PGA Overalloc Count |
Shows how many times the database instance
processes would need to request more PGA memory at the OS level than the amount
shown in the ‘PGA Target Est (MB)’ value of the respective row. Ideally this
field should be 0 (indicating that the PGA is correctly sized, and no
overallocations should take place), and that is your equally important second
goal. In the given example this goal is achieved with PGA_AGGREGATE_TARGET of
even 12 800MB. |
|
So, our PGA allocation is way too high. |
4.3.3.
Shared
Pool Advisory
The shared pool
advisory report provides assistance in right sizing the Oracle shared pool.
Much like the PGA Memory Advisor or the Buffer Pool advisory report, it
provides some insight into what would happen should you add or remove memory
from the shared pool. This can help you reclaim much needed memory if you have
over allocated the shared pool and can significantly improve performance if you
have not allocated enough memory to the shared pool.
PARAMETER |
DESCRIPTION |
ANALYSIS |
SP Size Factr |
Indicates Shared Pool size factor |
Starting point here is “SP Size Factor” = 1.0.
This gives current memory allocation for shared pool. |
Shared Pool Size |
Indicates Shared Pool size |
In this case approx 112 GB is being allocated
to shared pool (corresponding to SP Size Factor=1.0) |
EST LC TIME SAVED |
LC means LIBRARY CACHE |
You have to see that if increase your
shared pool then what is the amount of this time that you can save (Not
really applicable in our case) |
4.3.4.
SGA
Target Advisory
The SGA target advisory report is somewhat of a
summation of all the advisory reports previously presented in the AWR report.
It helps you determine the impact of changing the settings of the SGA target
size in terms of overall database performance. The report uses a value called
DB Time as a measure of the increase or decrease in performance relative to the
memory change made. Also, the report will summarize an estimate of physical
reads associated with the listed setting for the SGA.
Starting at a
“Size Factor” of 1 (this indicates the current size of the SGA). If the “Est DB
Time (s)” decreases significantly as the “Size Factor” increases, then
increasing the SGA will significantly reduce the physical reads and improve
performance. but here in our example the Est DB Time is not reducing as much
with increase in SGA so increasing SGA in our case will not be beneficial.
When the SQL
requires a large volume of data access, increasing the SGA_TARGET size can
reduce the amount of disk I/O and improve the SQL performance.
4.4. Segment
Statistics
4.4.1.
Segments by Logical Reads
The statistic displays segment details based on
logical reads happened. Data displayed is sorted on “Logical Reads” column in
descending order. It provides information about segments for which more logical
reads are happening. Most of these SQLs can be found under section SQL
Statistics -> SQL ordered by Gets.
These reports
can help you find objects that are “hot” objects in the database. You may want
to review the objects and determine why they are hot, and if there are any
tuning opportunities available on those objects (e.g. partitioning), or on SQL
accessing those objects.
When the
segments are suffering from high logical I/O, those segments are listed here.
When the table has high logical reads and its index has relatively small
logical reads, there is a high possibility some SQL is using the index
inefficiently, which is making a throw-away issue in the table. Find out the
columns of the condition evaluated in the table side and move them into the
index. When the index has high logical reads, the index is used excessively
with wide range. You need to reduce the range with an additional filtering
condition whose columns are in the same index.
If a SQL is
suboptimal then this can indicate the tables and indexes where the workload or
throwaway occurs and where the performance issue lies. It can be particularly
useful if there are no actual statistics elsewhere (e.g., Row Source Operation
Counts (STAT lines) in the SQL Trace or no actuals in the SQLT/Display Cursor
report).
4.4.2.
Segments
by Physical Reads
If there are a high number of physical read
waits (db file scattered read, db file sequential read and direct path read)
then this section can indicate on which segments (tables or
indexes) the issue occurs.
This can help identify suboptimal execution
plan lines. It can also help identify changes to tablespace and storage
management that will improve performance.
When the SQLs need excessive physical reads on
the particular segments, this section lists them. You need to check if some of
SQLs are using unnecessary full scan and wide range scan.
The statistic displays segment details based on
physical reads happened. Data displayed is sorted on “Physical Reads” column in
descending order. It provides information about segments for which more
physical reads are happening.
Queries using these segments should be analyzed
to check whether any FTS is happening on these segments. In case FTS is
happening then proper indexes should be created to eliminate FTS. Most of these
SQLs can be found under section SQL Statistics -> SQL ordered by Reads.
These reports can help you find objects
that are “hot” objects in the database. You may want to review the objects and
determine why they are hot, and if there are any tuning opportunities
available on those objects (e.g. partitioning), or on SQL accessing those
objects.
For example, if an object is showing up on the
physical reads report, it may be that an index is needed on that object.
Its great article. very much apprecaite from Nepal
RépondreSupprimer