![]() | ![]() |
Home |
|
|
Performance and Tuning Guide Volumes 1 - 3 (Online Only) |
|
| Chapter 20 Advanced Optimizing Tools |
Chapter 20
This chapter describes query processing options that affect the optimizer's choice of join order, index, I/O size and cache strategy.
Being familiar with the information presented in the Basics volume helps to understand the material in this chapter. Use caution, as the tools allow you to override the decisions made by Adaptive Server's optimizer and can have an extreme negative effect on performance if misused. You should understand the impact on the performance of both your individual query and the possible implications for overall system performance.
Adaptive Server's advanced, cost-based optimizer produces excellent query plans in most situations. But there are times when the optimizer does not choose the proper index for optimal performance or chooses a suboptimal join order, and you need to control the access methods for the query. The options described in this chapter allow you that control.
In addition, while you are tuning, you may want to see the effects of a different join order, I/O size, or cache strategy. Some of these options let you specify query processing or access strategy without costly reconfiguration.
Adaptive Server provides tools and query clauses that affect query optimization and advanced query analysis tools that let you understand why the optimizer makes the choices that it does.
This chapter suggests workarounds for certain optimization problems. If you experience these types of problems, please call Sybase Technical Support.
Adaptive Server lets you specify these optimization choices by including commands in a query batch or in the text of the query:
The order of tables in a join
The number of tables evaluated at one time during join optimization
The index used for a table access
The I/O size
The cache strategy
The degree of parallelism
In a few cases, the optimizer fails to choose the best plan. In some of these cases, the plan it chooses is only slightly more expensive than the "best" plan, so you need to weigh the cost of maintaining forced options against the slower performance of a less than optimal plan.
The commands to specify join order, index, I/O size, or cache strategy, coupled with the query-reporting commands like statistics io and showplan, can help you determine why the optimizer makes its choices.
Warning!
Use the options described in this chapter with caution. The forced query plans may be inappropriate in some situations and may cause very poor performance. If you include these options in your applications, check query plans, I/O statistics, and other performance data regularly.
These options are generally intended for use as tools for tuning and experimentation, not as long-term solutions to optimization problems.
Adaptive Server optimizes join orders to minimize I/O. In most cases, the order that the optimizer chooses does not match the order of the from clauses in your select command. To force Adaptive Server to access tables in the order they are listed, use:
set forceplan [on|off]
The optimizer still chooses the best access method for each table. If you use forceplan, specifying a join order, the optimizer may use different indexes on tables than it would with a different table order, or it may not be able to use existing indexes.
You might use this command as a debugging aid if other query analysis tools lead you to suspect that the optimizer is not choosing the best join order. Always verify that the order you are forcing reduces I/O and logical reads by using set statistics io on and comparing I/O with and without forceplan.
If you use forceplan, your routine performance maintenance checks should include verifying that the queries and procedures that use it still require the option to improve performance.
You can include forceplan in the text of stored procedures.
set forceplan forces only join order, and not join type. There is no command for specifying the join type; you can disable merge joins at the server or session level.
See "Enabling and disabling merge joins" for more information.
Forcing join order has these risks:
Misuse can lead to extremely expensive queries. Always test the query thoroughly with statistics io, and with and without forceplan.
It requires maintenance. You must regularly check queries and stored procedures that include forceplan. Also, future versions of Adaptive Server may eliminate the problems that lead you to incorporate index forcing, so you should check all queries using forced query plans each time a new version is installed.
Before you use forceplan:
Check showplan output to determine whether index keys are used as expected.
Use dbcc traceon(302) to look for other optimization problems.
Run update statistics on the index.
Use update statistics to add statistics for search arguments on unindexed search clauses in the query, especially for search arguments that match minor keys in compound indexes.
If the query joins more than four tables, use set table count to see if it results in an improved join order.
See "Specifying the number of tables considered by the optimizer".
Adaptive Server optimizes joins by considering permutations of two to four tables at a time, as described in "Costing and optimizing joins". If you suspect that an inefficient join order is being chosen for a join query, you can use the set table count option to increase the number of tables that are considered at the same time. The syntax is:
set table count int_value
Valid values are 0 though 8; 0 restores the default behavior.
For example, to specify 4-at-a-time optimization, use:
set table count 4
dbcc traceon(310) reports the number of tables considered at a time. See "dbcc traceon(310) and final query plan costs" for more information.
As you decrease the value, you reduce the chance that the optimizer will consider all the possible join orders. Increasing the number of tables considered at one time during join ordering can greatly increase the time it takes to optimize a query.
Since the time it takes to optimize the query is increased with each additional table, the set table count option is most useful when the execution savings from improved join order outweighs the extra optimizing time. Some examples are:
If you think that a more optimal join order can shorten total query optimization and execution time, especially for stored procedures that you expect to be executed many times once a plan is in the procedure cache
When saving abstract plans for later use
Use statistics time to check parse and compile time and statistics io to verify that the improved join order is reducing physical and logical I/O.
If increasing the table count produces an improvement in join optimization, but increases the CPU time unacceptably, rewrite the from clause in the query, specifying the tables in the join order indicated by showplan output, and use forceplan to run the query. Your routine performance maintenance checks should include verifying that the join order you are forcing still improves performance.
You can specify the index to use for a query using the (index index_name) clause in select, update, and delete statements. You can also force a query to perform a table scan by specifying the table name. The syntax is:
select select_list
from table_name [correlation_name]
(index {index_name | table_name } )
[, table_name ...]
where ...delete table_name
from table_name [correlation_name]
(index {index_name | table_name }) ... update table_name set col_name = value
from table_name [correlation_name]
(index {index_name | table_name})...For example:
select pub_name, title
from publishers p, titles t (index date_type)
where p.pub_id = t.pub_id
and type = "business"
and pubdate > "1/1/93"Specifying an index in a query can be helpful when you suspect that the optimizer is choosing a suboptimal query plan. When you use this option:
Always check statistics io for the query to see whether the index you choose requires less I/O than the optimizer's choice.
Ttest a full range of valid values for the query clauses, especially if you are tuning queries:
Tuning queries on tables that have skewed data distribution
Performing range queries, since the access methods for these queries are sensitive to the size of the range
Use this option only after testing to be certain that the query performs better with the specified index option. Once you include an index specification in a query, you should check regularly to be sure that the resulting plan is still better than other choices made by the optimizer.
If a nonclustered index has the same name as the table, specifying a table name causes the nonclustered index to be used. You can force a table scan using select select_list from tablename (0).
Specifying indexes has these risks:
Changes in the distribution of data could make the forced index less efficient than other choices.
Dropping the index means that all queries and procedures that specify the index print an informational message indicating that the index does not exist. The query is optimized using the best alternative access method.
Maintenance increases, since all queries using this option need to be checked periodically. Also, future versions of Adaptive Server may eliminate the problems that lead you to incorporate index forcing, so you should check all queries using forced indexes each time you install a new version.
Before specifying an index in queries:
Check showplan output for the "Keys are" message to be sure that the index keys are being used as expected.
Use dbcc traceon(302) to look for other optimization problems.
Run update statistics on the index.
If the index is a composite index, run update statistics on the minor keys in the index, if they are used as search arguments. This can greatly improve optimizer cost estimates. Creating statistics for other columns frequently used for search clauses can also improve estimates.
If your Adaptive Server is configured for large I/Os in the default data cache or in named data caches, the optimizer can decide to use large I/O for:
Queries that scan entire tables
Range queries using clustered indexes, such as queries using >, <, > x and < y, between, and like "charstring %"
Queries that scan a large number of index leaf pages
If the cache used by the table or index is configured for 16K I/O, a single I/O can read up to eight pages simultaneously. Each named data cache can have several pools, each with a different I/O size. Specifying the I/O size in a query causes the I/O for that query to take place in the pool that is configured for that size. See the System Administration Guide for information on configuring named data caches.
To specify an I/O size that is different from the one chosen by the optimizer, add the prefetch specification to the index clause of a select, delete, or update statement. The syntax is:
select select_list
from table_name
( [index {index_name | table_name} ]
prefetch size)
[, table_name ...]
where ... delete table_name from table_name
( [index {index_name | table_name} ]
prefetch size)
... update table_name set col_name = value
from table_name
( [index {index_name | table_name} ]
prefetch size)
...The valid prefetch size depends on the page size. If no pool of the specified size exists in the data cache used by the object, the optimizer chooses the best available size.
If there is a clustered index on au_lname, this query performs 16K I/O while it scans the data pages:
select *
from authors (index au_names prefetch 16)
where au_lname like "Sm%"If a query normally performs large I/O, and you want to check its I/O performance with 2K I/O, you can specify a size of 2K:
select type, avg(price)
from titles (index type_price prefetch 2)
group by typeReference to Large I/Os are on a 2K logical page size server. If you have an 8K page size server, the basic unit for the I/O is 8K. If you have a 16K page size server, the basic unit for the I/O is 16K.
When you specify an I/O size with prefetch, the specification can affect both the data pages and the leaf-level index pages. Table 20-1 shows the effects.
Access method | Large I/O performed on |
Table scan | Data pages |
Clustered index | Data pages only, for allpages-locked tables Data pages and leaf-level index pages for data-only-locked tables |
Nonclustered index | Data pages and leaf pages of nonclustered index |
showplan reports the I/O size used for both data and leaf-level pages.
See "I/O Size Messages" for more information.
In most cases, when you specify an I/O size in a query, the optimizer incorporates the I/O size into the query's plan. However, there are times when the specification cannot be followed, either for the query as a whole or for a single, large I/O request.
Large I/O cannot be used for the query if:
The cache is not configured for I/O of the specified size. The optimizer substitutes the best size available.
sp_cachestrategy has been used to disable large I/O for the table or index.
Large I/O cannot be used for a single buffer if
Any of the pages included in that I/O request are in another pool in the cache.
The page is on the first extent in an allocation unit. This extent holds the allocation page for the allocation unit, and only seven data pages.
No buffers are available in the pool for the requested I/O size.
Whenever a large I/O cannot be performed, Adaptive Server performs 2K I/O on the specific page or pages in the extent that are needed by the query.
To determine whether the prefetch specification is followed, use showplan to display the query plan and statistics io to see the results on I/O for the query. sp_sysmon reports on the large I/Os requested and denied for each cache.
By default, a query uses large I/O whenever a large I/O pool is configured and the optimizer determines that large I/O would reduce the query cost. To disable large I/O during a session, use:
set prefetch off
To reenable large I/O, use:
set prefetch on
If large I/O is turned off for an object using sp_cachestrategy, set prefetch on does not override that setting.
If large I/O is turned off for a session using set prefetch off, you cannot override the setting by specifying a prefetch size as part of a select, delete, or insert statement.
The set prefetch command takes effect in the same batch in which it is run, so you can include it in a stored procedure to affect the execution of the queries in the procedure.
For queries that scan a table's data pages or the leaf level of a nonclustered index (covered queries), the Adaptive Server optimizer chooses one of two cache replacement strategies: the fetch-and-discard (MRU) strategy or the LRU strategy.
See "Overview of cache strategies" for more information about these strategies.
The optimizer may choose the fetch-and-discard (MRU) strategy for:
Any query that performs table scans
A range query that uses a clustered index
A covered query that scans the leaf level of a nonclustered index
An inner table in a nested-loop join, if the inner table is larger than the cache
The outer table of a nested-loop join, since it needs to be read only once
Both tables in a merge join
You can affect the cache strategy for objects:
By specifying lru or mru in a select, update, or delete statement
By using sp_cachestrategy to disable or reenable mru strategy
If you specify MRU strategy, and a page is already in the data cache, the page is placed at the MRU end of the cache, rather than at the wash marker.
Specifying the cache strategy affects only data pages and the leaf pages of indexes. Root and intermediate pages always use the LRU strategy.
You can use lru or mru (fetch-and-discard) in a select, delete, or update command to specify the I/O size for the query:
select select_list
from table_name (index index_name prefetch size [lru|mru])
[, table_name ...]
where ... delete table_name from table_name (index index_name
prefetch size [lru|mru]) ... update table_name set col_name = value from table_name (index index_name
prefetch size [lru|mru]) ...This query adds the LRU replacement strategy to the 16K I/O specification:
select au_lname, au_fname, phone
from authors (index au_names prefetch 16 lru)For more information about specifying a prefetch size, see "Specifying I/O size in a query".
Status bits in the sysindexes table identify whether a table or an index should be considered for large I/O prefetch or for MRU replacement strategy. By default, both are enabled. To disable or reenable these strategies, use sp_cachestrategy. The syntax is:
sp_cachestrategy dbname , [ownername.]tablename
[, indexname | "text only" | "table only"
[, { prefetch | mru }, { "on" | "off"}]]This command turns off the large I/O prefetch strategy for the au_name_index of the authors table:
sp_cachestrategy pubtune,
authors, au_name_index, prefetch, "off"
This command reenables MRU replacement strategy for the titles table:
sp_cachestrategy pubtune,
titles, "table only", mru, "on"
Only a System Administrator or the object owner can change or view the cache strategy status of an object.
To see the cache strategy that is in effect for a given object, execute sp_cachestrategy, with the database and object name:
sp_cachestrategy pubtune, titles
object name index name large IO MRU ---------------- ---------------- -------- -------- titles NULL ON ON
showplan output shows the cache strategy used for each object, including worktables.
By default, merge joins are not enabled at the server level. When merge joins are disabled, the server only costs nested-loop joins, and merge joins are not considered. To enable merge joins server-wide, set enable sort-merge joins and JTC to 1. This also enables join transitive closure.
The command set sort_merge on overrides the server level to allow use of merge joins in a session or stored procedure.
To enable merge joins, use:
set sort_merge on
To disable merge joins, use:
set sort_merge off
For information on configuring merge joins server-wide see the System Administration Guide.
By default, join transitive closure is not enabled at the server level, since it can increase optimization time. You can enable join transitive closure at a session level with set jtc on. The session-level command overrides the server-level setting for the enable sort-merge joins and JTC configuration parameter.
For queries that execute quickly, even when several tables are involved, join transitive closure may increase optimization time with little improvement in execution cost. For example, with join transitive closure applied to this query, the number of possible joins is multiplied for each added table:
select * from t1, t2, t3, t4, ... tN where t1.c1 = t2.c1 and t1.c1 = t3.c1 and t1.c1 = t4.c1 ... and t1.c1 = tN.c1
For joins on very large tables, however, the additional optimization time involved in costing the join orders added by join transitive closure may result in a join order that greatly improves the response time.
You can use set statistics time to see how long it takes to optimize the query. If running queries with set jtc on greatly increases optimization time, but also improves query execution by choosing a better join order, check the showplan or dbcc traceon(302, 310) output. Explicitly add the useful join orders to the query text. You can run the query without join transitive closure, and get the improved execution time, without the increased optimization time of examining all possible join orders generated by join transitive closure.
You can also enable join transitive closure and save abstract plans for queries that benefit. If you then execute those queries with loading from the saved plans enabled, the saved execution plan is used to optimize the query, making optimization time extremely short.
See Chapter 29, "Introduction to Abstract Plans," for more information on using abstract plans.
For information on configuring join transitive closure server-wide see the System Administration Guide.
The parallel and degree_of_parallelism extensions to the from clause of a select command allow users to restrict the number of worker processes used in a scan.
For a parallel partition scan to be performed, the degree_of_parallelism must be equal to or greater than the number of partitions. For a parallel index scan, specify any value for the degree_of_parallelism.
The syntax for the select statement is:
select...
[from {tablename}
[(index index_name
[parallel [degree_of_parallelism | 1]]
[prefetch size] [lru|mru])],
{tablename} [([index_name]
[parallel [degree_of_parallelism | 1]
[prefetch size] [lru|mru])] ...Table 20-2 shows how to combine the index and parallel keywords to obtain serial or parallel scans.
To specify this type of scan: | Use this syntax: |
Parallel partition scan | (index tablename parallel N) |
Parallel index scan | (index index_name parallel N) |
Serial table scan | (index tablename parallel 1) |
Serial index scan | (index index_name parallel 1) |
Parallel, with the choice of table or index scan left to the optimizer | (parallel N) |
Serial, with the choice of table or index scan left to the optimizer | (parallel 1) |
When you specify the parallel degree for a table in a merge join, it affects the degree of parallelism used for both the scan of the table and the merge join.
You cannot use the parallel option if you have disabled parallel processing either at the session level with the set parallel_degree 1 command or at the server level with the parallel degree configuration parameter. The parallel option cannot override these settings.
If you specify a degree_of_parallelism that is greater than the maximum configured degree of parallelism, Adaptive Server ignores the hint.
The optimizer ignores hints that specify a parallel degree if any of the following conditions is true:
The from clause is used in the definition of a cursor.
parallel is used in the from clause of an inner query block of a subquery, and the optimizer does not move the table to the outermost query block during subquery flattening.
The table is a view, a system table, or a virtual table.
The table is the inner table of an outer join.
The query specifies exists, min, or max on the table.
The value for the max scan parallel degree configuration parameter is set to 1.
An unpartitioned clustered index is specified or is the only parallel option.
A nonclustered index is covered.
The query is processed using the OR strategy.
For an explanation of the OR strategy, see "Access Methods and Costing for or and in Clauses".
The select statement is used for an update or insert.
To specify the degree of parallelism for a single query, include parallel after the table name. This example executes in serial:
select * from titles (parallel 1)
This example specifies the index to be used in the query, and sets the degree of parallelism to 5:
select * from titles
(index title_id_clix parallel 5)
where ...To force a table scan, use the table name instead of the index name.
For data-only-locked tables of 15 pages or fewer, Adaptive Server does not consider a table scan if there is a useful index on the table. Instead, it always chooses the cheapest index that matches any search argument that can be optimized in the query. The locking required for an index scan provides higher concurrency and reduces the chance of deadlocks, although slightly more I/O may be required than for a table scan.
If concurrency on small tables is not an issue, and you want to optimize the I/O instead, you can disable this optimization with sp_chgattribute. This command turns off concurrency optimization for a table:
sp_chgattribute tiny_lookup_table,
"concurrency_opt_threshold", 0With concurrency optimization disabled, the optimizer can choose table scans when they require fewer I/Os.
You can also increase the concurrency optimization threshold for a table. This command sets the concurrency optimization threshold for a table to 30 pages:
sp_chgattribute lookup_table,
"concurrency_opt_threshold", 30The maximum value for the concurrency optimization threshold is 32,767. Setting the value to -1 enforces concurrency optimization for a table of any size. It may be useful in cases where a table scan is chosen over indexed access, and the resulting locking results in increased contention or deadlocks.
The current setting is stored in systabstats.conopt_thld and is printed as part of optdiag output.
Concurrency optimization affects only data-only-locked tables. Table 20-3 shows the effect of changing the locking scheme.
Changing locking scheme from | Effect on stored value |
Allpages to data-only | Set to 15, the default |
Data-only to allpages | Set to 0 |
One data-only scheme to another | Configured value retained |
|
|