Last week I had a rather pleasant experience tuning an SQL query—I was able to reduce the time needed to run the query to 62% of the original time. One of my clients had written a query for an ad-hoc report, and before running it on the production system he sent it to me so that I could have a look. Would that all database users were so considerate! He did kick it off on the test system, where it took 55 minutes to run. The test database had about half as much data as the production database. He wanted to be sure there were no major blunders in the query before he ran it against production. The query, as he sent it to me, looked like this:
SELECT TO_CHAR(uabpymt_pymt_date,'yyyy'), utrsrat_glcl_code,utrglcl_desc, SUM(uarappl_amount) FROM uabpymt,uarappl, utrsrat,utrglcl, uabopen WHERE uabpymt_pymt_date >= '09-JUL-1999' AND uabpymt_ar_trans = uarappl_from_ar_trans AND uarappl_to_ar_trans = uabopen_ar_trans AND uabopen_srat_code=utrsrat_srat_code AND uabopen_scat_code=utrsrat_scat_code AND (utrsrat_effect_date<=uabpymt_pymt_date AND utrsrat_nchg_date>=uabpymt_pymt_date) AND utrglcl_code=utrsrat_glcl_code GROUP BY TO_CHAR(uabpymt_pymt_date,'yyyy'),utrsrat_glcl_code,utrglcl_desc;
Nothing in the query itself leapt out at me as a potential problem. The joins were all straightforward, and there were no functions in the WHERE clause that enclosed database columns. A check of the indexes on the tables involved in the query showed that all the join columns were indexed. I did an EXPLAIN PLAN, and the execution plan for the query looked like this:
0 SELECT STATEMENT Cost = 1 SORT GROUP BY 2 NESTED LOOPS 3 NESTED LOOPS 4 NESTED LOOPS 5 NESTED LOOPS 6 TABLE ACCESS BY ROWID UABPYMT 7 INDEX RANGE SCAN UABPYMT_BATCH_INDEX 8 TABLE ACCESS BY ROWID UARAPPL 9 INDEX RANGE SCAN UARAPPL_KEY_INDEX 10 TABLE ACCESS BY ROWID UABOPEN 11 INDEX UNIQUE SCAN UABOPEN_TRANS_INDEX 12 TABLE ACCESS BY ROWID UTRSRAT 13 INDEX RANGE SCAN UTRSRAT_KEY_INDEX 14 TABLE ACCESS BY ROWID UTRGLCL 15 INDEX UNIQUE SCAN UTRGLCL_KEY_INDEX
This really didn't look too bad, and I almost told my client to run it as-is. There were no tablescans, and all the joins were NESTED LOOP joins. The UABPYMT table was the driving table, which made sense because the query was concerned with summarizing all payments made between two dates. The plan looks like it would scale linearly. Having taken 55 minutes to run against our test database, I figured that 2 hours would be needed for the run against production. We could easily accomodate that 2 hours in our batch schedule, so there didn't seem to be any value to be gained by burning up lots of billable time tuning this once-off query. I did however, have one inspiration that seemed worth pursuing.
The point of the query was to summarize payments made between two dates. Payments were recorded in the UABPYMT table, which was the driving table for the query. Knowing the history of this system, I knew that the date range included the bulk of the data in the UABPYMT table. According to the execution plan, this table was being accessed via an index range scan. Oracle was reading almost the entire index in addition to retrieving all the rows in the table. The query might run faster if a full tablescan was done on UABPYMT instead, avoiding all that index I/O. To that end, I added a FULL hint to the query:
SELECT /*+FULL(uabpymt) */ ...
Adding this FULL hint make a dramatic difference in the execution plan, which now came out looking like this:
0 SELECT STATEMENT Cost = 33386 1 SORT GROUP BY 2 HASH JOIN 3 HASH JOIN 4 TABLE ACCESS FULL UTRSRAT 5 HASH JOIN 6 HASH JOIN 7 TABLE ACCESS FULL UABPYMT 8 TABLE ACCESS FULL UARAPPL 9 TABLE ACCESS FULL UABOPEN 10 TABLE ACCESS FULL UTRGLCL
Initialially I was a bit stunned to see all the NESTED LOOPS joins replaced by HASH joins. Then I realized that the use of a hint must have forced the use of the cost-based optimizer. A quick check of Rich Niemiec's book,Oracle Performance Tuning confirmed that hints do indeed cause the cost-based optimizer to be used. No matter, I talked with the client, and we decide to give this new version of the query a try—still on the test system of course.
The new version of the query, based on full tablescans and hash joins ran in just 34 minutes, which was 62% of the 55 minutes needed for the original query. We were very happy with that increase, and decided to go ahead and run the query against production. Why didn't we try to optimize further? We could have played around with the join order, the join types (nested loops versus hashed), or the use of indexes, and perhaps realized even further gains. Why not do that? The answer is that in the real-world you have to weigh the cost against the benefits. It was a Friday. The client was eager to schedule his report to run, and then to leave for the weekend. I was billing my time by the hour. This was a once-off report. We had achieved a "good enough" level of optimization, and we left it at that.