Logging queries for performance monitoring

Before making any performance improvements, one need to measure it either you are trying to improve computer applications or DW queries, etc.  In one of the Data Warehouse subject area where I run more than 120K (120,00) queries everyday in an ETL environment, I log all queries to a table and track over time to see which queries are showing performance issue(s). With the following table, I was able to improve the performance by more than 50% some time.  For a query that repeatedly runs with different parameters in where clause, a small improvement adds up quickly and other times due to data skewing you can clearly see any changes needed to improve performance.   Other times any mistakes in indexes (DBA dropped it :) or application user modified their query, etc. you will have a reference point to check against why a particular query or queries are slow.


Query Performance Monitoring table definition
Column Null? Type Comment
ID No int (11) Auto Increment
RUN_NUMBER No tinyint Useful when job is recovered or re-ran
APPLICATION No varchar(128) Application name to track
DOMAIN Yes varchar(128) Application domain or subject area
QUERY No varchar(4096) Actual query
DURATION No decimal(10,5) How long did query run?
ROW_COUNT Yes int Number of rows affected
RESULT No varchar(32) Query result type - succeeded or failed?
COMMENT Yes varchar(128) User comment that can be logged with each query
CREATED_DT No date Query run date
CREATED_DT_TM No timestamp/ datetime Query run date and time
CREATED_BY Yes varchar(64) Query user name




In the above table, one can use CREATED_DT_TM wherever CREATED_DT is needed while doing query performance analysis but most of these analyzes are done at date level (as most of DW ETL and summarizations are daily processes). By having date column and indexed, the query analysis queries are much faster as there is no need to apply date function on each row.

More interesting analysis including job failures, errors and query result set (ROW_COUNT) can be done with this data. You can also analyze specific database query performances. For example when a same query runs against different (mysql) databases (say, more than 100), some databases show performance getting worse due to table size or load, etc.

Once you know that there is some query performance issue, you can take remedial action.  Let me know how you have implemented query performance monitoring?

Cheers,
Shiva

No comments:

Post a Comment