Introduction and using MySQL Explain

Like in any database query optimization is critical for MySQL data warehouse environment and having a better understanding of the "Explain plan" helps the database application developer avoiding issues with query  performance. Also, DBAs will like your queries and they would be more than happy to help you optimize them.

MySQL's "Explain" statement provides details on query parsing and execution steps and outputs 10 fields -
id, select_type, table, type, possible_keys, key, key_len, ref, rows and Extra columns.

You run the statement by issuing
explain <sql query> #-- Insert your query between < and >. 

Shown below is the Toad's output of explain plan of a self-joined table's query with where clause. The table has around 20million rows. The query took less than 0.6seconds to index through ~70K rows and with "const" and a func (date_add function) to compare between one days data to its previous day data.

SELECT a.ad_date, a.unit_id, a.max_cpc, 
               SUM(coalesce(a.max_cpc, a.max_cpc) - coalesce(b.max_cpc, a.max_cpc)) diff_cpc
       FROM  sem_kw_summary a
       LEFT JOIN  sem_kw_summary b
           ON  date_add(a.ad_date, INTERVAL -1 day)  = b.ad_date
           AND a.unit_id = b.unit_id
       WHERE date_add(CURRENT_DATE, INTERVAL -1 day) = a.ad_date
         AND a.engine = 'google'
       GROUP BY a.ad_date,  a.unit_id






The most important columns that you should look are "type", "key_len", "ref" and "rows".  They quickly help you concentrate your tuning effort.

Column
Column details
id
An identifier and a sequential number of each selection within the query.
selecty_type
A "type of select" with 9 possible values from simple to more complex type like derived, uncachable subquery, etc. Correlated subqueries are very costly in MySQL and avoid them.
table
Name of the table or alias it is referring.
type
A join or data access type. Can have 12 values like const, system, ref, eq_ref, etc. Full table scan happens when the value is "ALL" and try avoiding it. For fast performing queries you should see "const or system or eq_ref or ref".
possible_keys
Many possible keys that could be of use in this query (join).
key
One of the many keys from possible_keys. A key can be composite key.
key_len
Length of each key (in bytes). Smaller the key_len better it is for performance.
ref
The columns that will be compared to the index in key column above.
rows
Number of rows engine has to examine. It is an estimate. I wish it was a percentage of table size in rows. Obviously, smaller number is better.
Extra
Can have many values and provides more details on the explain plan. Typically, "using index, using where" are good and don't be confused with when you see "using filesort" - it does not mean MySQL will use file transferring data between main memory and disk. Filesort is a type of sort with an extra pass needed to retrieve the data.

For more details see http://dev.mysql.com/doc/refman/5.0/en/using-explain.html

Cheers,
Shiva

No comments:

Post a Comment