Check and set analyze statistics on tables-columns in Vertica

It is important to update statistics in any database for better query runs or query optimizations.  Vertica provides ANALYZE_STATISTICS and ANALYZE_HISTOGRAM to collect stats and in big data warehouse environment it is sometime possible that some tables may not have stats collected.  In that case you can run following query to check and add the tables/ columns to collect stats.


SELECT T1.table_schema, T1.table_name, T2.row_ct, T2.wos_ct, T2.ros_ct FROM ( -- Get list of tables that don't have 'FULL' stats SELECT distinct table_schema, table_name FROM projection_columns WHERE (statistics_type = 'ROWCOUNT' or statistics_type = 'NONE') -- and table_schema not like 'v_%' AND (table_schema = 'schema_1' OR table_schema = 'schema_2' ) ORDER BY 1,2 ) T1 JOIN ( -- Get list of tables with storage details SELECT anchor_table_schema, anchor_table_name, sum(row_count) row_ct, sum(wos_row_count) wos_ct, sum(ros_row_count) ros_ct FROM projection_storage WHERE (anchor_table_schema = 'schema_1' OR anchor_table_schema = 'schema_2' ) GROUP BY 1,2 ) T2 ON T1.table_schema = T2.anchor_table_schema AND T1.table_name = T2.anchor_table_name ORDER BY 3 desc;

From the list generated above add those tables/ columns to your regular stats collecting script which you can run during off peak hours.

Notes:
Vertica's ANALYZE_HISTOGRAM returns similar data as ANALYZE_STATISTICS except that one can set the percent of data that is sampled to generate statistics metrics.  In the case of latter it uses fixed 10% sampling.

WOS: Write Optimized Store is a memory resident data structure used to support fast data loads without data compression and indexing.
ROS: Read Optimized Store is disk-based storage with data compressed and sorted.  Tuple Mover moves data out of WOS and into ROS in batches.

References: