The time data is in UTC (unix timestamp and first second of each hour. For example, 1262307600 => 2010-01-01 01:00:01) summarized with other dimensions and metrics. The simplified fact table looks like
ts_begin_hr, dim_1, dim_2, metric_1, metric_2, etc.
The query used to pull required report data.
SELECT 'dim1' dim, concat(D_H.d, ' ', D_H.hr) pst,
CASE WHEN P.met1 IS NULL THEN 0.0 ELSE P.met1 end METRIC
FROM
(
SELECT d.d, h.hr
FROM
(SELECT '00' hr UNION SELECT '01' UNION SELECT '02' UNION SELECT '03' UNION SELECT '04'
UNION SELECT '05' UNION SELECT '06' UNION SELECT '07' UNION SELECT '08'
UNION SELECT '09' UNION SELECT 10 UNION SELECT 11 UNION SELECT 12
UNION SELECT 13 UNION SELECT 14 UNION SELECT 15 UNION SELECT 16
UNION SELECT 17 UNION SELECT 18 UNION SELECT 19 UNION SELECT 20
UNION SELECT 21 UNION SELECT 22 UNION SELECT 23
) h
CROSS JOIN
(SELECT full_date d FROM DW_DB.dim_date
WHERE full_date BETWEEN date_add(current_date(), interval -6 day) AND current_date()
) d -- 1
) D_H
LEFT OUTER JOIN
(
SELECT 'dim1' dim,
, date(convert_tz(from_unixtime(ts_begin_hr), 'GMT', 'America/Los_Angeles')) d
, hour(convert_tz(from_unixtime(ts_being_hr), 'GMT', 'America/Los_Angeles')) hr
, sum(met1) met1
FROM DW_DB.FACT_TABLE FT
JOIN DW_DB.DIM_TABLE DM
ON FT.dim_id = DM.dim_id
WHERE from_unixtime(ts_begin_hr) >= date_add(now(), interval -7 day)
GROUP BY dim, d, hr
) P -- 2
ON D_H.d = P.d and D_H.hr = P.hr
WHERE D_H.d < date(convert_tz(now(), 'GMT', 'America/Los_Angeles'))
OR (D_H.d = date(convert_tz(now(), 'GMT', 'America/Los_Angeles'))
AND D_H.d <= hour(convert_tz(now(), 'GMT', 'America/Los_Angeles'))
)
ORDER BY D_H.d, D_H.hr
;";
Subquery 1 (in blue) gets a cross-joined table of 24 hours (0..23) with last 6 dates resulting in output like YYYY-MM-DD HH. Single digit hours are converted to two digits in string format (SELECT '0n'). The system and mysql were running in UTC timezone while business users needed them in PST tz. Subquery 2 (in brown) generates required metric in PST and joined with subquery 1. With where clause any extra hours (later than current hour) is filtered out in left outer join.
Subquery 1 is joined with left outer join for possibility that during a given hour there might not be metric (say, dollar amount) available because there might not be any transaction during that hour for a given dimension.
The result set is dumped to XML file for Fusion chart consumption resulting in above output through php wrapper. The chart is updated every hour.
No comments:
Post a Comment