源表数据如下:
time | province | order_cnt |
---|---|---|
20200601 | shandong | 100 |
20200601 | jiangsu | 200 |
20200601 | zhejiang | 300 |
20200602 | shandong | 200 |
20200602 | jiangsu | 300 |
20200602 | zhejiang | 400 |
20200603 | shandong | 400 |
20200603 | jiangsu | 500 |
20200603 | zhejiang | 600 |
需求是按照省份和时间分类,每天累计求和,需求的结果如下:
(每天的数据都要按照省份将之前的订单数目进行累加)
time | province | order_cnt_sum |
---|---|---|
20200601 | shandong | 100 |
20200601 | jiangsu | 200 |
20200601 | zhejiang | 300 |
20200602 | shandong | 300 |
20200602 | jiangsu | 500 |
20200602 | zhejiang | 700 |
20200603 | shandong | 700 |
20200603 | jiangsu | 1000 |
20200603 | zhejiang | 1300 |
需要借助clickhouse的array系列函数实现: ArrayFunction
SELECT
time,
province,
arraySum(arraySlice(order_cnts, 1, i)) AS order_cnt_sum
FROM
(
SELECT groupArray(order_cnt) AS order_cnts,
groupArray(time) AS times,
groupArray(province) AS provinces
FROM
(
SELECT *
FROM test_table order by time
) group by province
)
ARRAY JOIN
times as time,
provinces AS province,
arrayEnumerate(order_cnts) AS i;