一些小区别
Mysql的groupby不用严格写完除聚合字段外的所有字段,写几个字段就按几个字段分组;Postgresql的group by必须写完
Mysql没有开窗函数
hive没有in和not in
时间相关
时间的加减
Mysql:timestamp + interval n minute/hour/day/month/year,如:"2022-04-20 00:00:00" + interval 1 hour
postgresql: timestamp + '+1 month',如: to_date(to_char(data_date::timestamp + '+1 month', 'yyyy-mm-dd'), 'yyyy-mm-dd')
spark sql: date_add/add_months等
时间转换为字符串
Mysql:timestamp 本来就可以当字符串用
postgresql: to_char(date,‘YYYY’)
字符串转换为时间
to_date(cast("datecode" as varchar),'yyyyMMdd') as "datecode"
规范时间格式
Mysql:date_format(timestamp , '%Y-%m-%d %H:%i:%s')
postgresql: to_char(date,‘YYYY’)
date_trunc('month',now()) 获取当前月份的首日分秒
extract(month from now()) = date_part('month',now()) 获取当前日期的月份数字
生成日期序列
postgresql:selectgenerate_series('20220101'::date,'20221001'::date,'+1 month') dd;
聚合函数相关
排序
Postgresql: row_number() over (partition by A order by B asc(desc))
Mysql:8.0以下版本无排序函数,需要用left join,如:
select t1.*, COUNT(t2.A) as m -- could count over any thing from t2, doesn't have to be A
from Test1 t1 left join Test1 t2
on t2.A = t1.A and t2.B = t1.B and t2.C <= t1.C group by t1.A , t1.B , t1.C order by t1.A , t1.B , t1.C
多行的数据以字符串的形式连接为一行
Postgresql: array_to_string(array_agg(DISTINCT "titleCh" order by "titleCh" desc),'') as titleChs
Mysql: GROUP_CONCAT(live_room_id order by live_room_id SEPARATOR ', ') as live_room_id,需搭配group by
spark sql: concat_ws(',',collect_set(ID)),concat_ws合并成字符串,collect_set合并成数组
一行数据按照字符切割拆成多行
Postgresql: regexp_split_to_table(字段, 符号)
Spark: explode(split(字段, 符号))
开窗函数
Postgresql: 聚合函数 over (partition by __ order by __)
Mysql: 8.0以下版本无开窗函数。替代方法1:GROUP_CONCAT+SUBSTRING_INDEX+GROUP BY结合可以组内排序取最值,注意只能取1个最值,如:SUBSTRING_INDEX(GROUP_CONCAT(d.monitor_time ORDER BY d.monitor_time asc ),',',1) group by d.date, d.hour 取每小时中monitor time排第一的记录
2:主表之外做聚合再关联(MySQL实现over partition by(分组后对组内数据排序)_MrCao杰罗尔德的博客-CSDN博客_mysql over partition)