我们在查询数据时,经常会使用distinct去重,但使用distinct只能去除所有查询列都相同的记录,如果所查询列中有一个字段值不同,distinct是无法去重的。但我们还想要实现这样的效果,这时我们可以用partition by。
1.例如,我们新建一张学生成绩表。
DROP TABLE IF EXISTS STU_SCORE_INFO;
CREATE TABLE `STU_SCORE_INFO` (
`id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT '主键',
`student_id` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL COMMENT '学生编号',
`student_name` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL COMMENT '学生姓名',
`subject_code` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL COMMENT '科目(0001-语文,0002-数学,0003-英语)',
`student_score` decimal(3,1) DEFAULT NULL COMMENT '成绩',
PRIMARY KEY (`id`)
) ENGINE=InnoDB COMMENT='学生成绩表';
2.插入一些测试数据。
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120201', '张三', '001', '90');
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120202', '李四', '001', '95');
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120203', '小花', '001', '85');
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120204', '小明', '001', '85');
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120201', '张三', '002', '80');
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120202', '李四', '002', '88');
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120203', '小花', '002', '60');
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120204', '小明', '002', '60');
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120201', '张三', '003', '50');
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120202', '李四', '003', '96');
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120203', '小花', '003', '78');
INSERT INTO `test`.`stu_score_info` (`student_id`, `student_name`, `subject_code`, `student_score`) VALUES ('120204', '小明', '003', '96');
3.例如我们需要查询每个科目不同的分数,这时候可以用到distinct:
select DISTINCT subject_code, student_score from STU_SCORE_INFO;
4.但是我们把需求再加一点,需要查询每个科目排名第一的学生信息,这时候就需要用到partition by:
select * from
(select s.*,row_number() over(partition by s.subject_code order by s.student_score desc) AS rn from stu_score_info s) t
WHERE t.rn = 1;
PS:MySQL5.6不支持partition by
此时我们发现,并且第一的小明同学的英语成绩没有被查询出来,接着优化:
select * from
(select s.*,rank() over(partition by s.subject_code order by s.student_score desc) AS rn from stu_score_info s) t
WHERE t.rn = 1;