Kitti的stereo matching benchmark有2012和2015两个版本。
在2012的版本中,ground truth是利用ICP从先后五帧的点云中求得相对pose,和积累得到的点云,利用camera的参数信息反投回图像,再人为的移除镜面反射的区域得到。
在和其他benchmark,比方说middleburry的stereo数据集的比较过程中发现,一些基于local patch的方法在middleburry上面表现得很好,但是在kitti上面表现的很差。在kitti的论文中分析了这一现象,认为原因是kitti数据集的场景中有很多non-texture area比方说路面,墙面和阴影,区别于middleburry数据集丰富纹理的场景,很难让local的方法得到好的performance
Our evaluation table ranks all methods according to the number of non-occluded erroneous pixels at the specified disparity / end-point error threshold. All methods providing less than 100 % density have been interpolated using simple background interpolation as explained in the corresponding header file in the development kit. For each method we show:
Out-Noc: Percentage of erroneous pixels in non-occluded areas
Out-All: Percentage of erroneous pixels in total
Avg-Noc: Average disparity / end-point error in non-occluded areas
Avg-All: Average disparity / end-point error in total
Density: Percentage of pixels for which ground truth has been provided by the method
在2015年kitti的另外一篇论文中(Object Scene Flow for Autonomous Vehicles),提出了新的stereo和optical flow benchmark,里面有更多的含有动态物体的数据