What and Why?

Uncertainty analysis does not “build a better model. It indicates how well a given model captures the data.

最新paper

20181019-Safe Reinforcement Learning with Model Uncertainty Estimates-MIT

DNNs tend to be overconfident in predictions on unseen data and can give unpredictable results for far-from-distribution test data. This paper uses MC-Dropout and Bootstrapping to give computationally tractable and parallelizable uncertainty estimates. The result is a collision avoidance policy that knows what it does not know and cautiously avoids pedestrians that exhibit unseen behavior.

The main contributions of this work are i) an algorithm that identifies novel pedestrian observations and ii) avoids them more cautiously and safer than an uncertainty-unaware baseline, iii) an extension of an existing uncertainty-aware
reinforcement learning framework [29] to more complex dynamic environments with exploration aiding methods, and iv) a demonstration in a simulation environment. This work is another step towards opening up the vast capabilities of deep neural networks for the application in safety-critical tasks.

20181029-Principled Uncertainty Estimation for Deep Neural Networks

In this paper, we examine three types of uncertainty: model capacity uncertainty, intrinsic data uncertainty, and open set uncertainty, and review techniques that have been derived to address each one. We then introduce a unified hierarchical model, which combines methods from Bayesian inference, invertible latent density inference, and discriminative classification in a single end-to-end deep neural network topology to yield efficient per-sample uncertainty estimation.

image.png

20181214-Combating Uncertainty with Novel Losses for Automatic Left Atrium Segmentation
作者：港中文Xin Yang. left atrium avg dice=92.24% on 20 testing volumes.
组合overlap loss和focal positive loss来对抗classification uncertainty
扩大前景和背景预测的gap来来抑制边界处的不确定性。Overlap loss来measure这种gap。

Overlap loss表示前景和背景的重合区域，最优为0
Accuracy, Uncertainty, and Adaptability of Automatic Myocardial ASL Segmentation using Deep CNN
作者：Hung P. Do, Canon Medical Systems USA
用MC dropout measure U-Net的不确定性，做N=1115次MC采样，引入两个量化指标：
Dice uncertainty：N次dice scores的标准差
MC uncertainty：summed all pixel values of the uncertainty map and normalized by the area of the predicted mask.
201812-Leveraging (Bayesian) uncertainty information: Opportunities and failure modes, Dr. Christian Leibig, NeurIPS 18 Bayesian DL workshop
- Bayesian uncertainty is practical in a medical setting
eg: Uncertainty informed decision referral. If CNN outputs have high uncertainties,, refer data and decision to physician
- Uncertainty tends to be high for “difficult” samples
可以用来提升性能；人和ML结合
- Uncertainty tends to be high in extrapolation directions；eg: 用于active learning
- Uncertainty based out-of-distribution detection 不一定都work
OoD样本不确定性不一定很高。

课程&资源

DeepBayes2018：深度学习贝叶斯6天速成视频课程
Nips workshop Bayesian Deep Learning
New deep learning models that take advantage of Bayesian techniques, as well as Bayesian models that incorporate deep learning elements
201705 清华大学朱军详解珠算：贝叶斯深度学习的GPU库
From a historical perspective, David MacKay's dissertation (1991) is worth reading
2018不确定性AI国际会议

Code

阅读笔记

1. 不确定性的分类

不确定性度量反映the amount of dispersion of a random variable，也即度量随机变量的随机性。有很多不同的方法来表示不确定性，比如方差，熵等。但是要牢记一个单一的标量并不能刻画随机性的整个图景！

1.1 Aleatoric Uncertainty

描述数据产生过程中的随机性，这类随机性通过收集更多的数据并不能消除。考虑一个简单的模型y=5x，x~N(0,1)，则y~N(0,5)，因此y的aleatoric不确定性可以描述为 $\sigma=5$ 。输入数据的Aleatoric Uncertainty会传到模型的预测结果。
如何捕捉aleatoric uncertainty?

1.2 Epistemic uncertainty

描述模型认识的不确定性，即模型对其输出有多大的把握。该类不确定性可以通过收集更多的数据来降低（见多识广）。一种不错的估计认知不确定性方法是模型集成。比如boostrap ensemble，从大小为N的训练集中随机抽取M个子训练集，分别训练M个模型，这M个模型的预测结果就形成一个经验的预测分布。
另一种方式是在网络训练的时候加入dropout来近似模型集成，然而这会对单个模型的性能打折。
因此，如果计算资源够的话推荐第一种方案，Deep ensemble中提到，如果通过不同的随机初始化来训练，已经足够引入a diverse set of model，不需要bootstrap ensemble。

1.3 Out of distribution (OoD) errors

确定输入的数据是否valid，这在部署ML模型到实践中非常重要。两种方式处理OoD输入：
1）建立watchdogs，在OoD数据输入到模型前将其捕获，比如建立一个正常数据的density model；
2）如果模型的输出很奇怪，说明对应的输入数据有问题。比如利用epistemic uncertainty。

Who Will Watch the Watchdogs? 第一种方式将OoD问题和不确定性估计问题解耦，从工程的视角来看更为easy。最近的研究表明epistemic uncertainty of likelihood models是一个非常好的OoD detector。By bridging epistemic uncertainty with density estimation, we can use ensembles of likelihood models to protect machine learning models against OoD inputs in a model-agnostic way

Conclusion

Bayesian uncertainty is practical in a medical setting

eg: Uncertainty informed decision referral. If CNN outputs have high uncertainties,, refer data and decision to physician

Uncertainty tends to be high for “difficult” samples

可以用来提升性能；人和ML结合

Uncertainty based out-of-distribution detection 不一定都work

from: leveraging (Bayesian) uncertainty information: opportunities and failure modes, by Dr Christian Leibig

Calibration is important，但是还没得到学术界的充分的重视。 Researchers are not performing model selection by deploying the model in repeated identical experiments and measuring calibration error, so unsurprisingly, our models tend to be poorly calibrated [1]. A much more powerful way to prove our models understand the world correctly" (in a statistical sense) is to test them for statistical calibration.

[1] On Calibration of Modern Neural Networks (paper, vedio, code)

DL中的不确定性估计

DL中的不确定性估计

What and Why?

相关大牛和paper

Yarin Gal

Alex Kendall

Inbar Naor