pointnet论文翻译(二)

3. Problem Statement

We design a deep learning framework that directly consumes unordered point sets as inputs. A point cloud is represented as a set of 3D points {Pi | i = 1, ..., n}, where each point Pi is a vector of its (x, y, z) coordinate plus extra feature channels such as color, normal etc. For simplicity and clarity, unless otherwise noted, we only use the (x, y, z) coordinate as our point’s channels. For the object classification task, the input point cloud is either directly sampled from a shape or pre-segmented from a scene point cloud. Our proposed deep network outputs k scores for all the k candidate classes. For semantic segmentation, the input can be a single object for part region segmentation, or a sub-volume from a 3D scene for object region segmentation. Our model will output n × m scores for each of the n points and each of the m semantic subcategories.

3.问题陈述

我们设计了一个深度学习框架,直接使用无序点集作为输入。点云表示为一组三维点{Pi | i = 1, ..., n},其中每个点Pi是它的(x,y,z)坐标的向量加上额外的特征通道,如颜色、法线等特征。为了简单明了,除非另有说明,我们只使用(x,y,z)坐标作为点的通道。的频道。对于对象分类任务,输入点云要么直接从形状中采样,要么从场景点云中预分割。我们建议的深度网络输出k个分数对应所有的k个候选类别。对于语义分割,输入可以是用于局部区域分割的单个对象,也可以是用于对象区域分割的3D场景中的子卷。我们的模型输出n点和m个语义子范畴的n×m分数。

4. Deep Learning on Point Sets

The architecture of our network (Sec 4.2) is inspired by the properties of point sets in R n (Sec 4.1).

4.1. Properties of Point Sets in Rn 

Our input is a subset of points from an Euclidean space.It has three main properties:

• Unordered. Unlike pixel arrays in images or voxel arrays in volumetric grids, point cloud is a set of points without specific order. In other words, a network that consumes N 3D point sets needs to be invariant to N! permutations of the input set in data feeding order.

• Interaction among points. The points are from a space with a distance metric. It means that points are not isolated, and neighboring points form a meaningful subset. Therefore, the model needs to be able to capture local structures from nearby points, and the combinatorial interactions among local structures.

• Invariance under transformations. As a geometric object, the learned representation of the point set should be invariant to certain transformations. For example, rotating and translating points all together should not modify the global point cloud category nor the segmentation of the points.

4.点集的深度学习

我们的网络体系结构(第4.2节)是受Rn(第4.1节)中点集的性质所启发的。

4.1.Rn中点集的性质

我们的输入是欧氏空间中点的子集。它有三个主要特性:

·无序性。与图像中的像素阵列或体积网格中的体素阵列不同,点云是一组没有特定顺序的点。换句话说,消耗N个3D点集的网络需要对数据输入顺序中输入集的N!个排列不变 (a network that consumes N 3D point sets needs to be invariant to N! permutations of the input set in data feeding order. )。

·各点之间的相互作用。这些点来自具有距离度量的空间。这意味着点不是孤立的,相邻的点构成有意义的子集。因此,模型需要能够从附近点捕捉局部结构,以及局部结构之间的组合相互作用。

·转型下的不变性。作为一个几何对象,点集的学习表示应该不受某些变换的影响。例如,旋转和转换点不应修改全局点云类别,也不应对点进行分割。

4.2. PointNet Architecture

Our full network architecture is visualized in Fig 2, where the classification network and the segmentation network share a great portion of structures. Please read the caption of Fig 2 for the pipeline. Our network has three key modules: the max pooling layer as a symmetric function to aggregate information from all the points, a local and global information combination structure, and two joint alignment networks that align both input points and point features. We will discuss our reason behind these design choices in separate paragraphs below.

Symmetry Function for Unordered Input

In order to make a model invariant to input permutation, three strategies exist: 1) sort input into a canonical order; 2) treat the input as a sequence to train an RNN, but augment the training data by all kinds of permutations; 3) use a simple symmetric function to aggregate the information from each point. Here, a symmetric function takes n vectors as input and outputs a new vector that is invariant to the input order. For example, + and ∗ operators are symmetric binary functions.

While sorting sounds like a simple solution, in high dimensional space there in fact does not exist an ordering that is stable w.r.t. point perturbations in the general sense. This can be easily shown by contradiction. If such an ordering strategy exists, it defines a bijection map between a high-dimensional space and a 1d real line. It is not hard to see, to require an ordering to be stable w.r.t point perturbations is equivalent to requiring that this map preserves spatial proximity as the dimension reduces, a task that cannot be achieved in the general case. Therefore, sorting does not fully resolve the ordering issue, and it’s hard for a network to learn a consistent mapping from input to output as the ordering issue persists. As shown in experiments (Fig 5), we find that applying a MLP directly on the sorted point set performs poorly, though slightly better than directly processing an unsorted input.

4.2.PointNet结构

我们的整个网络架构如图2所示,其中分类网络和分割网络共享很大一部分结构。请阅读图2的标题。我们的网络有三个关键模块:最大池化层作为一个对称函数来聚合来自所有点的信息,一个本地和全局信息组合结构,以及两个联合对齐网络,使输入点和点的特征对齐。我们将在下面的单独段落中讨论这些设计选择背后的原因。

无序输入的对称函数

为了使模型对输入排列不变量,有三种策略:1)将输入按规范顺序排序;2)将输入看作训练RNN的序列,但是通过各种排列来增加训练数据。3)使用一个简单的对称函数从每个点聚集信息。这里,对称函数以n个向量作为输入,并输出一个与输入顺序不变的新向量。例如,+和∗运算符是对称二进制函数。

虽然排序听起来像是一个简单的解决方案,但实际上在高维空间中并不存在稳定的w.r.t排序。这很容易由产生矛盾而证明。如果存在这样的排序策略,则在高维空间和一维实线之间定义一个双射映射。不难看出,要求顺序是稳定的W.R。T点扰动相当于要求这张地图在维数减小时保持空间邻近性,这在一般情况下是无法完成的。因此,排序(sorting)并未圆满解决排序问题(ordering issue),当排序问题持续存在时,网络很难学习从输入到输出的一致映射。如实验所示(图5),我们发现在排序点集上直接应用MLP性能很差,虽然略好于直接处理未排序的输入。

The idea to use RNN considers the point set as a sequential signal and hopes that by training the RNN with randomly permuted sequences, the RNN will become invariant to input order. However in “OrderMatters” [25] the authors have shown that order does matter and cannot be totally omitted. While RNN has relatively good robustness to input ordering for sequences with small length (dozens), it’s hard to scale to thousands of input elements, which is the common size for point sets. Empirically, we have also shown that model based on RNN does not perform as well as our proposed method (Fig 5). Our idea is to approximate a general function defined on a point set by applying a symmetric function on transformed elements in the set:

使用RNN的思想是将点集看作序列信号,并希望通过训练具有随机置换序列的RNN,使RNN对输入顺序保持不变。然而,按“OrderMatters”[25]作者已表明,秩序确实重要,不能完全忽略。虽然RNN对小长度(几十个)序列的输入排序具有较好的鲁棒性,但它具有较好的鲁棒性。但它很难放大到数千个输入元素,而这是点集的通用大小。在经验上,我们还证明了基于RNN的模型的性能不如我们提出的方法(图5)。我们的思想是,通过对集合中转换的元素应用对称函数来逼近点集上定义的一般函数:


Empirically, our basic module is very simple: we approximate h by a multi-layer perceptron network and g by a composition of a single variable function and a max pooling function. This is found to work well by experiments. Through a collection of h, we can learn a number of f’s to capture different properties of the set. While our key module seems simple, it has interesting properties (see Sec 5.3) and can achieve strong performace (see Sec 5.1) in a few different applications. Due to the simplicity of our module, we are also able to provide theoretical analysis as in Sec 4.3.

经验上,我们的基本模块非常简单:我们用多层感知器网络来逼近h,用单变量函数和最大池化函数的组合来逼近g。通过实验发现,这种方法效果很好。通过h的集合,我们可以学习一些f来捕捉集合的不同属性。虽然我们的关键模块看起来很简单,但它有一些有趣的属性(参见5.3节)并能在几个不同的应用程序中实现较强的性能(参见5.1节)。由于我们模块的简单性,我们还可以提供理论分析,如第4.3节所示。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 201,312评论 5 473
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 84,578评论 2 377
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 148,337评论 0 333
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,134评论 1 272
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,161评论 5 363
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,303评论 1 280
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,761评论 3 393
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,421评论 0 256
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,609评论 1 295
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,450评论 2 317
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,504评论 1 329
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,194评论 3 318
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,760评论 3 303
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,836评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,066评论 1 257
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 42,612评论 2 348
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,178评论 2 341

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 7,274评论 0 10
  • 标题这句话,其实不是我想的,是某个伟大的什么家说的…哈哈,原谅我这记性,实在没把作者给记住。 为什...
    柠檬汁爆米花阅读 279评论 0 1
  • 大势研判 市场情绪脆弱,内在结构不稳,中期有待观察,但短期风险逐步释放,跌势已经放缓,存在博弈机会,建议控制仓位,...
    畅享心灵阅读 191评论 0 1
  • #幸福是需要修出来的~每天进步1%~幸福实修10班-03-陈莉梅-浙江永康# 20170815(22/30) 【幸...
    陈莉梅阅读 119评论 2 3