深度学习:快速推进中的
机器学习与人工智能前沿
提纲
深度学习(DL)及其应用前沿
DL在CV领域应用的启示
关键算法介绍
Perceptron及学习算法
MLP及其BP算法
Auto-Encoder
CNN及其主要变种
关于DL的思考与讨论
null
Institute of Computing Technology, Chinese Academy of Sciences
机器学习的基本任务
null
Class label
(Classification)
Vector
(Estimation)
{dog, cat, horse,, …}
Object recognition
Super resolution
Low-resolution
image
High-resolution image
Institute of Computing Technology, Chinese Academy of Sciences
源起——生物神经系统的启示
神经元之间通过突触(synapse)连接
层级感受野,学习使突触连接增强或变弱甚至消失
null
Hubel, D. H. & Wiesel, T. N. (1962)
Institute of Computing Technology, Chinese Academy of Sciences
第一代神经网络
null
Frank Rosenblatt(1957), The Perceptron--a perceiving and recognizing automaton. Report 85-460-1, Cornell Aeronautical Laboratory.
Institute of Computing Technology, Chinese Academy of Sciences
null
第一代神经网络
单层感知机(Perceptrons)模型的局限性
Minsky & Papert的专著Perceptron(1969)
只能对线性可分的模式进行分类
解决不了异或问题
几乎宣判了这类模型的死刑,导致了随后多年NN研究的低潮
null
Institute of Computing Technology, Chinese Academy of Sciences
2nd Generation Neural Networks
多层感知机(Multi-layer Perceptron, MLP)
超过1层的hidden layers(正确输出未知的层)
BP算法 [Rumelhart et al., 1986]
Compute error signal;
Then, back-propagate error signal to get derivatives for learning
null
David E. Rumelhart,, Geoffrey E. Hinton, and Ronald J. Williams. (). "Learning representations by back-propagating errors". Nature 323 (6088): 533–536
Institute of Computing Technology, Chinese Academy of Sciences
Error Backpropagation
W is the parameter of the network; J is the objective function
Feedforward operation
Back error propagation
David E. Rumelhart,, Geoffrey E. Hinton, and Ronald J. Williams. (). "Learning representations by back-propagating errors". Nature 323 (6088): 533–536
Output layer
Hidden layers
Input layer
Target values
Institute of Computing Technology, Chinese Academy of Sciences
2nd Generation Neural Networks
理论上多层好
两层权重即可逼近任何连续函数映射
遗憾的是,训练困难
It requires labeled training data
Almost all data is unlabeled.
The learning time does not scale well
It is very slow in networks with multiple hidden layers.
It can get stuck in poor local optima
These are often quite good, but for deep nets they are far from optimal.
null
Institute of Computing Technology, Chinese Academy of Sciences
null
1990-2006更流行…
Specific methods for specific tasks
Hand-crafted features (SIFT, LBP, HOG)
ML methods
SVM
Kernel tricks
Boosting
AdaBoost
kNN
Decision tree
null
Kruger et al. TPAMI’13
Institute of Computing Technology, Chinese Academy of Sciences
A Breakthrough Back to 2006
2006年,通过分层的、无监督预训练,终于获得了训练深层网络结构的能力
null
Institute of Computing Technology, Chinese Academy of Sciences
null
A Breakthrough Back to 2006
Hinton, G. E., Osindero, S. and Teh, Y., A fast learning algorithm for deep belief nets. Neural Computation 18:1527-1554, 2006
Hinton, G. E. and Salakhutdinov, R. R. (2006) Reducing the dimensionality of data with neural networks. Science, Vol. 313. no. 5786, pp. 504 - 507, 28 July 2006
Yoshua Bengio, Pascal Lamblin, Dan Popovici and Hugo Larochelle, Greedy Layer-Wise Training of Deep Networks, Advances in Neural Information Processing Systems 19 (NIPS 2006)
Marc’Aurelio Ranzato, Christopher Poultney, Sumit Chopra and Yann LeCun. Efficient Learning of Sparse Representations with an Energy-Based Model, Advances in Neural Information Processing Systems (NIPS 2006)
null
Institute of Computing Technology, Chinese Academy of Sciences
Unsupervised learning of representations is used to (pre-)train each layer.
Unsupervised training of one layer at a time, on top of the previously trained ones. The representation learned at each level is the input for the next layer.
Use supervised training to fine-tune all the layers (in addition to one or more additional layers that are dedicated to producing predictions).
null
其实是有例外的——CNN
卷积神经网络CNN
K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological Cybernetics, vol. 36, pp. 193–202, 1980
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Computation, vol. 1, no. 4, pp. 541–551, 1989
Y. Le Cun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998
null
Institute of Computing Technology, Chinese Academy of Sciences
其实是有例外的——CNN
Neocognitron 1980
null
K. Fukushima, “Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position,” Biological Cybernetics, vol. 36, pp. 193–202, 1980
Local Connection
Institute of Computing Technology, Chinese Academy of Sciences
例外:CNN用于数字识别
null
Institute of Computing Technology, Chinese Academy of Sciences
例外:CNN用于目标检测与识别
null
Institute of Computing Technology, Chinese Academy of Sciences
而且,东风同样重要
大数据
大数据
大数据
语音图像视频
计算能力
并行计算平台
GPU大量部署
开放的社区
开源,开放数据
null
Institute of Computing Technology, Chinese Academy of Sciences
语音识别(2011)
null
1986
2006
DBN
Science
Speech
2011
BP
Institute of Computing Technology, Chinese Academy of Sciences
2012年计算机视觉的巨大进步
ImageNet物体分类任务上
物体分类任务:1000类,1,431,167幅图像
null
1986
2006
DBN
Science
Speech
2011
2012
Rank Name Error rates(TOP5) Description
1 U. Toronto Deep learning
2 U. Tokyo Hand-crafted features and learning models.
Bottleneck.
3 U. Oxford
4 Xerox/INRIA
BP
Institute of Computing Technology, Chinese Academy of Sciences
ImageNet with Deep CNN
方法:大规模CNN网络
null
A. Krizhevsky, L. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” NIPS, 2012.
Institute of Computing Technology, Chinese Academy of Sciences
ImageNet with Deep CNN
方法:大规模CNN网络
650K神经元, 60M参数
Trained with BP on GPU
使用了各种技巧+dropout
ReLU, Data augment,
contrast normalization,...
被Google收编(Jan 2013)
Google+ Photo Tagging()
null
A. Krizhevsky, L. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” NIPS, 2012.
Institute of Computing Technology, Chinese Academy of Sciences
ImageNet物体分类(2013)
1000类,1,431,167幅图像,Top 5错误率
null
1986
2006
DBN
Science
Speech
2011
2012
Rank Name Error rates(TOP5) Description
1 NYU Deep learning
2 NUS Deep learning
3 Oxford Deep learning
2013
BP
Institute of Computing Technology, Chinese Academy of Sciences
MIT Tech Review坐不住了
null
Institute of Computing Technology, Chinese Academy of Sciences
ImageNet物体分类(2014)
1000类,1,431,167幅图像,Top 5错误率
null
1986
2006
DBN
Science
Speech
2011
2012
Rank Name Error rates(TOP5) Description
1 Google Deep learning
2 Oxford Deep learning
3 MSRA Deep learning
2013
2014
BP
Institute of Computing Technology, Chinese Academy of Sciences
ImageNet物体分类(2014)
GoogLeNet [CVPR2015]
22个卷积层
Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. CVPR 2015
null
1986
2006
DBN
Science
Speech
2011
2012
BP
2014
2013
Institute of Computing Technology, Chinese Academy of Sciences
ImageNet物体分类(2010-2014)
ImageNet Top 5 Error Rate上的持续进步
null
1986
2006
DBN
Science
Speech
2011
2012
BP
2014
2013
Institute of Computing Technology, Chinese Academy of Sciences
ImageNet物体检测任务
200类,456, 567 幅图像,检测率
null
传统方法
SIFT+BOW+SPM
深度方法
R-CNN+GoogLeNet
1986
2006
DBN
Science
Speech
2011
2012
BP
2014
2013
Institute of Computing Technology, Chinese Academy of Sciences
物体分割/语义标注进步迅速
Jonathan Long, Evan Shelhamer, Trevor Darrell. Fully Convolutional Networks for Semantic Segmentation. CVPR2015
null
Institute of Computing Technology, Chinese Academy of Sciences
DL有多热
Deep Learning for Vision
602篇文章中,仅标题中出现Deep的就有87篇,出现Convolution的47篇,出现Neural的40篇,出现Network的51篇,Recurrent 7篇
Going deeper,优化,无监督、自主学习…
Fully Convolutional Network(for segmentation等)
Vision and Language(for看图说话, Google, Fei-fei, Microsoft, UCB)
RNN with LSTM(for 时序处理)
Deep Learning for X(detection, metric learning, attribute, hash,…)
null
Institute of Computing Technology, Chinese Academy of Sciences
计算机视觉的重大进步
Vision and Language(Google, Microsoft, UCB)——看图说话:Minsky 60年前布置的作业
null
Show and Tell: A Neural Image Caption Generator (a work from Google)
From Captions to Visual Concepts and Back (a work from Microsoft)
Long-term Recurrent Convolutional Networks for Visual Recognition and Description(a work from UTA/UML/UCB)
Institute of Computing Technology, Chinese Academy of Sciences
人脸识别上的进步
正确率% [, X. Cao, F. Wen, J. Sun, CVPR13]
正确率% [, M. Yang, , L. Wolf, CVPR14]
正确率% [Y. Sun, X. Wang, and X. Tang, CVPR14]
正确率% [F. Schroff, D. Kalenichenko, and J. Philbin, CVPR15]
null
1986
2006
DBN
Science
Speech
2011
2012
Face
2014
2015
BP
在LFW上,过去2年错误率从5%下降到%
(错300对错30对)
Institute of Computing Technology, Chinese Academy of Sciences
null
人脸识别上的进步
Labeled Face in the Wild (LFW)
非限定条件下的人脸识别
数据来源于因特网
国外名人,Yahoo新闻
广为人知的测试模式
训练集:无限制
验证任务测试集
共6000图像对
null
Huang G B, Ramesh M, Berg T, et al. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report, University of Massachusetts, Amherst, 2007.
Institute of Computing Technology, Chinese Academy of Sciences
人脸识别上的进步
2014: DeepFace [1] (Facebook)
大数据:4K人,图像
null
[1] Taigman Y, Yang M, Ranzato M A, et al. Deepface: Closing the gap to human-level performance in face verification. CVPR, 2014.
[2] Sun Y, Wang X, Tang X. Deeply learned face representations are sparse, selective, and robust. arXiv preprint, 2014.
Institute of Computing Technology, Chinese Academy of Sciences
人脸识别上的进步
香港中文大学DeepID2+
在25个人脸Patch上分别训练CNN(4个卷积层,4个全连接层,4个verification损失信号和1个identification损失信号)
训练数据:10K人,202K名人图像
[Y. Sun, X. Wang, and X. Tang, CVPR14]
Institute of Computing Technology, Chinese Academy of Sciences
null
人脸识别上的进步
Google最新的FaceNet
深层网络(22层)+ 海量数据(800万人,2亿张图像) + Triplet Loss (不需要额外占用显存)
[F. Schroff, D. Kalenichenko, and J. Philbin, CVPR15]
Institute of Computing Technology, Chinese Academy of Sciences
提纲
深度学习(DL)及其应用前沿
DL在CV领域应用的启示
关键算法介绍
BP算法
Auto-Encoder
CNN
CNN主要变种
关于DL的思考与讨论
null
Institute of Computing Technology, Chinese Academy of Sciences
DL之前的视觉处理方法
分步处理背后的哲学
分而治之Divide and Conquer
Knowledge-driven
Hand-crafted feature
I think it should be solved by methods like…
null
Institute of Computing Technology, Chinese Academy of Sciences
DL及其之后的视觉处理方法
null
Institute of Computing Technology, Chinese Academy of Sciences
DL及其之后的视觉处理方法
学习到接近期望的底层、中层和高层特征
null
Institute of Computing Technology, Chinese Academy of Sciences
DL之前的视觉处理方法
任务
人工设计F(部分学习F)
领域知识:分步处理
滤波器,局部特征(SIFT),BoW,直方图,Max/Sum汇聚,判别分析,Kernel技巧,分段线性,流形学习,测度学习…
null
类标签
(分类问题)
向量
(回归/估计)
预处理
特征设计
特征降维
分类/回归
Institute of Computing Technology, Chinese Academy of Sciences
DL时代的视觉处理方法
任务
人工设计F(部分学习F)
End-to-end地学习F(全步骤学习)
Representation learning
Feature learning
Nonlinear transform learning
null
离散类标签
(分类问题)
连续向量
(回归/估计)
Credit to Dr. Xiaogang Wang
Institute of Computing Technology, Chinese Academy of Sciences
DL时代的视觉处理方法
null
Collect data
Preprocessing 1
Feature design
Classifier
Evaluation
Preprocessing 2
…
Collect data
Feature transform
Feature transform
…
Classifier
Deep neural network
Evaluation
vs.
Credit to Dr. Xiaogang Wang
Institute of Computing Technology, Chinese Academy of Sciences
DL时代的视觉处理方法
方法论上的变化
从分治协同(joint)
多步骤end-to-end learning
更广义的
检测与识别
分割与识别
…
null
Institute of Computing Technology, Chinese Academy of Sciences
提纲
深度学习(DL)及其应用前沿
DL在CV领域应用的启示
关键算法介绍
Perceptron算法
BP算法
Auto-Encoder
CNN及其主要变种
关于DL的思考与讨论
null
Institute of Computing Technology, Chinese Academy of Sciences
Perceptron
null
Frank Rosenblatt(1957), The Perceptron--a perceiving and recognizing automaton. Report 85-460-1, Cornell Aeronautical Laboratory.
Institute of Computing Technology, Chinese Academy of Sciences
null
Perceptron算法
null
F. Rosenblatt. The perceptron: A probabilistic model for information
storage and organization in the brain. Psychological Review, 65:386-408, 1958
Institute of Computing Technology, Chinese Academy of Sciences
Perceptron算法
null
Institute of Computing Technology, Chinese Academy of Sciences
前馈神经网络的BP学习算法
David E. Rumelhart,, Geoffrey E. Hinton, and Ronald J. Williams. (). "Learning representations by back-propagating errors". Nature 323 (6088): 533–536
(单独slides)
卷积神经网络及其变种
null
(单独slides)
Institute of Computing Technology, Chinese Academy of Sciences
提纲
深度学习(DL)及其应用前沿
DL在CV领域应用的启示
关键算法介绍
Perceptron及学习算法
MLP及其BP算法
Auto-Encoder
CNN及其主要变种
关于DL的思考与讨论
null
Institute of Computing Technology, Chinese Academy of Sciences
关于DL的更多讨论
DL带来观念的变革
DL是类脑信息处理方法吗?
DL有理论吗?
DL不能做什么?
数据驱动的学习不再需要领域知识?
工业界抢了学术界的饭碗?
CV研究者沦为ML研究者的实验员?
DL未来工作?
null
Institute of Computing Technology, Chinese Academy of Sciences
DL带来观念的变革
人工领域知识驱动数据驱动的学习思想
小数据控制模型复杂度避免过拟合
大数据提高模型复杂度避免欠拟合
“大数据+简单模型”是错误的!
维数灾难(降维)高维有益(升维)
分步、分治思想
协同学习(joint learning)思想
End-to-end的全过程学习
软硬件更优的协同
null
Institute of Computing Technology, Chinese Academy of Sciences
DL是类脑信息处理方法吗?
DL受到脑信息处理方式启发
分层逐级抽象
初级视觉神经元的“类Gabor小波编码”
并不“类脑”
本质上,脑的计算“机制”尚不清晰
脑的连接更多样、更复杂
Top-down,反馈机制
学习过程未必需要大量数据
先天——生物进化的结果(大数据长期训练)
后天学习——更多演绎推理,迁移学习
null
Institute of Computing Technology, Chinese Academy of Sciences
DL有理论吗?
DL理论匮乏
收敛性,bound
局部极值,初值很重要
复杂度理论
但不完全是black box
与传统“分步”做法的关系
比Kernel更“显式”
层级可视化提供了很多线索
逐层抽象 or 分层“非线性”?
null
Institute of Computing Technology, Chinese Academy of Sciences
DL不能做什么?
用做“特征学习”或“非线性变换”最成功
学到的特征具有良好的通用性
传统分类器或回归似乎还可用
非常倚重大数据,小数据深度学习不可靠
需要引入领域知识,深度模型的迁移学习
难以演绎推理
DL是归纳学习,难以举一反三,更难无师自通
在一些简单问题上未必需要深度学习
人脸识别的例子
目前的DL不学习“自身结构”
调试经验很重要
null
Institute of Computing Technology, Chinese Academy of Sciences
数据驱动的学习不需要领域知识?
大数据驱动确实减少了对领域知识的依赖
CNN在CV领域的成功,本身就说明了领域知识的重要性
卷积操作,Pooling操作
小数据条件下,领域知识尤其重要
null
Data is king, and DL is queen?
Institute of Computing Technology, Chinese Academy of Sciences
工业界抢了学术界的饭碗?
工业界看不起学术界?
工业界重视大数据收集和并行实现
学术界重视理论和新的模型
CV学术界应该更smart更前瞻
新的模型设计
网络结构学习
优化方法
训练加速
更Smart的数据收集
大而脏乱差数据的高效利用
null
Institute of Computing Technology, Chinese Academy of Sciences
CV研究者沦为ML研究者的实验员?
有这个危险
CV本身缺少理论体系
“分步”法主宰CV太多年
CV仍有机会
实际上ML也从CV获益良多
基于学习的CV理论?
CV研究者应该与ML有更多互动
“几何”和“结构”的可学习性?
null
Institute of Computing Technology, Chinese Academy of Sciences
DL领域的未来工作
DL理论
网络结构本身的学习
小数据条件下的DL
领域知识的嵌入
带反馈的深度网络
大而脏乱差数据条件下的DL
深度模型的迁移与适应
面向视频分析的DL模型
“非线性”的更多来源
新的优化和训练算法
null
Institute of Computing Technology, Chinese Academy of Sciences
总结和警告!!!
神经网络兴衰史的教训
是复兴不是创新
历史经常重演
相比CV等应用领域对DL的狂热,ML领域很冷静
他强由他强,清风拂山冈;
他横由他横,明月照大江。
建议
要会DL,但不要只会DL
经验知识驱动数据驱动混合驱动
null
Institute of Computing Technology, Chinese Academy of Sciences
谢谢!
null
Institute of Computing Technology, Chinese Academy of Sciences