本文将对主成分分析(Pricipal Components Analaysis) 和白化(Whitening) 两种数据预处理方法做实验分析。理论参考文档:http://deeplearning.stanford.edu/wiki/index.php/PCA,实验数据:http://deeplearning.stanford.edu/wiki/index.php/Exercise:PCA_and_Whitening。

主成分分析是对输入数据降维的一个过程,就是通过一个正交变换矩阵将输入数据映射到另外一个多维坐标系下,而该正交变换矩阵即是输入数据协方差矩阵的特征向量的集合。所谓的映射就是数据在各特征向量上的投影。通过选取前k 个特征向量(对应前k 个数据变化的主方向)完成降维的功能。

白化是解决输入数据冗余问题。对于输入时自然图像来说,输入特征(像素)与周边特征大多是相关的,白化的目的就是降低特征之间的相关性,并且让所有特征具有相同的方差。消除相关性通过主成分分析法已解决,对于让所有特征具有相同方差,可以通过来缩放每个特征,其中λ 表示输入数据协方差矩阵的特征值,i 表示特征的维数,ε 是由于特征值太小,容易除法溢出而添加的一个正规化量。

上面讲的白化是PCA 的白化,对于ZCA(Zero Components Analaysis) 白化是对PCA 白化结果进行一个旋转处理即可。这样处理后的数据更加倾向于原始数据。

对于ZCA 白化与PCA 白化的比较,详见:http://stats.stackexchange.com/questions/117427/what-is-the-difference-between-zca-whitening-and-pca-whitening

实验代码如下:

%%================================================================
clc, clear, close all;
%% Step 0a: Load data
%  Here we provide the code to load natural image data into x.
%  x will be a 144 * 10000 matrix, where the kth column x(:, k) corresponds to
%  the raw image data from the kth 12x12 image patch sampled.
%  You do not need to change the code below.x = sampleIMAGESRAW();
figure('name','Raw images');
randsel = randi(size(x,2),200,1); % A random selection of samples for visualization
display_network(x(:,randsel));%%================================================================
%% Step 0b: Zero-mean the data (by row)
%  You can make use of the mean and repmat/bsxfun functions.% -------------------- YOUR CODE HERE --------------------
avg = mean(x, 1);
x = x - repmat(avg, size(x, 1), 1);%%================================================================
%% Step 1a: Implement PCA to obtain xRot
%  Implement PCA to obtain xRot, the matrix in which the data is expressed
%  with respect to the eigenbasis of sigma, which is the matrix U.% -------------------- YOUR CODE HERE --------------------
xRot = zeros(size(x)); % You need to compute this
[U, S, V] = svd(x * x' ./ size(x, 2));
xRot = U' * x;%%================================================================
%% Step 1b: Check your implementation of PCA
%  The covariance matrix for the data expressed with respect to the basis U
%  should be a diagonal matrix with non-zero entries only along the main
%  diagonal. We will verify this here.
%  Write code to compute the covariance matrix, covar.
%  When visualised as an image, you should see a straight line across the
%  diagonal (non-zero entries) against a blue background (zero entries).% -------------------- YOUR CODE HERE --------------------
covar = zeros(size(x, 1)); % You need to compute this
covar = cov(xRot');% Visualise the covariance matrix. You should see a line across the
% diagonal against a blue background.
figure('name','Visualisation of covariance matrix');
imagesc(covar);%%================================================================
%% Step 2: Find k, the number of components to retain
%  Write code to determine k, the number of components to retain in order
%  to retain at least 99% of the variance.% -------------------- YOUR CODE HERE --------------------
k = 0; % Set k accordingly
covariance_k = cumsum(diag(S)) ./ sum(diag(S));
k = min(find(covariance_k >= 0.99));%%================================================================
%% Step 3: Implement PCA with dimension reduction
%  Now that you have found k, you can reduce the dimension of the data by
%  discarding the remaining dimensions. In this way, you can represent the
%  data in k dimensions instead of the original 144, which will save you
%  computational time when running learning algorithms on the reduced
%  representation.
%
%  Following the dimension reduction, invert the PCA transformation to produce
%  the matrix xHat, the dimension-reduced data with respect to the original basis.
%  Visualise the data and compare it to the raw data. You will observe that
%  there is little loss due to throwing away the principal components that
%  correspond to dimensions with low variation.% -------------------- YOUR CODE HERE --------------------
xHat = zeros(size(x));  % You need to compute this
xHat = U(:, 1 : k) * xRot(1 : k, :);% Visualise the data, and compare it to the raw data
% You should observe that the raw and processed data are of comparable quality.
% For comparison, you may wish to generate a PCA reduced image which
% retains only 90% of the variance.figure('name',['PCA processed images ',sprintf('(%d / %d dimensions)', k, size(x, 1)),'']);
display_network(xHat(:,randsel));
% For comparison, retains only 90% of the variance
k1 = min(find(covariance_k >= 0.90));
xHat1 = zeros(size(x));  % You need to compute this
xHat1 = U(:, 1 : k1) * xRot(1 : k1, :);
figure('name',['PCA processed images ',sprintf('(%d / %d dimensions)', k1, size(x, 1)),'']);
display_network(xHat1(:,randsel));figure('name','Raw images');
display_network(x(:,randsel));%%================================================================
%% Step 4a: Implement PCA with whitening and regularisation
%  Implement PCA with whitening and regularisation to produce the matrix
%  xPCAWhite. epsilon = [0.01, 0.1, 1];
for i = 1 : length(epsilon)
xPCAWhite = zeros(size(x));
% -------------------- YOUR CODE HERE --------------------
xPCAWhite = diag(1 ./ sqrt(diag(S) + epsilon(i))) * xRot;%%================================================================
%% Step 4b: Check your implementation of PCA whitening
%  Check your implementation of PCA whitening with and without regularisation.
%  PCA whitening without regularisation results a covariance matrix
%  that is equal to the identity matrix. PCA whitening with regularisation
%  results in a covariance matrix with diagonal entries starting close to
%  1 and gradually becoming smaller. We will verify these properties here.
%  Write code to compute the covariance matrix, covar.
%
%  Without regularisation (set epsilon to 0 or close to 0),
%  when visualised as an image, you should see a red line across the
%  diagonal (one entries) against a blue background (zero entries).
%  With regularisation, you should see a red line that slowly turns
%  blue across the diagonal, corresponding to the one entries slowly
%  becoming smaller.% -------------------- YOUR CODE HERE --------------------
covar = cov(xPCAWhite');% Visualise the covariance matrix. You should see a red line across the
% diagonal against a blue background.
figure('name','Visualisation of covariance matrix');
title(['epsilon = ' num2str(epsilon(i))]);
imagesc(covar);%%================================================================
%% Step 5: Implement ZCA whitening
%  Now implement ZCA whitening to produce the matrix xZCAWhite.
%  Visualise the data and compare it to the raw data. You should observe
%  that whitening results in, among other things, enhanced edges.xZCAWhite = zeros(size(x));% -------------------- YOUR CODE HERE --------------------
xZCAWhite = U * xPCAWhite;
% Visualise the data, and compare it to the raw data.
% You should observe that the whitened images have enhanced edges.
figure('name','ZCA whitened images');
display_network(xZCAWhite(:,randsel));
title(['epsilon = ' num2str(epsilon(i))]);
end
figure('name','Raw images');
display_network(x(:,randsel));

其效果如下:

原始图片集:

均值之后的图片集:

保留99%的方差后,PCA还原图片集:

保留99%的方差后,PCA还原图片集:

ZCA 白化后的图片集(针对不同的epsilon值):

相比于原始图片集:

可以看出,加入ε 参量能够起到低通滤波(除噪)效果,但是也不宜过大,否则边缘(特征)将会模糊掉。

PCA and Whitening Exercise相关推荐

  1. UFLDL教程:Exercise:PCA in 2D PCA and Whitening

    相关文章 PCA的原理及MATLAB实现 UFLDL教程:Exercise:PCA in 2D & PCA and Whitening python-A comparison of vario ...

  2. Deep learning:十一(PCA和whitening在二维数据中的练习)

    前言: 这节主要是练习下PCA,PCA Whitening以及ZCA Whitening在2D数据上的使用,2D的数据集是45个数据点,每个数据点是2维的.参考的资料是:Exercise:PCA in ...

  3. Deep learning:十(PCA和whitening)

    PCA: PCA的具有2个功能,一是维数约简(可以加快算法的训练速度,减小内存消耗等),一是数据的可视化. PCA并不是线性回归,因为线性回归是保证得到的函数是y值方面误差最小,而PCA是保证得到的函 ...

  4. PCA和whitening

    PCA: PCA的具有2个功能,一是维数约简(可以加快算法的训练速度,减小内存消耗等),一是数据的可视化. PCA并不是线性回归,因为线性回归是保证得到的函数是y值方面误差最小,而PCA是保证得到的函 ...

  5. 数据预处理之白化(Whitening transformation)

    版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明. 本文链接: https://blog.csdn.net/hjimce/article/deta ...

  6. Python数据处理 PCA/ZCA 白化(UFLDL教程:Exercise:PCA_in_2DPCA_and_Whitening)

    Python数据处理 PCA/ZCA 白化 参考材料 PCA.白化 以及一份别人的课后作业答案 UFLDL教程答案(3):Exercise:PCA_in_2D&PCA_and_Whitenin ...

  7. PCA的原理及MATLAB实现

    相关文章 PCA的原理及MATLAB实现 UFLDL教程:Exercise:PCA in 2D & PCA and Whitening python-A comparison of vario ...

  8. deeplearning URL

    Deep learning:五十一(CNN的反向求导及练习) 摘要: 前言: CNN作为DL中最成功的模型之一,有必要对其更进一步研究它.虽然在前面的博文Stacked CNN简单介绍中有大概介绍过C ...

  9. Deep Learning 教程(斯坦福深度学习研究团队)

    http://www.zhizihua.com/blog/post/602.html 说明:本教程将阐述无监督特征学习和深度学习的主要观点.通过学习,你也将实现多个功能学习/深度学习算法,能看到它们为 ...

最新文章

  1. Database之SQLSever:T-SQL数据语言操作(数据定义语句DDL、数据操作语句DML、数据控制语句DCL、其他基本语句、流程控制语句、批处理语句)概念及其相关案例之详细攻略
  2. Drools学习笔记3—Conditions / LHS—字段约束连接字段约束操作符
  3. c语言结构体与共同体课件,《结构体与共同体》PPT课件.ppt
  4. setid android,android-如何将setId()用于imageView
  5. P5707 【深基2.例12】上学迟到(python3实现)
  6. web_xml 控制web行为
  7. 渣男,你为什么有这么多小姐姐的照片?因为我Python爬虫学的好啊❤️!
  8. 具有system权限的进程无法访问sdcard
  9. Node.js nvshens图片批量下载爬虫 1.00
  10. java二维数组详解
  11. 《实变函数简明教程》,P78,第16题(依测度收敛 推导 依测度收敛,几乎处处小于 推导 几乎处处小于)
  12. ZT:【搞笑】某大学生毕业自我鉴定
  13. SDCC编译器 + VSCode开发 8位微控制器
  14. 使用PowerPhotos for Mac查找重复项似乎缺少一些重复的照片的解决办法
  15. js基础试题及答案(一)
  16. 数理统计之 置信区间(置信度)
  17. 几款好用的Markdown 写作工具推荐(上)
  18. nasa注册_“NASA中文”更名的情况说明
  19. p40pro升级鸿蒙后续航怎么样,华为p40续航怎么样
  20. JAVA导出excel 直接弹出下载框

热门文章

  1. Qt实现路径渐变,绘制彩色的线条
  2. 【VBA】给单元格设置背景色
  3. ipfs文件服务器,IPFS的文件获取过程详解
  4. Future 用法详解
  5. Unity 输入法回车确定搜索 InputField.onSubmit InputField.onEndEdit
  6. oracle 完全检查点条件,ORACLE Checkpoint(检查点)
  7. Hive综合应用案例 — 学生成绩查询
  8. Spring Security + SpringBoot + Mybatis-plus实现前后端分离的权限管理系统
  9. g2.Chart折线图绘制
  10. JAVA毕业设计流浪狗领养系统计算机源码+lw文档+系统+调试部署+数据库