top of page

Dimensionality Reduction In Machine Learning | Machine Learning Homework Help

realcode4you

Objectives

  • Understand the dimensionality reduction problem

  • Use principal component analysis to solve the dimensionality reduction problem

Through out this lecture we will be using the MNIST dataset. The MNIST dataset consists of thousands of images of handwritten digits from 0 to 1. The dataset is a standard benchmark in machine learning. Here is how to get the dataset from the tensorflow library:


# Import some basic libraries
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set_context('paper')

# Import tensorflow
import tensorflow as tf
# Download the data
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

The dataset comes with inputs (that are images of digits) and labels (which is the label of the digit). We are not going to use the labels in this lecture as we will be doing unsupervised learning. Let's look at the dimensions of the training dataset:


x_train.shape

The training dataset is a 3D array. The first dimension is 60,0000. This is the number of different images that we have. Then each image consists of 28x28 pixels. Here is the first image in terms of numbers:

x_train[0]

Each number corresponds to the pixel value. Say, zero is a white pixel and 255 is a black pixel. Values between 0 and 255 correspond to some shade of gray. Here is how to visualize the first image:

plt.imshow(x_train[0], cmap=plt.cm.gray_r, interpolation='nearest')

In this handout, I want to work with just images of threes. So, let me just keep all the threes and throw away all other data:

threes = x_train[y_train == 3]
threes.shape

We have 6,131 threes. That's enough. Now, each image is a 28x28 matrix. We do not like that. We would like to have vectors instead of matrices. So, we need to vectorize the matrices. That's easy to do. We just have to reshape them.


vectorized_threes = threes.reshape((threes.shape[0], threes.shape[1] * threes.shape[2]))
vectorized_threes.shape

Okay. You see that we now have 6,131 vectors each with 784 dimensions. That is our dataset. Let's apply PCA to it to reduce its dimensionality. We are going to use the PCA class of scikit-learn. Here is how to import the class:


from sklearn.decomposition import PCA

And here is how to initialize the model and fit it to the data:


pca = PCA(n_components=0.98, whiten=True).fit(vectorized_threes)

For the complete definition of the inputs to the PCA class, see its documentation. The particular parameters that I define above have the following effect:

  • n_components: If you set this to an integer, the PCA will have this many components. If you set it to a number between 0 and 1, say 0.98, then PCA will keep as many components as it needs in order to capture 98% of the variance of the data. I use the second type of input.

  • whiten: This ensures that the projections have unit variance. If you don't specify this then their variance will be the corresponding eigenvalue. Setting whiten=True is consistent with the theory developed in the video.

Okay, so now that the model is trained let's investigate it. First, we asked PCA to keep enough components so that it can describe 98% of the variance. How many did it actually keep? Here is how to check this:


pca.n_components_

It kept 227 compents. This doesn't look very impressive but we will take it for now.

Now, let's focus on the eigenvalues of the covariance matrix. Here is how to get them:



contact us to get any machine learning project assignment help at realcode4you@gmail.com and get instant help by our expert.

6 views0 comments

Comments


REALCODE4YOU

Realcode4you is the one of the best website where you can get all computer science and mathematics related help, we are offering python project help, java project help, Machine learning project help, and other programming language help i.e., C, C++, Data Structure, PHP, ReactJs, NodeJs, React Native and also providing all databases related help.

Hire Us to get Instant help from realcode4you expert with an affordable price.

USEFUL LINKS

Discount

ADDRESS

Noida, Sector 63, India 201301

Follows Us!

  • Facebook
  • Twitter
  • Instagram
  • LinkedIn

OUR CLIENTS BELONGS TO

  • india
  • australia
  • canada
  • hong-kong
  • ireland
  • jordan
  • malaysia
  • new-zealand
  • oman
  • qatar
  • saudi-arabia
  • singapore
  • south-africa
  • uae
  • uk
  • usa

© 2023 IT Services provided by Realcode4you.com

bottom of page