top of page

Random Sampling, Straitified Sampling, K- mean(elbow), MDS In Machine Learning | ML Assignment Help

realcode4you


House-price-Prediction


Import Libraries


%matplotlib inline import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D import pandas as pd import numpy as np from sklearn.model_selection import train_test_split from sklearn.manifold import MDS



Random Sampling


Random sampling approach (i.e train_test_split), using a test size of 30% of data and a random_state of 42.


# X--> feature set ,, y --> target variable x = df1.drop(['id', 'price'],axis=1) y = df1['price'] x_train , x_test , y_train , y_test = train_test_split(x , y , test_size = 0.30,random_state =42) print('shapes of training and test set ') x_train.shape,x_test.shape


Straitified Sampling


target = 'price' X = df1.drop(target, axis = 'columns', inplace = False) Y = df1[target]


#method: 2 df2 = df1[df1[target].isin(df1[target].value_counts()[df1[target].value_counts()>2].index)] y2 = df2[target] X2 = df2.fillna(0)


X2_train, X2_test, y2_train, y2_test = train_test_split(X2, y2, test_size=0.33, random_state=42, stratify=X2[target])


X2_train.shape,X2_test.shape


K- mean(elbow)


The Elbow method is a very popular technique and the idea is to run k-means clustering for a range of clusters k (let’s say from 1 to 10) and for each value, we are calculating the sum of squared distances from each point to its assigned center(distortions).


from matplotlib import style from sklearn.cluster import KMeans


df1 = df1.drop('date', axis = 'columns', inplace = False)


distortions = [] K = range(1,11) for k in K: kmeanModel = KMeans(n_clusters=k) kmeanModel.fit(df1) distortions.append(kmeanModel.inertia_)


plt.figure(figsize=(8,2)) plt.plot(K, distortions, 'bx-') plt.xlabel('k') plt.ylabel('Distortion') plt.title('The Elbow Method showing the optimal k') plt.show()



Dimension reduction on both org and 2 types of reduced data using PCA


#import libraries

from sklearn.decomposition import PCA model = PCA()


#fit into model

model.fit(df1)


transformed = model.transform(df1) print('Principle components: ',model.components_)


# PCA variance from sklearn.preprocessing import StandardScaler scaler = StandardScaler() df1 = scaler.fit_transform(df1) pca = PCA() pca.fit_transform(df1) pca_variance = pca.explained_variance_ plt.bar(range(pca.n_components_), pca_variance) plt.xlabel('PCA feature') plt.ylabel('variance') plt.show()













Intrinsic dimension


PCA identifies intrinsic dimension when samples have any number of features

intrinsic dimension = number of PCA feature with significant variance

In order to choose intrinsic dimension try all of them and find best accuracy


#color_list=['black','gray'] pca = PCA(n_components = 3) pca.fit(df1) transformed = pca.transform(df1) transformed.shape


I hope this may help you to understand basic flow of data science concept, if you are face any other issue or need any assignment related help then you can directly send your quote so we can help you as soon as we can.

You can send quote at given main directly:


"realcode4you@gmail.com"


or


Submit your requirement details at here:


 
 
 

Commentaires


REALCODE4YOU

Realcode4you is the one of the best website where you can get all computer science and mathematics related help, we are offering python project help, java project help, Machine learning project help, and other programming language help i.e., C, C++, Data Structure, PHP, ReactJs, NodeJs, React Native and also providing all databases related help.

Hire Us to get Instant help from realcode4you expert with an affordable price.

USEFUL LINKS

Discount

ADDRESS

Noida, Sector 63, India 201301

Follows Us!

  • Facebook
  • Twitter
  • Instagram
  • LinkedIn

OUR CLIENTS BELONGS TO

  • india
  • australia
  • canada
  • hong-kong
  • ireland
  • jordan
  • malaysia
  • new-zealand
  • oman
  • qatar
  • saudi-arabia
  • singapore
  • south-africa
  • uae
  • uk
  • usa

© 2023 IT Services provided by Realcode4you.com

bottom of page