top of page

Predict Wine Quality With Random Forest Using Wine Dataset

Import Libraries

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn import preprocessing
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.tree import export_graphviz
from sklearn.metrics import f1_score
import matplotlib.pyplot as plt
from subprocess import call
import numpy as np
import pandas as pd

Read Dataset

data = pd.read_csv('winequality-red.csv')
data.describe()

Output:



Selecting Features and target variable

x = data.drop('quality', axis=1)
y = data['quality']

Split the dataset

X_train, X_test, y_train, y_test = train_test_split(x, y, 
                                                    test_size=0.2, 
                                                    random_state=123, 
                                                    stratify=y)

Normalize the Dataset

scaler  = preprocessing.StandardScaler().fit(X_train)
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

Fit Into the model

forest = RandomForestClassifier(n_estimators=100, 
                               bootstrap = True,
                               max_features = 10,
                               n_jobs = 1,
                               criterion = 'entropy')

forest.fit(X_train_scaled, y_train)

Output:

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='entropy', max_depth=None, max_features=10, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=1, oob_score=False, random_state=None, verbose=0, warm_start=False)



Mean Absolute Error

pred = forest.predict(X_test_scaled)
errors = abs(pred - y_test)
print('Mean Absolute Error:', round(np.mean(errors), 2))

Output:

Mean Absolute Error: 0.34


Accuracy Score

print("Random Forest Accuracy Score -> ",accuracy_score(pred, y_test)*100)
print(classification_report(y_test, pred))
print(f1_score(y_test, pred , average='weighted'))

Output:










Visualize

col = x.columns
y = forest.feature_importances_
fig, ax = plt.subplots() 
width = 0.4 # the width of the bars 
ind = np.arange(len(y)) # the x locations for the groups
ax.barh(ind, y, width, color="green")
ax.set_yticks(ind+width/10)
ax.set_yticklabels(col, minor=False)
plt.title('Feature importance in RandomForest Classifier')
plt.xlabel('Relative importance')
plt.ylabel('feature') 
plt.figure(figsize=(5,5))
fig.set_size_inches(6.5, 4.5, forward=True)

Output:












Are you looking Machine Learning Assignment Help? Big Data Assignment Help? Or any other Python related help?


Send Your requirement details at:



And get instant help with an affordable price.

REALCODE4YOU

Realcode4you is the one of the best website where you can get all computer science and mathematics related help, we are offering python project help, java project help, Machine learning project help, and other programming language help i.e., C, C++, Data Structure, PHP, ReactJs, NodeJs, React Native and also providing all databases related help.

Hire Us to get Instant help from realcode4you expert with an affordable price.

USEFUL LINKS

Discount

ADDRESS

Noida, Sector 63, India 201301

Follows Us!

  • Facebook
  • Twitter
  • Instagram
  • LinkedIn

OUR CLIENTS BELONGS TO

  • india
  • australia
  • canada
  • hong-kong
  • ireland
  • jordan
  • malaysia
  • new-zealand
  • oman
  • qatar
  • saudi-arabia
  • singapore
  • south-africa
  • uae
  • uk
  • usa

© 2023 IT Services provided by Realcode4you.com

bottom of page