top of page

Creating Decision Tree Using Mini-impurity-decrease | What is mini-impurity-decrease In D-Tree?

Before you run the python files, you need to run “Anaconda Prompt” in the same location as “Spyder”. “Anaconda Prompt” is a command line window.


import numpy as np
import matplotlib.pyplot as plt
from sklearn import tree
import pandas as pd
my_data1=pd.read_csv('purchase2.csv')
clf = tree.DecisionTreeClassifier()
X=my_data1[['Age', 'Income', 'Year-of-Education']]
y=my_data1['Favorite']
clf.fit(X, y)
fig = plt.figure(figsize=(16,14))
tree.plot_tree(clf,feature_names=X.columns,fontsize=12,filled=True)

clf = tree.DecisionTreeClassifier(min_impurity_decrease =0.0672)
X=my_data1[['Age', 'Income', 'Year-of-Education']]
y=my_data1['Favorite']
clf.fit(X, y)
fig = plt.figure(figsize=(10,10))
tree.plot_tree(clf,feature_names=X.columns,fontsize=12,filled=True)


from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)


for decrease in np.arange(0, 0.1,0.01):
    clf = tree.DecisionTreeClassifier(min_impurity_decrease =decrease)
    clf.fit(X_train, y_train)
    print("min_impurity_decrease=%f score=%f" %(decrease,clf.score(X_test, y_test))) 

The first part creates a tree



The second part creates another tree, which shows the effect of setting min_impurity_decrease =0.0674, since we know the Impurity Decrease for the third row right node is 0.0673, we know the tree will stop splitting at this node.



The third part shows the scores of tree by running min_impurity_decrease from 0 to 0.1


min_impurity_decrease=0.000000 score=0.500000

min_impurity_decrease=0.010000 score=0.500000

min_impurity_decrease=0.020000 score=0.500000

min_impurity_decrease=0.030000 score=0.500000

min_impurity_decrease=0.040000 score=0.500000

min_impurity_decrease=0.050000 score=0.500000

min_impurity_decrease=0.060000 score=0.500000

min_impurity_decrease=0.070000 score=0.250000

min_impurity_decrease=0.080000 score=0.250000

min_impurity_decrease=0.090000 score=0.250000


From the above results, we see min_impurity_decrease =0, 0.01,0.02,0.03,0.04,0.05,0.06 are better than min_impurity_decrease=0.07,0.08,0.09




To get help in decision tree or other related topics of this then you can get help from Realcode4you.com experts. Realcode4you.com is the top rated website where you get all machine learning and data science related help with an affordable price.


For more details you can send your requirement detail at:


realcode4you@gmail.com
bottom of page