Before you run the python files, you need to run “Anaconda Prompt” in the same location as “Spyder”. “Anaconda Prompt” is a command line window.
import numpy as np import matplotlib.pyplot as plt from sklearn import tree import pandas as pd my_data1=pd.read_csv('purchase2.csv') clf = tree.DecisionTreeClassifier() X=my_data1[['Age', 'Income', 'Year-of-Education']] y=my_data1['Favorite'] clf.fit(X, y) fig = plt.figure(figsize=(16,14)) tree.plot_tree(clf,feature_names=X.columns,fontsize=12,filled=True) clf = tree.DecisionTreeClassifier(min_impurity_decrease =0.0672) X=my_data1[['Age', 'Income', 'Year-of-Education']] y=my_data1['Favorite'] clf.fit(X, y) fig = plt.figure(figsize=(10,10)) tree.plot_tree(clf,feature_names=X.columns,fontsize=12,filled=True) from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0) for decrease in np.arange(0, 0.1,0.01): clf = tree.DecisionTreeClassifier(min_impurity_decrease =decrease) clf.fit(X_train, y_train) print("min_impurity_decrease=%f score=%f" %(decrease,clf.score(X_test, y_test)))
The first part creates a tree
The second part creates another tree, which shows the effect of setting min_impurity_decrease =0.0674, since we know the Impurity Decrease for the third row right node is 0.0673, we know the tree will stop splitting at this node.
The third part shows the scores of tree by running min_impurity_decrease from 0 to 0.1
From the above results, we see min_impurity_decrease =0, 0.01,0.02,0.03,0.04,0.05,0.06 are better than min_impurity_decrease=0.07,0.08,0.09
To get help in decision tree or other related topics of this then you can get help from Realcode4you.com experts. Realcode4you.com is the top rated website where you get all machine learning and data science related help with an affordable price.
For more details you can send your requirement detail at: