Deep Neural Network for Human Breast Cancer Prognosis Prediction

Import Libraries


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import preprocessing
from keras.models import Sequential
from keras.layers import Dense, Flatten, LSTM, Conv1D, MaxPooling1D, Dropout, Activation,Embedding
from sklearn.preprocessing import MinMaxScaler

Read Data

Dataset You can download from here

d = pd.read_csv('/brca_metabric_clinical_data.tsv',sep='\t')
d.dropna(inplace=True)
d.head()

Output


d['Patient\'s Vital Status'].unique()

Output

array(['Living', 'Died of Disease', 'Died of Other Causes'], dtype=object)


#
y = d['Patient\'s Vital Status']

le = preprocessing.LabelEncoder()
le.fit(y)
y = le.transform(y)

le.classes_


Output

array(['Died of Disease', 'Died of Other Causes', 'Living'], dtype=object)



d.info()

Output:

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1092 entries, 1 to 1743
Data columns (total 38 columns):
 #   Column                          Non-Null Count  Dtype  
---  ------                          --------------  -----  
 0   Study ID                        1092 non-null   object 
 1   Patient ID                      1092 non-null   object 
 2   Sample ID                       1092 non-null   object 
 3   Age at Diagnosis                1092 non-null   float64
 4   Type of Breast Surgery          1092 non-null   object 
 5   Cancer Type                     1092 non-null   object 
 6   Cancer Type Detailed            1092 non-null   object 
 7   Cellularity                     1092 non-null   object 
 8   Chemotherapy                    1092 non-null   object 
 9   Pam50 + Claudin-low subtype     1092 non-null   object 
 10  Cohort                          1092 non-null   float64
 11  ER status measured by IHC       1092 non-null   object 
 12  ER Status                       1092 non-null   object 
 13  Neoplasm Histologic Grade       1092 non-null   float64
 14  HER2 status measured by SNP6    1092 non-null   object 
 15  HER2 Status                     1092 non-null   object 
 16  Tumor Other Histologic Subtype  1092 non-null   object 
 17  Hormone Therapy                 1092 non-null   object 
 18  Inferred Menopausal State       1092 non-null   object 
 19  Integrative Cluster             1092 non-null   object 
 20  Primary Tumor Laterality        1092 non-null   object 
 21  Lymph nodes examined positive   1092 non-null   float64
 22  Mutation Count                  1092 non-null   float64
 23  Nottingham prognostic index     1092 non-null   float64
 24  Oncotree Code                   1092 non-null   object 
 25  Overall Survival (Months)       1092 non-null   float64
 26  Overall Survival Status         1092 non-null   object 
 27  PR Status                       1092 non-null   object 
 28  Radio Therapy                   1092 non-null   object 
 29  Relapse Free Status (Months)    1092 non-null   float64
 30  Relapse Free Status             1092 non-null   object 
 31  Number of Samples Per Patient   1092 non-null   int64  
 32  Sample Type                     1092 non-null   object 
 33  Sex                             1092 non-null   object 
 34  3-Gene classifier subtype       1092 non-null   object 
 35  Tumor Size                      1092 non-null   float64
 36  Tumor Stage                     1092 non-null   float64
 37  Patient's Vital Status          1092 non-null   object 
dtypes: float64(10), int64(1), object(27)
memory usage: 332.7+ KB


prepDF = d.select_dtypes(exclude=[object])
objDF = d.select_dtypes(include=[object])

objDF.head()

Output:



Adding Label Encoder to Change String data into integer

nle = preprocessing.LabelEncoder()
for i in objDF.columns:
  objDF[i] = nle.fit_transform(objDF[i])

features = pd.concat([objDF, prepDF], axis=1)
freatures

Output:


Check dataset columns

features.columns

Output:

Index(['Study ID', 'Patient ID', 'Sample ID', 'Type of Breast Surgery',
       'Cancer Type', 'Cancer Type Detailed', 'Cellularity', 'Chemotherapy',
       'Pam50 + Claudin-low subtype', 'ER status measured by IHC', 'ER Status',
       'HER2 status measured by SNP6', 'HER2 Status',
       'Tumor Other Histologic Subtype', 'Hormone Therapy',
       'Inferred Menopausal State', 'Integrative Cluster',
       'Primary Tumor Laterality', 'Oncotree Code', 'Overall Survival Status',
       'PR Status', 'Radio Therapy', 'Relapse Free Status', 'Sample Type',
       'Sex', '3-Gene classifier subtype', 'Patient's Vital Status',
       'Age at Diagnosis', 'Cohort', 'Neoplasm Histologic Grade',
       'Lymph nodes examined positive', 'Mutation Count',
       'Nottingham prognostic index', 'Overall Survival (Months)',
       'Relapse Free Status (Months)', 'Number of Samples Per Patient',
       'Tumor Size', 'Tumor Stage'],
      dtype='object')

EDA to show dataset Column's Relation

for i in prepDF.columns:
  plt.hist(prepDF[i])
  plt.title(i)
  plt.xlabel('samples')
  plt.ylabel('frequency')
  plt.show()


Output:

















...



Building Model

scaler = MinMaxScaler()
features=scaler.fit_transform(features)

#building up the model
deepModel = Sequential()
deepModel.add(Dense(110, activation='relu', input_dim=features.shape[1]))
deepModel.add(Dense(70, activation='relu'))
deepModel.add(Dense(30, activation='relu'))
deepModel.add(Flatten())
deepModel.add(Dense(1, activation='sigmoid'))


deepModel.compile(optimizer='sgd', 
              loss='mse', 
              metrics=['accuracy'])
his = deepModel.fit(features,y,validation_split=0.2,epochs=20,batch_size=10)

Output:

Epoch 1/20
88/88 [==============================] - 0s 3ms/step - loss: 0.9908 - accuracy: 0.2257 - val_loss: 0.7325 - val_accuracy: 0.2557
Epoch 2/20
88/88 [==============================] - 0s 1ms/step - loss: 0.7648 - accuracy: 0.2085 - val_loss: 0.6631 - val_accuracy: 0.2557
Epoch 3/20
88/88 [==============================] - 0s 1ms/step - loss: 0.6853 - accuracy: 0.2085 - val_loss: 0.5789 - val_accuracy: 0.2557
Epoch 4/20
88/88 [==============================] - 0s 1ms/step - loss: 0.6054 - accuracy: 0.3517 - val_loss: 0.5048 - val_accuracy: 0.5205
Epoch 5/20
88/88 [==============================] - 0s 1ms/step - loss: 0.5458 - accuracy: 0.5120 - val_loss: 0.4595 - val_accuracy: 0.5708
Epoch 6/20
88/88 [==============================] - 0s 1ms/step - loss: 0.5123 - accuracy: 0.5212 - val_loss: 0.4355 - val_accuracy: 0.5799
Epoch 7/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4950 - accuracy: 0.5235 - val_loss: 0.4243 - val_accuracy: 0.5799
Epoch 8/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4859 - accuracy: 0.5258 - val_loss: 0.4180 - val_accuracy: 0.5845
Epoch 9/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4804 - accuracy: 0.5258 - val_loss: 0.4145 - val_accuracy: 0.5845
Epoch 10/20
88/88 [==============================] - 0s 2ms/step - loss: 0.4773 - accuracy: 0.5281 - val_loss: 0.4106 - val_accuracy: 0.5845
Epoch 11/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4750 - accuracy: 0.5269 - val_loss: 0.4083 - val_accuracy: 0.5845
Epoch 12/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4731 - accuracy: 0.5281 - val_loss: 0.4076 - val_accuracy: 0.5845
Epoch 13/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4717 - accuracy: 0.5304 - val_loss: 0.4053 - val_accuracy: 0.5845
Epoch 14/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4706 - accuracy: 0.5304 - val_loss: 0.4049 - val_accuracy: 0.5845
Epoch 15/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4697 - accuracy: 0.5304 - val_loss: 0.4031 - val_accuracy: 0.5845
Epoch 16/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4689 - accuracy: 0.5326 - val_loss: 0.4023 - val_accuracy: 0.5845
Epoch 17/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4683 - accuracy: 0.5338 - val_loss: 0.4010 - val_accuracy: 0.5890
Epoch 18/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4675 - accuracy: 0.5326 - val_loss: 0.3998 - val_accuracy: 0.5982
Epoch 19/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4671 - accuracy: 0.5349 - val_loss: 0.3992 - val_accuracy: 0.5982
Epoch 20/20
88/88 [==============================] - 0s 1ms/step - loss: 0.4665 - accuracy: 0.5349 - val_loss: 0.3994 - val_accuracy: 0.5890

Check Summary of Deepmodel

deepModel.summary()

Output:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 110)               4290      
_________________________________________________________________
dense_1 (Dense)              (None, 70)                7770      
_________________________________________________________________
dense_2 (Dense)              (None, 30)                2130      
_________________________________________________________________
flatten (Flatten)            (None, 30)                0         
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 31        
=================================================================
Total params: 14,221
Trainable params: 14,221
Non-trainable params: 0

Accuracy In Visual Form

plt.plot(his.history['accuracy'])
plt.plot(his.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

Output:











plt.plot(his.history['loss'])
plt.plot(his.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

Output:











Looking help in machine learning projects, assignments and any coding or if you need help in any support related to machine learning then send your request at realcode4you@gmail.com and get instant help in affordable prices.


We are also offering:

  • Data Visualization Help in Python, R, MATLAB

  • Machine Learning Coursework Help

  • Machine Learning homework Help

  • Machine Learning Project Help

  • Machine Learning Coding Help

  • Machine Learning Assignment Help With R Programming