top of page
Search

# Working With Prostate Cancer Dataset Using R | R Programming Assignment Help | Basic Practice Set

Use the prostate cancer dataset for the following exercises

2. Examine the structure of the dataset.

3. Remove the first variable(id) from the data set

4. Get the number of Benign (B) cases and Malignant (M) cases. Hint: ‘table’

5. Create a normalize function

6. Using the function created in Question 5, normalize the numeric features in the data set.

7. Confirm that the normalization worked

8. Create the training(1 through 65) and test datasets (66 through 100)

9. Use the knn() function to classify test data

10. Evaluate the model performance

Code Implementation

##install packages

```install.packages("psych")
install.packages('class')

library(class)

library(psych)```

```data <- read.csv("C:/Users/navee/OneDrive/Desktop/Oct 2022/Deadline 14 Oct +1 (202) 902-3768 R Programming/Prostate_Cancer.csv",
stringsAsFactors=TRUE,sep = ",")```

#2. Examine the structure of the dataset.

`str(data)`

#3. Remove the first variable(id) from the data set

```data <- data[,-1]

#4. Get the number of Benign (B) cases and Malignant (M) cases. Hint: 'table'

`table(data["diagnosis_result"])`

#5. Create a normalize function

```normalize <- function(x) {
return ((x - min(x)) / (max(x) - min(x)))
}```

#6. Using the function created in Question 5, normalize the numeric features in the data set.

`data.n <- as.data.frame(lapply(data[,2:9], normalize))`

#7. Confirm that the normalization worked

`head(data.n)`

#8. Create the training(1 through 65) and test datasets (66 through 100)

```train.data <- data.n[1:65,]
test.data <- data.n[66:100,]

train.label <- data[1:65,1]
test.label <- data[66:100,1]

#9. Use the knn() function to classify test data

```knn.res <- knn(train=train.data, test=test.data, cl=train.label, k=8)
```

#10. Evaluate the model performance

```ACC.res <- 100 * sum(test.label == knn.res)/NROW(test.label)
ACC.res```