DecisionTree RandomForest_Medhavi

.docx

School

University of Illinois, Urbana Champaign *

*We aren’t endorsed by this school

Course

557

Subject

Industrial Engineering

Date

Apr 3, 2024

Type

docx

Pages

Uploaded by MajorDiscovery13494 on coursehero.com

Decision Tree Exercise 1. Use a graph (ggplot) to show the distribution of LIFETIME_GIFT_COUNT Code: ggplot(X15_donor_exercise, aes(x = LIFETIME_GIFT_COUNT)) + geom_histogram(binwidth = 1, fill = "blue", color = "black") + theme_minimal() + labs(title = "Distribution of Lifetime Gift Count", x = "Lifetime Gift Count", y = "Frequency") Output: 2. Use a graph (ggplot) to show relationship between MEDIAN_HOME_VALUE and LIFETIME_GIFT_COUNT Code: X15_donor_exercise$MEDIAN_HOME_VALUE <- gsub("[^0-9]", "", X15_donor_exercise$MEDIAN_HOME_VALUE) X15_donor_exercise$MEDIAN_HOME_VALUE <- as.numeric(X15_donor_exercise$MEDIAN_HOME_VALUE) ggplot(X15_donor_exercise, aes(x = MEDIAN_HOME_VALUE, y = LIFETIME_GIFT_COUNT)) + geom_point(alpha = 0.5) + theme_minimal() + labs(title = "Relationship between Median Home Value and Lifetime Gift Count", x = "Median Home Value", y = "Lifetime Gift Count")

Output: 3. Build a decision tree to predict if an individual should be targeted as a donor Code: library(rpart) library(rpart.plot) library(caret) X15_donor_exercise$TARGET_B <- as.factor(X15_donor_exercise$TARGET_B) set.seed(50) index <- createDataPartition(X15_donor_exercise$TARGET_B, p = 0.8, list = FALSE) trainData <- X15_donor_exercise[index,] testData <- X15_donor_exercise[-index,] model <- rpart(TARGET_B ~ ., data = trainData, method = "class") rpart.plot(model, type = 4, extra = 102)

Output: 4. (predict variable: TARGET_B). (set.seed(50)). Use 80% in the training set. Code: X15_donor_exercise$TARGET_B <- as.factor(X15_donor_exercise$TARGET_B) set.seed(50) index <- createDataPartition(X15_donor_exercise$TARGET_B, p = 0.8, list = FALSE) trainData <- X15_donor_exercise[index,] testData <- X15_donor_exercise[-index,] model <- rpart(TARGET_B ~ ., data = trainData, method = "class") rpart.plot(model, type = 4, extra = 102) predictions <- predict(model, testData, type = "class") confusionMatrix(predictions, testData$TARGET_B) Output: Confusion Matrix and Statistics Reference Prediction NO YES NO 1157 723 YES 114 245 Accuracy : 0.6262 95% CI : (0.6058, 0.6463) No Information Rate : 0.5677 P-Value [Acc > NIR] : 1.055e-08

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help