Date of Graduation

Spring 2019

Degree

Master of Science in Mathematics

Department

Mathematics

Committee Chair

George Mathew

Abstract

Tree methods are some of the best and most commonly used methods in the field of statistical learning. They are widely used in classification and regression modeling. This thesis introduces the concept and focuses more on decision trees such as Classification and Regression Trees (CART) used for classification and regression predictive modeling problems. We also introduced some ensemble methods such as bagging, random forest and boosting. These methods were introduced to improve the performance and accuracy of the models constructed by classification and regression tree models. This work also provides an in-depth understanding of how the CART models are constructed, the algorithm behind the construction and also using cost-complexity approaching in tree pruning for regression trees and classification error rate approach used for pruning classification trees. We took two real-life examples, which we used to solve classification problem such as classifying the type of cancer based on tumor type, size and other parameters present in the dataset and regression problem such as predicting the first year GPA of a college student based on high school GPA, SAT scores and other parameters present in the dataset.

Keywords

decision trees, classification trees, regression trees, bagging, random forest, boosting

Subject Categories

Statistical Models

Copyright

© Obinna Chilezie Njoku

Open Access

Share

COinS