Date of Graduation

Summer 2019


Master of Science in Mathematics



Committee Chair

Yingcai Su


pattern recognition, supervised learning, decision trees, classification trees, nodes, target variable, pruning, graduation rate, proficiency, logistic regression

Subject Categories

Statistical Theory


While working as an educator for the past fourteen years, we are always looking at data and determining ways to help our students. Graduation status is one area of interest. I wanted to apply statistical methods to try and find early indicators of those students who may drop out, thus being able to provide early intervention to those students. With early intervention, we may be able to lower our dropout rate. While studying different methods of pattern recognition, I found that the decision tree method in machine learning was the best for the data that I had collected. Decision trees are suited for data that is numeric and categorical. It is a simplistic method of pattern recognition that is easy to interpret. Decision trees begin with a root node and attributes are tested to determine the branches that lead to the leaf node, which is where decisions or classifications are made for the target variable. Students state assessment scores and lunch status were used to find a pattern for those who graduate and those who drop out. The data was then re-run again using logistic regression. Running the data using logistic regression found some similarities with the decision tree. There was not a clear pattern that separated those students who graduate from those who dropped out. However, there are a few areas of testing that may provide a start for early intervention with our struggling students.


© Andrea M. Lee

Open Access