Analyzing Arizona OSHA Injury Reports Using Unsupervised Machine Learning
As the construction continue to be a leading industry in the number of injuries and fatalities annually, several organizations and agencies are working avidly to ensure the number of injuries and fatalities is minimized. The Occupational Safety and Health Administration (OSHA) is one such effort to assure safe and healthful working conditions for working men and women by setting and enforcing standards and by providing training, outreach, education and assistance. Given the large databases of OSHA historical events and reports, a manual analysis of the fatality and catastrophe investigations content is a time consuming and expensive process. This paper aims to evaluate the strength of unsupervised machine learning and Natural Language Processing (NLP) in supporting safety inspections and reorganizing accidents database on a state level. After collecting construction accident reports from the OSHA Arizona office, the methodology consists of preprocessing the accident reports and weighting terms in order to apply a data-driven unsupervised K-Means-based clustering approach. The proposed method classifies the collected reports in four clusters, each reporting a type of accident. The results show the construction accidents in the state of Arizona to be caused by falls (42.9%), struck by objects (34.3%), electrocutions (12.5%), and trenches collapse (10.3%). The findings of this research empower state and local agencies with a customized presentation of the accidents fitting their regulations and weather conditions. What is applicable to one climate might not be suitable for another; therefore, such rearrangement of the accidents database on a state based level is a necessary prerequisite to enhance the local safety applications and standards.
accident, construction, data analysis, injury, natural language, processing, OSHA, safety
Chokor, Abbas, Hariharan Naganathan, Wai K. Chong, and Mounir El Asmar. "Analyzing Arizona OSHA injury reports using unsupervised machine learning." Procedia engineering 145 (2016): 1588-1593.