Date of Graduation
Summer 2021
Degree
Master of Natural and Applied Science in Computer Science
Department
Computer Science
Committee Chair
Razib Iqbal
Abstract
Fourier-transform infrared (FTIR) spectra of organic compounds can be used to compare and identify compounds. A mid-FTIR spectrum gives absorbance values of a compound over the 400-4000 cm-1 range. Spectral matching is the process of comparing the spectral signature of two or more compounds and returning a value for the similarity of the compounds based on how closely their spectra match. This process is commonly used to identify an unknown compound by searching for its spectrum’s closes match in a database of known spectra. A major limitation of this process is that it can only be used to identify substances already in the database. An unknown compound not found in the database will likely match to a similar yet structurally different compound. Alternatively, FTIR has been used to identify characteristics, substructures, or functional groups of a compound based on the compounds IR spectral features. However, most works have only attempted to predict a limited set of substructures and there has only been limited success in predicting the full structure of an unknown compound based purely on its FTIR spectrum. For this thesis, I investigated the possibility of identifying compounds and identifying substructures present in the compound’s structure by analyzing the compound’s FTIR spectrum. This was dependent on the property that the infrared (IR) absorbances of a compound are the result of the physical interactions between bonded sets of atoms in the compound’s structure. I hypothesized that different instances of the same substructures will either give similar spectral signatures or some pattern of spectral signatures that could be learned using machine learning. In this thesis I show that it is possible to use convolutional neural networks (CNN) to predict the presence or absence of substructures within a compound. Finally, I demonstrate a method of making predictions for the full structure of these compounds based on the substructure predictions and the compound’s FTIR spectrum.
Keywords
Fourier-transform infrared spectroscopy, chemistry, chemical structure, chemical substructures, deep learning, convolutional neural networks, evolutionary optimization, deep Q-learning, reinforcement learning
Subject Categories
Artificial Intelligence and Robotics | Organic Chemistry | Other Computer Sciences
Copyright
© Joshua D. Ellis
Recommended Citation
Ellis, Joshua D., "Identification of Chemical Structures and Substructures via Deep Q-Learning and Supervised Learning of FTIR Spectra" (2021). MSU Graduate Theses. 3662.
https://bearworks.missouristate.edu/theses/3662
Open Access
Included in
Artificial Intelligence and Robotics Commons, Organic Chemistry Commons, Other Computer Sciences Commons