High-Throughput Phenotyping of Clinical Text Using Large Language Models

Abstract

High-throughput phenotyping automates the mapping of patient signs to standardized concepts, such as those in Human Phenotype Ontology (HPO), a process critical to precision medicine. We evaluated the automated phenotyping of clinical summaries from the Online Mendelian Inheritance in Man (OMIM) database using a large language model. Various APIs were used to automate text retrieval, sign identification, categorization, and normalization. GPT-4 outperformed GPT-3.5Turbo in identifying, categorizing, and normalizing signs, achieving concordance with manual annotators comparable to concordance between manual annotators. While GPT-4 demonstrates high accuracy in sign identification and categorization, limitations remain in sign normalization, particularly in retrieving the correct HPO ID for a normalized term. Methods such as retrieval-augmented generation, changes in pre-training, and additional fine-tuning may help address these limitations. The combination of APIs with large language models presents a promising approach for high-throughput phenotyping of free text.

Department(s)

Cooperative Engineering Program

Document Type

Conference Proceeding

DOI

10.1109/BHI62660.2024.10913712

Keywords

GPT-4, high-throughput, HPO, large language model, natural language processing, neurology, OMIM, phenotype

Publication Date

1-1-2024

Recommended Citation

Obafemi-Ajayi, Tayo; Hier, Daniel B.; Munzir, S. Ilyas; Stahlfeld, Anne; and Carrithers, Michael D., "High-Throughput Phenotyping of Clinical Text Using Large Language Models" (2024). Faculty Scholarship. 475.
https://bearworks.missouristate.edu/articles00/475

Journal Title

Bhi 2024 IEEE EMBS International Conference on Biomedical and Health Informatics Proceedings

Faculty Scholarship

High-Throughput Phenotyping of Clinical Text Using Large Language Models

Abstract

Department(s)

Document Type

DOI

Keywords

Publication Date

Recommended Citation

Journal Title

Browse

Search

Author Corner

Faculty Scholarship

High-Throughput Phenotyping of Clinical Text Using Large Language Models

Authors

Abstract

Department(s)

Document Type

DOI

Keywords

Publication Date

Recommended Citation

Journal Title

Share

Browse

Search

Author Corner