Faculty Scholarship

Balanced Benchmarking of Zero-Shot and RAG Approaches for Biomedical Term Normalization

Abstract

Normalization of medical concepts to an ontology is a key aspect of the natural language processing of biomedical text. It enables the mapping of medical expressions to standardized ontology terms and their identifiers, thereby enhancing the interoperability and computability of medical concepts. Although large language models (LLMs) can identify and standardize medical terms, they may struggle to accurately map ontology terms to their corresponding ontology identifiers. These challenges arise from the stochastic nature of LLMs, their limited exposure to uncommon ontology identifiers during training, and their lack of an integrated lookup mechanism. We generated test sets of synthetic terms to assess normalization performance by both zero-shot prompted and retrieval-augmented generation (RAG) prompted methods across two ontologies (Human Phenotype Ontology and Gene Ontology) and three LLMs (GPT-4o, LLaMA 3.3 70B, and Phi-4). To ensure a calibrated and fair evaluation of normalization, the test set was balanced along two axes: (1) term prevalence in biomedical literature, as estimated by PubMed Central frequency counts, and (2) semantic proximity to ontology terms, as assessed by cosine similarity of BioBERT embeddings. Our results demonstrate that RAG consistently outperforms zero-shot prompting, particularly on low-prevalence terms that are infrequently encountered in the biomedical literature. This highlights the value of RAG in compensating for gaps in model exposure to uncommon medical concepts. We demonstrate that a synthetic test set can be a valuable tool for evaluating biomedical term normalization across LLMs.

Department(s)

Cooperative Engineering Program

Document Type

Conference Proceeding

DOI

10.1109/CIBCB66090.2025.11177118

Keywords

BioBERT, cosine similarity, Gene Ontology, Human Phenotype Ontology, large language models, normalization, ontology identifiers, Ontology mapping

Publication Date

1-1-2025

Recommended Citation

Do, Thanh Son; Obafemi-Ajayi, Tayo; and Hier, Daniel B., "Balanced Benchmarking of Zero-Shot and RAG Approaches for Biomedical Term Normalization" (2025). Faculty Scholarship. 253.
https://bearworks.missouristate.edu/articles00/253

Journal Title

2025 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology Cibcb 2025

Link to Full Text

COinS

Faculty Scholarship

Balanced Benchmarking of Zero-Shot and RAG Approaches for Biomedical Term Normalization

Abstract

Department(s)

Document Type

DOI

Keywords

Publication Date

Recommended Citation

Journal Title

Browse

Search

Author Corner

Faculty Scholarship

Balanced Benchmarking of Zero-Shot and RAG Approaches for Biomedical Term Normalization

Authors

Abstract

Department(s)

Document Type

DOI

Keywords

Publication Date

Recommended Citation

Journal Title

Share

Browse

Search

Author Corner