A Retrieval-Augmented Generation Framework for Traditional Chinese Medicine Herb Recommendation Using Symptom-Focused and Ingredient-Based Embeddings
Keywords:
Traditional Chinese Medicine, Retrieval-Augmented Generation, Large Language Models, Semantic Embeddings, Clinical Decision SupportAbstract
Traditional Chinese Medicine (TCM) herb recommendation systems face significant challenges in accurately matching complex patient symptoms with appropriate herbal treatments due to the intricate relationships between symptoms, herb properties, and traditional diagnostic principles. This research proposes a retrieval-augmented generation (RAG) framework that leverages specialized symptom-focused and ingredient-based embeddings to explore potential improvements in recommendation accuracy and clinical relevance. We developed three distinct RAG systems: the first focuses exclusively on symptom-herb relationships by utilizing detailed indication and effect data from comprehensive TCM knowledge bases, the second incorporates rich herb properties and meridian information to capture the fundamental characteristics of each medicinal substance, and the third combines both knowledge perspectives through integrated retrieval mechanisms. Our comprehensive experimental evaluation, conducted on a dataset of 15,000 historical TCM prescriptions with 100 test samples, reveals nuanced findings regarding the effectiveness of domain-specific embeddings. The symptom-focused RAG approach achieved modest improvements with 4.76% higher precision@5 (0.1320 vs. 0.1260) and 2.22% higher recall@5 (0.0967 vs. 0.0946) compared to the baseline LLM-only approach, though these differences did not achieve statistical significance (p > 0.05). Notably, the baseline LLM demonstrated robust performance across multiple metrics, including superior overall accuracy (0.1900), Mean Average Precision (0.0803), and NDCG@5 (0.1475), reflecting the substantial pre-trained medical knowledge embedded within modern large language models. The ingredient-focused and combined RAG approaches showed performance decrements relative to baseline, with 7.94% lower precision@5 and 7.29% lower recall@5, suggesting that simple knowledge concatenation strategies may introduce conflicting signals without careful integration design. These findings provide important insights into both the opportunities and challenges of applying retrieval-augmented generation to Traditional Chinese Medicine applications. While specialized domain embeddings show potential for targeted improvements in specific retrieval scenarios, achieving consistent and statistically significant performance gains requires further optimization of embedding construction strategies, retrieval mechanisms, knowledge integration approaches, and potentially larger-scale training datasets. The proposed framework establishes a methodological foundation for future research in TCM recommendation systems and highlights the importance of rigorous experimental validation, careful hyperparameter tuning, and sophisticated knowledge fusion techniques when bridging traditional medical knowledge with modern artificial intelligence capabilities. These results empirically demonstrate that embedding design choices critically determine RAG system performance in TCM applications, where misaligned knowledge integration can degrade rather than improve recommendation quality.
Downloads
Downloads
Published
How to Cite
Issue
Section
Categories
License
Copyright (c) 2026 Journal of Computer and Creative Technology

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.





















