A Retrieval-Augmented Generation Framework for Traditional Chinese Medicine Herb Recommendation Using Symptom-Focused and Ingredient-Based Embeddings
DOI:
https://doi.org/10.65205/jcct.2026.e3516Keywords:
Traditional Chinese Medicine, Retrieval-Augmented Generation, Large Language Models, Semantic Embeddings, Clinical Decision SupportAbstract
This study proposes a RAG framework for Traditional Chinese Medicine (TCM) herb recommendation, developing three systems: one focusing on symptom-herb relationships, one on herb properties and meridian information, and one combining both approaches. Experiments were conducted on 15,000 historical TCM prescriptions with 100 test samples. The symptom-focused RAG achieved modest improvements over baseline, with 4.76% higher precision@5 and 2.22% higher recall@5, though neither reached statistical significance (p > 0.05). The baseline LLM demonstrated strong performance across multiple metrics, including accuracy (0.1900) and NDCG@5 (0.1475), reflecting substantial pre-trained medical knowledge. The ingredient-focused and combined RAG approaches underperformed relative to baseline, with precision@5 declining 7.94%, suggesting that naive knowledge concatenation may introduce conflicting signals. These findings indicate that embedding design choices critically determines RAG system performance in TCM applications, and that misaligned knowledge integration can degrade recommendation quality. This work establishes a methodological foundation for future TCM recommendation research, highlighting the importance of embedding construction, hyperparameter tuning, and sophisticated knowledge fusion strategies.
Downloads
References
Amugongo, L. M., Mascheroni, P., Brooks, S., Doering, S., & Seidel, J. (2025). Retrieval Augmented Generation for Large Language Models in Healthcare: A Systematic Review. PLOS Digital Health, 4(6), e0000877. https://doi.org/10.1371/journal.pdig.0000877
Cheng, N., Chen, Y., Gao, W., Liu, J., Huang, Q., Yan, C., Huang, X., & Ding, C. (2021). An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification. Frontiers in Genetics, 12, 807825. https://doi.org/10.3389/fgene.2021.807825
Feng, C., Zhou, S., Qu, Y., Wang, Q., Bao, S., Li, Y., & Yang, T. (2021). Overview of Artificial Intelligence Applications in Chinese Medicine Therapy. Evidence-Based Complementary and Alternative Medicine, 2021, 678958. https://doi.org/10.1155/2021/6678958
Flamholz, Z. N., Crane-Droesch, A., Ungar, L. H., & Weissman, G. E. (2022). Word Embeddings Trained on Published Case Reports are Lightweight, Effective for Clinical Tasks, and Free of Protected Health Information. Journal of Biomedical Informatics, 125, 103971. https://doi.org/10.1016/j.jbi.2021.103971
Hong, D.-R., Huang, C.-Y., & Gao, J. (2026). Comparative Performance of ChatGPT-5 and DeepSeek on the Chinese Ultrasound Medicine Senior Professional Title Examination. Frontiers in Digital Health, 8, 1783347. https://doi.org/10.3389/fdgth.2026.1783347
Huang, K., Altosaar, J., & Ranganath, R. (2019). Clinicalbert: Modeling Clinical Notes and Predicting Hospital Readmission. arXiv. https://doi.org/10.48550/arXiv.1904.05342
Jin, Y., Ji, W., Zhang, W., He, X., Wang, X., & Wang, X. (2022). A KG-Enhanced Multi-Graph Neural Network for Attentive Herb Recommendation. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 19(5), 2560-2571. https://doi.org/10.1109/tcbb.2021.3115489
Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W., Rocktäschel, T., Riedel, S., & Kiela, D. (2020, December 6 - 12). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NIPS'20: Proceedings of the 34th International Conference on Neural Information Processing Systems (12, 9459-9474). Curran Associates.
Li, W., & Yang, Z. (2017). Distributed Representation for Traditional Chinese Medicine Herb via Deep Learning Models. arXiv. https://doi.org/10.48550/arXiv.1711.01701
Li, Y., Liu, X., Zhou, J., Li, F., Wang, Y., & Liu, Q. (2025). Artificial Intelligence in Traditional Chinese Medicine: Advances in Multi-Metabolite Multi-Target Interaction Modeling. Frontiers in Pharmacology, 16, 1541509. https://doi.org/10.3389/fphar.2025.1541509
Liang, W., Chen, P., Zou, X., Lu, X., Liu, S., Yang, J., Li, Z., Zhong, W., Zhang, K., Liang, Y., He, J., & Zhong, N. (2025). DeepSeek: The “Watson” to Doctors-From Assistance to Collaboration. Journal of Thoracic Disease, 17(2), 1103-1105. https://doi.org/10.21037/jtd-2025b-03
Luo, Z., Cui, J., Hu, X., Tu, L., Liu, H., Jiao, W., Zeng, L., Jing, C., Qiao, L., Ma, X., Wang, Y., Wang, J., Pai, C.-H., Qi, Z., Zhang, Z., & Xu, J. (2018). A Study of Machine-Learning Classifiers for Hypertension Based on Radial Pulse Wave. BioMed Research International, 2018, 1-12. https://doi.org/10.1155/2018/2964816
Quidwai, M. A., & Lagana, A. (2024). A RAG Chatbot for Precision Medicine of Multiple Myeloma. medRxiv. https://doi.org/10.1101/2024.03.14.24304293
Singhal, K., Azizi, S., Tu, T., Mahdavi, S. S., Wei, J., Chung, H. W., Scales, N., Tanwani, A., Cole-Lewis, H., Pfohl, S., Payne, P., Seneviratne, M., Gamble, P., Kelly, C., Babiker, A., Schärli, N., Chowdhery, A., Mansfield, P., Demner-Fushman, D., … Natarajan, V. (2023). Large Language Models Encode Clinical Knowledge. Nature, 620(7972), 172-180. https://doi.org/10.1038/s41586-023-06291-2
Sun, Z., Zang, X., Zheng, K., Song, Y., Xu, J., Zhang, X., Yu, W., Song, Y., & Li, H. (2025). ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability. arXiv. https://doi.org/10.48550/arXiv.2410.11414
Wang, X., Qu, H., Liu, P., & Cheng, Y. (2004). A Self-Learning Expert System for Diagnosis in Traditional Chinese Medicine. Expert Systems with Applications, 26(4), 557-566. https://doi.org/10.1016/j.eswa.2003.10.004
Weng, H., Chen, J., Ou, A., & Lao, Y. (2022). Leveraging Representation Learning for the Construction and Application of a Knowledge Graph for Traditional Chinese Medicine: Framework Development Study. JMIR Medical Informatics, 10(9), e38414. https://doi.org/10.2196/38414
Xie, Q., Cui, M., Wu, Z.-D., & Zhao, H. (2010). Traditional Chinese Medicine Information Digitalization Discussion. Journal of Alternative and Complementary Medicine, 16(11), 1207-1209. https://doi.org/10.1089/acm.2009.0700
Xu, Q., Zeng, Y., Tang, W., Peng, W., Xia, T., Li, Z., Teng, F., Li, W., & Guo, J. (2020). Multi-Task Joint Learning Model for Segmenting and Classifying Tongue Images Using a Deep Neural Network. IEEE Journal of Biomedical and Health Informatics, 24(9), 2481-2489. https://doi.org/10.1109/JBHI.2020.2986376
Zakka, C., Shad, R., Chaurasia, A., Dalal, A. R., Kim, J. L., Moor, M., Fong, R., Phillips, C., Alexander, K., Ashley, E., Boyd, J., Boyd, K., Hirsch, K., Langlotz, C., Lee, R., Melia, J., Nelson, J., Sallam, K., Tullis, S., … Hiesinger, W. (2024). Almanac-Retrieval-Augmented Language Models for Clinical Medicine. NEJM AI, 1(2). https://doi.org/10.1056/aIoa2300068
Zhang, Y., Chen, Q., Yang, Z., Lin, H., & Lu, Z. (2019). BioWordVec, Improving Biomedical Word Embeddings with Subword Information and MeSH. Scientific Data, 6, 52. https://doi.org/10.1038/s41597-019-0055-0
Zhao, C., Li, G.-Z., Wang, C., & Niu, J. (2015). Advances in Patient Classification for Traditional Chinese Medicine: A Machine Learning Perspective. Evidence-Based Complementary and Alternative Medicine, 2015, 376716. https://doi.org/10.1155/2015/376716
Zhou, W., Yang, K., Zeng, J., Lai, X., Wang, X., Ji, C., Li, Y., Zhang, P., & Li, S. (2021). FordNet: Recommending Traditional Chinese Medicine Formula via Deep Neural Network Integrating Phenotype and Molecule. Pharmacological Research, 173, 105752. https://doi.org/10.1016/j.phrs.2021.105752
Zhou, X., Peng, Y., & Liu, B. (2010). Text Mining for Traditional Chinese Medical Knowledge Discovery: A Survey. Journal of Biomedical Informatics, 43(4), 650-660. https://doi.org/10.1016/j.jbi.2010.01.002
Downloads
Published
How to Cite
Issue
Section
Categories
License
Copyright (c) 2026 Journal of Computer and Creative Technology

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.





















