Med-Aligner Empowers LLM Medical Applications for complex medical scenarios

Xiangbin Meng, Jiaming Ji, Xiangyu Yan, Juntao Dai, Boyuan Chen, Guan Wang, Hua Xu, Jingjia Wang, Xuliang Wang, Da Liu, Mingqi Zheng, Rongzhou Wu, Chuanjie Wu, Yuwei Wu, Wenyao Wan, Zhen Song, Yaodong Yang

June 2025

Abstract

Large language models (LLMs) show great promise in medical applications, but challenges like limited high-quality data, closed-source model rigidity, and reasoning degradation during fine-tuning hinder their reliability. To address this, we present Med-Aligner, a plug-in module that learns correction residuals to improve accuracy without full model re-optimization. Trained on 267,524 anonymized medical records from 21 departments, Med-Aligner was integrated with eight LLMs (eg, GPT-4, Med-Llama3-8B) and evaluated on helpfulness, harmlessness, and honesty (3H). It achieved average gains of 41.3±25.4%, 30.3±12.4%, and 27.3±14.8% in helpfulness; 10.9±8.6% and 16.6±11.3% in harmlessness; and a median 1.7%(range: 0.4%–3.4%) improvement in honesty (p< 0.05). Distribution shift plots confirmed consistent gains, especially in safety and utility. Its lightweight, model-agnostic design enables deployment on resource-limited devices like smartphones. Top rankings on the Alpaca-Eval leaderboard further validate its effectiveness. By bridging open-source and proprietary LLMs, Med-Aligner offers a flexible, efficient solution for medical AI. Limitations include reliance on offline data and the need for clinical validation.

Type

Publication

The Innovation