Large language models (LLMs) show great promise in medical applications, but challenges like limited high-quality data, closed-source model rigidity, and reasoning degradation during fine-tuning hinder their reliability. To address this, we present Med-Aligner, a plug-in module that learns correction residuals to improve accuracy without full model re-optimization. Trained on 267,524 anonymized medical records from 21 departments, Med-Aligner was integrated with eight LLMs (eg, GPT-4, Med-Llama3-8B) and evaluated on helpfulness, harmlessness, and honesty (3H). It achieved average gains of 41.3±25.4%, 30.3±12.4%, and 27.3±14.8% in helpfulness; 10.9±8.6% and 16.6±11.3% in harmlessness; and a median 1.7%(range: 0.4%–3.4%) improvement in honesty (p< 0.05). Distribution shift plots confirmed consistent gains, especially in safety and utility. Its lightweight, model-agnostic design enables deployment on resource-limited devices like smartphones. Top rankings on the Alpaca-Eval leaderboard further validate its effectiveness. By bridging open-source and proprietary LLMs, Med-Aligner offers a flexible, efficient solution for medical AI. Limitations include reliance on offline data and the need for clinical validation.