Last week, Google dropped a pre-print describing Med-Gemini, their latest suite of medical focused Large Language Models (LLMs).
See pre-print at: https://arxiv.org/abs/2404.18416
🎯 If you are working in the arena of medical LLMs, this is a must read.
A quick summary.
First, Med-Gemini is not one model, but a family of models. For each model, the Google team:
👉 Started with Gemini, Google's most recent and advanced LLM. Notably, Gemini include supports for multi-modal reasoning and long-context windows.
👉 Performed fine-tuning with specific data sets, including specific multimodal data sets.
👉 Added web search capability, where the model can search the web for trusted information, and integrate newly found data into its responses.
👉 Added "uncertainty-guided search at inference", a process whereby the model can iteratively search the web and find data to support or refute specific medical conclusions.
Google then tested different models against different benchmarks; and found that it out-performed previous State of the Art (SOTA) models in most benchmarks. Notably, some of the benchmarks were text-based and some were multi-modal based.
🤖 Google also included several example conversations with different Med-Gemini models, illustrating an impressive range of features. For example: diagnosing skin bumps, reading and interpreting a chest x-ray, summarizing EHR clinical notes, and extracting targeted segments from instructional medical videos.
Finally, Google is very cognizant that the bar is high for deploying such models for diagnostic purposes, and cautions that much more work is needed. For background on this, see my recent post on Food, Drugs and AI. However, the authors also note that Med-Gemini surpasses humans in a few non-diagnostic areas, including: medical text summarization and referral letter generation.
Of note, Google has decided to not open source any of the Med-Gemini models, due to “safety implications of unmonitored use of such a system in medical settings”.
"I understand your concern" -- who is "I", I wonder...