Research presents WALDO, a training-free framework for zero-shot anomaly localisation in medical imaging, leveraging vision-language models (VLMs) to improve rare pathology detection. The framework reformulates anomaly detection as a comparative inference problem, utilizing entropy-weighted Sliced Wasserstein distances for anatomically-aware reference selection, Goldilocks zone sampling for optimal reference similarity, and self-consistency aggregation through weighted non-maximum suppression. Theoretical analysis indicates that moderate similarity references minimize bias-variance trade-offs in visual reasoning. Evaluated on the NOVA brain MRI benchmark, WALDO with Qwen2.5-VL-72B achieved 43.5% mAP@30, a 19% relative improvement over zero-shot baselines, with statistical significance confirmed by paired McNemar tests ($p<0.01$). Source code is available on GitHub.
WALDO Framework Enhances Zero-Shot Anomaly Localisation in Medical Imaging Using Vision-Language Models
More Articles From This Day
IMF Issues Warning on Potential Systemic Risks of New AI Models to Financial Sector
The International Monetary Fund (IMF) has issued a warning regarding the potential for 'systemic' shocks to the finance sector due to new AI models. The organization emphasizes the need for preparations to address the 'inevitable' AI-enabled breaches that could compromise the cyber defenses of financial institutions. This alert highlights the growing concerns about the intersection of advanced AI technologies and financial stability.
