HarmoWAM Introduces Adaptive World Action Models for Enhanced Robotic Manipulation
HarmoWAM presents a novel approach to robot control through the integration of World Action Models (WAMs), addressing the fundamental trade-off between two existing paradigms: 'Imagine-then-Execute' and 'Joint Modeling'. The research demonstrates that while the former excels in generalizability, it lacks precision, and the latter provides fine-grained actions but is limited by training distribution exploration. HarmoWAM unifies these approaches by employing a world model to enhance both predictive and reactive control, utilizing a Process-Adaptive Gating Mechanism for effective coordination. Evaluations across six real-world robotic tasks reveal that HarmoWAM achieves significant zero-shot generalization, outperforming previous models by 33% and 29%, respectively, in diverse testing environments.
