Researchers Ali Hatamizadeh, Yejin Choi, and Jan Kautz have introduced Gated DeltaNet-2, an advanced model that enhances linear attention by effectively decoupling the processes of erasing and writing in memory management. Traditional linear attention techniques face challenges in editing compressed memory without disrupting existing data associations. Gated DeltaNet-2 innovatively utilizes separate channel-wise erase and write gates, improving upon predecessors like Gated DeltaNet and Kimi Delta Attention. With 1.3 billion parameters trained on a vast dataset, Gated DeltaNet-2 demonstrates superior performance in language modeling and commonsense reasoning tasks, particularly excelling in long-context retrieval benchmarks. The model's code is publicly available for further research and development.
Introducing Gated DeltaNet-2: A Breakthrough in Linear Attention Mechanisms
More Articles From This Day
Google DeepMind Launches Accelerator Program in Asia Pacific to Address Environmental Risks
Google DeepMind has announced the launch of its inaugural Accelerator program in the Asia-Pacific region aimed at addressing environmental challenges through artificial intelligence. The program, titled 'AI for the Planet,' is designed for startups, research teams, and nonprofits to utilize advanced AI technologies to tackle issues related to nature, climate, agriculture, and energy. Selected participants will receive mentorship and support from Google AI experts to integrate frontier AI models into their initiatives. The program will commence with an in-person bootcamp in Singapore, encouraging innovators focused on climate solutions to scale their efforts.
