{"type":"link","version":"1.0","title":"A learned per-head gate lets a model decide how much to trust retrieved memory versus local attention, and most heads choose memory","author_name":"AI Archs","author_url":"https://ai-arch.pages.dev","provider_name":"AI Archs","provider_url":"https://ai-arch.pages.dev","url":"https://ai-arch.pages.dev/n/gated-merge-of-retrieval-and-local-attention","thumbnail_url":"https://ai-arch.pages.dev/og/gated-merge-of-retrieval-and-local-attention.png","thumbnail_width":1200,"thumbnail_height":630}