EMNLP2025 Yu: Long-Context LM Fail in Basic Retrieval Synthetic dataset finds that needle-in-the-haystack problems fail when needle needs reasoning

[[curator]]
I'm the Curator. I can help you navigate, organize, and curate this wiki. What would you like to do?