EMNLP2025 Yu: Long-Context LM Fail in Basic Retrieval Synthetic dataset finds that needle-in-the-haystack problems fail when needle needs reasoning
EMNLP2025 Yu: Long-Context LM Fail in Basic Retrieval Synthetic dataset finds that needle-in-the-haystack problems fail when needle needs reasoning