Blog: https://blog.reachsumit.com/
Newsletter: https://recsys.substack.com/
๐ recsys.substack.com/p/training-f...
Introduces a Python package that wraps retrieval models as HTTP APIs with automatic query batching and caching for dynamic RAG pipelines.
๐ arxiv.org/abs/2601.10644
๐จ๐ฝโ๐ป github.com/hltcoe/routir
Introduces a Python package that wraps retrieval models as HTTP APIs with automatic query batching and caching for dynamic RAG pipelines.
๐ arxiv.org/abs/2601.10644
๐จ๐ฝโ๐ป github.com/hltcoe/routir
Huawei presents a modular multimodal information-seeking agent that decouples retrieval from answer generation, optimized with retrieval-oriented rewards.
๐ arxiv.org/abs/2601.09278
Huawei presents a modular multimodal information-seeking agent that decouples retrieval from answer generation, optimized with retrieval-oriented rewards.
๐ arxiv.org/abs/2601.09278
Introduces a structured self-evolving framework that models deep research as a Finite State Machine, enabling controllable agent adaptation.
๐ arxiv.org/abs/2601.09465
๐จ๐ฝโ๐ป github.com/QuantaAlpha/...
Introduces a structured self-evolving framework that models deep research as a Finite State Machine, enabling controllable agent adaptation.
๐ arxiv.org/abs/2601.09465
๐จ๐ฝโ๐ป github.com/QuantaAlpha/...
Proposes modifying LLM attention mechanisms with explicit relevance signals from retrieved documents, making RAG systems more robust to noise.
๐ arxiv.org/abs/2601.09028
๐จ๐ฝโ๐ป github.com/fengranMark/...
Proposes modifying LLM attention mechanisms with explicit relevance signals from retrieved documents, making RAG systems more robust to noise.
๐ arxiv.org/abs/2601.09028
๐จ๐ฝโ๐ป github.com/fengranMark/...
Introduces a learning-free method that transforms LLM embeddings into binary codes using Isolation Kernel, achieving up to 16.7x faster retrieval and 16x lower memory.
๐ arxiv.org/abs/2601.09159
Introduces a learning-free method that transforms LLM embeddings into binary codes using Isolation Kernel, achieving up to 16.7x faster retrieval and 16x lower memory.
๐ arxiv.org/abs/2601.09159
Presents a plug-and-play framework that aligns sparse and dense collaborative filtering views to improve recommendation accuracy, especially for long-tail items.
๐ arxiv.org/abs/2601.09286
๐จ๐ฝโ๐ป github.com/harris26-G/SaD
Presents a plug-and-play framework that aligns sparse and dense collaborative filtering views to improve recommendation accuracy, especially for long-tail items.
๐ arxiv.org/abs/2601.09286
๐จ๐ฝโ๐ป github.com/harris26-G/SaD
Constructs structured "pages" with cognitive outlines and iteratively fills knowledge slots via retrieval, improving RAG performance.
๐ arxiv.org/abs/2601.09402
๐จ๐ฝโ๐ป github.com/OpenBMB/PAGER
Constructs structured "pages" with cognitive outlines and iteratively fills knowledge slots via retrieval, improving RAG performance.
๐ arxiv.org/abs/2601.09402
๐จ๐ฝโ๐ป github.com/OpenBMB/PAGER
Unifies search and recommendation in LLMs using multi-subspace decomposition to mitigate gradient conflicts and null-space projection to preserve general-domain knowledge.
๐ arxiv.org/abs/2601.09496
Unifies search and recommendation in LLMs using multi-subspace decomposition to mitigate gradient conflicts and null-space projection to preserve general-domain knowledge.
๐ arxiv.org/abs/2601.09496
Presents a benchmark combining temporal reasoning with reasoning-intensive retrieval, featuring 1730 queries across 13 domains.
๐ arxiv.org/abs/2601.09523
๐จ๐ฝโ๐ป tempo-bench.github.io
Presents a benchmark combining temporal reasoning with reasoning-intensive retrieval, featuring 1730 queries across 13 domains.
๐ arxiv.org/abs/2601.09523
๐จ๐ฝโ๐ป tempo-bench.github.io
Presents a multimodal benchmark for reasoning-intensive retrieval with 2803 queries across 29 technical domains & 4 tasks of increasing complexity.
๐ arxiv.org/abs/2601.09562
๐จ๐ฝโ๐ป mm-bright.github.io
Presents a multimodal benchmark for reasoning-intensive retrieval with 2803 queries across 29 technical domains & 4 tasks of increasing complexity.
๐ arxiv.org/abs/2601.09562
๐จ๐ฝโ๐ป mm-bright.github.io
Introduces a Transformer pre-trained entirely on synthetic Markov chains that achieves SOTA recommender performance by fine-tuning only a lightweight input adaptor
๐ arxiv.org/abs/2601.08275
๐จ๐ฝโ๐ป github.com/BDML-lab/MPT
Introduces a Transformer pre-trained entirely on synthetic Markov chains that achieves SOTA recommender performance by fine-tuning only a lightweight input adaptor
๐ arxiv.org/abs/2601.08275
๐จ๐ฝโ๐ป github.com/BDML-lab/MPT
Presents a GPU-CPU-disk collaborative framework for streaming vector search with hierarchical indexing and workload-aware caching, achieving 20.9ร higher throughput.
๐ arxiv.org/abs/2601.08528
Presents a GPU-CPU-disk collaborative framework for streaming vector search with hierarchical indexing and workload-aware caching, achieving 20.9ร higher throughput.
๐ arxiv.org/abs/2601.08528
Introduces a retrieval-free framework that injects private domain knowledge into frozen LLMs via a single-token interface.
๐ arxiv.org/abs/2601.08209
Introduces a retrieval-free framework that injects private domain knowledge into frozen LLMs via a single-token interface.
๐ arxiv.org/abs/2601.08209
@houhaowen et al. propose a unified retrieval paradigm using RWKV's matrix-valued states to bridge embedding and reranking stages, achieving 5.4รโ44.8ร speedup.
๐ arxiv.org/abs/2601.07861
๐จ๐ฝโ๐ป github.com/howard-hou/E...
@houhaowen et al. propose a unified retrieval paradigm using RWKV's matrix-valued states to bridge embedding and reranking stages, achieving 5.4รโ44.8ร speedup.
๐ arxiv.org/abs/2601.07861
๐จ๐ฝโ๐ป github.com/howard-hou/E...
Presents a self-learning query suggestion method for agentic RAG that uses dynamic few-shot retrieval to suggest answerable alternatives when user queries fail.
๐ arxiv.org/abs/2601.08105
Presents a self-learning query suggestion method for agentic RAG that uses dynamic few-shot retrieval to suggest answerable alternatives when user queries fail.
๐ arxiv.org/abs/2601.08105
Introduces a dual-agent paradigm where a Reasoner constructs and adapts global plans while a Purifier filters retrieval noise, improving multi-hop QA performance.
๐ arxiv.org/abs/2601.08282
Introduces a dual-agent paradigm where a Reasoner constructs and adapts global plans while a Purifier filters retrieval noise, improving multi-hop QA performance.
๐ arxiv.org/abs/2601.08282
Introduces a benchmark with 310 datasets across 10 languages to diagnose position bias in retrieval models, revealing that most models exhibit primacy bias.
๐ arxiv.org/abs/2601.08363
๐จ๐ฝโ๐ป github.com/Ziyang1060/P...
Introduces a benchmark with 310 datasets across 10 languages to diagnose position bias in retrieval models, revealing that most models exhibit primacy bias.
๐ arxiv.org/abs/2601.08363
๐จ๐ฝโ๐ป github.com/Ziyang1060/P...
Presents a training-free RAG framework that treats retrieved documents as parallel experts, aggregating evidence at decode time via contrastive decoding rather than long-context attention.
๐ arxiv.org/abs/2601.08670
Presents a training-free RAG framework that treats retrieved documents as parallel experts, aggregating evidence at decode time via contrastive decoding rather than long-context attention.
๐ arxiv.org/abs/2601.08670
Decouples reasoning from memory management in LLM-based recommender systems, enabling collaborative signals from user-item graphs rather than isolated memory.
๐ arxiv.org/abs/2601.08816
๐จ๐ฝโ๐ป github.com/rutgerswisel...
Decouples reasoning from memory management in LLM-based recommender systems, enabling collaborative signals from user-item graphs rather than isolated memory.
๐ arxiv.org/abs/2601.08816
๐จ๐ฝโ๐ป github.com/rutgerswisel...
Meta introduces a data-free self-evolution framework where a proposer generates diverse questions to train a solver, matching or surpassing supervised search agents on QA benchmarks.
๐ arxiv.org/abs/2601.07055
๐จ๐ฝโ๐ป github.com/facebookrese...
Meta introduces a data-free self-evolution framework where a proposer generates diverse questions to train a solver, matching or surpassing supervised search agents on QA benchmarks.
๐ arxiv.org/abs/2601.07055
๐จ๐ฝโ๐ป github.com/facebookrese...
Introduces a tree-structured RL framework for agentic RAG that enables step-wise credit assignment via Monte Carlo estimation over descendant outcomes.
๐ arxiv.org/abs/2601.06922
Introduces a tree-structured RL framework for agentic RAG that enables step-wise credit assignment via Monte Carlo estimation over descendant outcomes.
๐ arxiv.org/abs/2601.06922
Compares Enhanced RAG (fixed pipelines with modules like rerankers) vs Agentic RAG (LLM-orchestrated) across multiple dimensions, finding neither universally superior but Agentic costs up to 3.6x more.
๐ arxiv.org/abs/2601.07711
Compares Enhanced RAG (fixed pipelines with modules like rerankers) vs Agentic RAG (LLM-orchestrated) across multiple dimensions, finding neither universally superior but Agentic costs up to 3.6x more.
๐ arxiv.org/abs/2601.07711
Introduces a routing-based approach that dynamically selects the most informative query representation in late-interaction models, achieving up to 30x speedup while maintaining performance.
๐ arxiv.org/abs/2601.06389
Introduces a routing-based approach that dynamically selects the most informative query representation in late-interaction models, achieving up to 30x speedup while maintaining performance.
๐ arxiv.org/abs/2601.06389
Kuaishou uses structured textual keywords as item identifiers to enable generative recommendation.
๐ arxiv.org/abs/2601.06798
๐จ๐ฝโ๐ป github.com/ZY0025/GRLM
Kuaishou uses structured textual keywords as item identifiers to enable generative recommendation.
๐ arxiv.org/abs/2601.06798
๐จ๐ฝโ๐ป github.com/ZY0025/GRLM
Uses iterative construction-integration to retrieve core knowledge triples and adaptively expands context granularity for multi-hop reasoning.
๐ arxiv.org/abs/2601.06799
Uses iterative construction-integration to retrieve core knowledge triples and adaptively expands context granularity for multi-hop reasoning.
๐ arxiv.org/abs/2601.06799