Sumit
@reachsumit.com
190 followers
36 following
1.8K posts
Senior MLE at Meta. Trying to keep up with the Information Retrieval domain!
Blog: https://blog.reachsumit.com/
Newsletter: https://recsys.substack.com/
Posts
Media
Videos
Starter Packs
Pinned
Sumit
@reachsumit.com
· 3d
Probing LLMs' Knowledge Boundary: Adaptive RAG, Part 3
This post introduces techniques that probe the LLM’s internal confidence and knowledge boundaries. We explore prompt-based confidence detection, consistency-based uncertainty estimation, and internal ...
blog.reachsumit.com
Sumit
@reachsumit.com
· 1d
A$^2$Search: Ambiguity-Aware Question Answering with Reinforcement Learning
Recent advances in Large Language Models (LLMs) and Reinforcement Learning (RL) have led to strong performance in open-domain question answering (QA). However, existing models still struggle with ques...
arxiv.org
Sumit
@reachsumit.com
· 1d
Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window
While recent advances in reasoning models have demonstrated cognitive behaviors through reinforcement learning, existing approaches struggle to invoke deep reasoning capabilities in multi-turn agents ...
arxiv.org
Sumit
@reachsumit.com
· 1d
Haystack Engineering: Context Engineering for Heterogeneous and Agentic Long-Context Evaluation
Modern long-context large language models (LLMs) perform well on synthetic "needle-in-a-haystack" (NIAH) benchmarks, but such tests overlook how noisy contexts arise from biased retrieval and agentic ...
arxiv.org
Sumit
@reachsumit.com
· 1d
Retentive Relevance: Capturing Long-Term User Value in Recommendation Systems
Recommendation systems have traditionally relied on short-term engagement signals, such as clicks and likes, to personalize content. However, these signals are often noisy, sparse, and insufficient fo...
arxiv.org
Sumit
@reachsumit.com
· 1d
PLUM: Adapting Pre-trained Language Models for Industrial-scale Generative Recommendations
Large Language Models (LLMs) pose a new paradigm of modeling and computation for information tasks. Recommendation systems are a critical application domain poised to benefit significantly from the se...
arxiv.org
Sumit
@reachsumit.com
· 1d
HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation
Agentic RAG is a powerful technique for incorporating external information that LLMs lack, enabling better problem solving and question answering. However, suboptimal search behaviors exist widely, su...
arxiv.org
Sumit
@reachsumit.com
· 1d
Multilingual Generative Retrieval via Cross-lingual Semantic Compression
Generative Information Retrieval is an emerging retrieval paradigm that exhibits remarkable performance in monolingual scenarios.However, applying these methods to multilingual retrieval still encount...
arxiv.org
Sumit
@reachsumit.com
· 1d
STEPER: Step-wise Knowledge Distillation for Enhancing Reasoning Ability in Multi-Step Retrieval-Augmented Language Models
Answering complex real-world questions requires step-by-step retrieval and integration of relevant information to generate well-grounded responses. However, existing knowledge distillation methods ove...
arxiv.org
Sumit
@reachsumit.com
· 1d
TaoSR-AGRL: Adaptive Guided Reinforcement Learning Framework for E-commerce Search Relevance
Query-product relevance prediction is fundamental to e-commerce search and has become even more critical in the era of AI-powered shopping, where semantic understanding and complex reasoning directly ...
arxiv.org
Sumit
@reachsumit.com
· 1d
VersionRAG: Version-Aware Retrieval-Augmented Generation for Evolving Documents
Retrieval-Augmented Generation (RAG) systems fail when documents evolve through versioning-a ubiquitous characteristic of technical documentation. Existing approaches achieve only 58-64% accuracy on v...
arxiv.org
Sumit
@reachsumit.com
· 1d
ReasonEmbed: Enhanced Text Embeddings for Reasoning-Intensive Document Retrieval
In this paper, we introduce ReasonEmbed, a novel text embedding model developed for reasoning-intensive document retrieval. Our work includes three key technical contributions. First, we propose ReMix...
arxiv.org
Sumit
@reachsumit.com
· 2d
The Upside of Bias: Personalizing Long-Tail Item Recommendations with Biased Sampling | ACM Transactions on Recommender Systems
Recommendation systems drive user engagement across social media, streaming platforms,
and e-commerce by learning from past interactions. The relevance of a recommended
item depends on the quality of ...
dl.acm.org
Sumit
@reachsumit.com
· 2d
Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them
Agentic search leverages large language models (LLMs) to interpret complex user information needs and execute a multi-step process of planning, searching, and synthesizing information to provide answe...
arxiv.org
Sumit
@reachsumit.com
· 2d
Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models
Despite their remarkable natural language understanding capabilities, Large Language Models (LLMs) have been underutilized for retrieval tasks. We present Search-R3, a novel framework that addresses t...
arxiv.org
Sumit
@reachsumit.com
· 2d
LAD-RAG: Layout-aware Dynamic RAG for Visually-Rich Document Understanding
Question answering over visually rich documents (VRDs) requires reasoning not only over isolated content but also over documents' structural organization and cross-page dependencies. However, conventi...
arxiv.org
Sumit
@reachsumit.com
· 2d
PTEB: Towards Robust Text Embedding Evaluation via Stochastic Paraphrasing at Evaluation Time with LLMs
Current evaluations of sentence embedding models typically rely on static test beds such as the Massive Text Embedding Benchmark (MTEB). While invaluable, repeated tuning on a fixed suite can inflate ...
arxiv.org
Sumit
@reachsumit.com
· 2d
LLM-Powered Nuanced Video Attribute Annotation for Enhanced Recommendations
This paper presents a case study on deploying Large Language Models (LLMs) as an advanced "annotation" mechanism to achieve nuanced content understanding (e.g., discerning content "vibe") at scale wit...
arxiv.org
Sumit
@reachsumit.com
· 2d
Are LLMs Reliable Rankers? Rank Manipulation via Two-Stage Token Optimization
Large language models (LLMs) are increasingly used as rerankers in information retrieval, yet their ranking behavior can be steered by small, natural-sounding prompts. To expose this vulnerability, we...
arxiv.org
Sumit
@reachsumit.com
· 3d
Probing LLMs' Knowledge Boundary: Adaptive RAG, Part 3
This post introduces techniques that probe the LLM’s internal confidence and knowledge boundaries. We explore prompt-based confidence detection, consistency-based uncertainty estimation, and internal ...
blog.reachsumit.com
Sumit
@reachsumit.com
· 3d
Sumit
@reachsumit.com
· 3d
Probing LLMs' Knowledge Boundary: Adaptive RAG, Part 3
This post introduces techniques that probe the LLM’s internal confidence and knowledge boundaries. We explore prompt-based confidence detection, consistency-based uncertainty estimation, and internal ...
blog.reachsumit.com
Sumit
@reachsumit.com
· 3d
Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics
RAG (Retrieval-Augmented Generation) systems and web agents are increasingly evaluated on multi-hop deep search tasks, yet current practice suffers from two major limitations. First, most benchmarks l...
arxiv.org
Sumit
@reachsumit.com
· 3d
Stratified GRPO: Handling Structural Heterogeneity in Reinforcement Learning of LLM Search Agents
Large language model (LLM) agents increasingly rely on external tools such as search engines to solve complex, multi-step problems, and reinforcement learning (RL) has become a key paradigm for traini...
arxiv.org
Sumit
@reachsumit.com
· 3d
RAG Makes Guardrails Unsafe? Investigating Robustness of Guardrails under RAG-style Contexts
With the increasing adoption of large language models (LLMs), ensuring the safety of LLM systems has become a pressing concern. External LLM-based guardrail models have emerged as a popular solution t...
arxiv.org