Blog: https://blog.reachsumit.com/
Newsletter: https://recsys.substack.com/
🔗 recsys.substack.com/p/a-gpu-nati...
Meta presents an early-stage ranking with Mixture of Attention modules that capture explicit cross-signals via Hard Matching Attention and implicit signals through target-aware self-attention and cross-attention mechanisms.
📝 arxiv.org/abs/2511.21095
Meta presents an early-stage ranking with Mixture of Attention modules that capture explicit cross-signals via Hard Matching Attention and implicit signals through target-aware self-attention and cross-attention mechanisms.
📝 arxiv.org/abs/2511.21095
Adaptively allocates tokens between collaborative filtering and semantic codebooks using mixture-of-experts to balance memorization and generalization.
📝 arxiv.org/abs/2511.20673
Adaptively allocates tokens between collaborative filtering and semantic codebooks using mixture-of-experts to balance memorization and generalization.
📝 arxiv.org/abs/2511.20673
Presents a benchmark for e-commerce generative engine optimization with 7000+ realistic product queries, showing that optimization-based rewriting strategies substantially outperform heuristic methods.
📝 arxiv.org/abs/2511.20867
Presents a benchmark for e-commerce generative engine optimization with 7000+ realistic product queries, showing that optimization-based rewriting strategies substantially outperform heuristic methods.
📝 arxiv.org/abs/2511.20867
Presents an OCR-free multimodal retrieval system using pyramid indexing that achieves strong performance with only 17-27 vectors per page compared to 1024 for patch-based approaches.
📝 arxiv.org/abs/2511.21121
Presents an OCR-free multimodal retrieval system using pyramid indexing that achieves strong performance with only 17-27 vectors per page compared to 1024 for patch-based approaches.
📝 arxiv.org/abs/2511.21121
Meituan introduces a framework that integrates pointwise and listwise evaluation for click-through rate prediction, combining fine-grained modeling with hierarchical item dependencies.
📝 arxiv.org/abs/2511.21394
Meituan introduces a framework that integrates pointwise and listwise evaluation for click-through rate prediction, combining fine-grained modeling with hierarchical item dependencies.
📝 arxiv.org/abs/2511.21394
Proposes a speculation-based framework that reduces latency in LLM search agents through adaptive two-phase speculation and two-level scheduling mechanisms.
📝 arxiv.org/abs/2511.20048
Proposes a speculation-based framework that reduces latency in LLM search agents through adaptive two-phase speculation and two-level scheduling mechanisms.
📝 arxiv.org/abs/2511.20048
Introduces a training-free recommendation framework that combines LLM-based semantic embeddings with collaborative filtering in a two-stage process.
📝 arxiv.org/abs/2511.20564
Introduces a training-free recommendation framework that combines LLM-based semantic embeddings with collaborative filtering in a two-stage process.
📝 arxiv.org/abs/2511.20564
NVIDIA introduces a lightweight document parsing and OCR model with improved capabilities across general OCR, markdown formatting, structured table parsing, and text extraction from pictures, charts and diagrams.
📝 arxiv.org/abs/2511.20478
👨🏽💻 huggingface.co/nvidia/NVIDI...
NVIDIA introduces a lightweight document parsing and OCR model with improved capabilities across general OCR, markdown formatting, structured table parsing, and text extraction from pictures, charts and diagrams.
📝 arxiv.org/abs/2511.20478
👨🏽💻 huggingface.co/nvidia/NVIDI...
Tencent proposes a framework that transfers LLM reasoning capabilities to recommender systems through automated pattern discovery and structure-preserving integration.
📝 arxiv.org/abs/2511.19514
Tencent proposes a framework that transfers LLM reasoning capabilities to recommender systems through automated pattern discovery and structure-preserving integration.
📝 arxiv.org/abs/2511.19514
Introduces a domain-adaptive reranking framework combining dynamic expert routing with Entity Abstraction for Generalization to enhance decoder-only rerankers.
📝 arxiv.org/abs/2511.19987
Introduces a domain-adaptive reranking framework combining dynamic expert routing with Entity Abstraction for Generalization to enhance decoder-only rerankers.
📝 arxiv.org/abs/2511.19987
Alibaba integrates LLM-derived world knowledge with sequential recommendation models through generation augmented retrieval.
📝 arxiv.org/abs/2511.20177
👨🏽💻 anonymous.4open.science/r/GRASP-SRS/
Alibaba integrates LLM-derived world knowledge with sequential recommendation models through generation augmented retrieval.
📝 arxiv.org/abs/2511.20177
👨🏽💻 anonymous.4open.science/r/GRASP-SRS/
Meta introduces a self-supervised framework that dynamically segments users and selectively removes or boosts features, improving inference throughput by 4.2%.
📝 arxiv.org/abs/2511.18331
Meta introduces a self-supervised framework that dynamically segments users and selectively removes or boosts features, improving inference throughput by 4.2%.
📝 arxiv.org/abs/2511.18331
Alibaba introduces a speculative decoding architecture for generative recommendation that integrates self-drafting with model-free verification, achieving 2.6x speedup.
📝 arxiv.org/abs/2511.18793
Alibaba introduces a speculative decoding architecture for generative recommendation that integrates self-drafting with model-free verification, achieving 2.6x speedup.
📝 arxiv.org/abs/2511.18793
Introduces conformal prediction for RAG systems to filter irrelevant context while preserving relevant evidence with statistical guarantees, reducing context size by 2-3x.
📝 arxiv.org/abs/2511.17908
Introduces conformal prediction for RAG systems to filter irrelevant context while preserving relevant evidence with statistical guarantees, reducing context size by 2-3x.
📝 arxiv.org/abs/2511.17908
Introduces a token-augmented re-ranking framework that empowers users to steer recommendations with precise, attribute-based control while maintaining competitive ranking performance.
📝 arxiv.org/abs/2511.17913
Introduces a token-augmented re-ranking framework that empowers users to steer recommendations with precise, attribute-based control while maintaining competitive ranking performance.
📝 arxiv.org/abs/2511.17913
Pinterest introduces a lightweight framework for modeling user revisitation behavior that led to a 0.1% lift in active users.
📝 arxiv.org/abs/2511.18013
Pinterest introduces a lightweight framework for modeling user revisitation behavior that led to a 0.1% lift in active users.
📝 arxiv.org/abs/2511.18013
Netflix proposes reasoning strategies that leverage LLMs to address cold-start recommendation challenges, outperforming their production ranking model by up to 8% in certain cases.
📝 arxiv.org/abs/2511.18261
Netflix proposes reasoning strategies that leverage LLMs to address cold-start recommendation challenges, outperforming their production ranking model by up to 8% in certain cases.
📝 arxiv.org/abs/2511.18261
Introduces a framework that instantiates similar users and relevant items as LLM agents with unique profiles for collaborative filtering in agentic systems.
📝 arxiv.org/abs/2511.18413
Introduces a framework that instantiates similar users and relevant items as LLM agents with unique profiles for collaborative filtering in agentic systems.
📝 arxiv.org/abs/2511.18413
Apple presents a unified framework that performs embedding-based compression and joint optimization in a shared continuous space for retrieval-augmented generation.
📝 arxiv.org/abs/2511.18659
Apple presents a unified framework that performs embedding-based compression and joint optimization in a shared continuous space for retrieval-augmented generation.
📝 arxiv.org/abs/2511.18659
Alibaba introduces a unified token-based ranking framework that tackles representation and computational bottlenecks in recommendation systems.
📝 arxiv.org/abs/2511.18805
Alibaba introduces a unified token-based ranking framework that tackles representation and computational bottlenecks in recommendation systems.
📝 arxiv.org/abs/2511.18805
Evaluates cross-lingual retrieval interventions, finding that multilingual dense retrieval models outperform lexical methods and contrastive learning improves encoders' alignment.
📝 arxiv.org/abs/2511.19324
Evaluates cross-lingual retrieval interventions, finding that multilingual dense retrieval models outperform lexical methods and contrastive learning improves encoders' alignment.
📝 arxiv.org/abs/2511.19324
Evaluates multilingual LLMs for cross-lingual query expansion, finding that query length determines effective prompting techniques and fine-tuning benefits depend on data similarity
📝 arxiv.org/abs/2511.19325
Evaluates multilingual LLMs for cross-lingual query expansion, finding that query length determines effective prompting techniques and fine-tuning benefits depend on data similarity
📝 arxiv.org/abs/2511.19325
Improves HyDE's effectiveness by applying traditional feedback models like Rocchio to weight expansion terms from LLM-generated hypothetical documents for BM25 retrieval.
📝 arxiv.org/abs/2511.19349
👨🏽💻 github.com/nourj98/hyde...
Improves HyDE's effectiveness by applying traditional feedback models like Rocchio to weight expansion terms from LLM-generated hypothetical documents for BM25 retrieval.
📝 arxiv.org/abs/2511.19349
👨🏽💻 github.com/nourj98/hyde...
Dynamically adapts retrieval resolution per query, reducing memory usage by 1.8x and accelerating search by 5.7x.
📝 arxiv.org/abs/2511.16681
👨🏽💻 github.com/FastLM/SPI_V...
Dynamically adapts retrieval resolution per query, reducing memory usage by 1.8x and accelerating search by 5.7x.
📝 arxiv.org/abs/2511.16681
👨🏽💻 github.com/FastLM/SPI_V...
Dynamically prunes less informative semantic tokens in generative recommender systems, reducing training time by 26.7%.
📝 arxiv.org/abs/2511.16943
👨🏽💻 github.com/Yuzt-zju/RASTP
Dynamically prunes less informative semantic tokens in generative recommender systems, reducing training time by 26.7%.
📝 arxiv.org/abs/2511.16943
👨🏽💻 github.com/Yuzt-zju/RASTP