Monday
9:00 – 10:00
Keynote (plenary)
10:30 – 12:30
Full Papers 1: Core Retrieval Models, Representations & Evaluation
- Sample-Free Almost-Exact Estimation of Plackett-Luce Propensities for Off-Policy Ranking
- Validating Search Query Simulations: A Taxonomy of Measures
- Reducing Human Effort to Validate LLM Relevance Judgements via Stratified Sampling
- Revealing MonoT5's Learning Mechanisms via Prompt-Token Adaptation
- When Reducing Representations Improves Performance
- An Empirical Study of Model Casing in Learned Sparse Retrieval
- Improving Instruction-Aware Retrieval with Query-Preserving Regularization
Full Papers 2: Applied Generation, Evaluation & Analysis with LLMs
- Contradictions in Context: Challenges for Retrieval-Augmented Generation in Healthcare
- Small Models, Big Picture! A Language Model Augmentation for Enhanced Reader-Aware Summarization
- From Comments to Conclusions: Adaptive Reader-Aware Summary Generation in Low-Resource Languages via Agent Debate
- Prompt Compression in the Wild: Measuring Latency, Rate Adherence, and Quality for Faster LLM Inference
- Towards Quantitative Summarization Evaluation: An Integrated Atomic-Based Evaluation Framework and Dataset for Text Summarization
- ExpertMix: Aspect and Severity Detection in Conversational Complaints
- MemTool: Optimizing Short-Term Memory Management for Dynamic Tool Retrieval and Invocation in LLM Agent Multi-Turn Conversations
IR4Good 1: IR-for-Good Paper Session I
- From Engagement to Empowerment: A Capability-Theoretic Rethinking of Recommender Systems
- Bias in Book Recommendation: A Case Study on the Danish Public Libraries
- How Do LLMs Cite? A Mechanistic Interpretation of Attribution in RAG
- All That Matters: Revisiting Children's Concept of Relevance in Primary School Context
- When Attention Becomes Exposure in Generative Search
- Counterfactual Understanding via Retrieval-Aware Multimodal Modeling for Time-to-Event Survival Prediction
- Joint Modeling of Candidate and Recruiter Preferences for Fair Two-Sided Job Matching
14:30 – 15:30
Full Papers 3: Specialized Retrieval Domains & Architectures
- Filtering Few-Level Segment Regions for Efficient Subsequence Search in 3D Human Motions
- Starbucks: Improved Training for 2D Matryoshka Embeddings
- Website Segmentation Beyond Structure: A Benchmark on Functional and Digital Maturity Classes
Reproducibility 1: Recommender Systems
- Are Multimodal Embeddings Truly Beneficial for Recommendation? A Deep Dive into Whole vs. Individual Modalities
- RecRankerEval: A Reproducible Framework for Deploying and Evaluating LLM-Based Top-k Recommenders
- Efficient Optimization of Hierarchical Identifiers for Generative Recommendation
- A Reproducible and Fair Evaluation of Partition-Aware Collaborative Filtering
- A Systematic Reproducibility Study of BSARec for Sequential Recommendation
IR4Good 2: IR-for-Good Paper Session II
- Measuring Political Stance and Consistency in Large Language Models
- Judiciously Reducing Sub-Group Comparisons for Learning Intersectional Fair Representations
- Modeling Behavioral Patterns in News Recommendations Using Fuzzy Neural Networks
- Does Reasoning Make Search More Fair? Comparing Fairness in Reasoning and Non-Reasoning Rerankers
16:00 – 17:00
Findings Lightning Talks
Reproducibility 2: Retrieval
- Fast, Compact, Dynamic Indexing for Learned Sparse Retrieval Systems
- Down with the Hierarchy: The 'H' in HNSW Stands for "Hubs"
- Multivector Reranking in the Era of Strong First-Stage Retrievers
- Temporal Fact Conflicts in LLMs: Reproducibility Insights from Unifying DYNAMICQA and MULAN
IR4Good Invited Talks and Panel
Tuesday
9:00 – 10:00
Keynote – IR4Good (plenary)
10:30 – 12:30
Full Papers 4: LLMs as Rankers, Rerankers & Judges
- Training-Induced Bias Toward LLM-Generated Content in Dense Retrieval
- OrLog: Resolving Complex Queries with LLMs and Probabilistic Reasoning
- LLM-based Listwise Reranking Under the Effect of Positional Bias
- RerAnchor: Anchoring Important Context in Multi-Modal Document Reranking
- How Role-Play Shapes Relevance Judgment in Zero-Shot LLM Rankers
- Influential Training Data Retrieval for Explaining Verbalized Confidence of LLMs
- LANCER: LLM Reranking for Nugget Coverage
Full Papers 5: RAG: Retrieval Utility, Scaling & Infrastructure
- Who Benefits from RAG? The Role of Exposure, Utility and Attribution Bias
- Utilizing Metadata for Better Retrieval-Augmented Generation
- Predicting Retrieval Utility and Answer Quality in Retrieval-Augmented Generation
- Open Web Indexes for Remote Querying
- LURE-RAG: Lightweight Utility-Driven Reranking for Efficient RAG
- Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets
- Less LLM, More Documents: Searching for Improved RAG
IR4Good 3: IR-for-Good Paper Session III
- AgriIR: A Scalable Framework for Domain-Specific Knowledge Retrieval
- Extending Logic Tensor Networks to Implicit Feedback for Representation-Aware Music Recommendation
- Cultural Analytics for Good: Building Inclusive Evaluation Frameworks for Historical IR
- One LLM to Train Them All: A Multi-Task Learning Framework for Fact-Checking
- How Information Retrieval Systems Construct and Amplify Immigration Narratives
- Towards Reliable Machine Translation: Scaling LLMs for Critical Error Detection and Safety
- Integrating AI and IR Paradigms for Sustainable and Trustworthy Accurate Access to Large Scale Biomedical Information
- Debiasing CLIP with Neural Interventions
14:30 – 16:00
Full Papers 6: Multimodal Retrieval & Embeddings
- Event-Aware Video Corpus Moment Retrieval
- Scalable Music Cover Retrieval Using Lyrics-Aligned Audio Embeddings
- Image Complexity-Aware Adaptive Retrieval for Efficient Vision-Language Models
- Cross-Sensory Brain Passage Retrieval: Scaling Beyond Visual to Audio
- Learning Audio–Visual Embeddings with Inferred Latent Interaction Graphs
Full Papers 7: Trustworthy and Responsible Retrieval-Augmented Systems
- Learned Hallucination Detection in Black-Box LLMs Using Token-Level Entropy Production Rate
- FACTUM: Mechanistic Detection of Citation Hallucination in Long-Form RAG
- SUMMIR: A Hallucination-Aware Framework for Ranking Sports Insights from LLMs
- Bribery-Resistant Ranking Systems: A Multipartite User-Agnostic Framework for AI Act Compliance
- RAC: Retrieval-Augmented Clarification for Faithful Conversational Search
Resource 1: Interactive and Conversational Search
- WildClaims: Conversational Information Access in the Wild(Chat)
- LISP – A Rich Interaction Dataset and Loggable Interactive Search Platform
- UserSimCRS v2: Simulation-Based Evaluation for Conversational Recommender Systems
- Sim4IA-Bench: A User Simulation Benchmark Suite for Next Query and Utterance Prediction
- Beyond the Click: A Framework for Inferring Cognitive Traces in Search
Evening Banquet
Wednesday
9:00 – 10:00
Keynote – KvR (plenary)
10:30 – 12:30
IRRJ Papers
CLEF Tracks Presentations
Resource 2: Domain- and Language-Specific Datasets
- FaE: A Resource of Logs, Profiles, and Rankings for Academic Expert Finding
- SciNUP: Natural Language User Interest Profiles for Scientific Literature Recommendation
- FoodNexus: Massive Food Knowledge for Recommender Systems
- pt-image-ir-dataset: An Image Retrieval Dataset in European Portuguese
- CitiLink-Minutes: A Multilayer Annotated Dataset of Municipal Meeting Minutes
- ClaimPT: A Portuguese Dataset of Annotated Claims in News Articles
- BioGraphletQA: Knowledge-Anchored Generation of Complex Question Answering Datasets
14:30 – 16:00
Full Papers 8: Recommendation Systems & LLMs
- From What to Why: Thought-Space Recommendation with Small Language Models
- Post-Training Denoising of User Profiles with LLMs in Collaborative Filtering Recommendation
- PromptHG: Prompt-Enhanced Heterogeneous Graph for Personalized News Recommendation
- Interplay: Training Independent Simulators for Reference-Free Conversational Recommendation
- Improving Conversational Recommendation with Contextual Adaptation of External Recommenders and LLM-Based Reranking
- No session -
Resource 3: Evaluation Tooling for Retrieval and RecSys
- CoRECT: A Framework for Evaluating Embedding Compression Techniques at Scale
- GREAT: Group Recommender Evaluation and Analysis Tool
- Evaluating the Efficiency and Effectiveness of Learned Sparse Retrieval with the lsr_benchmark
- An Open SERP Mining Infrastructure for the Archive Query Log
- RoutIR: Fast Serving of Retrieval Pipelines for Retrieval-Augmented Generation
16:00 – 16:30
Closing Session (plenary)