Loading Session...

IR-for-Good Paper Session III

Session Information

  • AgriIR: A Scalable Framework for Domain-Specific Knowledge Retrieval
  • Extending Logic Tensor Networks to Implicit Feedback for Representation-Aware Music Recommendation
  • Cultural Analytics for Good: Building Inclusive Evaluation Frameworks for Historical IR
  • One LLM to Train Them All: A Multi-Task Learning Framework for Fact-Checking
  • How Information Retrieval Systems Construct and Amplify Immigration Narratives
  • Towards Reliable Machine Translation: Scaling LLMs for Critical Error Detection and Safety
  • Integrating AI and IR paradigms for sustainable and trustworthy accurate access to large scale Biomedical information
  • Debiasing CLIP with Neural Interventions
Mar 31, 2026 10:30 - 12:30(Europe/Amsterdam)
Venue : Chemie
20260331T1030 20260331T1230 Europe/Amsterdam IR-for-Good Paper Session III AgriIR: A Scalable Framework for Domain-Specific Knowledge RetrievalExtending Logic Tensor Networks to Implicit Feedback for Representation-Aware Music RecommendationCultural Analytics for Good: Building Inclusive Evaluation Frameworks for Historical IROne LLM to Train Them All: A Multi-Task Learning Framework for Fact-CheckingHow Information Retrieval Systems Construct and Amplify Immigration NarrativesTowards Reliable Machine Translation: Scaling LLMs for Critical Error Detection and SafetyIntegrating AI and IR paradigms for sustainable and trustworthy accurate access to large scale Biomedical informationDebiasing CLIP with Neural Interventions Chemie ECIR2026 conference-secretariat@blueboxevents.nl

Sub Sessions

AgriIR: A Scalable Framework for Domain-Specific KnowledgeRetrieval

IR for good 10:30 AM - 12:30 PM (Europe/Amsterdam) 2026/03/31 08:30:00 UTC - 2026/03/31 10:30:00 UTC
This paper introduces AgriIR, a configurable retrieval augmented generation (RAG) framework designed to deliver grounded, domain-specific answers while maintaining flexibility and low computational cost. Instead of relying on large, monolithic models, AgriIR decomposes the information access process into declarative modular stages query refinement, sub-query planning, retrieval, synthesis, and evaluation. This design allows practitioners to adapt the framework to new knowledge verticals without modifying the architecture. Our reference implementation targets Indian agricultural information access, integrating 1B-parameter language models with adaptive retrievers and domain-aware agent catalogues. The system enforces deterministic citation, integrates telemetry for transparency, and includes automated deployment assets to ensure auditable, reproducible operation. By emphasizing architectural design and modular control, AgriIR demonstrates that well-engineered pipelines can achieve domain-accurate, trustworthy retrieval even under constrained resources. We argue that this approach exemplifies ¡°AI for Agriculture¡± by promoting accessibility, sustainability, and accountability in retrieval-augmented generation systems.
Presenters
SS
Shuvam Banerji Seal
Indian Institute Of Science Education And Research - Kolkata
Co-Authors
AP
Aheli Poddar
Institute Of Engineering & Management, Kolkata
AM
Alok Mishra
Indian Institute Of Science Education And Research - Kolkata
DR
Dwaipayan Roy
Assistant Professor, Indian Institute Of Science Education And Research Kolkata

Extending Logic Tensor Networks to Implicit Feedback forRepresentation-Aware Music Recommendation

IR for good 10:30 AM - 12:30 PM (Europe/Amsterdam) 2026/03/31 08:30:00 UTC - 2026/03/31 10:30:00 UTC
Music recommender systems shape how people discover music, yet persistent concerns have been raised regarding fairness and representation. Achieving fairness in recommender systems is challenging because conventional methods rely on rigid quantitative criteria, making it difficult to express nuanced or socially informed fairness goals. We explore the use of Logic Tensor Networks (LTNs) to incorporate nuanced fairness constraints into music recommender systems. LTNs enable the formulation of soft, differentiable constraints in a specific first-order logic, allowing fairness to be expressed through expert knowledge or data-driven insights. We make two main contributions.First, we extend an existing LTN-based recommender framework to the implicit-feedback setting. Second, we propose a procedure leveraging the extended framework to integrate data-informed fairness regularization into matrix factorization (MF)¨Cbased music recommendation. We demonstrate effectiveness of the proposed procedure with a case study on country-level representation bias in music recommendation, where content from hegemonic markets (e.g., the U.S.) is often overrepresented while local music is underexposed. Our analysis reveals that this imbalance disproportionately affects users with high local mainstreaminess (those who prefer music popular within their own country) and low global mainstreaminess (those who prefer less globally popular music). Using LTNs, we design targeted, data-informed fairness constraints and show that our approach allows to mitigate these disparities while maintaining competitive recommendation quality.
Presenters
HE
Hannah Eckert
PhD Student, Johannes Kepler University Linz

Cultural Analytics for Good: Building Inclusive EvaluationFrameworks for Historical IR

IR for good 10:30 AM - 12:30 PM (Europe/Amsterdam) 2026/03/31 08:30:00 UTC - 2026/03/31 10:30:00 UTC
This work bridges information retrieval and cultural analytics to support equitable access to historical knowledge. Using the British Library¡¯s BL19 digital collection (more than $35,000$ works from $1700-1899$), we construct a benchmark for studying language change and retrieval in the 19th-century fiction and non-fiction. Our approach combines expert-driven query design, paragraph-level relevance annotation, and Large Language Model (LLM) assistance to create a scalable evaluation framework grounded in human expertise. Central to our investigation is knowledge transfer from fiction to non-fiction, examining how narrative understanding and semantic richness in fiction can enhance retrieval performance for scholarly and factual materials. This interdisciplinary framework not only improves retrieval accuracy but also fosters interpretability, transparency, and cultural inclusivity in digital archives. Our work provides both practical evaluation resources and a methodological paradigm for developing retrieval systems that support richer, historically aware engagement with digital archives, ultimately working towards more emancipatory knowledge infrastructures.
Presenters
SD
Suchana Datta
Postdoctoral Research Fellow, University College Dublin
Co-Authors
DR
Dwaipayan Roy
Assistant Professor, Indian Institute Of Science Education And Research Kolkata
DG
Derek Greene
University College Dublin
GM
Gerardine Meaney
University College Dublin
KW
Karen Wade
University College Dublin
PM
Philipp Mayr
Team Leader, GESIS Leibniz Institute For The Social Sciences

One LLM to Train Them All: A Multi-Task Learning Framework for Fact-Checking

IR for goodIR for good 10:30 AM - 12:30 PM (Europe/Amsterdam) 2026/03/31 08:30:00 UTC - 2026/03/31 10:30:00 UTC
Large language models (LLMs) are reshaping automated fact-checking (AFC) by enabling unified, end-to-end verification pipelines rather than isolated components. While large proprietary models achieve strong performance, their closed weights, complexity, and high costs limit sustainability. Fine-tuning smaller open weight models for individual AFC tasks can help but requires multiple specialized models resulting in high costs. We propose \textbf{multi-task learning (MTL)} as a more efficient alternative that trains a single model to perform claim detection, evidence ranking, and stance detection jointly. Using small decoder-only LLMs (e.g., Qwen3-4b), we explore three MTL strategies: classification heads, causal language modeling heads, and instruction-tuning, and evaluate them across model sizes, task orders, and standard non-LLM baselines. While multitask models do not universally surpass single-task baselines, they yield substantial improvements, achieving up to \textbf{44\%}, \textbf{54\%}, and \textbf{31\%} relative gains for claim detection, evidence re-ranking, and stance detection, respectively, over zero-/few-shot settings. Finally, we also provide practical, empirically grounded guidelines to help practitioners apply MTL with LLMs for automated fact-checking.
Presenters
ML
Malin Astrid Larsson
University Of Stavanger
Co-Authors
HG
Harald Fosen Grunnaleite
University Of Stavanger
VS
Vinay Setty
Associate Professor, University Of Stavanger

How Information Retrieval Systems Construct and Amplify Immigration Narratives

IR for goodIR for good 10:30 AM - 12:30 PM (Europe/Amsterdam) 2026/03/31 08:30:00 UTC - 2026/03/31 10:30:00 UTC
Information retrieval systems play a central role in how people access and understand information about complex social issues, including immigration. Yet little is known about how the datasets that underpin these systems represent migrants or structure public narratives about migration. In this paper, we investigate how immigration is framed within a widely used IR benchmark and how ranking models shape the visibility of those frames. Using MS MARCO as our data source, we curate immigration-related queries and annotate retrieved passages using a migration-specific framing taxonomy grounded in social-science research. Our goal is to identify which narratives dominate and to measure how different retrieval models influence their exposure. We find that legality and security frames are far more common than humanitarian or inclusive ones, and that neural reranking amplifies exclusionary portrayals compared to sparse retrieval.
Presenters
ZM
Zarif Masud
PhD Student, Toronto Metropolitan University
Co-Authors
AP
Abhijit Paul
Fresh Grad, University Of Dhaka
SA
Syed Ishtiaque Ahmed
University Of Toronto
EB
Ebrahim Bagheri
University Of Toronto

Towards Reliable Machine Translation: Scaling LLMs for Critical Error Detection and Safety

IR for goodIR for good 10:30 AM - 12:30 PM (Europe/Amsterdam) 2026/03/31 08:30:00 UTC - 2026/03/31 10:30:00 UTC
Machine Translation (MT) plays a pivotal role in cross-lingual information access, public policy communication, and equitable knowledge dissemination. However, critical meaning errors, such as factual distortions, intent reversals, or biased translations, can undermine the reliability, fairness, and safety of multilingual systems. In this work, we explore the capacity of instruction-tuned Large Language Models (LLMs) to detect such critical errors, evaluating models across a range of scales (e.g., GPT-4o-mini, LLaMA 3.1 8B, LLaMA 3.3 70B, and GPT-OSS 20B/120B) using WMT-21, WMT-22, and a curated SynCED benchmark. Our findings show that model scaling and adaptation strategies (zero-shot, few-shot, fine-tuning) yield consistent improvements, outperforming encoder-only baselines like XLM-R and ModernBERT. We argue that improving critical error detection in MT contributes to safer, more trustworthy, and socially accountable information systems by reducing the risk of disinformation, miscommunication, and linguistic harm, especially in high-stakes or underrepresented contexts. This work positions error detection not merely as a technical challenge, but as a necessary safeguard in the pursuit of just and responsible multilingual AI.
Presenters
MC
Muskaan Chopra
Rheinische Friedrich-Wilhelms-Universit?t Bonn

Integrating AI and IR paradigms for sustainable andtrustworthy accurate access to large scale Biomedicalinformation

IR for good 10:30 AM - 12:30 PM (Europe/Amsterdam) 2026/03/31 08:30:00 UTC - 2026/03/31 10:30:00 UTC
In high-stakes domains such as health and biology, information retrieval systems must ensure accuracy while also supporting equitable access and protecting sensitive data. However, many state-of-the-art biomedical IR solutions rely on proprietary cloud infrastructures, raising concerns over cost, reproducibility, and patient privacy. We present a fully open-source retrieval-augmented question answering framework that accurately manages QA against the entire PubMed collection (over 38M documents) using modest, local, consumer-grade hardware. Inspired by BioASQ, our system combines sparse and dense retrieval with a lightweight local LLM for evidence-grounded biomedical QA. Experiments show that strong retrieval quality and real-time performance are achievable without reliance on commercial APIs or large GPU clusters. By reducing infrastructure barriers around on-premises data, this work provides a concrete path toward democratizing trustworthy biomedical IR for hospitals, universities, and healthcare organizations worldwide.
Presenters
FB
Federico Borazio
University Of Rome, Tor Vergata
Co-Authors
FL
Francesco Labbate
DC
Danilo Croce
Associate Professor, University Of Rome, Tor Vergata
RB
Roberto Basili

Debiasing CLIP with Neural Interventions

IR for good 10:30 AM - 12:30 PM (Europe/Amsterdam) 2026/03/31 08:30:00 UTC - 2026/03/31 10:30:00 UTC
This paper presents an inference-time method to mitigate demographic bias in CLIP-like vision¨Clanguage models through targeted neural interventions in their internal attention mechanisms. We first identify ``expert'' attention heads that encode demographic information by systematically analyzing CLIP¡¯s internal representations in response to labeled inputs. At inference, we intervene these heads -- replacing their activations with demographic prototypes or by neutralizing them (zero ablation). We chose to intervene specifically at the CLS token, as it aggregates information globally across image patches and is directly responsible for the final image embedding. Our results across multiple evaluation frameworks show that these targeted interventions can significantly reduce both gender and ethnicity biases in cross-modal retrieval and zero-shot classification, without compromising model performance.
Presenters
AG
Amelia Gomez
PhD Student, COMPUTER VISION CENTER
Co-Authors
JG
Jordi Gonzalez
LG
Lluis Gomez
Researcher, COMPUTER VISION CENTER
193 visits

Session Participants

User Online
Session speakers, moderators & attendees
Indian Institute of Science Education and Research - Kolkata
PhD Student
,
Johannes Kepler University Linz
Postdoctoral Research Fellow
,
University College Dublin
University of Stavanger
PhD Student
,
Toronto Metropolitan University
+ 3 more speakers. View All
PhD
,
University Of Amsterdam
No attendee has checked-in to this session!
22 attendees saved this session

Session Chat

Live Chat
Chat with participants attending this session

Questions & Answers

Answered
Submit questions for the presenters

Session Polls

Active
Participate in live polls

Need Help?

Technical Issues?

If you're experiencing playback problems, try adjusting the quality or refreshing the page.

Questions for Speakers?

Use the Q&A tab to submit questions that may be addressed in follow-up sessions.