IR-for-Good Paper Session II

Loading Session...

Session Information

Measuring Political Stance and Consistency in Large Language Models
Judiciously Reducing Sub-group Comparisons for Learning Intersectional Fair Representations
Modeling Behavioral Patterns in News Recommendations Using Fuzzy Neural Networks
Does Reasoning Make Search More Fair? Comparing Fairness in Reasoning and Non-Reasoning Rerankers

IR4Good

Mar 30, 2026 14:30 - 15:30(Europe/Amsterdam)

Venue : Chemie

20260330T1430 20260330T1530 Europe/Amsterdam IR-for-Good Paper Session II Measuring Political Stance and Consistency in Large Language ModelsJudiciously Reducing Sub-group Comparisons for Learning Intersectional Fair RepresentationsModeling Behavioral Patterns in News Recommendations Using Fuzzy Neural NetworksDoes Reasoning Make Search More Fair? Comparing Fairness in Reasoning and Non-Reasoning Rerankers Chemie ECIR2026 conference-secretariat@blueboxevents.nl

Add to my Schedule

Sub Sessions

Measuring Political Stance and Consistency in Large Language Models

IR for goodIR for good 02:30 PM - 03:30 PM (Europe/Amsterdam) 2026/03/30 12:30:00 UTC - 2026/03/30 13:30:00 UTC

With the incredible advancements in Large Language Models (LLMs), many people have started using them to satisfy their information needs. However, utilizing LLMs might be problematic for political issues where disagreement is common and model outputs may reflect training-data biases or deliberate alignment choices. To better characterize such behavior, we assess the stances of nine LLMs on 24 politically sensitive issues using five prompting techniques. We find that models often adopt opposing stances on several issues; some positions are malleable under prompting, while others remain stable. Among the models examined, Grok-3-mini is the most persistent, whereas Mistral-7B is the least. For issues involving countries with different languages, models tend to support the side whose language is used in the prompt. Notably, no prompting technique alters model stances on the Qatar blockade or the oppression of Palestinians. We hope these findings raise user awareness when seeking political guidance from LLMs and encourage developers to address these concerns.

Presenters

Co-Authors

Judiciously Reducing Sub-group Comparisons for Learning Intersectional Fair Representations

IR for goodIR for good 02:30 PM - 03:30 PM (Europe/Amsterdam) 2026/03/30 12:30:00 UTC - 2026/03/30 13:30:00 UTC

Ensuring fairness in ranking systems is critical to avoid discriminatory outcomes towards minority groups in high stakes domains such as recruitment. Most fairness interventions only address fairness for one or more binary groups without accounting for intersectional fairness. We study the problem of achieving intersectional fairness in ranking systems, where individuals may face compounded disadvantages. We adapt and extend existing pre-processing fairness intervention methods to optimize for intersectional group fairness. Importantly, as the number of intersectional sub-groups grows exponentially with the number of attributes, optimization becomes computationally expensive and possibly infeasible. To address this challenge, we propose to reduce the number of sub-group comparisons when optimizing for intersectional fairness, based on the highest disparities between sub-groups. Our results show that limiting sub-group comparisons achieves comparable or better intersectional fairness. We validate this on three real-world datasets and a simulated setup designed to test robustness to intersectional fairness challenges.

Presenters

Co-Authors

Modeling Behavioral Patterns in News Recommendations Using Fuzzy Neural Networks

IR for goodIR for good 02:30 PM - 03:30 PM (Europe/Amsterdam) 2026/03/30 12:30:00 UTC - 2026/03/30 13:30:00 UTC

News recommender systems are increasingly driven by black-box models, offering little transparency for editorial decision-making. In this work, we introduce a transparent recommender system that uses fuzzy neural networks to learn human-readable rules from behavioral data for predicting article clicks. By extracting the rules at configurable thresholds, we can control rule complexity and thus, the level of interpretability. We evaluate our approach on two publicly available news datasets (i.e., MIND and EB-NeRD) and show that we can accurately predict click behavior compared to several established baselines, while learning human-readable rules. Furthermore, we show that the learned rules reveal news consumption patterns, enabling editors to align content curation goals with target audience behavior.

Presenters

Co-Authors

Does Reasoning Make Search More Fair? Comparing Fairness in Reasoning and Non-Reasoning Rerankers

IR for goodIR for good 02:30 PM - 03:30 PM (Europe/Amsterdam) 2026/03/30 12:30:00 UTC - 2026/03/30 13:30:00 UTC

While reasoning rerankers, such as Rank1, have demonstrated strong abilities in improving ranking relevance, it is unclear how they perform on other retrieval qualities such as fairness. We conduct the first systematic comparison of fairness between reasoning and non-reasoning rerankers. Using the TREC 2022 Fair Ranking Track dataset, we evaluate six reranking models across multiple retrieval settings and demographic attributes. Our findings demonstrate reasoning neither improve nor harm fairness compared to non-reasoning approaches. Our fairness metric, Attention-Weighted Rank Fairness (AWRF) remained stable (0.33-0.35) across all models, even as relevance varies substantially (nDCG 0.247-1.000). Demographic breakdown analysis revealed fairness gaps for geographic attributes regardless of model architecture. These results indicate that future work in specializing reasoning models to be aware of fairness attributes could lead to improvements, as current implementations preserve the fairness characteristics of their input ranking.

Presenters

Co-Authors

188 visits

Session Participants

User Online

Session speakers, moderators & attendees

Joel Mackenzie

Senior Lecturer

The University Of Queensland

Clara Rus

PhD

University Of Amsterdam

Kevin Innerebner

PhD student

Graz University Of Technology

Saron Samuel

PhD Student

Johns Hopkins University

Alisa Rieger

Postdoc

GESIS - Leibniz Institute, Cologne

No attendee has checked-in to this session!

34 attendees saved this session

Session Chat

Live Chat

Chat with participants attending this session

Questions & Answers

Answered

Submit questions for the presenters

Session Polls

Active

Participate in live polls

Need Help?

Technical Issues?

If you're experiencing playback problems, try adjusting the quality or refreshing the page.

Questions for Speakers?

Use the Q&A tab to submit questions that may be addressed in follow-up sessions.

IR-for-Good Paper Session II

Session Information

Sub Sessions

Measuring Political Stance and Consistency in Large Language Models

Judiciously Reducing Sub-group Comparisons for Learning Intersectional Fair Representations

Modeling Behavioral Patterns in News Recommendations Using Fuzzy Neural Networks

Does Reasoning Make Search More Fair? Comparing Fairness in Reasoning and Non-Reasoning Rerankers

Session Participants

Session Chat

Questions & Answers

Session Polls

Need Help?

Please enter the four digit secret code The secret code should have been announced or displayed at the session location.

AI-generated Summary

Please enter the four digit secret code
The secret code should have been announced or displayed at the session location.