20260330T143020260330T1530Europe/AmsterdamReproducibility I: Recommender SystemsAre Multimodal Embeddings Truly Beneficial for Recommendation? A Deep Dive into Whole vs. Individual ModalitiesRecRankerEval: A Reproducible Framework for Deploying and Evaluating LLM-based Top-$k$ RecommendersEfficient Optimization of Hierarchical Identifiers for Generative RecommendationA Reproducible and Fair Evaluation of Partition-aware Collaborative FilteringA Systematic Reproducibility Study of BSARec for Sequential RecommendationChaosECIR2026n.fontein@tudelft.nl
Are Multimodal Embeddings Truly Beneficial for
Recommendation? A Deep Dive into Whole vs. Individual
Modalities
ReproducibilityReproducibility02:30 PM - 03:30 PM (Europe/Amsterdam) 2026/03/30 12:30:00 UTC - 2026/03/30 13:30:00 UTC
ultimodal recommendation has emerged as a mainstream paradigm, typically leveraging text and visual embeddings extracted from pre-trained models such as Sentence-BERT, Vision Transformers, and ResNet. This approach is founded on the intuitive assumption that incorporating multimodal embeddings can enhance recommendation performance. However, despite its popularity, this assumption lacks comprehensive empirical verification. This presents a critical research gap. To address it, we pose the central research question of this paper: Are multimodal embeddings truly beneficial for recommendation?