Specialized Retrieval Domains & Architectures

Loading Session...

Session Information

Filtering Few-Level Segment Regions for Efficient Subsequence Search in 3D Human Motions
Starbucks: Improved Training for 2D Matryoshka Embeddings
Website Segmentation Beyond Structure: A Benchmark on Functional and Digital Maturity Classes

Full Papers

Mar 30, 2026 14:30 - 15:30(Europe/Amsterdam)

Venue : Centrale (Plenary Room)

20260330T1430 20260330T1530 Europe/Amsterdam Specialized Retrieval Domains & Architectures Filtering Few-Level Segment Regions for Efficient Subsequence Search in 3D Human MotionsStarbucks: Improved Training for 2D Matryoshka EmbeddingsWebsite Segmentation Beyond Structure: A Benchmark on Functional and Digital Maturity Classes Centrale (Plenary Room) ECIR2026 conference-secretariat@blueboxevents.nl

Add to my Schedule

Sub Sessions

Filtering Few-Level Segment Regions for EfficientSubsequence Search in 3D Human Motions

Full papersApplicationsSearch and ranking 02:30 PM - 03:30 PM (Europe/Amsterdam) 2026/03/30 12:30:00 UTC - 2026/03/30 13:30:00 UTC

Efficient localization of query-similar subsequences in a database of untrimmed 3D human motion data is crucial to applications in numerous domains. We propose a novel subsequence search approach that partitions untrimmed database motions into segments across a few levels to accommodate variably-sized queries, addressing the limitations of single- and many-level state-of-the-art methods. By determining a deep similarity between the query and database segments, we specifically identify larger regions within the database motions likely to contain query-similar subsequences. These regions are then narrowly examined to determine the precise location of relevant subsequences, considering also variations in execution speed. While this approach contributes to a high retrieval quality, it also requires high search costs. Therefore, we propose two filtering techniques that further decrease the number of examined subsequences by more than an order of magnitude on a newly established benchmark across four challenging PKU-MMD sub-datasets.

Presenters

AČ

Co-Authors

Starbucks: Improved Training for 2D Matryoshka Embeddings

Full papersSearch and ranking 02:30 PM - 03:30 PM (Europe/Amsterdam) 2026/03/30 12:30:00 UTC - 2026/03/30 13:30:00 UTC

2D Matryoshka training enables a single embedding model to produce sub-network representations across varying layers and embedding dimensions, offering flexibility under different computational and task constraints. However, its performance remains below that of individually trained models of comparable sizes. To address this, we propose \textbf{Starbucks}, a new training strategy for Matryoshka-style embedding models that combines structured fine-tuning with masked autoencoder (MAE) pre-training. During fine-tuning, we compute the loss over a fixed set of layer-dimension pairs, ordered from small to large, which significantly improves over random sub-network sampling and matches the performance of separately trained models. Our MAE-based pre-training further strengthens sub-network representations, providing a more robust backbone for downstream tasks. Experiments on both in-domain (semantic similarity and passage retrieval) and out-of-domain (BEIR) benchmarks show that Starbucks consistently outperforms 2D Matryoshka models and matches or exceeds the performance of individually trained models, while maintaining high efficiency and flexibility. Ablation studies validate our loss design, the benefits of SMAE pre-training, and demonstrate Starbucks¡¯ applicability across backbones. We further show that depth- and width-wise Starbucks variants encode complementary information, and that combining them yields further gains with minimal latency overhead via parallelization. Code at https://anonymous.4open.science/r/Starbucks-Official-02E7.

Presenters

Co-Authors

Website Segmentation Beyond Structure: A Benchmark onFunctional and Digital Maturity Classes

Full papersEvaluation researchMachine Learning and Large Language Models 02:30 PM - 03:30 PM (Europe/Amsterdam) 2026/03/30 12:30:00 UTC - 2026/03/30 13:30:00 UTC

Segmentation is a crucial prerequisite for effective and efficient information retrieval on websites, as it enables the structured interpretation of heterogeneous content. Recently, a novel dataset has been released that provides two complementary segmentation schemes: a broad functional segmentation and a niche segmentation based on website digital maturity. While the former captures general structural elements, the latter targets a more specialized classification task, creating an interesting challenge for state-of-the-art segmentation approaches. In this paper, we present the first comprehensive evaluation of visual and textual models on this dataset, ranging from basic rule-based methods to large language models. We assess their performance across both segmentation frameworks using multiple evaluation scores. Our results show that visual approaches, despite limited training data, are generally more successful at generalizing across website structures and consistently outperform textual models. Notably, ResNet18 achieves the strongest performance in both functional and maturity-based segmentation, which we attribute to its ability to effectively capture and integrate both global and local context of a webpage. These findings establish important baselines for future research and underscore the importance of developing models that can perform robustly in niche settings and under data-scarce conditions.

Presenters

Co-Authors

180 visits

Session Participants

User Online

Session speakers, moderators & attendees

AČ

Andrej Černek

PhD student

Masaryk University

Shengyao Zhuang

CSIRO

Jonathan Gerber

Institute of Computer Science, Zurich University of Applied Science ZHAW

Allan Hanbury

Professor

TU Wien

No attendee has checked-in to this session!

21 attendees saved this session

Session Chat

Live Chat

Chat with participants attending this session

Questions & Answers

Answered

Submit questions for the presenters

Session Polls

Active

Participate in live polls

Need Help?

Technical Issues?

If you're experiencing playback problems, try adjusting the quality or refreshing the page.

Questions for Speakers?

Use the Q&A tab to submit questions that may be addressed in follow-up sessions.

Specialized Retrieval Domains & Architectures

Session Information

Sub Sessions

Filtering Few-Level Segment Regions for EfficientSubsequence Search in 3D Human Motions

Starbucks: Improved Training for 2D Matryoshka Embeddings

Website Segmentation Beyond Structure: A Benchmark onFunctional and Digital Maturity Classes

Session Participants

Session Chat

Questions & Answers

Session Polls

Need Help?

Please enter the four digit secret code The secret code should have been announced or displayed at the session location.

AI-generated Summary

Please enter the four digit secret code
The secret code should have been announced or displayed at the session location.