Data Challenge Track

The International Conference on Performance Engineering (ICPE) is hosting its fifth edition of the Data Challenge track. We call upon everyone interested to apply approaches and analyses to a common selection of performance datasets. The challenge is open-ended: participants can choose the research questions they find most interesting. The proposed approaches/analyses and their findings are discussed in short papers and presented at the main conference.

In this track, we will provide four different performance datasets. Participants are invited to come up with new research questions and approaches for performance analysis. For their papers, participants must choose one or more datasets from a predefined list derived from prior academic/industry research. Participants are expected to use this year’s datasets to answer their research questions, and report their findings in a four-page challenge paper. If the paper is accepted, participants will be invited to present the results at ICPE 2026 in Italy. Details on the datasets are provided below.

Datasets

This year’s ICPE data challenge is based on four datasets from both academic and industry studies. Each dataset includes performance measurements gathered from either industrial or open-source systems, with each dataset having its own unique data format and content as described in their respective repositories.

The first dataset is provided by the 2025 ICPE paper “A Dataset of Performance Measurements and Alerts from Mozilla”.
It was collected from Mozilla Firefox’s performance testing infrastructure and comprises 5,655 performance time series, 17,989 performance alerts, and detailed annotations of resulting bugs collected from May 2023 to May 2024.
Paper: https://dl.acm.org/doi/10.1145/3680256.3721973
Repository: https://zenodo.org/records/15465568
The second dataset is introduced in the 2025 NSDI paper “GPU-Disaggregated Serving for Deep Learning Recommendation Models at Scale”. This dataset contains a comprehensive trace dataset for GPU-disaggregated serving of Deep Learning Recommendation Models (DLRMs). It captures operational characteristics of 156 inference services, comprising a total of 23,871 inference instances, further divided into 16,485 CN (CPU Node) inference instances and 7,386 HN (Heterogeneous GPU Node) inference instances.
Paper:https://www.usenix.org/conference/nsdi25/presentation/yang
Repository:https://github.com/alibaba/clusterdata/tree/master/cluster-trace-gpu-v2025
The third dataset is introduced in the 2025 ICPE paper “Shaved Ice: Optimal Compute Resource Commitments for Dynamic Multi-Cloud Workloads”.
It contains normalized and obfuscated hourly data about VM demand in four example Snowflake deployments over a period of 3 years (2/1/2021 to 1/30/2024). Each hour includes (type of VM, region, number of VMs of that type) used at that time.
Paper: https://dl.acm.org/doi/10.1145/3676151.3719353
Repository: https://github.com/Snowflake-Labs/shavedice-dataset
The fourth dataset is provided by the EuroSys 2025 paper “TUNA: Tuning Unstable and Noisy Cloud Applications”. It represents a collection of benchmarks run on Microsoft Azure Virtual Machine offerings over a period of around 483 days, covering the main components in the VM (except network). Additionally, two end-to-end applications were also benchmarked: PostgreSQL and Redis.
Paper: https://dl.acm.org/doi/10.1145/3689031.3717480
Repository:https://github.com/Azure/AzurePublicDataset/tree/master/vm-noise-data

Challenge

High-level possible ideas for participants include but are not limited to:

Tailor visualization techniques to navigate the extensive data generated by systems.
- Beschastnikh et al., 2020: https://doi.org/10.1145/3375633
- Silva et al., 2021: https://doi.org/10.1109/IV53921.2021.00028
- Anand et al., 2020: https://doi.org/10.48550/arXiv.2010.13681
Develop automated techniques to identify patterns associated with performance degradations.
- Wang et al., 2022: https://dl.acm.org/doi/10.1145/3545008.3545026
- Traini and Cortellessa, 2023: https://doi.org/10.1109/TSE.2023.3266041
- Bansal et al., 2020: https://doi.org/10.1145/3377813.3381353
Evaluation of previous/novel root cause analysis techniques.
- Noferesti et al., 2024: https://doi.org/10.1016/j.jss.2024.112117
- Mariani et al., 2018: https://doi.org/10.1109/ICST.2018.00034
- Ma et al., 2020: https://doi.org/10.1145/3366423.3380111
Model performance of systems using machine learning algorithms.
- Xiong et al., 2013: https://doi.org/10.1145/2479871.2479909
- Liao et al., 2020: https://doi.org/10.1007/s10664-020-09866-z
Replicate prior study/approach on a selected dataset.

Submission

A challenge paper should outline the findings of your research. Starting with an introduction to the problem tackled and its relevance in the field. Detail the datasets utilized, the methods and tools applied, and the results achieved. Discuss the implications of the study findings, and highlight the paper contributions and their importance.

To maintain clarity and consistency in research submissions, authors are required to specify the dataset (or portion thereof) utilized when detailing methodologies or presenting findings. Additionally, authors are reminded to be precise in their references to datasets.

We highly encourage the solution’s source code to be included with the submission (e.g., in a permanent repository such as Zenodo, potentially linked to a GitHub repository as described here), but this is not mandatory for acceptance of a data challenge paper.

The page limit for challenge papers is 4 pages (including all figures and tables) + 1 page for references. Challenge papers will be published in the companion to the ICPE 2026 proceedings. All challenge papers will be reviewed by the program committee members. Note that submissions to this track are double-blind: for details, see the Double Anonymized FAQ page. The best data challenge paper will be awarded by the track chairs and the program committee members.

Submissions to be made via HotCRP by selecting the respective track.

The submission deadline can be found here.