Tutorial Schedule and Location
The ECML PKDD 2025 conference will be held in the city of Porto, Portugal from 15 to 19 Setember 2025.
This specific tutorial will take place at the 15th of September at 14h00. The tutorial will take about 4 hours, including a 30 min break. The first part of the tutorial will be a presentation of the library and its metrics. The remaining time will be dedicated to hands-on sessions focused on each data modality. A detailed schedule is provided bellow.
pyMDMA presentation (30 mins) – Luís Rosado
- Introduction
- Target modalities
- Metric taxonomy description
- Available metrics, installation, and contribution
Image Tutorial (60 mins) – Ivo Façoco
- Dataset presentation
- Public RGB dataset
- Input Validation
- Extraction of image quality metrics
- Distribution analysis of extracted metrics
- Synthetic Validation
- Synthetic dataset explanation (model used, number of instances and type of conditioning)
- Feature extraction with pre-trained models
- Evaluation of fidelity and diversity concepts
- Sample selection through quality-based ranking. Comparison of best/worst generated examples via metric outputs.
Time-Series Tutorial (60 mins) – Maria Russo
- Dataset Presentation
- Overview of the ECG dataset and its characteristics
- Input Validation
- Extraction of signal quality metrics
- Distribution analysis of extracted metrics
- Synthetic Validation
- Explanation of the synthetic dataset
- Feature extraction using the Time Series Feature Extraction Library (TSFEL)
- Evaluation of fidelity and diversity concepts
- Selection of synthetic samples using metric outputs.
Coffee Break (30 mins)
Tabular Tutorial (60 mins) – Pedro Matias
- Dataset Presentation
- Dataset Loading
- High-quality public tabular datasets
- Low-quality public tabular datasets
- Data Preparation
- Attribute type detection, encoding, and scaling;
- Visualization through 2D-embeddings.
- Dataset Loading
- Input Validation
- Extraction of tabular quality metrics;
- Dataset selection through quality-based global ranking.
- Synthetic Validation
- Synthetic Datasets
- Description of generative models (traditional vs. deep learning);
- Visualization of real vs. synthetic using 2D-embeddings.
- Evaluation of fidelity and diversity concepts;
- Synthetic Datasets