Archive/An Expected Goals Model for Analyzing a 5-a-Side Soccer for the Blind Using Ten Machine Learning Algorithms with SHAP Interpretability
An Expected Goals Model for Analyzing a 5-a-Side Soccer for the Blind Using Ten Machine Learning Algorithms with SHAP Interpretability
Boryi A. Becerra-Patiño, Rodrigo Yáñez-Sepúlveda, José Pino-Ortega
3 de julio de 2026
en

Abstract

Background: Currently, expected goal models are tools that enable quantitative analysis in the study of conventional sports, although they have seen very little application in the Paralympic context. Objective: To present a trained expected goals model for 5-a-side blind soccer games based on an analysis of 164 offensive plays by the national team that won first place at the 2022 IBSA Copa América. The novelty of this work lies in being, to our knowledge, the first expected goals (xG) model developed for Paralympic blind football (B1): conventional xG weights cannot be transferred directly because shooting in F5 is governed by auditory orientation, the absence of an offside rule, a smaller rebound-walled pitch, and fully blind executors, so a sport-specific, reproducible and SHAP-interpretable benchmark is required where none previously existed. Materials and Methods: The SHapley Additive exPlanations library was used to analyze the data via partial dependency plots, dependency scatter plots, waterfall plots, decision plots, and SHAP heatmaps. Additionally, ten machine learning algorithms were compared, including logistic regression, random forest, extra trees, gradient boosting, XGBoost, LightGBM, CatBoost, support vector machine, k-nearest neighbors, and multilayer perceptron, using a 70/30 stratification process with fivefold stratified cross-validation to define the main hyperparameters. Results: The most consistent model was CatBoost (F1 = 0.778; AUC-ROC = 0.913; AUC-PR = 0.828; MCC = 0.729; Brier = 0.072), which allowed for independent analysis and evaluation of the dataset. The five main offensive variables were determined to be (i) distance to the goal before the shot; (ii) lateral coordinate; (iii) absolute magnitude of the shooting angle; (iv) magnitude of the progression vector; (v) proximity to the side kickboard. However, none of these variables proved to be decisive in the tournament (n = 24), a characteristic that the model captured as a significant negative contribution from the opponent variable. Conclusions: The expected goals model considered for this study serves as a starting point for further analysis of tactical variables in 5-a-side soccer for the blind. Because the model was trained on a single team in a single tournament with few positive cases, these results should be read as preliminary, hypothesis-generating tactical insights rather than validated performance estimates, and require external validation before transfer to other teams or competitions.

IPC Classification

G06C07

Keywords

expectedgoalsmodelanalyzing5-a-sidesoccerblindmachinelearningalgorithmsshapinterpretabilitydatabackgroundcurrentlygoalmodelstoolsenablequantitativeanalysisconventionalsportsalthough
Citar esta publicación

€ 4.00