Archive/How Learnable Is LP Truck Dispatch? A Multi-Model Behavioural Cloning Benchmark of 983,000 Industrial Dispatch Cycles
How Learnable Is LP Truck Dispatch? A Multi-Model Behavioural Cloning Benchmark of 983,000 Industrial Dispatch Cycles
Muhammet Mustafa Kahraman
1 de julio de 2026
en

Abstract

This paper presents a large-scale behavioural-cloning benchmark of linear-programming (LP) truck dispatch in commercial open-pit mining, quantifying how much of the LP policy is recoverable from observable cycle records and which learning models recover it. Drawing on 983,025 LP dispatch decisions across four operational years at a large copper mine, four learned model families are compared under an identical, strictly causal feature set and a strict temporal hold-out (Year 5)—a 72,681-parameter multilayer perceptron (MLP), Random Forest, LightGBM, and XGBoost—against five non-parametric baselines. Gradient-boosted trees recover substantially more of the LP policy than the MLP: XGBoost attains 41.04% top-one and 79.50% top-three accuracy (95% CI [40.80, 41.30]), and LightGBM 39.70%/77.95%, both significantly exceeding the cycle-continuity heuristic (35.21%/67.32%) and the MLP (32.4%/68.8%) by McNemar tests (all p < 0.001). The dataset is highly imbalanced (normalized entropy 0.814; imbalance ratio 16,921:1), and front-end-loader classes with negligible support are not learnable. Permutation analysis shows the truck’s previous shovel dominates the MLP policy (+7.46 pp), yet XGBoost exceeds the previous-shovel-only Bayes-optimal accuracy of 32.40%, demonstrating that observable features beyond previous shovel carry exploitable signal the MLP fails to capture. A learning-curve ablation shows the gradient-boosting advantage is attributable to model architecture rather than training-data volume and is robust to hyperparameter choice, consistent with the established behaviour of tree ensembles on tabular data. The results indicate a learnability ceiling that sits well above the MLP and is partly model-limited rather than purely informational; they also show that imitation fidelity is distinct from dispatch quality, which is not assessed here. The study reframes behavioural cloning of commercial FMS dispatch as a diagnostic and benchmarking tool and motivates model choice, imbalance-aware learning, and richer state recovery as the levers for data-driven dispatch analysis.

IPC Classification

G06A61

Keywords

learnabletruckdispatchmulti-modelbehaviouralcloningbenchmarkindustrialcyclesminingpaperpresentslarge-scalebehavioural-cloninglinear-programmingcommercialopen-pitquantifyingmuchpolicyrecoverableobservablecyclerecords
Citar esta publicación

€ 4.00