The vulnerability of supply chains and their role in the propagation of shocks has been high- lighted multiple times in recent years, including by the recent pandemic. However, while the importance of micro data is increasingly recognised, data at the firm-to-firm level remains scarcely available. In this study, we formulate supply chain networks’ reconstruction as a link prediction problem and tackle it using machine learning, specifically Gradient Boosting. We test our approach on three different supply chain datasets and show that it works very well and outperforms three benchmarks. An analysis of features’ importance suggests that the key data underlying our predictions are firms’ industry, location, and size. To evaluate the feasibility of reconstructing a network when no production network data is available, we attempt to predict a dataset using a model trained on another dataset, showing that the model’s performance, while still better than a random predictor, deteriorates substantially.


Mungo, L., Lafond, F., Astudillo-Estévez, P. & Farmer, J.D. (2022). 'Reconstructing production networks using machine learning'. INET Oxford Working Paper No. 2022-02.
Download Document (pdf, 1.209 MB)