Machine Learning Ensembles and Rail Defects Prediction: Multilayer Stacking Methodology
Publication: ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering
Volume 5, Issue 4
Abstract
Machine learning has taken a front seat in railway big data analysis. This is partly due to perpetual data collection and the need for automated systems to expedite maintenance decisions. A case for track defect prognosis in rail track engineering is presented in this paper. Fatigue defects are very common and are influential on rail maintenance. Understanding such defects is essential for optimized maintenance scheduling. The literature is replete with machine learning models developed for defect prediction. Because no single machine learning model is guaranteed to surpass others with every kind of data, each model has its inherent deficiencies. Classifier ensembles such as bagging or boosting aggregate strengths from different models to enhance prediction. The outcome is very effective, although highly correlated. This work proposes a stacking method of combining average learners into powerful learning machines while considering memory, time, computational, structural complexities, and bias-variance trade-offs. Because of the large scale of rail infrastructure considered in this work (35,406 km), this study shows that classical Weibull analysis underestimates annual fatigue defects by at least 25% throughout rail life. The proposed stacking ensemble compensates for this shortfall by aggregating the probability predictions of diverse learners. These predictions were combined from a binary classification ensemble of 0.783 receiver operating characteristic area under curve (ROC-AUC) score with significant room for improvement in computation time and curve fitting.
Get full access to this article
View all available purchase options and get full access to this article.
References
ASCE. 2017. “2017 infrastructure report card.” Accessed July 8, 2018. https://www.infrastructure reportcard.org/.
Attoh-Okine, N. 2017. Big data and differential privacy: Wiley series in operations research and management science. Hoboken, NJ: Wiley.
Bai, L., R. Liu, Q. Sun, F. Wang, and F. Wang. 2016. “Classification-learning-based framework for predicting railway track irregularities.” Inst. Mech. Eng. Part F J. Rail Rapid Transit 230 (2): 598–610. https://doi.org/10.1177/0954409714552818.
Chattopadhyay, G., and S. Kumar. 2009. “Parameter estimation for rail degradation model.” Accessed March 8, 2018. http://paris.utdallas.edu/IJPE/Vol05/Issue02/VOL5N2P2FPGR.pdf.
Flennerhag, S. 2018. “Ensemble learning.” Accessed May 26, 2018. http://flennerhag.com/2017-04-18-introduction-to-ensembles/.
FRA (Federal Railroad Administration). 2015. In Vol. 2 of Track inspector rail defect reference manual, 1–82. Washington, DC: Office of Railroad Safety, US Dept. of Transportation.
FRA (Federal Railroad Administration). 2017. “FRA office of safety analysis.” Accessed May 18, 2018. http://safetydata.fra.dot.gov/officeofsafety/default.aspx.
FRA (Federal Railroad Administration). 2018. “Research initiatives in support of rail safety: Solicitation number: BAA20182.” Accessed July 8, 2018. https://www.fbo.gov/index?s=opportunity&mode=form&id=b64a01ccfb21050e4e46fce0f088bfa3&tab=core&_cview=1.
Fumeo, E., L. Oneto, and D. Anguita. 2015. “Condition based maintenance in railway transportation systems based on big data streaming analysis.” Procedia Comput. Sci. 53: 437–446. https://doi.org/10.1016/j.procs.2015.07.321.
Galván-Núñez, S., and N. Attoh-Okine. 2018. “A threshold-regression model for track geometry degradation.” Proc. Inst. Mech. Eng. Part F J. Rail Rapid Transit 232 (10): 2456–2465. https://doi.org/10.1177/0954409718777834.
Ghofrani, F., Q. He, R. M. P. Goverde, and X. Liu. 2018. “Recent applications of big data analytics in railway transportation systems: A survey.” Transp. Res. Part C Emerging Technol. 90 (May): 226–246. https://doi.org/10.1016/j.trc.2018.03.010.
Gibert, X., V. M. Patel, and R. Chellappa. 2015. “Deep multitask learning for railway track inspection.” IEEE Trans. Intell. Transp. Syst. 18 (1): 153–164. https://doi.org/10.1109/TITS.2016.2568758.
He, Q., H. Li, D. Bhattacharjya, D. P. Parikh, and A. Hampapur. 2015. “Track geometry defect rectification based on track deterioration modelling and derailment risk assessment.” J. Oper. Res. Soc. 66 (3): 392–404. https://doi.org/10.1057/jors.2014.7.
Heidarysafa, M., K. Kowsari, L. E. Barnes, and D. E. Brown. 2018. “Analysis of railway accidents’ narratives using deep learning.” In Proc., IEEE Transactions on Intelligent Transportation Systems. New York: IEEE.
Hribar, L., and D. Duka. 2010. “Weibull distribution in modeling component faults.” In Proc., IEEE Transactions on Intelligent Transportation Systems. New York: IEEE.
James, G., D. Witten, T. Hastie, and R. Tibshirani. 2013. “An introduction to statistical learning.” Accessed June 27, 2017. http://faculty.marshall.usc.edu/gareth-james/ISL/.
Jamshidi, A., S. Faghih-Roohi, S. Hajizadeh, A. Núñez, R. Babuska, R. Dollevoet, Z. Li, and B. De Schutter. 2017. “Big data analysis approach for rail failure risk assessment.” Risk Anal. 37 (8): 1495–1507. https://doi.org/10.1111/risa.12836.
Jeong, D. Y. 2003. “Analytical modelling of rail defects and its applications to rail defect management.” Accessed July 16, 2017. https://rosap.ntl.bts.gov/view/dot/8995.
Kashiwagi, S. 1969. “Weibull distribution and a practical analysis of mental test scores.” Jpn. Psychol. Res. 11 (2): 54–65. https://doi.org/10.4992/psycholres1954.11.54.
Kumar, S. 2006. “Study of the rail degradation process to predict rail breaks.” Licentiate thesis, Division of Operation and Maintenance Engineering, Lulea Univ. of Technology.
Lasisi, A., and N. Attoh-Okine. 2018. “Principal components analysis and track quality index: A machine learning approach.” Transp. Res. Part C Emerging Technol. 91 (Apr): 230–248. https://doi.org/10.1016/j.trc.2018.04.001.
Lettich, F., C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, N. Tonellotto, and R. Venturini. 2017. “Multicore/manycore parallel traversal of large forests of regression trees.” In Vol. 35 of Proc., 2017 Int. Conf. on High Performance Computing and Simulation, 915. New York: IEEE.
Lucchese, C., F. M. Nardini, S. Orlando, R. Perego, N. Tonellotto, and R. Venturini. 2017. QuickScorer: Efficient traversal of large ensembles of decision trees: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). Berlin: Springer.
Marsland, S. 2015. Machine learning: An algorithmic perspective. Boca Raton, FL: CRC Press.
Martey, E. N., L. Ahmed, and N. Attoh-Okine. 2017. “Track geometry big data analysis: A machine learning approach.” In Proc., 2017 IEEE Int. Conf. on Big Data (Big Data). 3718–3727. New York: IEEE.
Mulvaney, R., and D. S. Phatak. 2003. “Method to merge ensembles of bagged or boosted forced-split decision trees.” In Proc., IEEE Transactions on Pattern Analysis and Machine Intelligence. New York: IEEE.
Nio, A., R. Andrade, and P. Fonseca Teixeira. 2013. “Hierarchical Bayesian modelling of rail track geometry degradation.” J. Rail Rapid Transit Inst. Mech. Eng. 227 (4): 364–375. https://doi.org/10.1177/0954409713486619.
Opitz, D., and R. Maclin. 1999. “Popular ensemble methods: An empirical study.” Accessed March 5, 2017. https://arxiv.org/abs/1106.0257.
Panda, B., J. S. Herbach, S. Basu, and R. J. Bayardo. 2009. “PLANET: Massively parallel learning of tree ensembles with MapReduce.” Learning 2 (2): 1426–1437. https://doi.org/10.14778/1687553.1687569.
Pedregosa, F., et al. 2011. “Scikit-learn: Machine learning in Python.” J. Mach. Learn. Res. 12 (Oct): 2825–2830. http://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf.
Phan, K. T., T. H. Maul, and T. T. Vu. 2017. “Empirical study on improving the speed and generalization of neural networks using a parallel circuit approach.” Int. J. Parallel Program. 45 (4): 780–796. https://doi.org/10.1007/s10766-016-0435-4.
Rizzo, P., M. Cammarata, D. Dutta, and H. Sohn. 2009. “Unsupervised learning algorithm for fatigue crack detection in waveguides.” Smart Mater. Struct. 18 (2): 025016. https://doi.org/10.1088/0964-1726/18/2/025016.
Sharma, S., Y. Cui, Q. He, R. Mohammadi, and Z. Li. 2018. “Data-driven optimization of railway maintenance for track geometry.” Transp. Res. Part C Emerging Technol. 90 (May): 34–58. https://doi.org/10.1016/j.trc.2018.02.019.
Song, H., and E. Schnieder. 2018. “Modeling of railway system maintenance and availability by means of colored Petri nets.” Eksploatacja Niezawodnosc 20 (2): 236–243. https://doi.org/10.17531/ein.2018.2.08.
Steele, R. K., and W. M. Joerms. 1988. “Fatigue analysis of the effects of wheel load on rail life.” Transp. Res. Rec. 1174: 13–27.
Weibull, W. 1951. “Statistical distribution function of wide applicability.” J. Appl. Mech. 103 (3): 293–297.
Weisstein, E. W. 2009a. “Logistic equation.” MathWorld. Accessed July 20, 2018. http://mathworld.wolfram.com/LogisticEquation.html.
Weisstein, E. W. 2009b. “Sigmoid function.” Accessed July 20, 2018. http://mathworld.wolfram.com/SigmoidFunction.html.
Wilson, A. 2012. Engineering manual track TMC 226 rail defects handbook summary of changes from previous version. Sydney, Australia: NSW Transport RailCorp.
Wolpert, D. H., and W. G. Macready. 1997. “No free lunch theorems for optimization.” IEEE Trans. Evol. Comput. 1 (1): 67–82. https://doi.org/10.1109/4235.585893.
Yao, X., S. M. S. Nirjon, M. A. Islam, and K. Murase. 2008. “Bagging and boosting negatively correlated neural networks.” IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 38 (3): 771–784. https://doi.org/10.1109/TSMCB.2008.922055.
Yella, S., S. M. Pasha, and P. M. Dougherty. 2009. “Classifier fusion for condition monitoring of wooden railway sleepers.” In Vol. 1 of Proc., IEEE Int. Conf. on Systems, Man and Cybernetics. 2–4. New York: IEEE.
Zarembski, A. M., N. Attoh-Okine, and J. Cronin. 2017. “Rail fatigue life forecasting using big data analysis techniques.” In Proc., UTC Spotlight Conf.: Rebuilding and Retrofitting the Transportation Infrastructure. Washington, DC: Transportation Research Board.
Zarembski, A. M., N. Attoh-Okine, and D. Einbinder. 2015. “On the relationship between track geometry defects and development of internal rail defects.” Accessed July 17, 2017. http://railroadengineering.engr.udel.edu/wp-content/uploads/2016/11/186-WCRR2016-Track-Geometry-and-Rail-Defects.pdf.
Zarembski, A. M., D. Einbinder, and N. Attoh-Okine. 2016. “Using multiple adaptive regression to address the impact of track geometry on development of rail defects.” Constr. Build. Mater. 127 (Nov): 546–555. https://doi.org/10.1016/j.conbuildmat.2016.10.012.
Information & Authors
Information
Published In
Copyright
©2019 American Society of Civil Engineers.
History
Received: Aug 6, 2018
Accepted: Apr 10, 2019
Published online: Oct 15, 2019
Published in print: Dec 1, 2019
Discussion open until: Mar 15, 2020
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.