Comparison of Data-Driven Site Characterization Methods through Benchmarking: Methodological and Application Aspects
Publication: ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering
Volume 9, Issue 2
Abstract
Site characterization is one of the most crucial steps for decision making in geotechnical engineering and to the fullest extent possible should be conducted based on objective data. The current reliance on engineering judgment to interpret data directly cannot exploit the rapid growth of data, machine learning, and other digital technologies. Data-driven site characterization (DDSC) has received much attention in an emerging field called data-centric geotechnics, because a knowledge of the ground is fundamental to geotechnical engineering. As a result, many DDSC methods have been developed recently. Differences and similarities between DDSC methods, however, have not been well studied in terms of methodological and application aspects. This paper proposes a comparison between three emerging DDSC methods from these methodological and application perspectives: (1) geotechnical lasso (Glasso), (2) geotechnical lasso with basis-functions (Glasso-BFs), and (3) Gaussian process regression (GPR). From a methodological perspective, this paper presents a unified Bayesian framework to derive these DDSC methods, in order to shed light on the methodological similarities and differences. From the application perspective, the prediction accuracy for the validation dataset and runtime cost of these three DDSC methods were compared through benchmarking. The differences in performance can be better understood within the unified framework. This paper further proposes a new benchmark involving complex intermixing of soil types, to test the three methods under more realistic and challenging field conditions, although the training and validation datasets remain synthetic.
Get full access to this article
View all available purchase options and get full access to this article.
Data Availability Statement
All data, models, and code that support the findings of this study are available from the corresponding author upon reasonable request.
Acknowledgments
This work was supported by JSPS KAKENHI Grant No. JP18K05880.
References
Bertrand, Q., M. Massias, A. Gramfort, and J. Salmon. 2019. “Handling correlated and repeated measurements with the smoothed multivariate square-root lasso.” In Proc., 33rd Conf. on Neural Information Processing Systems (NeurIPS 2019). Red Hook, NY: Curran Associates.
Bishop, C. M. 2006. Pattern recognition and machine learning. New York: Springer.
Boyd, S., N. Parikh, E. Chu, B. Peleato, and J. Eckstein. 2010. “Distributed optimization and statistical learning via the alternating direction method of multipliers.” Found. Trends Mach. Learn. 3 (1): 1–122. https://doi.org/10.1561/2200000016.
Cami, B., S. Javankhoshdel, K.-K. Phoon, and J. Ching. 2020. “Scale of fluctuation for spatially varying soils: Estimation methods and values.” ASCE-ASME J. Risk Uncertainty Eng. Syst. Part A: Civ. Eng. 6 (4): 03120002. https://doi.org/10.1061/AJRUA6.0001083.
Cao, W., A. Zhou, and S.-L. Shen. 2021. “An analytical method for estimating horizontal transition probability matrix of coupled Markov chain for simulating geological uncertainty.” Comput. Geotech. 129 (Jan): 103871. https://doi.org/10.1016/j.compgeo.2020.103871.
Ching, J., W.-H. Huang, and K.-K. Phoon. 2020. “3D probabilistic site characterization by sparse Bayesian learning.” J. Eng. Mech. 146 (12): 04020134. https://doi.org/10.1061/(ASCE)EM.1943-7889.0001859.
Ching, J., and K.-K. Phoon. 2017. “Characterizing uncertain site-specific trend function by sparse Bayesian learning.” J. Eng. Mech. 143 (7): 04017028. https://doi.org/10.1061/(ASCE)EM.1943-7889.0001240.
Ching, J., and K.-K. Phoon. 2019. “Impact of autocorrelation function model on the probability of failure.” J. Eng. Mech. 145 (1): 04018123. https://doi.org/10.1061/(ASCE)EM.1943-7889.0001549.
Ching, J., S.-S. Wu, and K.-K. Phoon. 2016. “Statistical characterization of random field parameters using frequentist and Bayesian approaches.” Can. Geotech. J. 53 (2): 285–298. https://doi.org/10.1139/cgj-2015-0094.
Ching, J., Z. Yang, and K.-K. Phoon. 2021. “Dealing with nonlattice data in three-dimensional probabilistic site characterization.” J. Eng. Mech. 147 (5): 06021003. https://doi.org/10.1061/(ASCE)EM.1943-7889.0001907.
Ching, J., and I. Yoshida. 2023. “Data-drive site characterization for benchmark examples: Sparse Bayesian learning versus Gaussian process regression.” ASCE-ASME J. Risk Uncertainty Eng. Syst. Part A: Civ. Eng. 9 (1): 04022064. https://doi.org/10.1061/AJRUA6.RUENG-983.
Ching, J., I. Yoshida, and K.-K. Phoon. 2023. “Comparison of trend models for geotechnical spatial variability: Sparse Bayesian learning vs. Gaussian process regression.” Gondwana Res. https://doi.org/10.1016/j.gr.2022.07.011.
DeGroot, D. J., and G. B. Baecher. 1993. “Estimating autocovariance of in-situ soil properties.” J. Geotech. Eng. 119 (1): 147–166. https://doi.org/10.1061/(ASCE)0733-9410(1993)119:1(147).
Deng, Z. P., D. Q. Li, X. H. Qi, Z. J. Cao, and K. K. Phoon. 2017. “Reliability evaluation of slope considering geological uncertainty and inherent variability of soil parameters.” Comput. Geotech. 92: 121–131. https://doi.org/10.1016/j.compgeo.2017.07.020.
Elfeki, A., and M. Dekking. 2001. “A Markov chain model for subsurface characterization: Theory and applications.” Math. Geol. 33 (5): 569–589. https://doi.org/10.1023/A:1011044812133.
Fletcher, R. 1987. Practical methods of optimization. 2nd ed. New York: Wiley.
Hansen, P. C., and D. P. O’Leary. 1993. “The use of the L-curve in the regularization of discrete ill-posed problems.” SIAM J. Sci. Comput. 14 (6): 1487–1503. https://doi.org/10.1137/0914086.
Hastie, T., R. Tibshirani, and R. Tibshirani. 2020. “Best subset, forward stepwise or lasso? Analysis and recommendations based on extensive comparisons.” Stat. Sci. 35 (4): 579–592. https://doi.org/10.1214/19-STS733.
Hestenes, M. R., and S. Eduard. 1952. “Methods of conjugate gradients for solving linear systems.” J. Res. Natl. Bur. Stand. 49 (6): 409–435. https://doi.org/10.6028/jres.049.044.
Krumbein, W. C. 1967. FORTRAN IV computer programs for Markov chain experiments in geology. Lawrence, KS: Univ. of Kansas-State Geological Survey.
MacKay, D. J. C. 1992. “Bayesian interpolation.” Neural Comput. 4 (3): 415–447. https://doi.org/10.1162/neco.1992.4.3.415.
MacKay, D. J. C. 1994. “Bayesian methods for backprop networks.” Chap. 6 in Models of neural networks, III, edited by E. Domany, J. L. van Hemmen, and K. Schulten, 211–254. New York: Springer.
Massias, M., O. Fercoq, A. Gramfort, and J. Salmon. 2018. “Generalized concomitant multi-task lasso for sparse multimodal regression.” In Vol. 84 of Proc., 21st Int. Conf. on Artificial Intelligence and Statistics, 998–1007. Cambridge, UK: MIT Press.
Neal, R. M. 1996. Bayesian learning for neural networks, lecture notes in statistics. New York: Springer.
Osada, R., T. Funkhouser, B. Chazelle, and D. Dobkin. 2002. “Shape distributions.” ACM Trans. Graphics 21 (4): 807–832. https://doi.org/10.1145/571647.571648.
Park, E. 2010. “A multidimensional, generalized coupled Markov chain model for surface and subsurface characterization.” Water Resour. Res. 46 (11): W11509. https://doi.org/10.1029/2009WR008355.
Park, E., A. Elfeki, and M. Dekking. 2005. “Characterization of subsurface heterogeneity: Integration of soft and hard information using multidimensional coupled Markov chain approach.” Dev. Water Sci. 52 (Jan): 193–202. https://doi.org/10.1016/S0167-5648(05)52016-1.
Park, T., and G. Casella. 2008. “The Bayesian lasso.” J. Am. Stat. Assoc. 103 (482): 681–686. https://doi.org/10.1198/016214508000000337.
Phoon, K.-K., and J. Ching. 2021. “Project DeepGeo—Data-driven 3D subsurface mapping.” J. GeoEng. 16 (2): 61–73.
Phoon, K.-K., J. Ching, and Z. Cao. 2022a. “Unpacking data-centric geotechnics.” Underground Space 7 (6): 967–989. https://doi.org/10.1016/j.undsp.2022.04.001.
Phoon, K.-K., J. Ching, and T. Shuku. 2022b. “Challenges in data-driven site characterization.” Georisk 16 (1): 114–126. https://doi.org/10.1080/17499518.2021.1896005.
Phoon, K.-K., J. Ching, and Y. Wang. 2019. “Managing risk in geotechnical engineering—From data to digitalization.” In Proc., 7th Int. Symp. on Geotechnical Safety and Risk (ISGSR 2019), 13–34. Singapore: Research Publishing.
Phoon, K.-K., and T. Shuku. 2022. “3D data-driven site characterization using geotechnical lasso with basis functions.” In Proc., 8th Int. Symp. on Reliability and Risk Management (ISRERM 2022). Singapore: Research Publishing.
Phoon, K.-K., T. Shuku, J. Ching, and I. Yoshida. 2022c. “Benchmark examples for data-driven site characterization.” Georisk 16 (4): 599–621. https://doi.org/10.1080/17499518.2022.2025541.
Qi, X.-H., D.-Q. Li, K.-K. Phoon, Z.-J. Cao, and X.-S. Tang. 2016. “Simulation of geologic uncertainty using coupled Markov chain.” Eng. Geol. 207 (Jun): 129–140. https://doi.org/10.1016/j.enggeo.2016.04.017.
Rasmussen, C. E., and C. K. I. Williams. 2006. Gaussian processes for machine learning, 1–248. Cambridge, MA: MIT Press.
Robertson, P. K. 2016. “Cone penetration test (CPT)-based soil behaviour type (SBT) classification system—An update.” Can. Geotech. J. 53 (12): 1910–1927. https://doi.org/10.1139/cgj-2016-0044.
Robertson, P. K., and K. L. Cabal. 2010. “Estimating soil unit weight from CPT.” In Proc., 2nd Int. Symp. on Cone Penetration Testing. Linkoping, Sweden: Swedish Geotechnical Society.
Robertson, P. K., and C. E. F. Wride. 1998. “Evaluating cyclic liquefaction potential using the cone penetration test.” Can. Geotech. J. 35 (3): 442–459. https://doi.org/10.1139/t98-017.
Sauvin, G., M. Vanneste, M. E. Vardy, R. T. Klinkvort, and F. C. Fredrik. 2019. “Machine learning and quantitative ground models for improving offshore wind site characterization.” In Proc., Offshore Technology Conf. Houston: Offshore Technology Conference.
Shuku, T., and K.-K. Phoon. 2021. “Three-dimensional subsurface modeling using geotechnical lasso.” Comput. Geotech. 133 (May): 1034068. https://doi.org/10.1016/j.compgeo.2021.104068.
Shuku, T., and K.-K. Phoon. 2022. “Bayesian estimation for subsurface models using spike-and-slab prior.” In Proc., 8th Int. Symp. on Reliability and Risk Management (ISRERM 2022). Singapore: Research Publishing.
Shuku, T., K.-K. Phoon, and I. Yoshida. 2020. “Trend estimation and layer boundary detection in depth-dependent soil data using sparse Bayesian lasso.” Comput. Geotech. 128 (Dec): 103845. https://doi.org/10.1016/j.compgeo.2020.103845.
Tipping, M. E. 2001. “Sparse Bayesian learning and the relevance vector machine.” J. Mach. Learn. Res. 1 (Jun): 211–244. https://doi.org/10.1162/15324430152748236.
Tomizawa, Y., and I. Yoshida. 2022. “Benchmarking of Gaussian process regression with multiple random fields for spatial variability estimation.” ASCE-ASME J. Risk Uncertainty Eng. Syst. Part A: Civ. Eng. 8 (4): 04022052. https://doi.org/10.1061/AJRUA6.0001277.
Wang, Y., Z. Cao, and D. Li. 2016. “Bayesian perspective on geotechnical variability and site characterization.” Eng. Geol. 203 (Mar): 117–125. https://doi.org/10.1016/j.enggeo.2015.08.017.
Wang, Y., and P. Li. 2021. “Data-driven determination of sample number and efficient sampling locations for geotechnical site investigation of a cross-section using Voronoi diagram and Bayesian compressive sampling.” Comput. Geotech. 130 (Feb): 103898. https://doi.org/10.1016/j.compgeo.2020.103898.
Wang, Y., and T. Zhao. 2017. “Statistical interpolation of soil property profiles from sparse data using Bayesian compressive sampling.” Géotechnique 67 (6): 523–536. https://doi.org/10.1680/jgeot.16.P.143.
Wang, Y., T. Zhao, Y. Hu, and K.-K. Phoon. 2019. “Simulation of random fields with trend from sparse measurements without detrending.” J. Eng. Mech. 145 (2): 04018130. https://doi.org/10.1061/(ASCE)EM.1943-7889.0001560.
Yoshida, I., and T. Shuku. 2021. “Soil stratification and spatial variability estimated using sparse modeling and Gaussian random field theory.” ASCE-ASME J. Risk Uncertainty Eng. Syst., Part A: Civ. Eng. 7 (3). https://doi.org/10.1061/AJRUA6.0001143.
Yoshida, I., Y. Tomizawa, and Y. Otake. 2021. “Estimation of trend and random components of conditional random field using Gaussian process regression.” Comput. Geotech. 136 (Aug): 104179. https://doi.org/10.1016/j.compgeo.2021.104179.
Zhan, J.-Z., Z.-Q. Liu, D.-M. Zhan, H.-W. Huang, K.-K. Phoon, and Y.-D. Xue. 2022. “Improved coupled Markov chain method for simulating geological uncertainty.” Eng. Geol. 298 (Mar): 106539. https://doi.org/10.1016/j.enggeo.2022.106539.
Information & Authors
Information
Published In
Copyright
© 2023 American Society of Civil Engineers.
History
Received: Aug 6, 2022
Accepted: Dec 4, 2022
Published online: Jan 30, 2023
Published in print: Jun 1, 2023
Discussion open until: Jun 30, 2023
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.
Cited by
- Kok-Kwang Phoon, Takayuki Shuku, Future of Machine Learning in Geotechnics (FOMLIG), 5–6 Dec 2023, Okayama, Japan, Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards, 10.1080/17499518.2024.2316882, 18, 1, (288-303), (2024).
- Zia ur Rehman, Trends and Challenges of Technology-Enhanced Learning in Geotechnical Engineering Education, Sustainability, 10.3390/su15107972, 15, 10, (7972), (2023).
- K. K. Phoon, L. M. Zhang, Z. J. Cao, Special issue on “Machine learning and AI in geotechnics”, Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards, 10.1080/17499518.2023.2185938, 17, 1, (1-6), (2023).
- Takayuki Shuku, Kok-Kwang Phoon, Data-driven subsurface modelling using a Markov random field model, Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards, 10.1080/17499518.2023.2181973, 17, 1, (41-63), (2023).
- Kok-Kwang Phoon, Takayuki Shuku, Jianye Ching, Ikumasa Yoshida, Benchmarking Data-Driven Site Characterization, ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering, 10.1061/AJRUA6.RUENG-1058, 9, 2, (2023).
- Kok-Kwang Phoon, What Geotechnical Engineers Want to Know about Reliability, ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering, 10.1061/AJRUA6.RUENG-1002, 9, 2, (2023).