Technical Papers
Mar 18, 2020

Machine Learning Approach for Sequence Clustering with Applications to Ground-Motion Selection

Publication: Journal of Engineering Mechanics
Volume 146, Issue 6

Abstract

Clustering analysis of sequential data is of great interest and importance in many science and engineering areas thanks to the explosive growth of time-series data. Effective methods, especially for sequence clustering, are strongly needed to extract features from data for better representation learning. This paper presents an unsupervised machine learning algorithm for sequence clustering based on dynamic k-means. Specifically, the clustering problem is firstly formulated rigorously to an optimization problem, which is then solved by a proposed three-step alternating-direction optimization approach. The performance of the proposed approach is successfully illustrated through three examples with both synthetic data sets and field ground-motion measurements. In particular, this approach is applied to ground-motion clustering/selection and shows satisfactory results. Overall, the results demonstrate that the proposed algorithm is able to effectively cluster sequential data through mining latent inherent characteristics.

Get full access to this article

View all available purchase options and get full access to this article.

Data Availability Statement

Some or all data, models, or code generated or used during the study are available in a repository online in accordance with funder data retention policies (https://github.com/zhry10/Clustering).

References

Aghabozorgi, S., A. S. Shirkhorshidi, and T. Y. Wah. 2015. “Time-series clustering—A decade review.” Inf. Syst. 53 (Oct–Nov): 16–38. https://doi.org/10.1016/j.is.2015.04.007.
Alimoradi, A., S. Pezeshk, F. Naeim, and H. Frigui. 2005. “Fuzzy pattern classification of strong ground motion records.” J. Earthquake Eng. 9 (3): 307–332. https://doi.org/10.1080/13632460509350544.
Aref, W. G., M. G. Elfeky, and A. K. Elmagarmid. 2004. “Incremental, online, and merge mining of partial periodic patterns in time-series databases.” IEEE Trans. Knowl. Data Eng. 16 (3): 335–345. https://doi.org/10.1109/TKDE.2003.1262186.
Baker, J. W. 2011. “Conditional mean spectrum: Tool for ground-motion selection.” J. Struct. Eng. 137 (3): 322–331. https://doi.org/10.1061/(ASCE)ST.1943-541X.0000215.
Baragona, R. 2001. “A simulation study on clustering time series with metaheuristic methods.” Quaderni di Statistica 3: 1–26.
Chiou, B., R. Darragh, N. Gregor, and W. Silva. 2008. “NGA project strong-motion database.” Earthquake Spectra 24 (1): 23–44. https://doi.org/10.1193/1.2894831.
Chung, F.-L., T. C. Fu, R. Luk, and V. Ng. 2001. Flexible time series pattern matching based on perceptually important points. Thousand Oaks, CA: Sage.
Coates, A., and A. Y. Ng. 2012. “Learning feature representations with K-means.” In Vol. 7700 of Neural networks: Tricks of the trade: Lecture notes in computer science, edited by G. Montavon, G. B. Orr, and K. R. Müller. Berlin: Springer.
Ding, Y., Y. Peng, and J. Li. 2018. “Cluster analysis of earthquake ground-motion records and characteristic period of seismic response spectrum.” J. Earthquake Eng. 1–22. https://doi.org/10.1080/13632469.2018.1453420.
Frank, M., and P. Wolfe. 1956. “An algorithm for quadratic programming.” Naval Res. Logist. Q. 3 (1–2): 95–110. https://doi.org/10.1002/nav.3800030109.
Fu, T.-C., F.-L. Chung, V. Ng, and R. Luk. 2001. “Pattern discovery from stock time series using self-organizing maps.” In Proc., KDD 2001 Workshop on Temporal Data Mining, 26–29. San Francisco: Association for Computing Machinery Special Interest Group on Knowledge Discovery and Data Mining.
Goutte, C., L. K. Hansen, M. G. Liptrot, and E. Rostrup. 2001. “Feature-space clustering for FMRI meta-analysis.” Hum. Brain Mapp. 13 (3): 165–183. https://doi.org/10.1002/hbm.1031.
Hancock, J., J. Watson-Lamprey, N. A. Abrahamson, J. J. Bommer, A. Markatis, E. McCoy, and R. Mendis. 2006. “An improved method of matching response spectra of recorded earthquake ground motion using wavelets.” J. Earthquake Eng. 10 (S1): 67–89. https://doi.org/10.1080/13632460609350629.
Higham, D. J., G. Kalna, and M. Kibble. 2007. “Spectral clustering and its use in bioinformatics.” J. Comput. Appl. Math. 204 (1): 25–37. https://doi.org/10.1016/j.cam.2006.04.026.
Hinneburg, A., and D. A. Keim. 1998. “An efficient approach to clustering in large multimedia databases with noise.” In Vol. 98 of Proc., 4th Int. Conf. on Knowledge Discovery and Data Mining (KDD’98), 58–65. Palo Alto, CA: Association for the Advancement of Artificial Intelligence Press.
Huang, X., Y. Ye, L. Xiong, R. Y. Lau, N. Jiang, and S. Wang. 2016. “Time series k-means: A new k-means type smooth subspace clustering for time series data.” Inf. Sci. 367–368 (Nov): 1–13. https://doi.org/10.1016/j.ins.2016.05.040.
Iervolino, I., C. Galasso, and E. Cosenza. 2010. “Rexel: Computer aided record selection for code-based seismic structural analysis.” Bull. Earthquake Eng. 8 (2): 339–362. https://doi.org/10.1007/s10518-009-9146-1.
Kaufman, L., and P. J. Rousseeuw. 2009. Vol. 344 of Finding groups in data: An introduction to cluster analysis. New York: Wiley.
Keogh, E. J., and M. J. Pazzani. 2000. “A simple dimensionality reduction technique for fast similarity search in large time series databases.” In Proc., Pacific-Asia Conf. on Knowledge Discovery and Data Mining, 122–133. Berlin: Springer.
Kodinariya, T. M., and P. R. Makwana. 2013. “Review on determining number of cluster in k-means clustering.” Int. J. Advance Res. Comput. Sci. Manage. Stud. 1 (6): 90–95.
Liao, T. W. 2005. “Clustering of time series data—A survey.” Pattern Recognit. 38 (11): 1857–1874. https://doi.org/10.1016/j.patcog.2005.01.025.
Lin, J., M. Vlachos, E. Keogh, and D. Gunopulos. 2004. “Iterative incremental clustering of time series.” In Proc., Int. Conf. on Extending Database Technology, 106–122. New York: Springer.
Lin, T., C. B. Haselton, and J. W. Baker. 2013a. “Conditional spectrum-based ground motion selection. Part I: Hazard consistency for risk-based assessments.” Earthquake Eng. Struct. Dyn. 42 (12): 1847–1865. https://doi.org/10.1002/eqe.2301.
Lin, T., C. B. Haselton, and J. W. Baker. 2013b. “Conditional spectrum-based ground motion selection. Part II: Intensity-based assessments and evaluation of alternative target spectra.” Earthquake Eng. Struct. Dyn. 42 (12): 1867–1884. https://doi.org/10.1002/eqe.2303.
Mu, H.-Q., and K.-V. Yuen. 2016. “Ground motion prediction equation development by heterogeneous Bayesian learning.” Comput.-Aided Civ. Infrastruct. Eng. 31 (10): 761–776. https://doi.org/10.1111/mice.12215.
Mu, H.-Q., and K.-V. Yuen. 2017. “Novel sparse Bayesian learning and its application to ground motion pattern recognition.” J. Comput. Civ. Eng. 31 (5): 04017031. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000668.
Niennattrakul, V., and C. A. Ratanamahatana. 2007. “On clustering multimedia time series data using k-means and dynamic time warping.” In Proc., 2007 Int. Conf. on Multimedia and Ubiquitous Engineering, 733–738. New York: IEEE.
Nugent, R., and M. Meila. 2010. “An overview of clustering applied to molecular biology.” In Statistical methods in molecular biology, 369–404. New York: Springer.
Pollard, K. S., and M. J. Van Der Laan. 2002. A method to identify significant clusters in gene expression data. Berkeley, CA: Univ. of California, Berkeley.
Punj, G., and D. W. Stewart. 1983. “Cluster analysis in marketing research: Review and suggestions for application.” J. Marketing Res. 20 (2): 134–148. https://doi.org/10.1177/002224378302000204.
Rehman, K., P. W. Burton, and G. A. Weatherill. 2014. “K-means cluster analysis and seismicity partitioning for Pakistan.” J. Seismolog. 18 (3): 401–419. https://doi.org/10.1007/s10950-013-9415-y.
Schrijver, A. 1998. Theory of linear and integer programming. New York: Wiley.
Seed, H. B., C. Ugas, and J. Lysmer. 1976. “Site-dependent spectra for earthquake-resistant design.” Bull. Seismol. Soc. Am. 66 (1): 221–243.
Smyth, P. 1996. “Clustering using Monte Carlo cross-validation.” In Vol. 1 of Proc., 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD’96), 26–133. Palo Alto, CA: Association for the Advancement of Artificial Intelligence Press.
Venture, N. C. J. 2011. Selecting and scaling earthquake ground motions for performing response-history analyses. Gaithersburg, MD: NIST.
Wang, X., K. A. Smith, R. Hyndman, and D. Alahakoon. 2004. A scalable method for time series clustering. Clayton, Australia: Monarch Univ.
Xiong, Y., and D.-Y. Yeung. 2002. “Mixtures of ARMA models for model-based time series clustering.” In Proc., 2002 IEEE Int. Conf. on Data Mining, 717–720. New York: IEEE.
Yaghmaei-Sabegh, S. 2017. “A novel approach for classification of earthquake ground-motion records.” J. Seismolog. 21 (4): 885–907. https://doi.org/10.1007/s10950-017-9642-8.
Yi, T.-H., X.-J. Yao, C.-X. Qu, and H.-N. Li. 2019. “Clustering number determination for sparse component analysis during output-only modal identification.” J. Eng. Mech. 145 (1): 04018122. https://doi.org/10.1061/(ASCE)EM.1943-7889.0001557.
Zhang, R., Z. Chen, S. Chen, J. Zheng, O. Büyüköztürk, and H. Sun. 2019. “Deep long short-term memory networks for nonlinear structural seismic response prediction.” Comput. Struct. 220 (Aug): 55–68. https://doi.org/10.1016/j.compstruc.2019.05.006.

Information & Authors

Information

Published In

Go to Journal of Engineering Mechanics
Journal of Engineering Mechanics
Volume 146Issue 6June 2020

History

Received: Jun 12, 2019
Accepted: Dec 3, 2019
Published online: Mar 18, 2020
Published in print: Jun 1, 2020
Discussion open until: Aug 18, 2020

Permissions

Request permissions for this article.

Authors

Affiliations

Postdoctoral Associate, Dept. of Civil and Environmental Engineering, Northeastern Univ., Boston, MA 02115. ORCID: https://orcid.org/0000-0001-8676-5271. Email: [email protected]
Jerome Hajjar, M.ASCE [email protected]
Camp Dresser & McKee Smith Professor and Chair, Dept. of Civil and Environmental Engineering, Northeastern Univ., Boston, MA 02115. Email: [email protected]
Assistant Professor, Dept. of Civil and Environmental Engineering, Northeastern Univ., Boston, MA 02115; Research Affiliate, Dept. of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139 (corresponding author). ORCID: https://orcid.org/0000-0002-5145-3259. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

View Options

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Get Access

Access content

Please select your options to get access

Log in/Register Log in via your institution (Shibboleth)
ASCE Members: Please log in to see member pricing

Purchase

Save for later Information on ASCE Library Cards
ASCE Library Cards let you download journal articles, proceedings papers, and available book chapters across the entire ASCE Library platform. ASCE Library Cards remain active for 24 months or until all downloads are used. Note: This content will be debited as one download at time of checkout.

Terms of Use: ASCE Library Cards are for individual, personal use only. Reselling, republishing, or forwarding the materials to libraries or reading rooms is prohibited.
ASCE Library Card (5 downloads)
$105.00
Add to cart
ASCE Library Card (20 downloads)
$280.00
Add to cart
Buy Single Article
$35.00
Add to cart

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share