Identifying Tourists and Locals by K-Means Clustering Method from Mobile Phone Signaling Data
Publication: Journal of Transportation Engineering, Part A: Systems
Volume 147, Issue 10
Abstract
Nowadays, a large percentage of people use smartphones frequently. The mobile phone signaling data contains various attributes that can be used to infer when and where the user is. Compared with other big data sources (e.g., social media and GPS data) for the human movement, mobile phone signaling data demonstrate the advantages of a high coverage of population, strong temporal continuity, and low cost of collection. Taking advantage of such mobile phone signaling data, this work aims to identify tourists and locals from a large volume of mobile phone signaling data in a tourism city and analyze their spatiotemporal patterns to better promote tourism service and alleviate possible disturbance to local residents. In this paper, we present a framework to differentiate these two types of people by the following procedure: first, the hidden behavior characteristics of users are extracted from mobile phone signaling data; and then, the K-means clustering method is adopted to identify tourists and locals. With the identification of both tourists and local residents, we analyze the distribution and interaction characteristics of tourists and locals in an urban area. An experimental study is conducted in a famous tourism city, Xiamen, China. The results indicate that the proposed method can successfully identify the most popular scenic spots and major transportation corridors for tourists. The feature extraction, identification, and spatiotemporal analysis presented in this paper are of great significance for analyzing the urban tourism demand, managing the urban space, and mining the tourist behavior.
Get full access to this article
View all available purchase options and get full access to this article.
Data Availability Statement
Some or all data, models, or code generated or used during the study are proprietary or confidential in nature and may only be provided with restrictions.
1.
Mobile phone signaling data of Xiamen: This data are in cooperation with a Xiamen communication operator, and our permission is only allowed to deploy the algorithm on their data platform and calculate the results. Meanwhile, the data cannot be taken out. Therefore, these data are provided with restrictions.
2.
The feature extraction algorithm: The related codes are available from the corresponding author if requested.
3.
The K-means clustering method: This code is also available from the corresponding author if requested.
References
Ahas, R., A. Aasa, Ü. Mark, T. Pae, and A. Kull. 2007. “Seasonal tourism spaces in Estonia: Case study with mobile positioning data.” Tourism Manage. 28 (3): 898–910. https://doi.org/10.1016/j.tourman.2006.05.010.
Ahas, R., A. Aasa, A. Roose, Ü. Mark, and S. Silm. 2008. “Evaluating passive mobile positioning data for tourism surveys: An Estonian case study.” Tourism Manage. 29 (3): 469–486. https://doi.org/10.1016/j.tourman.2007.05.014.
Ahas, R., A. Aasa, Y. Yuan, M. Raubal, Z. Smoreda, Y. Liu, C. Ziemlicki, M. Tiru, and M. Zook. 2015. “Everyday space–time geographies: Using mobile phone-based sensor data to monitor urban activity in Harbin, Paris, and Tallinn.” Int. J. Geog. Inf. Sci. 29 (11): 2017–2039. https://doi.org/10.1080/13658816.2015.1063151.
Asakura, Y., and T. Iryo. 2007. “Analysis of tourist behaviour based on the tracking data collected using a mobile communication instrument.” Transp. Res. Part A Policy Pract. 41 (7): 684–690. https://doi.org/10.1016/j.tra.2006.07.003.
Batista e Silva, F., M. A. Marín Herrera, K. Rosina, R. Ribeiro Barranco, S. Freire, and M. Schiavina. 2018. “Analysing spatiotemporal patterns of tourism in Europe at high-resolution with conventional and big data sources.” Tourism Manage. 68 (Oct): 101–115. https://doi.org/10.1016/j.tourman.2018.02.020.
Calabrese, F., M. Diao, G. Di Lorenzo, J. Ferreira, and C. Ratti. 2013. “Understanding individual mobility patterns from urban sensing data: A mobile phone trace example.” Transp. Res. Part C Emerging Technol. 26 (Jan): 301–313. https://doi.org/10.1016/j.trc.2012.09.009.
Chancellor, C., and S. Cole. 2008. “Using geographic information system to visualize travel patterns and market research data.” J. Travel Tourism Marketing 25 (3–4): 341–354. https://doi.org/10.1080/10548400802508440.
Chareyron, G., J. Da-Rugna, and B. Branchet. 2013. “Mining tourist routes using Flickr traces.” In Proc., IEEE/ACM Int. Conf. on Advances in Social Networks Analysis and Mining, 1488–1489. New York: IEEE.
Csáji, B. C., A. Browet, V. A. Traag, J.-C. Delvenne, E. Huens, P. Van Dooren, Z. Smoreda, and V. D. Blondel. 2013. “Exploring the mobility of mobile phone users.” Physica A 392 (6): 1459–1473. https://doi.org/10.1016/j.physa.2012.11.040.
Cui, Y., C. Meng, Q. He, and J. Gao. 2018. “Forecasting current and next trip purpose with social media data and Google Places.” Transp. Res. Part C Emerging Technol. 97 (Dec): 159–174. https://doi.org/10.1016/j.trc.2018.10.017.
Drchal, J., M. Čertický, and M. Jakob. 2019. “Data-driven activity scheduler for agent-based mobility models.” Transp. Res. Part C Emerging Technol. 98 (Jan): 370–390. https://doi.org/10.1016/j.trc.2018.12.002.
Duan, Z., C. Wang, H. M. Zhang, Z. Lei, H. Li, and D. Yang. 2017. “Using longitudinal mobile phone data to understand the stability of individual travel patterns: Case study of three communities in Shanghai, China.” Transp. Res. Rec. 2643 (1): 166–177. https://doi.org/10.3141/2643-18.
East, D., P. Osborne, S. Kemp, and T. Woodfine. 2017. “Combining GPS & survey data improves understanding of visitor behaviour.” Tourism Manage. 61 (Aug): 307–320. https://doi.org/10.1016/j.tourman.2017.02.021.
Edwards, D., and T. Griffin. 2013. “Understanding tourists’ spatial behaviour: GPS tracking as an aid to sustainable destination management.” J. Sustainable Tourism 21 (4): 580–595. https://doi.org/10.1080/09669582.2013.776063.
Gajera, V., R. Gupta, and P. K. Jana. 2016. “An effective multi-objective task scheduling algorithm using min-max normalization in cloud computing.” In Proc., 2nd Int. Conf. on Applied and Theoretical Computing and Communication Technology, 812–816. New York: IEEE. https://doi.org/10.1109/ICATCCT.2016.7912111.
González, M. C., C. A. Hidalgo, and A.-L. Barabási. 2008. “Understanding individual human mobility patterns.” Nature 453 (7196): 779–782. https://doi.org/10.1038/nature06958.
Grinberger, A. Y., N. Shoval, and B. McKercher. 2014. “Typologies of tourists’ time–space consumption: A new approach using GPS data and GIS tools.” Tourism Geographies 16 (1): 105–123. https://doi.org/10.1080/14616688.2013.869249.
Hallo, J. C., J. A. Beeco, C. Goetcheus, J. McGee, N. G. McGehee, and W. C. Norman. 2012. “GPS as a method for assessing spatial and temporal use distributions of nature-based tourists.” J. Travel Res. 51 (5): 591–606. https://doi.org/10.1177/0047287511431325.
Hallo, J. C., R. E. Manning, W. Valliere, and M. Bedruk. 2004. “A case study comparison of visitor self-reported and GPS recorded travel routes.” In Proc., Northeaster Recreation Research Symp. Washington, DC: USDA.
Hasnat, M. M., and S. Hasan. 2018. “Identifying tourists and analyzing spatial patterns of their destinations from location-based social media data.” Transp. Res. Part C Emerging Technol. 96 (Nov): 38–54. https://doi.org/10.1016/j.trc.2018.09.006.
Huang, A., L. Gallegos, and K. Lerman. 2017. “Travel analytics: Understanding how destination choice and business clusters are connected based on social media data.” Transp. Res. Part C Emerging Technol. 77 (Apr): 245–256. https://doi.org/10.1016/j.trc.2016.12.019.
Huang, Z., X. Ling, P. Wang, F. Zhang, Y. Mao, T. Lin, and F.-Y. Wang. 2018. “Modeling real-time human mobility based on mobile phone and transportation data fusion.” Transp. Res. Part C Emerging Technol. 96 (Nov): 251–269. https://doi.org/10.1016/j.trc.2018.09.016.
Järv, O., R. Ahas, and F. Witlox. 2014. “Understanding monthly variability in human activity spaces: A twelve-month study using mobile phone call detail records.” Transp. Res. Part C Emerging Technol. 38 (Jan): 122–135. https://doi.org/10.1016/j.trc.2013.11.003.
Kuflik, T., E. Minkov, S. Nocera, S. Grant-Muller, A. Gal-Tzur, and I. Shoor. 2017. “Automating a framework to extract and analyse transport related social media content: The potential and the challenges.” Transp. Res. Part C Emerging Technol. 77 (Apr): 275–291. https://doi.org/10.1016/j.trc.2017.02.003.
Lane, N., M. Mohammod, M. Lin, X. Yang, H. Lu, S. Ali, A. Doryab, E. Berke, T. Choudhury, and A. Campbell. 2011. “BeWell: A smartphone application to monitor, model and promote wellbeing.” In Proc., 5th Int. ICST Conf. on Pervasive Computing Technologies for Healthcare. New York: IEEE. https://doi.org/10.4108/icst.pervasivehealth.2011.246161.
Lee, J. H., A. Davis, and E. McBride. 2017. “Exploring social media data for travel demand analysis: A comparison of Twitter, household travel survey, and synthetic population data in California.” In Proc., TRB 96th Annual Meeting Compendium of Papers. Washington, DC: Transportation Research Board.
Li, D., X. Zhou, and M. Wang. 2018a. “Analyzing and visualizing the spatial interactions between tourists and locals: A Flickr study in ten US cities.” Cities 74 (Apr): 249–258. https://doi.org/10.1016/j.cities.2017.12.012.
Li, J., L. Xu, L. Tang, S. Wang, and L. Li. 2018b. “Big data in tourism research: A literature review.” Tourism Manage. 68 (Oct): 301–323. https://doi.org/10.1016/j.tourman.2018.03.009.
McKercher, B., N. Shoval, E. Ng, and A. Birenboim. 2012. “First and repeat visitor behaviour: GPS tracking and GIS analysis in Hong Kong.” Tourism Geographies 14 (1): 147–161. https://doi.org/10.1080/14616688.2011.598542.
Ni, L., X. Wang, and X. Chen. 2018. “A spatial econometric model for travel flow analysis and real-world applications with massive mobile phone data.” Transp. Res. Part C Emerging Technol. 86 (Jan): 510–526. https://doi.org/10.1016/j.trc.2017.12.002.
Nilbe, K., R. Ahas, and S. Silm. 2014. “Evaluating the travel distances of events visitors and regular visitors using mobile positioning data: The case of Estonia.” J. Urban Technol. 21 (2): 91–107. https://doi.org/10.1080/10630732.2014.888218.
Olak, S., A. Lima, and M. C. González. 2016. “Understanding congested travel in urban areas.” Nat. Commun. 7 (1): 1–8. https://doi.org/10.1038/ncomms10793.
Patro, S. G. K., and K. K. Sahu. 2015. “Normalization: A preprocessing stage.” Preprint, submitted March 19, 2015. https://arxiv.org/abs/1503.06462.
Phithakkitnukoon, S., T. Horanont, A. Witayangkurn, R. Siri, Y. Sekimoto, and R. Shibasaki. 2015. “Understanding tourist behavior using large-scale mobile sensing approach: A case study of mobile phone users in Japan.” Pervasive Mob. Comput. 18 (Apr): 18–39. https://doi.org/10.1016/j.pmcj.2014.07.003.
Quercia, D., N. Lathia, F. Calabrese, G. D. Lorenzo, and J. Crowcroft. 2011. “Recommending social events from mobile phone location data.” In Proc., IEEE Int. Conf. on Data Mining. New York: IEEE.
Rashidi, T. H., A. Abbasi, M. Maghrebi, S. Hasan, and T. S. Waller. 2017. “Exploring the capacity of social media data for modelling travel behaviour: Opportunities and challenges.” Transp. Res. Part C Emerging Technol. 75 (Feb): 197–211. https://doi.org/10.1016/j.trc.2016.12.008.
Raun, J., R. Ahas, and M. Tiru. 2016. “Measuring tourism destinations using mobile tracking data.” Tourism Manage. 57 (Dec): 202–212. https://doi.org/10.1016/j.tourman.2016.06.006.
Scherrer, L., M. Tomko, P. Ranacher, and R. Weibel. 2018. “Travelers or locals? Identifying meaningful sub-populations from human movement data in the absence of ground truth.” EPJ Data Sci. 7: 19. https://doi.org/10.1140/epjds/s13688-018-0147-7.
Schneider, C. M., V. Belik, T. Couronne, Z. Smoreda, and M. C. Gonzalez. 2013. “Unravelling daily human mobility motifs.” J. R. Soc. Interface 10 (84): 20130246. https://doi.org/10.1098/rsif.2013.0246.
Shi, Q., and M. Abdel-Aty. 2015. “Big Data applications in real-time traffic operation and safety monitoring and improvement on urban expressways.” Transp. Res. Part C Emerging Technol. 58 (Sep): 380–394. https://doi.org/10.1016/j.trc.2015.02.022.
Shou, Z., and X. Di. 2018. “Similarity analysis of frequent sequential activity pattern mining.” Transp. Res. Part C Emerging Technol. 96 (Nov): 122–143. https://doi.org/10.1016/j.trc.2018.09.018.
Shoval, N., and R. Ahas. 2016. “The use of tracking technologies in tourism research: The first decade.” Tourism Geographies 18 (5): 587–606. https://doi.org/10.1080/14616688.2016.1214977.
Shoval, N., B. McKercher, A. Birenboim, and E. Ng. 2015. “The application of a sequence alignment method to the creation of typologies of tourist activity in time and space.” Environ. Plann. B: Plann. Des. 42 (1): 76–94. https://doi.org/10.1068/b38065.
Sørensen, F., and J. Sundbo. 2014. “Potentials for user-based innovation in tourism: The example of GPS tracking of attraction visitors.” In Handbook of research on innovation in tourism industries, 132–153. Cheltenham, UK: Edward Elgar Publishing. https://doi.org/10.4337/9781782548416.00013.
Syakur, M. A., B. K. Khotimah, E. M. S. Rochman, and B. D. Satoto. 2018. Integration K-means clustering method and elbow method for identification of the best customer profile cluster.” In Vol. 336 of Proc., IOP Conf. Series: Materials Science and Engineering. Bristol, UK: IOP Publishing. https://doi.org/10.1088/1757-899X/336/1/012017.
Tchetchik, A., A. Fleischer, and N. Shoval. 2009. “Segmentation of visitors to a heritage site using high-resolution time-space data.” J. Travel Res. 48 (2): 216–229. https://doi.org/10.1177/0047287509332307.
Toole, J. L., S. Colak, B. Sturt, L. P. Alexander, A. Evsukoff, and M. C. González. 2015. “The path most traveled: Travel demand estimation using big data resources.” Transp. Res. Part C Emerging Technol. 58 (Sep): 162–177. https://doi.org/10.1016/j.trc.2015.04.022.
Van der Spek, S., J. Van Schaick, P. De Bois, and R. De Haan. 2009. “Sensing human activity: GPS tracking.” Sensors 9 (4): 3033–3055. https://doi.org/10.3390/s90403033.
Varblane, U., R. Ahas, M. Tiru, and A. Kuusik. 2011. “Innovation in destination marketing: The use of passive mobile positioning for the segmentation of repeat visitors in Estonia.” Balt. J. Manage. 6 (3): 378–399. https://doi.org/10.1108/17465261111168000.
Wang, F., and C. Chen. 2018. “On data processing required to derive mobility patterns from passively-generated mobile phone data.” Transp. Res. Part C Emerging Technol. 87 (Feb): 58–74. https://doi.org/10.1016/j.trc.2017.12.003.
Ying, C., H. S. Mahmassani, and A. Frei. 2017. “Incorporating social media in travel and activity choice models: Conceptual framework and exploratory analysis.” Int. J. Urban Sci. 22 (2): 180–200. https://doi.org/10.1080/12265934.2017.1331749.
Zakrisson, I., and M. Zillinger. 2012. “Emotions in motion: Tourist experiences in time and space.” Curr. Issues Tourism 15 (6): 505–523. https://doi.org/10.1080/13683500.2011.615391.
Zhang, B., X. Huang, N. Li, and R. Law. 2017a. “A novel hybrid model for tourist volume forecasting incorporating search engine data.” Asia Pac. J. Tourism Res. 22 (3): 245–254. https://doi.org/10.1080/10941665.2016.1232742.
Zhang, Z., Q. He, and S. Zhu. 2017b. “Exploring travel behavior with social media: An empirical study of abnormal movements using high-resolution tweet trajectory data.” In Proc., TRB 96th Annual Meeting Compendium of Papers. Washington, DC: Transportation Research Board.
Zhang, Z., Q. He, and S. Zhu. 2017c. “Potentials of using social media to infer the longitudinal travel behavior: A sequential model-based clustering method.” Transp. Res. Part C Emerging Technol. 85 (Dec): 396–414. https://doi.org/10.1016/j.trc.2017.10.005.
Zhao, Z., H. N. Koutsopoulos, and J. Zhao. 2018. “Individual mobility prediction using transit smart card data.” Transp. Res. Part C Emerging Technol. 89 (Apr): 19–34. https://doi.org/10.1016/j.trc.2018.01.022.
Zhong, G., X. Wan, J. Zhang, T. Yin, and B. Ran. 2017. “Characterizing passenger flow for a transportation hub based on mobile phone data.” IEEE Trans. Intell. Transp. Syst. 18 (6): 1507–1518. https://doi.org/10.1109/TITS.2016.2607760.
Information & Authors
Information
Published In
Copyright
© 2021 American Society of Civil Engineers.
History
Received: Oct 28, 2020
Accepted: May 17, 2021
Published online: Aug 8, 2021
Published in print: Oct 1, 2021
Discussion open until: Jan 8, 2022
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.
Cited by
- Hao Zhu, Research on Energy Consumption Data Monitoring of Smart Parks Based on IoT Technology, Advanced Hybrid Information Processing, 10.1007/978-3-031-50546-1_2, (20-37), (2024).
- Zhihuan Jiang, Ailing Huang, Geqi Qi, Wei Guan, A Framework of Travel Mode Identification Fusing Deep Learning and Map-Matching Algorithm, IEEE Transactions on Intelligent Transportation Systems, 10.1109/TITS.2023.3250660, 24, 6, (6401-6415), (2023).
- Yuan Li, Jing Guo, Long Zhao, Han Shen, Research on the spatial-temporal pattern and spatial spillover effect of tourism based on mobile signaling and POIs data: a case study of Xiamen city, southeast China, Asia Pacific Journal of Tourism Research, 10.1080/10941665.2022.2152357, 27, 10, (1052-1070), (2023).
- Cheng Shi, Yujia Zhai, Dongying Li, Urban tourists’ spatial distribution and subgroup identification in a metropolis --the examination applying mobile signaling data and latent profile analysis, Information Technology & Tourism, 10.1007/s40558-023-00255-y, 25, 3, (453-476), (2023).
- Haodong Sun, Yang Yang, Yanyan Chen, Xiaoming Liu, Jiachen Wang, Tourism demand forecasting of multi-attractions with spatiotemporal grid: a convolutional block attention module model, Information Technology & Tourism, 10.1007/s40558-023-00247-y, 25, 2, (205-233), (2023).
- Dandan Ke, Jingyi Dai, Multidimensional analysis of engineering cost database based on descriptive data mining, Soft Computing, 10.1007/s00500-023-07992-6, (2023).
- Md. Hamidur Rahman, Momotaz Begum, Partitional Technique for Searching Initial Cluster Centers in K-means Algorithm, Proceedings of the Fourth International Conference on Trends in Computational and Cognitive Engineering, 10.1007/978-981-19-9483-8_22, (255-266), (2023).
- Haodong Sun, Yanyan Chen, Jianming Ma, Yang Wang, Xiaoming Liu, Jiachen Wang, Multi-Objective Optimal Travel Route Recommendation for Tourists by Improved Ant Colony Optimization Algorithm, Journal of Advanced Transportation, 10.1155/2022/6386119, 2022, (1-14), (2022).
- Alekh Gour, Shikha Aggarwal, Subodha Kumar, Lending ears to unheard voices: An empirical analysis of user‐generated content on social media, Production and Operations Management, 10.1111/poms.13732, 31, 6, (2457-2476), (2022).
- Jing Wang, Filip Biljecki, Unsupervised machine learning in urban studies: A systematic review of applications, Cities, 10.1016/j.cities.2022.103925, 129, (103925), (2022).
- See more