Hydrating Large-Scale Coronavirus Pandemic Tweets: A Review of Software for Transportation Research
Publication: International Conference on Transportation and Development 2021
ABSTRACT
The coronavirus (COVID-19) pandemic has challenged the established societal structure, and the transportation sector is not out of this new normal. The primary objective of this research is to analyze and review the performance of software models used for extracting and processing large-scale data from Twitter streams related to COVID-19. The study extends the previous research efforts of machine learning applications on social media by providing a review of contemporary tools, including their computing maturity, and their potential usefulness. The paper also provides an open data repository for the processed data frames to facilitate the swift development of new transportation research. Transportation researchers and the American Society of Civil Engineers (ASCE) community are believed to benefit from this study.
Get full access to this article
View all available purchase options and get full access to this chapter.
REFERENCES
Abidin, A. F., Kolberg, M., and Hussain, A. (2015). Integrating Twitter traffic information with Kalman filter models for public transportation vehicle arrival time prediction. In Big-data analytics and cloud computing (pp. 67-82). Springer, Cham.
Ali, M. S., Kaisar, E. I., and Hadi, M. (2017). Guidance for Identifying Corridor Conditions That Warrant Deploying Transit Signal Priority and Queue Jump. 5th IEEE International Conference on Models and Technologies for Intelligent Transportation Systems, MT-ITS 2017 - Proceedings, No. 2, 2017, pp. 657–662. https://doi.org/10.1109/MTITS.2017.8005595.
Ali, M. D., Zerpa, L. A., Kaisar, E. I., and Masters, K. O. (2018). Guidance for Identifying Corridor Conditions That Warrant Deploying Transit Signal Priority.
Alluri, P., Raihan, M. A., Saha, D., Wu, W., Huq, A., Nafis, S., and Gan, A. (2017). Statewide analysis of bicycle crashes.
Olteanu, A., Castillo, C., Diaz, F., and Vieweg, S. 2014. CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises. In Proceedings of the AAAI Conference on Weblogs and Social Media (ICWSM'14). AAAI Press, Ann Arbor, MI, USA.
Arafat, Mahmoud. 2020, A Review of Models for Hydrating Large-scale Twitter Data of COVID-19-related Tweets for Transportation Research, https://doi.org/10.7910/DVN/LJWIGZ.
Arafat, M., Nafis, S. R., Sadeghvaziri, E., and Tousif, F. (2020). A data-driven approach to calibrate microsimulation models based on the degree of saturation at signalized intersections. Transportation Research Interdisciplinary Perspectives, 8, 100231.
Arafat, M., Iqbal, S., and Hadi, M. (2020). Utilizing an Analytical Hierarchy Process with Stochastic Return On Investment to Justify Connected Vehicle-Based Deployment Decisions. Transportation Research Record, 2674(9), 462-472.
Banda, Juan M., Tekumalla, Ramya, Wang, Guanyu, Yu, Jingyuan, Liu, Tuo, Ding, Yuning, and Chowell, Gerardo. (2020). A Twitter Dataset of 179+ million tweets related to COVID-19 for open research (Version 5.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.3749360.
Chaniotakis, E., and Antoniou, C. (2015, September). Use of geotagged social media in urban settings: Empirical evidence on its potential from twitter. In 2015 IEEE 18th International Conference on Intelligent Transportation Systems (pp. 214-219). IEEE.
Das, S., Dutta, A., Medina, G., Minjares-Kyle, L., and Elgart, Z. (2019). Extracting patterns from Twitter to promote biking. IATSS research, 43(1), 51-59.
Documenting the Now Project DocNow - Website (https://www.docnow.io/) Source: https://github.com/DocNow/hydrator/releases.
Garcia, M. How to Make a Twitter Bot in Python with Tweepy – Real Python. 2020. https://realpython.com/twitter-bot-python-tweepy/#what-is-tweepy. Accessed Jul. 28, 2020.
Lamsal, Rabindra. (2020). Corona Virus (COVID-19) Tweets Dataset. IEEE Dataport. https://doi.org/10.21227/781w-ef42.
Lee, J. H., Davis, A., Yoon, S. Y., and Goulias, K. G. 2016. Activity Space Estimation with Longitudinal Observations of Social Media Data. In: Transportation Research Board 95th Annual Meeting, 16-0070.
Maghrebi, M., Abbasi, A., Rashidi, T. H., and Waller, S. T. (2015, September). Complementing travel diary surveys with twitter data: application of text mining techniques on activity location, type and time. In 2015 IEEE 18th international conference on intelligent transportation systems (pp. 208-213). IEEE.
Morshed, S. A., Lv, X., and Tanvir, R. B. (2020, November). Network-Based Information Extraction from IFC Files to Support Intelligent BIM Companion (iBcom) Technology. In Construction Research Congress 2020: Computer Applications (pp. 427-435). Reston, VA: American Society of Civil Engineers.
Morshed, S. A., Arafat, M., Ashraf Ahmed, M., and Saha, R. (2020, August). Discovering the Commuters’ Assessments on Disaster Resilience of Transportation Infrastructure. In International Conference on Transportation and Development 2020 (pp. 23-34). Reston, VA: American Society of Civil Engineers.
Nafis, S. R., Alluri, P., Jung, R., Ennemoser, R., and Gan, A. (2019). A Comprehensive Review of States’ Existing Practices in Using Communication Technologies to Increase Public Involvement for Transportation Projects. Transportation Research Board 98th Annual Meeting Transportation Research Board, (19-01674).
Perrin, A. (2015). Social media usage. Pew research center, 52-68.
Rashidi, T. H., Abbasi, A., Maghrebi, M., Hasan, S., and Waller, T. S. (2017). Exploring the capacity of social media data for modelling travel behaviour: Opportunities and challenges. Transportation Research Part C: Emerging Technologies, 75, 197-211.
Roy, K. C., and Hasan, S. (2019). Modeling the dynamics of hurricane evacuation decisions from real-time Twitter data.
Sckit learn, 0.23.2. (2007) https://scikit-learn.org/stable/modules/generated/sklearnpreprocessing.LabelEncoder.html (Date accessed: November, 2020).
Tariq, M. T., Massahi, A., Saha, R., and Hadi, M. (2020). Combining Machine Learning and Fuzzy Rule-Based System in Automating Signal Timing Experts’ Decisions during Non-Recurrent Congestion. Transportation Research Record, 2674(6), 163-176.
Tekumalla, R., and Banda, J. M. (2020). Social Media Mining Toolkit (SMMT).
Twitter developer aggrement. https://developer.twitter.com/en/developer-terms/agreement-and-policy.
Twitter-First quarter. 2019 Earnings Report https://s22.q4cdn.com/826641620/files/doc_financials/2019/q1/Q1-2019-Slide-Presentation.pdf.
Information & Authors
Information
Published In
Copyright
© 2021 American Society of Civil Engineers.
History
Published online: Jun 4, 2021
Authors
Metrics & Citations
Metrics
Citations
Download citation
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.