Open access
Technical Papers
Oct 23, 2020

Origin-Destination Trips Generated from Operational Data of a Mobile Network for Urban Transportation Planning

Publication: Journal of Urban Planning and Development
Volume 147, Issue 1

Abstract

Owing to recent advances in information communication technologies, trail data, i.e., movement logs of people on the road or on trains, can be acquired on a 24-h, 365-day basis, some of which have yielded real-world applications. Generating highly granular origin-destination trip data from the operational data of mobile networks would significantly advance the methodology for actualizing and broadening the field of applications in urban transportation planning. This paper proposes a method that estimates origin-destination trips on the basis of mobile network operational data, referred to as mobile spatial dynamics. Statistics are generated using a three-step process comprising anonymization, estimation, and disclosure limitation. Unlike other studies conducted thus far, this study uses large-scale person trip survey data to validate the reliability of estimated origin-destination trips between regions. Analysis of the use cases shows that the proposed method has the potential for mobile spatial dynamics to complement person trip surveys and road traffic censuses.

Introduction

In urban transportation planning, understanding population flow is of enormous importance. The Japanese government and Japanese organizations currently utilize national population censuses, person trip (PT) survey data, and road traffic censuses. The surveys are conducted every 5 or 10 years, typically on one day in the fall. These surveys provide only a snapshot of the travel demand and contain no information on yearly or seasonal changes in movements.
In recent years, it has become possible to acquire trail data from mobile phone networks and car navigation systems to analyze and forecast travel demands in a continuous manner. In the urban transportation field, trail data are investigated widely, and put to practical use (Imai et al. 2012c; Momma et al. 2011; Sengoku et al. 2011). Recent studies revealed the effectiveness of combining qualitative questionnaire data with quantitative trail data (Sekimoto et al. 2012, 2013; Imai et al. 2012a, 2012b, 2013).
Okajima et al. (2013) developed a methodology for estimating the number of people in the coverage area of a base station using mobile network operational data from NTT DOCOMO a mobile telecommunications company in Japan. The operational data comprise location updates and a customer database of approximately 76 million subscribers (as of June 2018). All mobile phones in the network perform a location update procedure when crossing location areas or on a regular basis, i.e., almost once an hour, in the mobile network. The statistics can be used to estimate the number of people in administrative districts or grids with attributes such as gender or age group (Odawara and Nagata 2014). Prior studies examined the reliability of these statistics (Oyabu et al. 2013), and applied the research to the fields of urban planning (Imai et al. 2013; Seike et al. 2011; Odawara and Kawakami 2013; Seike et al. 2013; Nagata et al. 2013; Seike et al. 2015) and disaster prevention planning (Suzuki et al. 2013; Murakami et al. 2011). These prior studies imply a high degree of applicability of the statistics to such fields.
However, the statistics provide only snapshots of the number of people in administrative districts or grids and thus contain no information on the movement of people. Generating data on origin-destination (OD) trips would significantly advance the methodology for actualizing and broadening the range of applications in urban transportation planning. There has been difficulty in generating details of OD trips from mobile operational data because location update procedures are performed regardless of the movement of people.
The purpose of this study is to propose a method for estimating data on OD trips based on mobile network operational data and to evaluate its usefulness. The next section introduces the requirements for analyzing OD trips in the urban transportation field. The section on “Prior Work” gives an overview of previous investigations. Estimation processes for OD trips and data specifications of the statistics, referred to as mobile spatial dynamics (MSD) herein, are defined in the section entitled “Method for Estimating MSD”. The section entitled “Evaluation of MSD” evaluates the reliability of the statistics through a comparison with PT survey data. The subsequent section presents an analysis of use cases and an investigation of the application fields in urban transportation planning. The “Summary and Conclusions” section presents analysis results and describes future challenges.

Defining the Requirements for MSD

This study starts with a definition of the requirements for statistics in the urban transportation field. PT surveys are designed to capture the movement of people on a weekday and to clarify their travel attributes, the origin and destination points, the purpose of the trip, and the means of transportation. The road traffic census is designed to capture data of OD trips and road traffic conditions. PT surveys are conducted every 10 years, while the road traffic census is conducted every 5 years. A questionnaire approach is used to obtain details of OD trips, while monitoring methods utilizing traffic counter cameras are used to capture road traffic. The data on interregion OD trips estimated using the PT survey and road traffic censuses are foundations for urban transportation planning.
Table 1 gives the requirements for OD trip analysis. The estimated time, spatial resolution of the OD areas, and estimated values should be consistent with existing PT surveys and road traffic surveys. Zoning is classified into large, medium, basic planning, and small zones, which are defined in the Tokyo metropolitan area PT survey. Although PT surveys obtain details of trips in the survey areas, the operational data of mobile networks allow capturing of the movement of people nationwide. For this reason, the authors define the origin and destination areas nationwide as one of the distinguishing characteristics. Another feature is to capture the movement of extraregional residents.
Table 1. OD trip requirements
ItemElementRequirement
Date and timeRange3 a.m. to 3 a.m. the next day
 Unit24 h, 365 days (each trip includes departure and arrival times)
Departure areaRangeNationwide
 UnitLarge zone, medium zone, municipality zone, basic planning zone, small zone, town and street zone, B zone
Destination areaRangeNationwide
 UnitLarge zone, medium zone, municipality zone, basic planning zone, small zone, town and street zone, B zone
Intraregional resident identifierRangeIntra- or extraregional residents
 Unit
Estimated valueRange
 UnitTrip

Prior Work

The location data of mobile phones are fundamentally categorized into three types: Global Positioning System (GPS) on mobile phones; communication logs, which are often referred to as call detail records (CDRs); and operational data of the mobile network, including location information on signals between mobile phones and the mobile network. A wide variety of mobile applications use GPS data with a high spatial resolution. Obtaining GPS data requires configuration of the mobile phone GPS settings, which tends to restrict the number of times that location data are acquired. Recording a large number of CDRs requires large-scale development of the mobile network. Operational data are obtained from signals initiated by all the mobile phones in the network. A mobile phone initiates location registration when it crosses a predetermined location area or at regular intervals defined by the mobile network. The spatial resolution is several hundred meters on average in urban areas (Jiang et al. 2013) and several kilometers in rural areas (Chen et al. 2014; Calabrese et al. 2013).
Significant research efforts have been put forth to devise mobility patterns using CDRs (Wang and Chen 2018; Tongsinoot and Muangsin 2017; Yin et al. 2018). Iqbal et al. (2014) used CDRs over a 1-month period to infer OD matrices. Gonzalez et al. (2008) revealed the regularity of behavior patterns of people based on CDRs. Phithakkitnukoon and Ratti (2011) found asymmetry in the forward and return routes of urban area residents and showed how this asymmetry impacted urban transportation planning. Calabrese et al. (2011) used CDRs and revealed new trip parameters, such as time variation and day-of-the-week variation in the population. Many other studies concerned applied research on CDRs, for example, in the fields of road traffic analysis (Song et al. 2010) and travel demand estimation (Jiang et al. 2013).
In a previous study on the operational data of a mobile network, Caceres et al. (2007) analyzed location information obtained from location registration procedures (hereinafter referred to as location registration data) and estimated trips between location registration areas to capture road traffic. Okajima et al. (2013) generated a nationwide geographical distribution of the population based on location registration data and applied the statistics to land-use planning. Sohn and Kim (2008) and Becker et al. (2011b) used signals of handover procedures to estimate OD trips. Ming-Heng et al. (2013) used signal information to estimate OD trips; this information is generated only when mobile phones initiate voice communications or are connected to the internet. Many prior studies proposed a method for estimating OD trips using operational data (Ming-Heng et al. 2013; Caceres et al. 2013; Calabrese et al. 2011; Zhang et al. 2010; White and Wells 2002).
Previous investigations of large-scale observed data include those of Becker et al. (2011a, 2013), who analyzed the location data of 300,000 mobile phones. They examined the reliability of estimated trips in major metropolitan areas in the US based on comparisons with census data and identified important places including likely home and work locations. Alexander et al. (2015) developed a method to estimate OD trips using the CDR for 2 million mobile phones in the Boston metropolitan area. Jiang et al. (2017) used the CDR for 3 million mobile phones in Singapore to analyze human mobility patterns. Berlingerio et al. (2013) identified congested areas in urban areas based on the CDRs of 500,000 mobile phones and then improved bus route planning. Calabrese et al. (2011) estimated the OD trips of 8 million mobile phones and compared the results with census data to find a linear relationship between the two datasets. Ishizuka et al. (2015) evaluated the transportation metrics extracted from CDRs by referring to GPS data. Many prior studies used location data from a mobile network to estimate OD trips and validated the methodology, mainly through comparisons with statistics of population density, most of which were census data, or with a small number of correct values. However, to our knowledge, no prior work has used large-scale survey data to validate the reliability of estimated OD trips.
This study uses a large amount of location registration data to estimate OD trips between regions. Location registration data are obtained from all mobile phones on a regular basis, and thus could be a foundation of reliable statistics in the urban transportation field. However, location registration signals are initiated regardless of the movement of people in the real world. This makes it difficult to identify origins and destinations. To address this problem, in this study, a movement judgment method is proposed to extract OD trips from the operational data on a 24-h, 365-day basis. To validate the proposed method and reliability of the statistics, the authors use PT survey data and large-scale OD trip survey data in Japan, unlike prior work. PT surveys assume a questionnaire approach, with a sampling rate determined by statistical analysis. For instance, the Tokyo metropolitan PT survey used sampling rates of 2% to 5%. In 2008, 1.4 million copies of the questionnaire were delivered, and 340,000 valid forms were returned and processed to generate statistics.
The purpose of this study is to establish specifications and a method to estimate details of OD trips, based on operational data of a mobile network, that will lead to support for a wide variety of applications in the urban transportation field. Unlike prior studies (Ming-Heng et al. 2013; Zhang et al. 2010), the proposed method introduces a multiplying factor characterized by age, gender, and residential areas of mobile phones in estimating OD trips to increase reliability. If the estimated OD trip details are consistent with the PT survey data, the data could be used as a complement to the PT survey. This approach could contribute to conducting a survey in regions where PT surveys have not been conducted owing to budget constraints. It would also reveal weekly, monthly, and seasonal changes in OD trips.

Method for Estimating MSD

Overview of MSD

To allow mobile phones to receive incoming voice calls and data on a 24-h, 365-day basis over broad regions, a base station in a mobile network detects all mobile phones within its coverage area on a regular basis. Statistical data generated from the operation data of the mobile network, which indicate the geographical distribution of the population all over Japan, can be used for research in the public and private sectors. The goal of this study is to generate statistics of OD trips between regional zones for the entire nation on a 24-h, 365-day basis, which are referred to as MSD. The authors use operational data of a mobile network operated by the Japanese mobile operator NTT DOCOMO. Since mobile services are available in all municipal areas of Japan, OD trips can be generated between regional zones nationwide. The characteristics of MSD depend on the architecture and operation of the mobile network. Table 2 gives the MSD data specifications. It is possible to select OD areas from all over Japan. The time resolution of MSD is 1 h because the base stations detect all mobile phones in their respective coverage areas approximately once every hour; thus, statistically reliable data can be generated on a 24-h, 365-day basis. The spatial resolution depends on the installation density of the base stations in the mobile network. The average installation interval is approximately 500 m in metropolitan areas and several kilometers in the suburbs. In metropolitan areas with a high installation density of base stations, OD trips can be estimated with the spatial resolution of medium zones or small zones as defined by the Tokyo metropolitan PT survey. Conversely, in the suburbs where the installation density is low, the spatial resolution should be adjusted to the municipal level.
Table 2. Data specifications for MSD
ItemMSDData example
Estimation date and time24 h, 365 days10 a.m. on Nov 13, 2014
Departure areaNationwide22,101 (Aoi-ku, Shizuoka)
Destination areaNationwide22,102 (Suruga-ku, Shizuoka)
Age group15 to 79 years old25 (25 to 29 years old)
GenderMale, female1 (male), 2 (female)
Intraregional resident identifierIntra-, extraregional0 (extraregional), 1 (intraregional)
Estimated valueTrip460 (trips)
Another characteristic is that OD trips can be estimated based on gender, age group, and residential area using attribute data of the mobile phone that initiated the location registration signal. Table 3 compares MSD to PT surveys. NTT DOCOMO provides its mobile phone services to municipalities throughout Japan with a high penetration rate. This means that MSD capture nationwide OD trips by the hour and with attributes such as gender and age group. MSD can represent OD trips in 5-year increments for mobile phone users aged 15 to 74. The authors used the data sample of approximately 76 million mobile phones and excluded business mobile phones. The sampling rate is approximately 59% of the Japanese population (127 million). The statistics are generated by multiplying the penetration rate by the age, gender, and residential area. Although MSD give no information on the means of transportation and the purpose of the travel, it has the potential to complement the PT surveys.
Table 3. Comparison of MSD and PT survey
ItemPT surveyMSD
Survey sampleSample inquiry (2% of metropolitan area residents)NTT DOCOMO mobile subscribers (approximately 76 milliona)
Survey dateSpecific day365 days
Survey frequency10 years1 h
Research fieldMetropolitan areasNationwide
AttributesGender, age group, residential areaGender, age group, residential area
Time resolution1 h1 h
Spatial resolutionZone (small zones defined by 15,000 persons during night)Depends on the installation density of base stations (from medium to small zones in urban areas)
Purpose of travelMain purpose of travelN/A
Transportation meansMeans of transportation, routesN/A
a
Corporate users are eliminated in the estimation process.

Method for Estimating MSD

MSD are generated from location registration signals of the mobile network using a three-step process to protect the personal information and privacy of mobile phone users. The process comprises: anonymization to eliminate the personal information from operation data; estimation of OD trips between regional zones on a specific date and time; and disclosure limitation to prevent the release of information regarding small populations from the estimated statistics, as indicated in Fig. 1. In the estimation process, the following judgment process is conducted. Base stations in the mobile network detect the movement of mobile phones in their coverage areas (hereafter, referred to as cells) when they move from one base station to another. Given that consecutive location registration signals s0, s1, s2, …, sk are received from a mobile phone, the centroid, psi(0ik), of the cell covered by the base station that receives the signal from the mobile phone in temporal window Tw is set as the base position. Then, when another base station receives a location registration signal, sj (0 ≤ jk), from the mobile phone, it calculates the Euclidean distance, d(psi,psj) between the two centroids. If the distance exceeds the trip criterion, Lc (in this case, 1 km), the base station recognizes that the mobile phone is “moving” from the base position to its cell, as shown in Fig. 2. Then the cell centroid, psj, is set as the next base position. This judgment process continues, and when the distance does not exceed Lc and the time duration tsjtsi exceeds time criterion Tc (in this case, 1 h), the condition of the mobile phone is set as stationary. This process makes it possible to determine whether the mobile phone is moving or stationary.
Fig. 1. Estimation processes of MSD.
Fig. 2. Mechanism for judging movement.
A trip is extracted by setting the position at which the condition is switched from stationary to moving, as the origin point, and the position at which the condition is switched from moving to stationary, as the destination point, as shown in Fig. 3. Then the extracted trips are multiplied by the penetration rate of the population and are converted from cell coverage to a regional zone by areal weighting interpolation, as indicated in Fig. 4. Next, trips are multiplied by the multiplying factor and summed using the same origin and destination zones. The multiplying factor, K(a, g, r, t), is characterized by age a, gender g, residential area of the mobile phone user r, and time t:
K(a,g,r,t)=zNstat(z,a,g,r,t)Nres(a,g,r,t)
(1)
where Nstat(z,a,g,r,t) = number of mobile phones in zone z; and Nres(a,g,r,t) = number of residents living in the residential areas that can be extracted from the basic resident register data. Finally, Ntrip(o,d,t) can be calculated as in Eq. (2), which shows the number of trips from the origin and destination zones:
Ntrip(o,d,t)=rgaNtrip(o,d,a,g,r,t)K(a,g,r,t)
(2)
where o = origin zone; d = destination zone; and Ntrip(o,d,a,g,r,t) is characterized by age a, gender g, residential area of the mobile phone user r, and time t.
Fig. 3. Trip extraction method.
Fig. 4. Conversion of estimated values between base stations to OD trips between regions.

Evaluation of MSD

In this section, the usefulness of MSD is evaluated based on correlation analysis with metropolitan PT survey data as a benchmark.

Evaluation Areas

The authors selected areas in the Shizuoka Chubu metropolitan area for the evaluation targets during October 2012. The areas are where the most recent PT surveys were conducted at the time this study was conducted. The Shizuoka Chubu metropolitan area, located in the center of Shizuoka Prefecture, consists of six cities and districts with 1.1 million people: Aoi-ku in Shizuoka-shi; Suruga-ku in Shizuoka-shi, Shimizu-ku in Shizuoka-shi; Fujieda-shi; Yaizu-shi; and Shimada-shi. The Shizuoka Chubu metropolitan PT survey used 64 medium zones, as indicated in Fig. 5: 16 medium zones, 25% of the whole, are smaller than a circular area with a 1-km diameter; 50 medium zones, 78% of the whole, are smaller than a circular area with a 3-km diameter.
Fig. 5. Medium zones of Shizuoka Chubu metropolitan area.

Evaluation Items

To validate the proposed method, the authors estimated intercity trips and inter-medium-zone trips and compared them with the PT survey data, considering the focus points given in Table 4. Basic statistics used included trip length distributions and excursion rate, which represents the ratio of residents living in an area who took a trip that day. Regarding the OD trips, the authors estimated OD trips as of November 2014 based on the specifications given in the section on “Method for Estimating MSD” and then compared them with the PT survey data. Even though the PT survey was conducted in 2012, there is a limitation on data availability. It is possible to generate MSD only on a date after November 2014. To consider the difference based on the survey year, the authors note that the Shizuoka Chubu metropolitan area has the following characteristics. According to basic resident register data, changes in the age groupings and population inflow and outflow for the 2-year period were not significant (Shizuoka Prefecture). In addition, there was no large-scale commercial or residential development in the areas. For these reasons, the comparison suggests that the validity of the proposed estimation method can be examined.
Table 4. Evaluation items and focus points
Evaluation itemFocus points
Basic statistics (trip length distribution and excursion rate)Trip criterion (1 km)
OD tripsTrip criterion (1 km/3 km), time resolution (1 h), spatial resolution (municipality zone, medium zone)

Evaluation of Basic Statistics

PT Survey Data Estimation as Comparison Data

To compare OD trips for the two different sets of statistics, the authors adjusted the estimation conditions of the PT survey data as given in Table 5. MSD are generated from operational data of mobile phones, including attribute data of users aged 15 to 74 years old. Thus, the authors excluded trips made by individuals under 15 or over 74 years old from the PT survey data. Also, MSD with a 1 km trip criterion record trips longer than 1 km; thus, trips shorter than 1 km are excluded from the PT survey data.
Table 5. Estimation conditions for PT survey data
ItemEstimation conditions
Age group15 to 74 years old
Trip distanceTrips shorter than 1 km (or shorter than 3 km) excluded, following data specifications of MSD; trips without information on trip distance included
Trip patternTrips outside Shizuoka Chubu excluded

Comparison of Trip Length Distributions

In the initial analysis, the authors examined trip length distributions with a range from 1 to 10 km or more. The PT survey data provide information on the trip length for each trip. Regarding the MSD, the authors determined the trip length of mobile phones by calculating the Euclidean distances between the centroids of the origin and destination zones for each OD pair. A comparison of the trip length distributions of MSD and the PT survey is shown in Fig. 6. The comparison results show that the OD trips generated from operational data have characteristics consistent with those of the PT survey.
Fig. 6. Comparison of trip distribution lengths.

Comparison of Excursion Rate

Table 6 shows a comparison of the excursion rates of MSD and the PT survey. Regarding MSD, the excursion rate was calculated by dividing the number of persons who take trips 1 km or longer by the number of residents. The authors obtained an average excursion rate of 82.5% in 1 day in the Shizuoka Chubu metropolitan area. The number is reasonable in comparison with the PT survey data, which indicates 82.8%. MSD also exhibit similar excursion rates in each municipal district. Although there is a difference between the definitions of the excursion rate, the results imply that MSD captures the stopping points.
Table 6. Excursion rate comparison
 Excursion ratea
City or districtPT survey (%)MSD (%)
Aoi-ku, Shizuoka-shi81.879.3
Suruga-ku, Shizuoka-shi81.881.7
Shimizu-ku, Shizuoka-shi81.281.4
Shimada-shi84.185.9
Yaizu-shi85.286.3
Fujieda-shi85.384.9
Total82.882.5
a
Trips between outer cites or shorter than 1 km are excluded.

Evaluation of OD Trips

Focus Points

The next analysis compares estimated OD trips between regional zones with the PT survey data. Here, two trip criteria, 1 and 3 km, were used to identify the movement of mobile phones. Focus points include the trip criterion, time resolution, and spatial resolution given in Table 4.

Comparison Focusing on Trip Criterion

Table 7 shows a comparative analysis of the OD trips yielded by MSD and the PT survey data. The PT survey was conducted for two trip distance groups of 1 and 3 km, and compared with MSD for the same distance groups. The PT survey for the trips exceeding 1 km resulted in 708,000 trips. MSD for this trip distance group resulted in 675,000 trips, approximately 95.3% of the PT survey result. The PT survey for the trips exceeding 3 km resulted in 660,000 trips. MSD for the same trip distance group resulted in 628,000 trips, approximately 95.2% of the PT survey result. The difference between MSD and the PT survey data can be explained by the difference in definitions of the origin and destination. MSD recognize consecutive stopping points for longer than 1 h as the origin and destination areas. Consider the case that mobile phone users take a trip that is shorter than 1 h, stay at the destination area, and return to the departure area. MSD considers this case as one trip, while the PT survey considers it to be two trips. It can be said that the results reflect this difference.
Table 7. Intercity OD trip comparison
Trip distancePT surveyMSD
Longer than 1 km708,000 trips675,000 trips (95.3% of PT survey results)
Longer than 3 km660,000 trips628,000 trips (95.2% of PT survey results)

Comparison Focusing on Time Resolution

The authors compared MSD and the PT survey data focusing on time resolution for the 1 km trip criterion. One of the objectives for conducting PT surveys is to capture the actual movement of people during commuting hours. In general, the PT survey defines 6 a.m. to 9 a.m. as commuting hours. Given that there is almost the same number of trips from 9 a.m. to 10 a.m., the authors defined the 4 h from 6 a.m. to 10 a.m. as the morning peak hours. Furthermore, PT surveys consider 5 p.m. to 7 p.m. as the peak return hours, and a similar trend continues from 1 h before 5 p.m. to 1 h after 7 p.m. Thus, the authors defined the 4 h from 4 p.m. to 8 p.m. as the evening peak hours.
Intercity OD trip comparisons of MSD and PT survey data during the morning peak hours (6 a.m. to 10 a.m.) and evening peak hours (4 p.m. and 8 p.m.) are shown in Figs. 7 and 8, respectively. The correlation between the two statistics is 0.989 during the morning peak hours and 0.990 during the evening peak hours. This high correlation suggests that the estimated OD matrix resembles that of the PT survey on the basis of the questionnaires.
Fig. 7. Comparison of OD trips (6 a.m. to 10 a.m.).
Fig. 8. Comparison of OD trips (4 p.m. to 8 p.m.).
Table 8 shows a comparative analysis of the OD trips during the morning and evening peak hours. The OD trips during the morning peak hours totaled 232,000, 93.2% of the PT survey value. The degree to which the two statistics differ is almost the same as the total number of trips. The OD trips during the morning peak hours totaled 230,000, which accounted for 119.8% of the PT survey. This can be explained by focusing on the purpose of the trips. From 4 p.m. to 8 p.m., the purpose of the trips includes returning home, business activities, and private activities, such as shopping or having a meal. The results imply that MSD have the potential to capture the trips with the purpose of business activities and private activities, which most PT surveys have difficulty in capturing.
Table 8. Intercity OD trip comparison: morning and evening peak hours
Trip distancePT surveyMSD
Total708,000 trips675,000 trips (95.3% of PT survey results)
Morning peak hours249,000 trips232,000 trips (93.2% of PT survey results)
Evening peak hours192,000 trips230,000 trips (119.8% of PT survey results)

Comparison Focusing on Spatial Resolution

To examine the spatial resolution of MSD, the authors used municipal areas and medium zones defined by the PT survey data. The resulting correlation coefficients are 0.997 for the 1 km trip criterion and 0.994 for the 3 km trip criterion, as indicated in Figs. 9 and 10, respectively. The correlation coefficients of the OD trips between medium zones are 0.959 for the 1 km trip criterion and 0.927 for the 3 km trip criterion, as indicated in Figs. 11 and 12, respectively, which are slightly lower than the intercity values. Also, there is some degree of variation in the difference between MSD and the PT survey data. To examine the degree to which the two statistics differ, the authors used a deviation rate according to previous work (Makita et al. 2013; Oyabu et al. 2013): μij is the average of the two statistics, calculated as
μij=MSDij+PTij2
(3)
where MSDij = number of estimated OD trips from zone zi to zone zj; and PTij = number of OD trips from zone zi to zone zj. The deviation rate δij is defined as the normalized difference between the number of estimated trips and the average of the two statistics:
δij=MSDijμijμij
(4)
According to the definition, the following equation can be derived by substituting Eq. (3) into Eq. (4):
δij=MSDijPTijMSDij+PTij
(5)
When δij is zero, the value of MSDij is equal to that of the PTij. For δij to be −1, MSDij = 0 and PTij > 0, while for δij to be 1, PTij = 0 and MSDij > 0.
Fig. 9. Comparison of OD trips (intercity with 1 km trip criterion).
Fig. 10. Comparison of OD trips (intercity with 3 km trip criterion).
Fig. 11. Comparison of OD trips (inter-medium-zone with 1 km trip criterion).
Fig. 12. Comparison of OD trips (inter-medium-zone with 3 km trip criterion).
Regarding the 1 km trip criterion, 83.3% of intercity OD pairs lie within a deviation rate of ±0.1, as indicated in Table 9. The ratio of OD pairs reached almost 100% at a deviation rate of ±0.2. Conversely, 69.8% of inter-medium-zone OD pairs lie within a deviation rate of ±0.3. The authors note that the estimated number of OD trips tends to be lower than that of the PT survey data when the trip is long. This is mainly due to the sampling error. Even though the PT survey is conducted to meet the condition that the relative standard deviation is less than 20%, a small number of trips, such as 30 or 40, tends to have a relatively large error. To address this, the authors investigated the OD pairs with higher reliability by excluding those fewer than 100 trips. By doing so, 92.6% of inter-medium-zone OD pairs lie within a deviation rate of ±0.3, as given in Table 10. Another reason why the two statistics differ lies in the existence of trips during the daytime. MSD estimate a larger number of trips to an adjacent zone than the PT survey does during the daytime. The results imply that MSD record trips with the purpose of shopping or having a meal, whereas PT surveys have difficulty in capturing this data. The difference between the 1 and 3 km trip criteria can be explained by the installation density of base stations. The average installation interval of base stations is approximately one to a few kilometers in the suburbs. MSD extract trips only when base stations detect mobile phones as they move from one base station to another. Thus, in the suburbs, there is a possibility that the base stations with the 3 km trip criterion detect even much longer trips than the threshold, compared with the 1 km trip criterion. Based on these results, it is important to determine the trip criterion in MSD based on the installation density of the base stations.
Table 9. Proportion of OD pairs within deviation rate
 Proportion of OD pairs within deviation rate 
Spatial resolution with trip criterion±0.3±0.2±0.1Number of OD pairs
Municipal zone with 1 km trip criterion10097.683.342
Municipal zone with 3 km trip criterion10010081.042
Medium zone with 1 km trip criterion69.856.331.83,482
Medium zone with 3 km trip criterion60.645.824.93,482
Table 10. Proportion of OD pairs within deviation rate (fewer than 100 trips are excluded)
 Proportion of OD pairs within deviation rate 
Spatial resolution with trip criterion±0.3±0.2±0.1Number of OD pairs
Medium zone with 1 km trip criterion92.679.146.21,759
Medium zone with 3 km trip criterion86.769.538.61,607

Discussion

As discerned through the comparison of OD trips between municipal zones and between medium zones, MSD have a high correlation with the Shizuoka Chubu metropolitan PT survey data. This suggests that MSD have great potential to capture the actual movement of people between areas. The evaluation also clarifies the difference in characteristics as a result of the estimation method. The proposed method captures trips made for business activities and private activities during the daytime, which not all the individuals mentioned in the PT survey. Future studies will include an examination of the reliability of estimated OD trips between small zones. The small number of OD trips makes it difficult to ensure statistical reliability. However, statistics collected from a large amount of operational data would allow the capturing of even a small number of OD trips. Another challenge is the problem of determining the trip criterion based on the density of base stations. In metropolitan areas, the average installation interval is approximately 500 m to 1 km, and thus it would be possible to record trips shorter than 1 km. PT surveys are conducted in some regions, which reveal the characteristics of the residents. Comparative analysis using the PT surveys in different regions would be useful in investigating the usefulness of MSD. Moreover, comparison under more consistent conditions should be addressed.

Use Case Analysis for the Application of MSD in the Urban Transportation Field

Imai et al. (2012a) showed the basic characteristics and use cases of a range of trail data, including location data acquired from mobile phone networks and car navigation systems, and censuses. The authors analyzed use cases of trail data by following the prior study to clarify the usefulness and application fields of MSD, as shown in Table 11. Here, MSD were included in mobile data.
Table 11. Use cases of traffic-related data
  Official statisticsTrail data to be usedFoundation data
No.Use cases of trail dataPT surveyRoad traffic censusNational censusMobile dataTransportation smart cardWireless LANCar probeBus location dataElectronic mapNetwork data
1Visualization of urban area activitiesbbbbbabbbb
2Disaster prevention planning and evacuationbN/AbbbaN/AN/Abb
3Public transportation services for the elderlybN/AbbbN/AN/AN/Abb
4Exploring potential demand for public transportationbbbbbN/Abbbb
5Complement, raise efficiency and sophistication of PT surveybN/AbbbaaN/Abb
6Complement, raise efficiency and sophistication of road traffic censusN/AbN/AbN/AN/Abbbb
7Diversification and sophistication of monitoring of the effect of road worksbbbbbabbbb
8Travel assistance for travelers from abroad, identifying spatial immobilityN/AN/AN/AbaaN/AN/AbN/A
Note: N/A = not applicable.
a
Data desirable for evaluation.
b
Data required for evaluation.
Visualization of urban area activities is possible by combining and analyzing MSD with electronic maps and network data. MSD allow the capturing of the movement of people nationwide on a 24-h, 365-day basis, and thus would become the foundation for urban planning and development. One of the use cases is to support disaster prevention planning by estimating the number of people who would seek to return home in the case of a disaster. Another application is to make public transportation services better organized to assist the elderly, and to explore the potential demand for public transportation.
PT surveys and road traffic censuses are conducted only on a specific day and once every 5 or 10 years. This makes it difficult to monitor the actual conditions of traffic behavior immediately after a disaster or during an event. MSD could capture the movement of people in a continuous manner and thus reveal weekly, monthly, or yearly changes, complementing PT surveys and road traffic censuses. For example, MSD could be used for temporal data correction of PT surveys after the survey date. Furthermore, MSD could be used to generate wider area population flow statistics to complement PT surveys, conducted only in metropolitan areas.
MSD are highly applicable to each use case mainly because they cover all transportation modes. Mobile phone users include office workers and students, who move from the home to the office or school by walking, cycling, bus, automobile, or train. Mobile phone users also include tourists and business travelers, who travel by bus, train, or airplane. MSD include no information on the means of transportation. However, combining MSD data with another data source has the potential for traffic flow analysis on a large scale. A road traffic census gives only information on the movement of automobiles on the highway; thus, MSD could reveal the movement before automobiles ingress or egress the highway. Another use case is to combine MSD with GPS data or automobile probe data, giving the potential for traffic flow analysis at road sections and intersections. This use case can be practically applied to monitor the effects of road construction. As discussed, statistics generated from operation data of the mobile network would become a key data resource in urban transportation planning.

Summary and Conclusions

This study introduced requirements for statistics generated from the operation data of a mobile network, which can be applied to the urban transportation field. Based on the requirements, the authors proposed a method for estimating MSD that comprises anonymization, estimation, and disclosure limitation processes. The proposed method introduces a multiplying factor characterized by age, gender, and residential areas of mobile phone users in estimating OD trips, to increase reliability. Comparison with the PT survey revealed consistent characteristics and great potential for monitoring the movement of people in metropolitan areas. MSD can estimate OD trips over the entire nation on a 24-h, 365-day basis and thus can be used as a complement to the PT survey, which is conducted at intervals of 10 years in metropolitan areas. The data allow surveys to be conducted survey in regions where PT surveys have not previously been conducted, owing to budget constraints. MSD could also be used in temporal data correction after the PT survey date and population flow estimation of extraregional residents.
Understanding the limitations of MSD is important when using the data in urban planning. The time resolution is 1 h, determined by the specifications of the base stations used in the NTT DOCOMO mobile network. The spatial resolution is expressed as an approximate 500 m grid in metropolitan areas and at the municipal level in the suburbs, which depends on the base station installation density. The authors introduced a time criterion in the proposed estimation method, defining the stationary time of mobile phones as 1 h in this study, because base stations detect all mobile phones approximately once every hour. It would be possible to refine the criterion; however, that might cause undercounting of OD trips. Another limitation is that MSD provide no information on the means of transportation and the purpose of travel. This means that the statistics could not cover all the survey items of the PT survey and road traffic census. However, combining MSD with another data source has the potential for obtaining the actual movement of people or automobiles, which is difficult to derive from a single survey. Combining MSD with PT surveys could allow a survey of population flow on a broader scale. The former gives the nationwide movement of people with high reliability, while the latter gives information on the means of transportation and the purpose of travel. Combining MSD with GPS data or automobile probe data has the potential for traffic flow analysis at road sections and intersections. MSD will become a key data resource in urban transportation planning.
Future work will include an examination of the reliability of estimated OD trips between small zones. Another challenge is to determine the trip criterion based on the density of base stations. These investigations would allow the application range of MSD to be extended. Comparative analysis with the PT surveys conducted in different regions will be useful in validating the proposed method. Comparison with the PT survey data in the same year should also be addressed. The progress in this study and future work will contribute to exploring the potential of the operational data of mobile networks and to establish a monitoring and analysis method in the urban transportation field.

Data Availability Statement

The MSD and PT survey data used during the study are proprietary and may only be provided with restrictions. Short OD trips may not be provided to protect the privacy of mobile phone users and residents in the research fields.

Acknowledgments

The authors acknowledge the general assistance of Dr. Ryosuke Shibasaki and Dr. Yoshihide Sekimoto of the University of Tokyo. The authors thank Dr. Tsutomu Yabe, Mr. Kazuki Hirokawa, and Mr. Ryoji Ishii of the Institute of Behavioral Sciences, Mr. Toshikazu Matsushima of Chuo Fukken Consultants, and Ms. Aya Fukute of NTT DOCOMO for their support in evaluating the reliability of MSD. The authors greatly appreciate the support of Dr. Hiroyoshi Hashimoto, Mr. Jundo Yoshida, Mr. Daisuke Toriumi, and Mr. Makoto Takimoto, from the National Institute for Land and Infrastructure Management, and Mr. Satoshi Hayasaki of NTT DOCOMO.

References

Alexander, L., S. F. Jiang, M. Murga, and M. C. Gonzalez. 2015. “Origin-destination trips by purpose and time of day inferred from mobile phone data.” Transp. Res. Part C: Emerging Technol. 58: 240–250. https://doi.org/10.1016/j.trc.2015.02.018.
Becker, R., R. Caceres, K. Hanson, J. Loh, and S. Urbanek. 2011a. “A tale of one city: Using cellular network data for urban planning.” IEEE Pervasive Comput. 10 (4): 18–26. https://doi.org/10.1109/MPRV.2011.44.
Becker, R., R. Caceres, K. J. Hanson, S. Isaacman, J. Loh, M. Martonosi, J. Rowland, S. Urbanek, A. Varshavsky, and C. Volinsky. 2013. “Human mobility characterization from cellular network data.” Commun. ACM 56 (1): 74–82. https://doi.org/10.1145/2398356.2398375.
Becker, R., R. Caceres, K. Hanson, J. Loh, S. Urbanek, A. Varshavsky, and C. Volinsky. 2011b. “Route classification using cellular handoff patterns.” In Proc., 13th Int. Conf. on Ubiquitous Computing, 123–132. New York: ACM.
Berlingerio, M., F. Calabrese, G. Lorenzo, R. Nair, F. Pinelli, and M. Sbodio. 2013. “All aboard: A system for exploring urban mobility and optimizing public transport using cellphone data.” In Proc., Joint European Conf. on Machine Learning and Knowledge Discovery in Databases, 663–666. Berlin: Springer.
Caceres, N., L. Romero, and F. Benitez. 2013. “Inferring origin–destination trip matrices from aggregate volumes on groups of links: A case study using volumes inferred from mobile phone data.” J. Adv. Transp. 47: 650–666. https://doi.org/10.1002/atr.v47.7.
Caceres, N., J. Wideberg, and F. Benitez. 2007. “Deriving origin-destination data from a mobile phone network.” IET Intel. Transport Syst. 1 (1): 15–26. https://doi.org/10.1049/iet-its:20060020.
Calabrese, F., M. Diao, G. D. Lorenzo, J. Ferreira, and C. Ratti. 2013. “Understanding individual mobility patterns from urban sensing data: A mobile phone trace example.” Transp. Res. Part C: Emerging Technol. 26: 301–313. https://doi.org/10.1016/j.trc.2012.09.009.
Calabrese, F., G. D. Lorenzo, L. Liu, and C. Ratti. 2011. “Estimating origin-destination flows using mobile phone location data.” IEEE Pervasive Comput. 10 (4): 36–44. https://doi.org/10.1109/MPRV.2011.41.
Chen, C., L. Bian, and J. Ma. 2014. “From sightings to activity locations: How well can we guess the locations visited from mobile phone sightings.” Transp. Res. Part C: Emerging Technol. 46 (10): 326–337. https://doi.org/10.1016/j.trc.2014.07.001.
Gonzalez, M., C. Hidalgo, and A. Brabasi. 2008. “Understanding individual human mobility patterns.” Nature 453: 779–782. https://doi.org/10.1038/nature06958.
Imai, R., M. Fukada, K. Shigetaka, T. Yabe, K. Makimura, and R. Adachi. 2013. “Feasibility study on applicability of multi-trail data using combinational analysis in urban transportation planning.” [In Japanese.] In Vol. 48 of Proc., Infrastructure Planning. Tokyo: JSCE.
Imai, R., Y. Iboshi, T. Chiba, K. Makimura, and S. Hamada. 2012a. “Verification of the effects on road maintenance and improvement using trail data of bus smart card.” [In Japanese.] Infrastruct. Plann. Rev. 68 (5): 1271–1278.
Imai, R., Y. Iboshi, T. Nakamura, K. Makimura, and S. Hamada. 2012b. “The supporting method of bus transportation planning using multi-trail data.” [In Japanese.] Infrastruct. Plann. Rev. 68 (5): 1287–1296.
Imai, R., Y. Iboshi, T. Nakamura, J. Morio, K. Makimura, and S. Hamada. 2012c. “Consideration on practical use of trail data acquired by smart card of transportation.” [In Japanese.] In Vol. 45 of Proc., Infrastructure Planning. Tokyo: JSCE.
Iqbal, M. S., C. F. Choudhury, P. Wang, and M. C. Gonzalez. 2014. “Development of origin-destination matrices using mobile phone call data.” Transp. Res. Part C: Emerging Technol. 40: 63–74. https://doi.org/10.1016/j.trc.2014.01.002.
Ishizuka, H., N. Kobayashi, S. Muramatsu, and C. Ono. 2015. Classifying the mode of transportation using cell tower alignments. IPSJ SIG Technical Rep. 2015-MBL-74/2015-UBI-45(57), 1–7. Tokyo: IPSJ.
Jiang, S. F., J. Ferreira, and M. C. Gonzalez. 2017. “Activity-based human mobility patterns inferred from mobile phone data: A case study of Singapore.” IEEE Trans. Big Data 3 (2): 208–219. https://doi.org/10.1109/TBDATA.2016.2631141.
Jiang, S., G. Fiore, Y. Yang, J. Ferreira, E. Frazzoli, and M. Gonzalez. 2013. “A review of urban computing for mobile phone traces: Current methods, challenges and opportunities.” In Proc., 2nd ACM SIGKDD Int. Workshop on Urban Computing, 2:1–2:9. New York: ACM.
Makita, N., M. Kimura, M. Terada, M. Kobayashi, and Y. Oyabu. 2013. “Can mobile phone network data be used to estimate small area population? A comparison from Japan.” Stat. J. IAOS 29 (3): 223–232.
Ming-Heng, W., S. Schrock, N. VanderBroek, and T. Mulinazzi. 2013. “Estimating dynamic origin-destination data and travel demand using cell phone network data.” Int. J. Intell. Transp. Syst. Res. 11 (2): 76–86.
Momma, T., H. Hashimoto, and S. Matsumoto. 2011. “Road traffic analysis using probe data.” [In Japanese.] Civ. Eng. J. 53 (10): 14–17.
Murakami, M., I. Okajima, T. Suzuki, and M. Yamashita. 2011. “Applicability of mobile spatial statistics for estimating victims unable to return home and case study.” [In Japanese.] In Proc., Architectural Institute of Japan. F-1, 893–894. Tokyo: AIJ.
Nagata, T., S. Aoyagi, and H. Kawakami. 2013. “Using mobile spatial statistics for regional revitalization.” NTT DOCOMO Tech. J. 14 (3): 46–50.
Odawara, T., and H. Kawakami. 2013. “Using mobile spatial statistics in field of urban planning.” NTT DOCOMO Tech. J. 14 (3): 31–36.
Odawara, T., and T. Nagata. 2014. “An estimation technique of social dynamics: The technology and applications of mobile spatial statistics.” [In Japanese.] J. Inst. Electron. Inf. Commun. Eng. 97 (9): 806–811.
Okajima, I., S. Tanaka, M. Terada, D. Ikeda, and T. Nagata. 2013. “Supporting growth in society and industry using statistical data from mobile terminal networks: Overview of mobile spatial statistics.” NTT DOCOMO Tech. J. 14 (3): 4–9.
Oyabu, Y., M. Terada, T. Yamaguchi, S. Iwasaki, J. Hagiwara, and D. Koizumi. 2013. “Evaluating reliability for mobile spatial statistics.” NTT DOCOMO Tech. J. 14 (3): 16–23.
Phithakkitnukoon, S., and C. Ratti. 2011. “Inferring asymmetry of inhabitant flow using call detail records.” J. Adv. Inf. Technol. 2 (4): 239–249.
Seike, T., H. Mimaki, Y. Hara, and S. Morita. 2013. “Research on the practical use fields of mobile spatial statistics in a municipality: Case studies at Kashiwa City.” [In Japanese.] AIJ J. Technol. Des. 19 (42): 737–742. https://doi.org/10.3130/aijt.19.737.
Seike, T., H. Mimaki, Y. Hara, T. Odawara, T. Nagata, and M. Terada. 2011. “Research on the applicability of mobile spatial statistics for enhanced urban planning.” [In Japanese.] J. City Plann. Inst. Jpn. 46 (3): 451–456.
Seike, T., H. Mimaki, and S. Morita. 2015. “Research on the evaluation of regional peculiarities in Kashiwa and Yokohama by mobile spatial statistics.” [In Japanese.] AIJ J. Technol. Des. 21 (48): 821–826. https://doi.org/10.3130/aijt.21.821.
Sekimoto, Y., Y. Matsubayashi, H. Yamada, R. Imai, T. Usui, and H. Kanasugi. 2012. “Lightweight lane positioning of vehicles using a smartphone GPS by monitoring the distance from the center line.” Proc., IEEE 15th Int. Conf. on Intelligent Transportation Systems, 1561–1565. New Jersey: IEEE.
Sekimoto, Y., A. Watanabe, T. Nakamura, H. Kanasugi, and T. Usui. 2013. “Combination of spatio-temporal correction methods using traffic survey data for reconstruction of people flow.” Pervasive Mob. Comput. J. 9 (5): 629–642. https://doi.org/10.1016/j.pmcj.2012.10.005.
Sengoku, H., Y. Akiyama, and R. Shibasaki. 2011. “Analysis of consumer’s behavior in commercial accumulation using auto log data of handy GPS.” [In Japanese.] In Vol. 20 of Proc., Geographic Information Systems Association, D-4-1. Tokyo: GISA.
Shizuoka Prefecture. “Statistics of Shizuoka prefecture.” n.d. http://www.pref.shizuoka.jp (May 8, 2020).
Sohn, K., and D. Kim. 2008. “Dynamic origin-destination flow estimation using cellular communication system.” IEEE Trans. Veh. Technol. 57 (5): 2703–2713. https://doi.org/10.1109/TVT.2007.912336.
Song, C., Z. Qu, N. Blumm, and A. Barabsi. 2010. “Limits of predictability in human mobility.” Science 327 (5968): 1018–1021. https://doi.org/10.1126/science.1177170.
Suzuki, T., M. Yamashita, and M. Terada. 2013. “Using mobile spatial statistics in field of disaster prevention planning.” NTT DOCOMO Tech. J. 14 (3): 37–45.
Tongsinoot, L., and V. Muangsin. 2017. “Exploring home and work locations in a city from mobile phone data.” In Proc., 19th Int. IEEE Conf. on High Performance Computing and Communications, 123–129. New Jersey: IEEE.
Wang, F., and C. Chen. 2018. “On data processing required to derive mobility patterns from passively-generated mobile phone data.” Transp. Res. Part C: Emerging Technol. 87: 58–74. https://doi.org/10.1016/j.trc.2017.12.003.
White, J., and I. Wells. 2002. “Extracting origin destination information from mobile phone data.” In Proc., IEEE 11th Int. Conf. on Road Transport Information and Control, 30–34. New Jersey: IEEE.
Yin, M., M. Sheehan, S. Feygin, J. Paiement, and A. Pozdnoukhov. 2018. “A generative model of urban activities from cellular data.” IEEE Trans. Intell. Transp. Syst. 19 (6): 1682–1696. https://doi.org/10.1109/TITS.6979.
Zhang, Y., X. Qin, S. Dong, and B. Ran. 2010. “Daily O-D matrix estimation using cellular probe data.” In Proc., Transportation Research Board 89th Annual Meeting. 9. Washington, D.C.: TRB.

Information & Authors

Information

Published In

Go to Journal of Urban Planning and Development
Journal of Urban Planning and Development
Volume 147Issue 1March 2021

History

Received: Feb 25, 2019
Accepted: Jul 27, 2020
Published online: Oct 23, 2020
Published in print: Mar 1, 2021
Discussion open until: Mar 23, 2021

Authors

Affiliations

Ryuichi Imai, Ph.D. [email protected]
Professor, Faculty of Engineering and Design, Hosei Univ., 2-33 Ichigayatamachi, Shinjuku-ku, Tokyo, Japan. Email: [email protected]
Executive Research Engineer, Research Laboratories, NTT DOCOMO, Inc., 3-6 Hikarino-oka, Yokosuka-shi, Kanagawa, Japan (corresponding author). ORCID: https://orcid.org/0000-0002-4022-0505. Email: [email protected]
Hiroyasu Shingai [email protected]
Head, Urban Facilities Division, National Institute for Land and Infrastructure Management, 1, Tachihara, Tsukuba-shi, Ibaraki, Japan. Email: [email protected]
Tomohiro Nagata, Ph.D. [email protected]
Assistant Manager, Platform Business Dept., NTT DOCOMO, Inc., 2-11-1 Nagata-cho, Chiyoda-ku, Tokyo, Japan. Email: [email protected]
Koichi Shigetaka [email protected]
Deputy Director General, Iwate Regional Bureau of Reconstruction, 1-7-25 Chuo-dori, Morioka City, Iwate, Japan. Email: [email protected]

Metrics & Citations

Metrics

Citations

Download citation

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

Cited by

View Options

Media

Figures

Other

Tables

Share

Share

Copy the content Link

Share with email

Email a colleague

Share