Assessing the Quality of Home Detection from Mobile Phone Data for Official Statistics

Open access

Abstract

Mobile phone data are an interesting new data source for official statistics. However, multiple problems and uncertainties need to be solved before these data can inform, support or even become an integral part of statistical production processes. In this article, we focus on arguably the most important problem hindering the application of mobile phone data in official statistics: detecting home locations. We argue that current efforts to detect home locations suffer from a blind deployment of criteria to define a place of residence and from limited validation possibilities. We support our argument by analysing the performance of five home detection algorithms (HDAs) that have been applied to a large, French, Call Detailed Record (CDR) data set (~18 million users, five months). Our results show that criteria choice in HDAs influences the detection of home locations for up to about 40% of users, that HDAs perform poorly when compared with a validation data set (resulting in 358-gap), and that their performance is sensitive to the time period and the duration of observation. Based on our findings and experiences, we offer several recommendations for official statistics. If adopted, our recommendations would help ensure more reliable use of mobile phone data vis-à-vis official statistics.

Ahas, R., A. Aasa, A. Roose,Ü. Mark, and S. Silm. 2008. “Evaluating Passive Mobile Positioning Data for Tourism Surveys: An Estonian Case Study.” Tourism Management 29(3): 469–486. Doi: http://dx.doi.org/10.1016/j.tourman.2007.05.014.

Ahas, R., S. Silm, O. Järv, E. Saluveer, and M. Tiru. 2010. “Using Mobile Positioning Data to Model Locations Meaningful to Users of Mobile Phones.” Journal of Urban Technology 17(1): 3–27. Doi: http://dx.doi.org/10.1080/10630731003597306.

ARCEP. 2008. “Le Suivi Des Indicateurs Mobiles – Les Chiffres Au 31 Décembre 2007 (Publication Le 4 Février 2008).” Available at: http://www.arcep.fr/index.php?id=9545 (Last accessed February 2018).

Ashbrook, D. and T. Starner. 2003. “Using GPS to Learn Significant Locations and Predict Movement across Multiple Users.” Personal and Ubiquitous Computing 7(5): 275–286. Doi: http://dx.doi.org/10.1007/s00779-003-0240-0.

Baldacci, E., D. Buono, G. Kapetanos, S. Krische, M. Marcellino, G. Mazzi, and F. Papailias. 2016. “Big Data and Macroeconomic Nowcasting: From Data Access to Modelling.” Luxembourg: Eurostat. Doi: http://dx.doi.org/10.2785/360587.

Blondel, V.D., M. Esch, C. Chan, F. Clerot, P. Deville, E. Huens, F. Morlot, Z. Smoreda, and C. Ziemlicki. 2012. “Data for Development: The D4D Challenge on Mobile Phone Data.” arXiv:1210.0137. Available at: http://arxiv.org/abs/1210.0137 (Last accessed February 2018).

Blondel, V.D., A. Decuyper, and G. Krings. 2015. “A Survey of Results on Mobile Phone Datasets Analysis.” EPJ Data Science 4(10): 1–57. Doi: http://dx.doi.org/10.1140/epjds/s13688-015-0046-0.

Blumenstock, J.E. 2012. “Inferring Patterns of Internal Migration from Mobile Phone Call Records: Evidence from Rwanda.” Information Technology for Development 18(2): 107–125. Doi: http://dx.doi.org/10.1080/02681102.2011.643209.

Bojic, I., E. Massaro, A. Belyi, S. Sobolevsky, and C. Ratti. 2015. “Choosing the Right Home Location Definition Method for the given Dataset.” In Social Informatics – 7th International Conference, SocInfo 2015, Beijing, China, December 9–12, 2015, Proceedings, edited by Tie-Yan Liu, Christie Napa Scollon, and Wenwu Zhu, 9471: 194–208. Bejing: Springer. Doi: http://dx.doi.org/10.1007/978-3-319-27433-1_14.

Calabrese, F., L. Ferrari, and V.D. Blondel. 2014. “Urban Sensing Using Mobile Phone Network Data: A Survey of Research.” ACM Computing Surveys 47(2): 1–20. Doi: http://dx.doi.org/10.1145/2655691.

Calabrese, F., G. Di Lorenzo, L. Liu, and C. Ratti. 2011. “Estimating Origin-Destination Flows Using Mobile Phone Location Data.” IEEE Pervasive Computing 10(4): 36–44. Doi: http://dx.doi.org/10.1109/MPRV.2011.41.

Chen, C., L. Bian, and J. Ma. 2014. “From Traces to Trajectories: How Well Can We Guess Activity Locations from Mobile Phone Traces?” Transportation Research Part C: Emerging Technologies 46: 326–337. Doi: http://dx.doi.org/10.1016/j.trc.2014.07.001.

Csáji, B.C., A. Browet, V.A. Traag, J.C. Delvenne, E. Huens, P. Van Dooren, Z. Smoreda, and V.D. Blondel. 2013. “Exploring the Mobility of Mobile Phone Users.” Physica A: Statistical Mechanics and Its Applications 392(6): 1459–1473. Doi: http://dx.doi.org/10.1016/j.physa.2012.11.040.

Daas, P.J.H., M.J. Puts, B. Buelens, and P.A.M. van den Hurk. 2015. “Big Data as a Source for Official Statistics.” Journal of Official Statistics 31(2): 249 – 262. Doi: http://dx.doi.org/10.1515/JOS-2015-0016.

de Montjoye, Y.-A., Z. Smoreda, R. Trinquart, C. Ziemlicki, and V.D. Blondel. 2014. “D4D-Senegal: The Second Mobile Phone Data for Development Challenge.” arXiv:1407.4885. Available at: http://arxiv.org/abs/1407.4885 (Last accessed February 2018).

Deville, P., C. Linard, S. Martin, M. Gilbert, F.R. Stevens, A.E. Gaughan, V.D. Blondel, and A.J. Tatem. 2014. “Dynamic Population Mapping Using Mobile Phone Data.” Proceedings of the National Academy of Sciences 111(45): 15888–15893. Doi: http://dx.doi.org/10.1073/pnas.1408439111.

Eurostat. 2014. “ESS Big Data Action Plan and Roadmap 1.0. Approved by the 22nd Meeting of the European Statistical System Committee.” Available at: https://ec.europa.eu/eurostat/cros/content/ess-Big-DataAction-Plan-and-Roadmap-10_en (Last accessed February 2018).

ESSnet Big Data. 2018. “ESSnet Big Data.” Available at: https://webgate.ec.europa.eu/fpfis/mwikis/essnetbigdata/index.php/ESSnet_Big_Data (Last accessed February 2018).

Frias-Martinez, V. and J. Virseda. 2012. “On the Relationship between Socio-Economic Factors and Cell Phone Usage.” In Proceedings of the Fifth International Conference on Information and Communication Technologies and Development, March 12–15, 2015. 76–84. New York, NY: ACM Press. Doi: http://dx.doi.org/10.1145/2160673.2160684.

Frias-Martinez, V., V. Jesus, A. Rubio, and E. Frias-Martinez. 2010. “Towards Large Scale Technology Impact Analyses.” In Proceedings of the 4th ACM/IEEE International Conference on Information and Communication Technologies and Development – ICTD ’10, December 13–16, 2010. 1–10. New York, NY: ACM Press. Doi: http://dx.doi.org/10.1145/2369220.2369230.

Getis, A. and J.K. Ord. 1992. “The Analysis of Spatial Association by Use of Distance Statistics.” Geographical Analysis 24(3): 189–206. Doi: http://dx.doi.org/10.1111/j.1538-4632.1992.tb00261.x.

Giannotti, F., D. Pedreschi, A. Pentland, P. Lukowicz, D. Kossmann, J. Crowley, and D. Helbing. 2012. “A Planetary Nervous System for Social Mining and Collective Awareness.” The European Physical Journal Special Topics 214(1): 49–75. Doi: http://dx.doi.org/10.1140/epjst/e2012-01688-9.

Glasson, M., J. Trepanier, V. Patruno, P. Daas, M. Skaliotis, and A. Khan. 2013. “What Does ‘Big Data’ Mean for Offical Statistics?” Available at: http://www1.unece.org/stat/platform/pages/viewpage.action?pageId=77170614 (Last accessed February 2018).

Gong, H., C. Chen, E. Bialostozky, and C.T. Lawson. 2012. “A GPS/GIS Method for Travel Mode Detection in New York City.” Computers, Environment and Urban Systems 36(2): 131–139. Doi: http://dx.doi.org/10.1016/j.compenvurbsys.2011.05.003.

Grauwin, S., M. Szell, S. Sobolevsky, P. Hövel, F. Simini, M. Vanhoof, Z. Smoreda, A.-L. Barabasi, and C. Ratti. 2017. “Identifying and Modelling the Structural Discontinuities of Human Interactions.” Scientific Reports 7: 46677. Doi: http://dx.doi.org/10.1038/srep46677.

Hightower, J., S. Consolvo, A. LaMarca, I. Smith, and J. Hughes. 2005. “Learning and Recognizing the Places We Go.” In UbiComp 2005: Ubiquitous Computing, edited by M. Beigl, S. Intille, J. Rekimoto, and H. Tokuda, 159–176. Berlin, Heidelberg: Springer Berlin Heidelberg. Doi: http://dx.doi.org/10.1007/11551201.

Isaacman, S., R. Becker, R. Caceres, S. Kobourov, M. Martonosi, J. Rowland, and A. Varshavsky. 2011. “Identifying Important Places in People’s Lives from Cellular Network Data.” In Pervasive Computing: Pervasive 2011, edited by K. Lyons, J. Hightower, and E.M. Huang, 133–151. Berlin, Heidelberg: Springer Berlin Heidelberg. Doi: http://dx.doi.org/10.1007/978-3-642-21726-5_9.

Janzen, M., M. Vanhoof, and K.W. Axhausen. 2016. “Estimating Long-Distance Travel Demand with Mobile Phone Billing Data.” In Proceedings of the 16th Swiss Transport Research Conference (STRC 2016), May 18–20, 2016. Available at: http://www.strc.ch/conferences/2016/Janzen_EtAl.pdf (Last accessed February 2018).

Janzen, M., M. Vanhoof, Z. Smoreda, and K.W. Axhausen. 2018. “Closer to the Total? Long-Distance Travel of French Mobile Phone Users.” Travel Behaviour and Sociey 11: 31–42. Doi: http://dx.doi.org/10.1016/j.tbs.2017.21.001.

Järv, O., R. Ahas, and F. Witlox. 2014. “Understanding Monthly Variability in Human Activity Spaces: A Twelve Month Study Using Mobile Phone Call Detail Records.” Transportation Research Part C: Emerging Technologies 38: 122–135. Doi: http://dx.doi.org/10.1016/j.trc.2013.11.003.

Karlberg, M., S. Biffignandi, P.J.H. Daas, A. Holmberg, B. Hulliger, P. Jacques, R. Lehtonen, R.T. Münnich, N. Shlomo, R. Silberman, and I. Stoop. 2015. “Preface.” Journal of Official Statistics 31(2): 149–153. Doi: http://dx.doi.org/10.1515/jos-2015-0011.

Kung, K.S., K. Greco, S. Sobolevsky, and C. Ratti. 2014. “Exploring Universal Patterns in Human Home-Work Commuting from Mobile Phone Data.” PLoS ONE 9(6): e96180. Doi: http://dx.doi.org/10.1371/journal.pone.0096180.

Marchetti, S., C. Giusti, M. Pratesi, N. Salvati, F. Giannotti, D. Pedreschi, S. Rinzivillo, L. Pappalardo, and L. Gabrielli. 2015. “Small Area Model-Based Estimators Using Big Data Sources.” Journal of Official Statistics 31(2): 263–281. Doi: http://dx.doi.org/10.1515/JOS-2015-0017.

Nurmi, P. and S. Bhattacharya. 2008. “Identifying Meaningful Places: The Non-Parametric Way.” In Pervasive Computing, edited by J. Indulska, D. Patterson, J. Rodden, and M. Ott, 111–127. Berlin: Springer Berlin.

OPAL. 2018. “The OPAL project.” Available at: http://www.opalproject.org/ (Last accessed February 2018).

Pappalardo, L., M. Vanhoof, L. Gabrielli, Z. Smoreda, D. Pedreschi, and F. Giannotti. 2016. “An Analytical Framework to Nowcast Well-Being Using Mobile Phone Data.” International Journal of Data Science and Analytics 2(1–2): 75–92. Doi: http://dx.doi.org/10.1007/s41060-016-0013-2.

Phithakkitnukoon, S., Z. Smoreda, and P. Olivier. 2012. “Socio-Geography of Human Mobility: A Study Using Longitudinal Mobile Phone Data.” PloS One 7(6): e39253. Doi: http://dx.doi.org/10.1371/journal.pone.0039253.

Raun, J., R. Ahas, and M. Tiru. 2016. “Measuring Tourism Destinations Using Mobile Tracking Data.” Tourism Management 57: 202–212. Doi: http://dx.doi.org/10.1016/j.tourman.2016.06.006.

Ricciato, F., P. Widhalm, M. Craglia, and F. Pantisano. 2015. “Estimating Population Density Distribution from Network-Based Mobile Phone Data.” Luxembourg: Publications Office of the European Union. Doi: http://dx.doi.org/10.2788/162414.

Rubrichi, S., Z. Smoreda, and M. Musolesi. 2017. “A Comparison of Spatial-Based Targeted Disease Containment Strategies Using Mobile Phone Data.” arXiv:1210.0137. Available at https://arxiv.org/pdf/1706.00690.pdf (Last accessed February 2018).

Sakarovitch, B., P. Givord, M.-P. de Bellefon, and M. Vanhoof. In Preparation. “Allô t’es où ? Estimer la population résidente à partir de données de téléphonie mobile, une première exploration.” Economie et Statistique/Economics and Statistics. (Preprint available upon request to authors).

Shen, L. and P.R. Stopher. 2014. “Review of GPS Travel Survey and GPS Data-Processing Methods.” Transport Reviews 34(3): 316–334. Doi: http://dx.doi.org/10.1080/01441647.2014.903530.

Sobolevsky, S., M. Szell, R. Campari, T. Couronné, Z. Smoreda, and C. Ratti. 2013. “Delineating Geographical Regions with Networks of Human Interactions in an Extensive Set of Countries.” PloS One 8(12): e81707. Doi: http://dx.doi.org/10.1371/journal.pone.0081707.

Tizzoni, M., P. Bajardi, A. Decuyper, G.K.K. King, C.M. Schneider, V. Blondel, Z. Smoreda, M.C. Gonzalez, and V. Colizza. 2014. “On the Use of Human Mobility Proxies for Modeling Epidemics.” PLoS Computational Biology 10(7): e1003716. Doi: http://dx.doi.org/10.1371/journal.pcbi.1003716.

Vanhoof, M., S. Combes, and M.-P. de Bellefon. 2017a. “Mining Mobile Phone Data to Detect Urban Areas.” In Proceedings of the Conference of the Italian Statistical Society (SIS), edited by A. Petrucci and R. Verde, 1005–1012. Florence: Firenze University Press. ISBN (online) 978-88-6453-521-0.

Vanhoof, M., L. Hendrickx, A. Puussaar, G. Verstraeten, T. Ploetz, and Z. Smoreda. 2017b. “Exploring the Use of Mobile Phones during Domestic Tourism Trips.” Netcom 31(3/4): 335–372.

Vanhoof, M., W. Schoors, A. Van Rompaey, T. Ploetz, and Z. Smoreda. 2018. “Correcting Mobility Entropy for Regional Comparison of Individual Movement Patterns.” Journal of Urban Technology 25(2): 27 –61. Doi: http://dx.doi.org/10.1080/10630732.2018.1450593.

Wolf, J., R. Guensler, and W. Bachman. 2001. “Elimination of the Travel Diary: Experiment to Derive Trip Purpose from GPS Travel Data.” Transportation Research Record 1768: 125–134. Doi: http://dx.doi.org/10.3141/1768-15.

Ye, J. 2011. “Cosine Similarity Measures for Intuitionistic Fuzzy Sets and their Applications.” Mathematical and Computer Modelling 53(1–2): 91–97. Doi: http://dx.doi.org/10.1016/j.mcm.2010.07.022.

Journal of Official Statistics

The Journal of Statistics Sweden

Journal Information


IMPACT FACTOR 2017: 0.662
5-year IMPACT FACTOR: 1.113

CiteScore 2017: 0.74

SCImago Journal Rank (SJR) 2017: 1.158
Source Normalized Impact per Paper (SNIP) 2017: 0.860

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 355 355 76
PDF Downloads 288 288 69