Organizational Design of Big Data and Analytics Teams

Open access

Abstract

Although many would argue that the most important factor for the success of a big data project is the process of analyzing the data, it is more important to staff, structure and organize the participants involved to ensure an efficient collaboration within the team and an effective use of the tool sets, the relevant applications and a customized flow of information. A main challenge of big data projects originates from the amount of people involved and that need to collaborate, the need for a higher and specific education, the defined approach to solve the analytical problem that is undefined in many cases, the data-set itself (structured or unstructured) and the required hard- and software (such as analysis-software or self-learning algorithms). Today there is neither an organizational framework nor overarching guidelines for the creation of a high-performance analytics team and its organizational integration available. This paper builds upon (a) the organizational design of a team for a big data project, (b) the relevant roles and competencies (such as programming or communication skills) of the members of the team and (c) the form in which they are connected and managed.

[1] Agrawal, R., Imielinski, T. and Swami, A. (1993). Mining association rules between sets of items in large databases. SIGMOD Conference, p. 207-16

[2] Adamson, D (1995). Blaise Pascal - Mathematician, Physicist and Thinker about God. ISBN 978-0-230- 37702-8

[3] Baker, T. (2017). Performance Management for Agile Organizations. Palgrave Macmillan. Brisbane, Queensland, Australia. DOI 10.1007/978-3-319-40153-9

[4] Burton, R. M., Børge, O. 2018. The science of organizational design: fit between structure and coordination. Journal of Organization Design. Springer Open

[5] Chen, H., Chiang, R. H. L., Storey, V. C. (2012). Business Intelligence and Analytics: From Big Data to Big Impact. MIS Quarterly; Special Issue: Business Intelligence Research

[6] Constine, J. (2017). Facebook now has 2 billion monthly users…and responsibilities. https://techcrunch.com/2017/06/27/facebook-2-billion-users/

[7] Contractor, N. S., Wasserman, S., Faust, K. (2006). Testing multitheoretical, multilevel hypotheses about organizational networks: An analytics framework and empirical example. The Academy of Management Review, Vol. 31, No. 3, pp. 681-703. Stable URL: http://www.jstor.org/stable/20159236

[8] Cordova, A., Keller, K. M., Menthe, L.; Rhodes, C. (2013). Virtual Collaboration for a Distributed Enterprise. Chapter title: Conclusions and Recommendations. Published by RAND Corporation. Stable URL: http://www.jstor.org/stable/10.7249/j.ctt5hhw1p.13

[9] Cortes, C. & Vapnik, V. (1995). Support-vector networks. Machine Learning 20(3). www.springerlink.com/content/k238jx04hm87j80g/

[10] Crosland, M. P. (1967). The Society of Arcueil: A View of French Science at the Time of Napoleon I. Harvard University Press.

[11] Debortoli, S, Müller, O., Brocke, J. von (2014). Vergleich von Kompetenzanforderungen an Business- Intelligence- und Big-Data-Spezialisten. Eine Text-Mining-Studie auf Basis von Stellenausschreibungen. Springer Fachmedien Wiesbaden. DOI 10.1007/s11576-014-0432-4

[12] Dietterich T. G., (1997). Machine learning: Four current directions. Department of Computer Science, Oregon State University, Corvallis

[13] Dijcks, J.-P. (2013). Big Data for the Enterprise. An Oracle White Paper. Oracle Corporation. http://www.oracle.com/us/products/database/big-data-for-enterprise-519135.pdf

[14] Donald A. M., Peppard, J. (2013). Why IT fumbles analytics. Harvard Business Review. http://hbr.org/2013/01/why-it-fumbles-analytics

[15] Economist, The (2011). Beyond the PC. Special report on personal technology. http://www.economist.com/node/21531109

[16] Ebert, P., Freibichler, W. 2017. Nudge management: applying behavioural science to increase knowledge worker productivity. Journal of Organization design. Springer Open

[17] Geissbauer, R., Schrauf, S., Berttram, P., Cheraghi, F. (2017). Digital Factories 2020. Shaping the future of manufacturing. Published by PricewaterhouseCoopers GmbH Wirtschaftsprüfungsgesellschaft (PwC)

[18] Hackl, P. & Katzenbeisser, A. (1996). Statistik. Oldenbourg, München, Wien; 10. Edition

[19] Hadjinicolaou, N., Dumrak, J., Mostafa, S. (2018). Improving Project Success with Project Portfolio Management Practices. Springer International Publishing AG

[20] Hajek, P., Havel, I. and Chytil, M. (1966) The GUHA method of automatic hypotheses determination. Computing 1(4), p. 293-308

[21] Håkonsson, T., Carroll, T (2016). Is there a dark side of Big Data - point, counterpoint. Journal of Organization Design. Springer Open

[22] Hitchcock, E. (2017). 5 Big Data Job Descriptions to Hire an All-Star-Team. https://www.datameer.com/company/datameer-blog/big-data-job-descriptions-hire-recruit-team/

[23] Howe, J. (2006). The Rise of Crowdsourcing. Wired, Issue 14.06, June 2006

[24] Hudec, M. & Neumann, C. (n. d.) Was ist Statitik? Geschichte, Grundlagen, Anwendungen. Institut für Statistik der Universität Wien. http://www.stat4u.at/download/1417/WasIstStatistik.pdf

[25] Kübler, R. V., Wieringa, J. E. & Pauweis, K. H. (2017). Advanced Methods for Modeling Markets. International Series in Quantitative Marketing. Chapter 19: Machine Learning and Big Data, p. 631. Springer International Publishing AG.

[26] Kuls, N. (2018). Absturz einer Internet-Ikone. http://www.faz.net/aktuell/finanzen/facebook-datenskandalabsturz-einer-internet-ikone-15515647.html

[27] Larson, E. (1989). How do they get your name? Direct-mail firms have vast intelligence network tracking consumers. http://articles.orlandosentinel.com/1989-07-26/lifestyle/8907254531_1_subscribe-to-magazinessubscriber- list-junk-mail

[28] Loh, W.-Y., (2011). Classification and regression trees. WIREs Data Mining and Knowledge Discovery. https://onlinelibrary.wiley.com/doi/abs/10.1002/widm.8

[29] Lohr, S. (2013). The Origins of ‘Big Data’: An Etymological Detective Story. https://bits.blogs.nytimes.com/2013/02/01/the-origins-of-big-data-an-etymological-detective-story/

[30] Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A. H. (2011). Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute. https://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/big-data-the-next-frontier-forinnovation

[31] Marr, B. (2016). A short history of machine learning. https://www.forbes.com/sites/bernardmarr/2016/02/19/ashort- history-of-machine-learning-every-manager-should-read/#4eb8a56315e7

[32] Mauro, A. D., Greco, M., Grimaldi, M. (2015). What is big data? A consensual definition and a review of key research topics. http://dx.doi.org/10.1063/1.4907823

[33] Milutinovic, V, et al. (2017). DataFlow Supercomputing Essentials. Algorithms, Applications and Implementations. Chapter 5: DataFlow Systems: From their origins to future Applications in Data Analytics, Deep Learning, and the Internet of Things. Springer International Publishing AG

[34] Monica, P. R. L. (2018). Facebook has lost $80 billion in market value since its data scandal. http://money.cnn.com/2018/03/27/news/companies/facebook-stock-zuckerberg/index.html

[35] Monnappa, A. (2017). How Facebook is using Big Data - The good, the bad, and the ugly. https://www.simplilearn.com/how-facebook-is-using-big-data-article

[36] Moon, T. K. (1996). The expectation-maximization algorithm. Elect. & Comput. Engineering Department, Utah State University, Logan, USA

[37] Moore, D. (1992). Statistics for the twenty-first century. Teaching statistics as a respectable subject. The Mathematical Association of America, Washington, DC

[38] Nunan, D. & Domenico M. D. (2015). Big Data: A normal accident waiting to happen? Journal of Business Ethics. Springer Science+Business Media Dordrecht 2015

[39] Page, L. (1998). Method for node ranking in a linked database. Patent number: US19980004827 19980109

[40] Panson (2015). Top Data Mining Algorithms Identified by IEEE & Related Python Resources. https://www.datasciencecentral.com/profiles/blogs/python-resources-for-top-data-mining-algorithms

[41] Pedersen, C. L., Ritter, T. (2017). The 4 types of project manager. Article project management. Harvard Business School Publishing Corporation. Reprint H03SJ5.

[42] Quinlan, J. R. (1986 and 2007). Induction of Decision Trees. Centre for Advanced Computing Sciences, New South Wales Institute of Technology, Sydney, Australia

[43] Reimer, K., Rutz, O. J., Pauwels, K. H. (2014). How online consumer segments differ in long-term marketing effectiveness.

[44] Rosenthal, C. (2013). Big data in the age of the telegraph. https://www.mckinsey.com/businessfunctions/ organization/our-insights/big-data-in-the-age-of-the-telegraph

[45] Russom, P. (2011). Big Data Analytics. TDWI Best Practices Report, fourth quarter 2011. https://vivomente.com/wp-content/uploads/2016/04/big-data-analytics-white-paper.pdf

[46] Scott, J. (2014). 5 Google projects that changed big data forever. https://mapr.com/blog/5-google-projectschanged- big-data-forever/

[47] Soni, Y. (2017). So what is machine learning? https://becominghuman.ai/machine-learning-for-dummiesexplained-in-2-mins-e83fbc55ac6d

[48] Statista.com (2018). Internet of Things (IoT) connected devices installed base worldwide from 2015 to 2025 (in billions) (2018). https://www.statista.com/statistics/471264/iot-number-of-connected-devices-worldwide/

[49] Sullivan, D. (2015). How machine learning works, as explained by Google. https://martechtoday.com/howmachine- learning-works-150366

[50] Teece, David J., Linden, G. (2017). Business models, value capture, and the digital enterprise. Journal of Organization Design. Springer Open

[51] Tilly, C. (1984). The old new social history and the new old social history. Research Foundation of State University of New York for and on behalf of the Fernand Braudel Center. Stable URL: http://www.jstor.org/stable/40241514

[52] Trevino, A. (2016). Introduction to k-means clustering. https://www.datascience.com/blog/k-means-clustering

[53] Vagata, P. & Wilfong, K. (2014). Scaling the Facebook data warehouse to 300PB. https://code.facebook.com/posts/229861827208629/scaling-the-facebook-data-warehouse-to-300-pb/

[54] Wasserman, L. (2012). Normal Deviate. Thoughts on Statistics and Machine Learning. Statistics versus machine learning. https://normaldeviate.wordpress.com/2012/06/12/statistics-versus-machine-learning-5-2/

[55] Wehrt, K. (1985). Beschreibende Statistik: Eine Einführung. Campus-Verlag, Frankfurt

[56] Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G. J., Ng, A., Liu, B., Yu, P. S., Zhou, Z.-H., Steinbach, M., Hand, D. J., Steinberg, D. (2007). Top 10 algorithms in data mining. Springer Verlag London Limited.

Journal Information

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 49 49 31
PDF Downloads 50 50 23