Analysis of data from the survey with developers on Stack Overflow: A Case Study

Yogesh Beeharry, Manish Ganoo


Many businesses are understanding the current evolution of Big Data Analytics around the world. Along this line, businesses are investing enormously in view not to lose competitive advantage. The work in this paper, analyses the data from the survey conducted with the numerous developers on Stack Overflow in order to gain insights on the directions of programming languages, databases, and the job seeking status of the developers. Results show that the trend is developers want to use more the programming languages and databases used on cloud platforms for Big Data Analytics. Additionally, a Distributed Random Forest model with 87.64% accuracy, for predicting the job seeking status of developers shows that the developers may not be looking to move to new job environments and would prefer staying in their current company or organisation. This would be an indication that developers are most probably looking forward to bring added value to their current companies/organisations as Big Data Analytics would start to be adopted.

Full Text:



M. Chambers, C. Doig and I. Stokes-Rees, Breaking Data Science Open (How Open Data Science is Eating the World), CA: O'Reilly Media Inc., 2017.

L. Columbus, “IBM Predicts Demand For Data Scientists Will Soar 28% By 2020,” IBM, 13 May 2017. [Online]. Available: [Accessed 26 December 2017].

V. Granville, “Data Science Central,” 14 December 2016. [Online]. Available: [Accessed 26 December 2017].

Environmental Science, “What Is a Data Scientist?,” 2017. [Online]. Available: [Accessed 26 December 2017].

A. Rosenblum, “The Tools of Big Data Science: The Technologies & Languages of Statistical Analysis,” Business-2-Community, 19 March 2016. [Online]. Available: [Accessed 26 December 2017].

Gartner Inc., “Gartner Says More Than 40 Percent of Data Science Tasks Will Be Automated by 2020,” Gartner Inc., Sydney, Australia, 2017.

G. Piatetsky, “Forrester vs Gartner on Data Science Platforms and Machine Learning Solutions,” KDnuggets, April 2017. [Online]. Available: [Accessed 26 December 2017].

M. S. Farooq, S. A. Khan, F. Ahmad, S. Islam and A. Abid, “An Evaluation Framework and Comparative Analysis of the Widely Used First Programming Languages,” Plos One, vol. 9, no. 2, pp. 1-25, 2014.

B. Vassilev, R. Louhimo, E. Ikonen and S. Hautaniemi, “Language-Agnostic Reproducible Data Analysis Using Literate Programming,” Plos One, vol. 11, no. 10, pp. 1-14, 2016.

Kaggle, “Kaggle,” 2017. [Online]. Available: [Accessed 27 December 2017].

KNIME AG, “H2O Random Forest Learner,” H2O, [Online]. Available: [Accessed 19 May 2018].

E. S. Walsh, B. J. Kreakie, M. G. Cantwell and D. Nacci, “A Random Forest approach to predict the spatial distribution of sediment pollution in an estuarine system,” PloS ONE, vol. 12, no. 7, pp. 1-18, 2017.

I. Pointer, “Which freaking big data programming language should I use?,” Info World, April 2016. [Online]. Available:


  • There are currently no refbacks.


The “ADBU Journal of Engineering Technology (AJET)" ISSN:2348-7305

This journal is published under the terms of the Creative Commons Attribution (CC-BY) (

Number of Visitors to this Journal:web counter