Software Development During COVID-19 Pandemic

An Analysis of Stack Overflow and GitHub

Pedro Almir Oliveira, Pedro Santos Neto, Gleison Silva, Irvayne Ibiapina, Werney Luz, Rossana M. C. Andrade

The new coronavirus became a severe health issue for the world. This situation has motivated studies of different areas to combat this pandemic. In computer science, we point out data visualization projects to follow the disease evolution, machine learning to estimate the pandemic behavior, computer vision processing radiologic images for early detection of the disease, among others. Most of these projects are stored in version control systems, and there are discussions about them in Question & Answer websites. In this work, we conducted a Mining Software Repository to analyze the data of a large number of questions and projects aiming to find trends that could help software development researchers and practitioners to fight the coronavirus. We analyzed 1,190 questions from Stack Overflow and Data Science Q&A and 60,352 GitHub projects. We identified a correlation between the questions and projects throughout the pandemic. The main questions about coronavirus are how-to, related to web scraping and data visualization, using Python, JavaScript, and R. The most recurrent GitHub projects are machine learning projects, using JavaScript, Python, and Java. We realized that many people, including a large number of beginners, are trying to contribute in a someway to tackle the problem. In recent weeks, the number of new projects and questions is decreasing, showing that we are going to stabilize. Finally, we also present a website with our findings, facilitating analysis of everything that has been done, and serving as a support for new solutions that will help in the fight against coronavirus.


Datavis

Distribution by Status

Distribution of Questions and Projects Related to the COVID-19

Number of Questions

Bubble chart

Most recurrent programming languages

Most discussed topics

Scatter plots

Correlating A) number of forks and pull requests, and B) number of forks and disk usage

Correlation between collaborators and commits


Topic Modeling

Topics considering Stack Overflow data

LDA model plotted using the LDAvis method.

Topics considering Github data

LDA model plotted using the LDAvis method.


URLs Related with COVID-19

They were classified as Application, Data repository, Code repository, and Other

Contact

You can contact us using one of the following emails.