Featuring insights from more than 4,000 respondents, the report illustrates how the past year impacted the data science industry, along with areas of success and places for continued growth for the field.
AUSTIN, Texas, July 22, 2021 (GLOBE NEWSWIRE) — The past year has impacted all areas of life and work in myriad ways, and the data science field is no exception. Today, Anaconda, provider of the world’s most popular data science platform, released the 2021 State of Data Science report, sharing valuable insight into some of the changes, trends, and areas for growth in the industry.
Titled “2021 State of Data Science: On the Path to Impact,” the report looks at data science industry trends such as COVID-19’s impact, popular programming languages, data literacy, and bias and explainability in machine learning (ML) models. 4,299 individuals from more than 140 countries took part in the online survey conducted from April 14 to May 5, 2021.
Among the trends highlighted by the report, it’s clear that data science has become an essential part of many organization’s workflows. However, the results also indicate there is still room to grow. For example, 50% of respondents said that their organization’s investment in data science either stayed the same or increased due to the pandemic, with only 37% saying it led to a decrease. The 2021 State of Data Science report offers insight into the practices and concerns shaping this vital industry, from education to enterprise model deployment. Respondents spanned various age groups, industries, and job functions, allowing for granular analysis of responses across multiple variables.
Some highlights from the report include:
- Many business leaders often think quantity should be at the forefront when collecting data for analysis, and this idea makes sense on a certain level. However, survey results suggest that a faulty assumption may lie at the heart of this question. When asked what the biggest myth in data science is, 31% of respondents said it’s the idea that having access to more data translates to greater accuracy.
- This year’s survey results show that, in the data science sector, automation is welcomed and isn’t seen as a competitor but rather a complementary tool to workers. 55% of respondents hope to see more automation and AutoML in data science, while only 4% are concerned with how automation will impact data scientists’ careers. In fact, when asked what the biggest myth is in data science, 33% of respondents replied it’s that “data scientists will be replaced by AI soon.”
- When asked, 87% responded that their organization allows the use of open-source software.
- Respondents indicated the biggest problem to tackle in the AI/ML area today was the social impacts caused by bias in data and models; this was the most popular answer, at 31% of responses. However, the survey also revealed that there is still progress to be made to tackle this issue. When asked if their data science teams have any plans to ensure fairness and mitigate bias in data sets and models, 30% of respondents stated that their organizations didn’t have plans, and 30% responded, “I don’t know.” While only 10% of respondents indicated their organization has already implemented a solution to ensure fairness and mitigate bias, it’s encouraging to see 30% said they are planning to implement a step in the next year, whereas, in the 2020 State of Data Science survey results, only 23% of respondents said that they planned to implement one step in the next 12 months.
“Over the past year, we’ve seen the power of data science for good and the innovation that can happen when the open-source data science community is supported,” said Peter Wang, CEO and co-founder of Anaconda. “This year’s State of Data Science report indicates the field is continuing to show impact, as some organizations increase their investments, many leaders display baseline data literacy, and about half of the respondents see their work involved in many business decisions. There are still clear areas for growth, especially in implementing ethical frameworks and education for data science and machine learning work. I’m excited by the progress the industry continues to make, especially as new generations enter the field, and look forward to its ongoing transformation.”
With more than 25 million users, Anaconda is the world’s most popular data science platform and the foundation of modern machine learning. We pioneered the use of Python for data science, champion its vibrant community, and continue to steward open-source projects that make tomorrow’s innovations possible. Our enterprise-grade solutions enable corporate, research, and academic institutions around the world to harness the power of open-source for competitive advantage, groundbreaking research, and a better world.