Widely-used code responsible for most open source vulnerabilities

A new report examining the security of programming languages has found that almost 50% of all the vulnerabilities discovered in open source projects since 2009 were coded in C.

The study by WhiteSource revealed that 46.9% of all reported open source vulnerabilities in the past 10 years were developed using C.

PHP was the second most insecure language, making up 16.7% of the vulnerabilities, and Java came in third place with 11.4%.

The list was completed by JavaScript, which accounted for 10.2% of open source project vulnerabilities; Python (5.45%); C++ (5.23%); and Ruby (4.25%).

The number of reported vulnerabilities in open source code has dramatically risen in the past 10 years, according to the report – by almost 2,000 a year.

Reasons for this include heightened security awareness in open source components, along with a more general increase in the popularity of open source projects.

The top vulnerabilities found in C were buffer errors and input validation, the report reads, and although numbers have both risen and fallen since 2009, it remains the most insecure language.

In C’s defense, it should be noted that this is the oldest (and most widely used) programming language in the list.

While JavaScript was number three, it was the only language to see a consistent rise in reported vulnerabilities over the past 10 years. In 2018 alone, reported instances increased by more than 50%.

The report attributes this to JavaScript’s rising popularity, and the fact that it has been widely used in backend development in recent years.

Following the release of the report, The Daily Swig chatted to David Habusha, vice president of WhiteSource, for a deep dive into what the numbers mean.

What do you think causes the difference in vulnerabilities between languages? Is it due to the language design and features? Do some languages attract more or less experienced developers? Or are there other reasons?

David Habusha: There are several reasons for the difference in vulnerabilities between languages.

First, we need to consider that some, like C, have been around for longer than others, and huge volumes of code have been written in that time. More lines of code means more vulnerabilities.

Some languages have more active communities behind them that invest a lot of time in finding and fixing security issues. Some languages are easier to scan automatically for issues, while others require more manual work.

Some languages are a go-to for younger, less experienced developers, while some require more seasoned software engineers.

Many programmers use frameworks like Rails or Django to assist their coding. Do you think there would be a difference in security between frameworks?

DH: Popular frameworks such as Rails, Django, Spring, Struts, and others are a favorite target for hackers, due to their large userbase and their active community.

Many of these frameworks consist of dependent open source packages that are also targets for hackers.

We have analyzed the distribution of vulnerabilities per framework (for example Struts vs. Spring) and the majority of these frameworks include internet-facing front end components with large attack surfaces that leave them very exposed, making them relatively easy to exploit and therefore attracting a lot of focus.

The frameworks differ from each other in the languages they support, the communities that maintain them, the automated tools, and their security vulnerabilities.

One thing that the high-profile projects have in common is that since they are well maintained, there are more vulnerabilities as well as more fixes.

C is quite different to the other languages in your study, providing programmers with both more power and an increased ability to go wrong. Why did you choose to include it in the comparison?

DH: C is behind so many of the software infrastructure that we use, powering projects like the Linux kernel, OpenSSL, and PHP, to name just a few. Added to that is how long it has been around and the considerably large open source community that actively maintains it.

Considering these factors, as well as the number of vulnerabilities it has racked up over the years, we couldn’t imagine creating a list of popular programming languages without including C.

Why do you think JavaScript saw the biggest increase in vulnerabilities?

DH: JavaScript’s popularity has been on a continuous climb for the past few years, and has also recently become more popular among back end developers.

The most common CWE that we found for JS, Cryptographic Issues (CWE-310) and Path Traversal (CWE-22), were anomalies and are unique to JS. We found that many of these vulnerabilities come from barely used, maintained, or downloaded projects.

It’s very likely that these types of CVEs had a big part in increasing the number of JS vulnerabilities. These types of vulnerabilities are relatively easy to find and they increase the total number of disclosed vulnerabilities.

Another factor contributing to the rise in the number of disclosed security vulnerabilities is the use of automated tools to discover vulnerabilities.

This doesn’t mean that JS has become less safe, rather that it has more eyeballs studying it.

Finally, the number of vulnerabilities per year, across all languages, seemed to spike in 2017 and went down last year. Is there an explanation for this?

DH: The spike in 2017 can be explained by the growing awareness to open source security vulnerabilities, leading to more resources invested in security research.

In addition, the increased funding in bug bounty programs, as well as the use of automated tools, has also contributed to the rise in the number of disclosed open source vulnerabilities.