Contributors to the Scorecards project, an automated security tool that produces a “risk score” for open source projects, have accomplished a lot since our launch last fall. Today, in collaboration with the Open Source Security Foundation community, we are announcing Scorecards v2. We have added new security checks, scaled up the number of projects being scored, and made this data easily accessible for analysis.
With so much software today relying on open-source projects, consumers need an easy way to judge whether their dependencies are safe. Scorecards helps reduce the toil and manual effort required to continually evaluate changing packages when maintaining a project’s supply chain. Consumers can automatically assess the risks that dependencies introduce and use this data to make informed decisions about accepting these risks, evaluating alternative solutions, or working with the maintainers to make improvements.
Since last fall, Scorecards’ coverage has grown; we’ve added several new checks, following the Know, Prevent, Fix framework proposed by Google earlier this year, to prioritize our additions:
Contributors with malicious intent or compromised accounts can introduce potential backdoors into code. Code reviews help mitigate against such attacks. With the new Branch-Protection check, developers can verify that the project enforces mandatory code review from another developer before code is committed. Currently, this check can only be run by a repository admin due to GitHub API limitations. For a third-party repository, use the less informative Code-Review check instead.
Despite best efforts by developers and peer reviews, vulnerable code can enter source control and remain undetected. That’s why it’s important to enable continuous fuzzing and static code analysis to catch bugs early in the development lifecycle. We have added checks to detect if a project uses Fuzzing and SAST tools as part of their CI/CD system.
A common CI/CD solution used by GitHub projects is GitHub Actions. A danger with these action workflows is that they may handle untrusted user input. Meaning, an attacker can craft a malicious pull request to gain access to the privileged GitHub token, and with it the ability to push malicious code to the repo without review. To mitigate this risk, Scorecard’s Token-Permissions prevention check now verifies that the GitHub workflows follow the principle of least privilege by making GitHub tokens read-only by default.
Any software is as secure as its weakest dependency. This may sound obvious, but the first step to knowing our dependencies is simply to declare them… and have our dependencies declare them too. Once we have this provenance information, we can assess the risks of our software and mitigate those risks. Unfortunately, there are several widely-used anti-patterns that break this provenance principle. The first of these anti-patterns is checked-in binaries — as there’s no way to easily verify or check the contents of the binary in the project. Scorecards provides Binary-Artifacts check for testing this.
Another anti-pattern is the use of curl | bash in scripts which dynamically pulls dependencies. Cryptographic hashes let us pin our dependencies to a known value: if this value ever changes, the build system will detect it and refuse to build. Pinning dependencies is useful everywhere we have dependencies: not just during compilation, but also in Dockerfiles, CI/CD workflows, etc. Scorecards checks for these anti-patterns with the Frozen-Deps check. This check is helpful for mitigating against malicious dependency attacks such as the recent CodeCov attack.
Even with hash-pinning, hashes need to be updated once in a while when dependencies patch vulnerabilities. Tools like dependabot or renovatebot give us the opportunity to review and update the hashes. The Scorecards Automated-Dependency-Update check verifies that developers rely on such tools to update their dependencies.
It is important to know vulnerabilities in a project before uptaking it as a dependency. Scorecards can provide this information via the new Vulnerabilities check, without the need to subscribe to a vulnerability alert system.
Scaling the impact
To date, the Scorecards project has scaled up to evaluate security criteria for over 50,000 open source projects. In order to scale this project, we undertook a massive redesign of our architecture and used a PubSub model which achieved horizontal scalability and higher throughput. This fully automated tool periodically evaluates critical open source projects and exposes the Scorecards check information through a public BigQuery dataset which is refreshed weekly.
To export the latest data on all analyzed projects, see instructions here.
How does the internet measure up?
Scorecards data for available projects is now included in the recently announced Google Open Source Insights project and also showcased in OpenSSF Security Metrics project. The data on these sites shows that there are still important security gaps to fill, even in widely used packages like Kubernetes.
We also analyzed Scorecards data through Google Data Studio — one of our data analysis and visualization tools.The diagram below shows a breakdown of the checks that were run and the pass/fail outcome for the 50,000 repositories:
As we can see, a lot needs to be done to improve the security of these critical projects. A large number of these projects are not continuously fuzzed, do not define a security policy for reporting vulnerabilities, and do not pin dependencies, to name just a few common problems. We all need to come together as an industry to drive awareness of these widespread security risks, and to make improvements that will benefit everyone.
Scorecards in Action
Several large projects have adopted Scorecards and are keeping us updated on their experiences with it. Below are some examples of Scorecards in action:
Early on we talked about how the Envoy maintainers adopted Scorecards for their project and integrated it within their policy on introducing new dependencies. Since then, pull requests introducing new dependencies to Envoy must get approval from a dependency maintainer who uses Scorecards to evaluate the dependency against a set of criteria.
In addition, Envoy also got right to work in improving its own security health metrics according to its own Scorecards evaluation, and is now pinning C++ dependencies and requiring pip hashes for python dependencies. Github actions are also pinned in the continuous integration flow.
Previously, Envoy had created a tool that outputs Scorecards data on its dependencies as a CSV that can be used to generate a table of results:
We improved our own score for the Scorecards! For example, we are now pinning our own dependencies by hash (e.g. docker dependencies, workflow dependencies) to prevent CodeCov style attacks. We’ve also included a Security Policy based on this recommended template.
We look forward to continuing to grow the Scorecards community. The project now has contributions from 23 developers. Thank you to Azeem, Naveen, Laurent, Asra and Chris for their work building these new features and scaling Scorecards.
If you would like to join the fun, check out these good first timer issues.
If you would like us to help you run Scorecards on specific projects, please submit a GitHub pull request to add those projects here.
Last but not least, we have a lot of ideas and many more checks we’d like to add, but we want to hear from you. Tell us which checks you would like to see in the next version of Scorecards.
There are a couple of big enhancements we’re especially excited about:
Thanks again to the entire Scorecards community and the OpenSSF for making this project successful. If you’re adopting and improving the score of the projects you maintain, tell us about it. Until next time, keep on improving those scores!