Software Security at Rocketship Pace
Photo by Yancy Min on Unsplash
Overview
Helping our engineers build amazing products that are worthy of our customer’s trust is the job of our product security team. One of the tools in the product security team’s arsenal is our code scanning platform which we lovingly call “Intersect”.
Effective code scanning is a core building block of a modern security program. Traditionally, these platforms have been confusing and slow to use which has meant that many organisations have struggled to integrate them into a modern software development environment. In this post, we outline the approach we took when designing Intersect and the lessons we’ve learned on the journey.
We hope that by sharing our approach we can help other teams decide how best to apply code scanning in their program.
Philosophy
We started by outlining what we wanted to achieve with our code scanning program:
- Findings should lead to vulnerabilities actually getting fixed.
- Focus on new code first.
- Make it fast to create new signatures for security flaws and eliminate them from our entire code base
- Start small and iterate.
We also decided on some design goals:
- (No) False Positives - It’s better to miss some findings than to destroy trust in our tooling by flooding our engineers with useless noise
- Actionable- If an engineer can’t quickly determine what is required they will treat findings as noise.
- Native - We need to fit into the engineer’s existing workflow so we don’t burden them with additional work
- Fast - If an engineer needs to merge a PR in 10 minutes they should be able to do so with confidence
- Flexible - We need to be able to adapt to the needs of our engineers quickly, for example, by adding support for new frameworks and languages
Approach
Orchestration
One of the key things we identified is that there was no single tool on the market that addressed all our needs in the SAST (Static Application Security Testing) \& SCA (Software Composition Analysis) spaces. To achieve the coverage we were looking for, the decision was made to use multiple tools and build an orchestration layer to make all the tools work together.
This approach allows us to continue to improve the platform, for example by adding new tools, in the process adding value to all existing applications and repositories without any effort on their part. The abstraction layer also allows us to keep up with our users’ changing needs (e.g. language/framework support) in a way that is easy to scale and simple to maintain.
A high level diagram of our architecture can be found below:
Github Advanced Security
Github Advanced Security is the foundation and secret sauce behind our code scanning platform. This suite packages a number of tools including:
- CodeQL (SAST)
- Dependabot (SCA)
- Secrets Detection
- SARIF Based Code Scanning Results Management
- Code Scanning Result Management UI
The key feature for us was the platform’s ability to ingest SARIF output from other tools. The SARIF format is an open source standard for SAST results and has growing popularity amongst open source and enterprise tooling. This feature allows us to provide a single aggregation point for the scanning results from all our tools.
The platform also presents scanning results inline with the code in the Github pull request review UI. This feature was an absolute game changer as findings can be presented with all the surrounding context an engineer would need to resolve the issue.
Take the example of a user trying to commit a vulnerable codeblock. Within the PR review UI they would likely be presented with something like this:
Presenting engineers with contextualised scan results within the Github UI has made it easier for engineers to triage and address security issues prior to those issues being merged to the default branch in near real time. Our experience has shown that an improved user experience will lead to a higher rate of issues being resolved as engineers know exactly what the problem is and are given actionable advice for remediation on the spot. Furthermore, these issues can come from any tool, not just CodeQL!
This feature provides an ideal aggregation layer that our orchestration solution can utilise to push all the results from different scans into a centralised, developer facing front.
CodeQL
CodeQL is the SAST tool included with Github Advanced Security. The CodeQL engine provides a powerful framework for making queries about the dataflow in an application. Allowing for customised source to sink analysis rules to be created and applied organisationally in a relatively straightforward manner.
CodeQL provides full path information as part of it’s SARIF output. This allows for engineers to view the full source to sink flow which has been flagged by the scanner.
The CodeQL strongly met our first 3 criteria for assessing tooling and the speed of the CodeQL scanning was strong when compared to other tools in the space (It should be noted that as you add additional rules the scanners speed does decrease slightly).
Snyk
Snyk is one of the leaders in the open source security space. Their tool provides a method to identify issues in software dependency trees. Additionally, the tooling provides a means of scanning Infrastructure as Code (IaC) and Kubernetes (K8S) configurations.
Like many technology companies, we support a diverse range of languages and build frameworks. All of these technologies have their own supply chain and associated risks that need consideration. When looking to address our issues in this space, Snyk stood out due to their support for analysis of more complex builds (Kotlin Gradle) and the rate at which they were expanding their tooling (Our tooling changes fast, it’s important that our tools grow with us).
Furthermore, Snyk natively supports SARIF as an output format for its scans. This means that issues about dependencies or even infrastructure misconfigurations can be integrated into the Github Advanced Security platform.
Snyk also has the ability to scan PR’s both via the Github Scan Management platform and with their own custom integration. In practice we found the Snyk default check worked great on simple builds (Yarn, Groovy Gradle, ETC) but for more complex builds (Kotlin Gradle, ETC) it was better to integrate the checks with the Github Advanced Security platform.
Snyk’s scanner can be configured to check for reachability (i.e. do I actually call the vulnerable function in the library?). This presents a powerful method of only presenting the issues that immediately need fixing. Note: In testing we observed that this feature significantly slowed down scans. It worked great on smaller codebases, but was very slow for large code bases.
Finally, Snyk supports automated PR’s which is helpful with keeping up with the never ending flow of CVE’s in mature libraries and frameworks.
Semgrep
Semgrep is a powerful open source SAST tool that can be used to quickly and easily identify patterns in your codebase that may lead to security issues, or don’t align your codebase to established coding standards.
The best part, you’re in full control of defining these standards and can easily define a baseline that meets your requirements. Semgrep does not analyse data flow - rather it acts as a “Semantic Aware Grep”. This tradeoff provides a very performant tool that can check a large amount of patterns very quickly.
Furthermore, constructing queries for the tool is exceptionally easy due to the web UI available on their website for building and testing queries. These queries can then be added to your scan. This appealed to us as it made it quick and easy to write rules that push our engineers towards our internal best practices as well as identify bad patterns in our codebase.
Finally, Semgrep also provides native support for the SARIF output format. Which makes it a breeze to integrate with Github.
Conclusion
Because of the design choices we made, we were able to get to over 50% coverage of all code commits at Afterpay in the first four months of Intersect’s life. In that time we’ve been able to prevent countless bugs, from Injection to Improper Logging, from ever reaching production. We have also been able to identify and patch hundreds of vulnerabilities in libraries that we depend on, often within days.
We believe the key behind our success was a clear understanding of our end users’ needs. By living our team’s core value of “empathy” we were able to make a solution that balanced managing security risk with the impact it had on the engineering experience.
By investing internal engineering resources into creating consistent abstractions we expect to be able to maintain the same core platform, while continually adding more tooling and detections, for a much longer time than traditional integration methods.
We hope that our experience building our code security tooling will help you on your security journey!
We should also mention that we are growing! If you are the type of person that likes to solve tricky security problems at scale, check out our open roles or let us know what you’re looking for out of your next security role.