I'm proud to announce new performance goals for Sonar analysis.
Historically, when users talked about Sonar analysis performance, we could easily classify them into one of two groups:
- Challengers pushing our limits, reporting cases where they thought we should improve.
- Satisfied users, happy because they were used to SAST tools that ran for hours to produce a lot of false-positive results.
But in neither case did we know how to respond. Because when we started building our own analysis engine, it was without clear performance goals in mind. And without knowing where we were headed, it was impossible to know if we'd gotten there yet. So if you told us the performance wasn't good enough, we didn't know whether you were right or wrong.
That's why we've finally defined our own performance goals for analysis - so that we're no longer subjecting ourselves to apples-to-oranges comparisons with tools that may not have the same goals or outcomes. Or too-subjective, personal assessments of how analysis "seems".
Now, we can clearly state what you can expect from analysis, and how long analysis of a project should take under standardized conditions.
So let's get into what the goals are, and where we stand today.
How long for a first analysis?
A first analysis should be understood as the analysis of all the files of a branch. This happens when you onboard a new project into SonarQube or SonarCloud and again when you create a new branch. In this context, you should expect to see the overall status of your project in fewer than x minutes, where x depends on the size of your project:
Project Size | Expected Duration |
≤ 1k LOC (XS) | ≤ 30s |
10k LOC (S) | ≤ 1 min |
100k LOC (M) | ≤ 5 min |
500k LOC (L) | ≤ 20 min |
1M LOC (XL) | ≤ 40 min |
From what we have measured on SonarCloud, we are on track for the M, L, and XL project sizes - 95% of these projects are analyzed within the targets. For XS and S, we are not on track mainly because of the time to start the analysis.
How long for a code change analysis?
A code change analysis happens:
- when you create a pull request and you want to validate the quality of the PR before merging it
- when you directly commit files to a branch (main or otherwise) without using a pull/merge request mechanism
In such a context, it’s natural to expect the analysis time to be proportional to the size of the changeset (the amount of added or updated code) and not have to wait the same amount of time as a first analysis.
Here, you should expect to see the updated Quality Gate of your project, branch, or PR in fewer than x minutes, where x depends on the size of the code change:
Code Change Size | Expected Duration |
≤ 1k LOC (XS) | ≤ 30s |
10k LOC (S) | ≤ 1 min |
100k LOC (M) | ≤ 5 min |
What has been done so far toward these goals?
Definition: a project can contain multiple programming languages. It’s convenient to speak about a given project as a Java, TypeScript, or PHP, … project. We do this by naming the project after the language that has the biggest Lines of Code density in the project.
For the first analysis durations
For Java projects, the general performance has been improved, making the Java analysis 30% faster on average with SonarQube 9.4 compared to SonarQube 9.3. A customer who tested this version said they were able to analyze a 1M LOC project in less than 18 minutes, putting us in a good position compared to our target (< 40min).
For Kotlin projects, we improved the performance by a factor of 10 which makes us reach our performance targets.
For C/C++ projects, analyses are multithreaded by default starting from SonarQube 9.5. Before it was an opt-in option. We no longer think it makes sense so we turned it on by default. With this change, it’s easy to reach our targets by allocating more CPUs to your analyses.
For code change analysis durations
For a lot of languages covered by Sonar, we don’t need to gather knowledge from all files to raise good results. In such a case, only the changed files are analyzed in a pull request context. This is available starting from SonarQube 9.3 and on SonarCloud since the 3rd of May. Pull Requests analysis time is generally improved if they contain CSS, HTML, XML, Ruby, Scala, Go, Apex, CloudFormation, Terraform, Swift, PL/SQL, T-SQL, ABAP, VB6, Flex, and RPG code changes.
For Pull Requests containing a majority of Java Code, there is an additional 8-25% gain compared to before because we started to only run the rules on change files that don’t require project-level data.
Overall it’s better, but we are not yet reaching our code change analysis duration targets.
What are the next steps?
As a first priority, we want to optimize the pull request analysis time of Java projects. We will do that by relying on a new cache mechanism storing project-level data. This will ensure to keep a high level of accuracy of our results. Why Java first? Java is the first language that was supported by Sonar and is one of our biggest user communities. Additionally, Sonar’s developers use a lot of Java so we will be able to find problems easily before the release.
Next, we will rely on the same cache system to optimize the code change analysis of branches.
When that is stable, we will extend it to languages such as JS/TS, PHP, Python, and COBOL.
How can you contribute?
If you are on SonarCloud or the latest version of SonarQube, we would love to get your feedback as soon as we announce improvements to confirm our internal measurements that the overall analysis duration has been improved.