Using QL snapshots for analysis of large open source projects

September 12, 2019

Category

Technical Difficulty

Reading time

Semmle QL is a powerful technology that helps security researchers automate variant analysis. Thanks to QL, the Semmle Security research team has found many CVEs. Semmle QL goes beyond the capabilities of a traditional static analysis tool. In order to perform deep analysis with complex control flow and data taint tracking, Semmle generates a detailed snapshot database to represent the hierarchical nature of the codebase.

For compiled languages, it turns out that much of the information you care about is only available during compilation time. Getting all of this information into the QL database requires a build to succeed.

Since LGTM.com is free for the open source community, we are obliged to set a reasonable limit to the build time to ensure that our resources are fairly available across all projects. However, some particularly large open source projects have a build time that exceeds this limit. These projects are therefore not properly available on LGTM.com. They are either partially analyzed, or not at all.

This leads to a situation where the security research community are unable to perform variant analysis with QL on these projects.

These resource limits do not apply to LGTM Enterprise, our licensed code analysis platform. Typical customer deployments build much faster on infrastructure tailored and scaled according to their needs.

While we work to make these big projects available on LGTM.com, we decided to regularly provide snapshots of these projects to the community. On this download page, you will find a list of large open source projects that were built and packaged into snapshots. We will make an effort to regularly update this page, with new revisions and new interesting projects.

If you want to explore one of these projects using QL, then you just need to download QL for Eclipse, and import your chosen snapshot. Instructions on how to use QL for Eclipse for these large snapshots are available also on the download page.

When you have set up the environment to write QL queries, you can run an existing QL query or write a new one. Try it now! Import the Apple XNU snapshot, and run this query, that will find assignments that could cause non-termination:

/**
 * @name Infinite loop
 * @description Updating a loop index with a compound assignment
 *              could cause non-termination.
 * @kind problem
 * @problem.severity warning
 * @id apple-xnu/cpp/infinite-loop
 */

import cpp
import semmle.code.cpp.rangeanalysis.SimpleRangeAnalysis

// Find loops like this:
// while (x) { ...; x -= n; }
from Loop loop, Variable v, AssignArithmeticOperation assign
where
  (
    loop.getCondition() = v.getAnAccess() or
    loop.getCondition().(ComparisonOperation).getAnOperand() = v.getAnAccess()
  ) and
  assign.getLValue() = v.getAnAccess() and
  // Compound assignment is in the body of the loop:
  assign = loop.getStmt().getAChild*() and
  lowerBound(assign.getRValue()) <= 0 and
  upperBound(assign.getRValue()) >= 0
select loop, "Loop might not terminate due to this $@.", assign, "assignment"

Or run this one on the Linux Kernel snapshot to find off-by-one errors:

/**
 * @name Off-by-one error
 * @description An off-by-one error could happen.
 * @kind problem
 * @problem.severity warning
 * @id linux/cpp/off-by-one-error
 */
 
import cpp
import semmle.code.cpp.controlflow.Guards
import semmle.code.cpp.dataflow.TaintTracking

from
  DataFlow::Node source, DataFlow::Node check, GTExpr guard, DataFlow::Node index, ArrayExpr array
where
  // The values coming from `source` are checked at `check`.
  DataFlow::localFlow(source, check) and
  // `check` is the "greater" operand of the `>` comparison `guard`.
  check.asExpr() = guard.getGreaterOperand() and
  // A value derived from `source` is used at `index`.
  TaintTracking::localTaint(source, index) and
  // `index` is the index in an array expression.
  index.asExpr() = array.getArrayOffset() and
  // `index` only executes if `guard` is false.
  guard.(GuardCondition).controls(index.asExpr().getBasicBlock(), false) and
  // Focus on vulnerable results: Only report if the `guard` comparison
  // establishes a lower bound which is too large for the size of the array.
  guard.getLesserOperand().getValue().toInt() >= array
        .getArrayBase()
        .getType()
        .getUnspecifiedType()
        .(ArrayType)
        .getArraySize()
select source, check, index, array.getArrayBase().getType().getUnspecifiedType()

If you're new to QL and would like to get started, then consider trying our CTF Challenges.

We hope you will all join us in making open source software more secure!

Join us in securing the software that runs the world!

Enter your email address below to stay up-to-date with Semmle news, security announcements and product updates.

Loading...