Semgrep scanner reference for STO

You can ingest scan results from Semgrep, an open-source static analysis engine for detecting dependency vulnerabilities and other issues in your code repositories.

The following tutorials include detailed examples of how to run a Semgrep scan in a Run step and ingest the results:

Important notes for running Semgrep scans in STO

Root access requirements

If you want to add trusted certificates to your scan images at runtime, you need to run the scan step with root access.

You can set up your STO scan images and pipelines to run scans as non-root and establish trust for your own proxies using custom certificates. For more information, go to Configure STO to Download Images from a Private Registry.

For more information

The following topics contain useful information for setting up scanner integrations in STO:

Semgrep step configuration

The recommended workflow is to add a Semgrep step to a Security Tests or CI Build stage and then configure it as described below.

Scan

Scan Mode

Ingestion Configure the step to read scan results from a data file and then ingest, normalize, and deduplicate the data.

Scan Configuration

The predefined configuration to use for the scan. All scan steps have at least one configuration.

Target

Type

Repository Scan a codebase repo.

In most cases, you specify the codebase using a code repo connector that connects to the Git account or repository where your code is stored. For information, go to Configure codebase.

Target and variant detection

When auto-detect is enabled for code repositories, the step detects these values using git:

To detect the target, the step runs git config --get remote.origin.url.
To detect the variant, the step runs git rev-parse --abbrev-ref HEAD. The default assumption is that the HEAD branch is the one you want to scan.

Note the following:

Auto-detection is not available when the Scan Mode is Ingestion.
Auto-detect is the default selection for new pipelines. Manual is the default for old pipelines, but you might find that neither radio button is selected in the UI.

Name

The identifier for the target, such as codebaseAlpha or jsmith/myalphaservice. Descriptive target names make it much easier to navigate your scan data in the STO UI.

It is good practice to specify a baseline for every target.

Variant

The identifier for the specific variant to scan. This is usually the branch name, image tag, or product version. Harness maintains a historical trend for each variant.

Ingestion File

The path to your scan results when running an Ingestion scan, for example /shared/scan_results/myscan.latest.sarif.

The data file must be in a supported format for the scanner.
The data file must be accessible to the scan step. It's good practice to save your results files to a shared path in your stage. In the visual editor, go to the stage where you're running the scan. Then go to Overview > Shared Paths. You can also add the path to the YAML stage definition like this:
```
    - stage:
      spec:
        sharedPaths:
          - /shared/scan_results
```

Log Level

The minimum severity of the messages you want to include in your scan logs. You can specify one of the following:

DEBUG
INFO
WARNING
ERROR

Fail on Severity

Every Security step has a Fail on Severity setting. If the scan finds any vulnerability with the specified severity level or higher, the pipeline fails automatically. You can specify one of the following:

CRITICAL
HIGH
MEDIUM
LOW
INFO
NONE — Do not fail on severity

Additional Configuration

In the Additional Configuration settings, you can use the following options:

Advanced settings

In the Advanced settings, you can use the following options:

YAML pipeline example

The following pipeline example illustrates an ingestion workflow. It consists of two steps:

A Run step that uses a Semgrep container to scan the codebase defined for the pipeline and then publish the results to a SARIF data file.
A Semgrep step that ingests the SARIF data.

pipeline:
  projectIdentifier: STO
  orgIdentifier: default
  tags: {}
  stages:
    - stage:
        name: semgrep-ingest
        identifier: semgrepingest
        type: CI
        spec:
          cloneCodebase: true
          execution:
            steps:
              - step:
                  type: Run
                  name: Run_1
                  identifier: Run_1
                  spec:
                    shell: Sh
                    command: semgrep --sarif --config auto -o /harness/results.sarif /harness
                    envVariables:
                      SEMGREP_APP_TOKEN: <+secrets.getValue("semgrepkey")>
                    connectorRef: YOUR_CONTAINER_IMAGE_REGISTRY_CONNECTOR_ID
                    image: returntocorp/semgrep
                    resources:
                      limits:
                        memory: 4096M
              - step:
                  type: Semgrep
                  name: Semgrep_1
                  identifier: Semgrep_1
                  spec:
                    mode: ingestion
                    config: default
                    target:
                      name: test
                      type: repository
                      variant: test
                    advanced:
                      log:
                        level: info
                    ingestion:
                      file: /harness/results.sarif
          infrastructure:
            type: KubernetesDirect
            spec:
              connectorRef: YOUR_KUBERNETES_CLUSTER_CONNECTOR_ID
              namespace: YOUR_NAMESPACE
              automountServiceAccountToken: true
              nodeSelector: {}
              os: Linux
  identifier: smpsemgrep
  name: smp-semgrep
  properties:
    ci:
      codebase:
        connectorRef: YOUR_CODE_REPO_CONNECTOR_ID
        build: <+input>

Important notes for running Semgrep scans in STO​

Root access requirements​

For more information​

Semgrep step configuration​

Scan​

Scan Mode​

Scan Configuration​

Target​

Type​

Target and variant detection​

Name​

Variant​

Ingestion File​

Log Level​

Fail on Severity​

Additional Configuration​

Advanced settings​

YAML pipeline example​