Commit 2579e708 authored by Pierre Smeyers's avatar Pierre Smeyers
Browse files

feat: support SBOM metadata expressions in project path

parent dfe6aea3
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -204,3 +204,4 @@ pip-selfcheck.json
.work/
reports/
.DS_Store
.bin/
+49 −26
Original line number Diff line number Diff line
@@ -16,13 +16,12 @@ sbom-scanner --help
## Usage

```bash
usage: sbom-scanner [-h] [-u BASE_API_URL] [-k API_KEY] [-p PROJECT_PATH] [-i] [sbom_patterns ...]
usage: sbom-scanner [-h] [-u BASE_API_URL] [-k API_KEY] [-p PROJECT_PATH] [-s PATH_SEPARATOR] [-i] [sbom_patterns ...]

This tool scans for SBOM files and publishes them to a Dependency Track server.

positional arguments:
  sbom_patterns         SBOM file patterns to publish (supports glob patterns)
                        Default: '**/*.cyclonedx.json **/*.cyclonedx.xml'
  sbom_patterns         SBOM file patterns to publish (supports glob patterns). Default: '**/*.cyclonedx.json **/*.cyclonedx.xml'

options:
  -h, --help            show this help message and exit
@@ -32,6 +31,8 @@ options:
                        Dependency Track API key
  -p PROJECT_PATH, --project-path PROJECT_PATH
                        Dependency Track target project path to publish SBOM files to (see doc)
  -s PATH_SEPARATOR, --path-separator PATH_SEPARATOR
                        Separator to use in project path (default: '/')
  -i, --insecure        Skip SSL verification
```

@@ -44,44 +45,66 @@ If none is specified, the program will look for SBOM files matching `**/*.cyclon
### Options

| CLI option                | Env. Variable            | Description                                                                   |
| ----------------------- | ------------------------ | ----------------------------------------------------------------------------- |
| ------------------------- | ------------------------ | ----------------------------------------------------------------------------- |
| `-u` / `--base-api-url`   | `$DEPTRACK_BASE_API_URL` | Dependency Track server base API url (includes `/api`) (**mandatory**)        |
| `-k` / `--api-key`        | `$DEPTRACK_API_KEY`      | Dependency Track API key (**mandatory**)                                      |
| `-p` / `--project-path`   | `$DEPTRACK_PROJECT_PATH` | Dependency Track target project path to publish SBOM files to (**mandatory**) |
| `-s` / `--path-separator` | `$PATH_SEPARATOR`        | Separator to use in project path (default `/`)                                |
| `-i` / `--insecure`       | `$DEPTRACK_INSECURE`     | Skip SSL verification                                                         |

## API Key permissions

* In order to be able to publish SBOM files to the Dependency Track server, the `BOM_UPLOAD` permission is mandatory.
* The extra `PROJECT_CREATION_UPLOAD` permission is required if you want to automatically create the project while uploading the SBOM files if the project does not exist (but the parent project must exist).
* The extra `VIEW_PORTFOLIO` and `PORTFOLIO_MANAGEMENT` permission are required if you want to automatically create one or several project ancestors prior to uploading the SBOM files.
- In order to be able to publish SBOM files to the Dependency Track server, the `BOM_UPLOAD` permission is **mandatory**.
- The extra `PROJECT_CREATION_UPLOAD` permission is required if you want to automatically create the project while uploading the SBOM files if the project does not exist (but the parent project must exist).
- The extra `VIEW_PORTFOLIO` and `PORTFOLIO_MANAGEMENT` permissions are required if you want to automatically create one or several project ancestors prior to uploading the SBOM files.<br/>
  Granting those permissions is not recommended in the general case as they virtually give administration rights to the API Key owner.

## Project Path explained
## Project Path

Whenever a SBOM file is found, `sbom-scanner` uploads it to the Dependency Track server under a certain project.
The target project is determined by evaluating the _project path_ CLI option.
The target project is determined by evaluating the `--project-path` CLI option (or `$DEPTRACK_PROJECT_PATH` variable).

It is a slash (`/`) separated string, each part being of one the following forms:
The project path is a sequence of elements separated by forward slashes `/` (although the separator is also configurable with the `--path-separator` CLI option).
Each element is expected to be one of the following:

* `#11111111-2222-3333-4444-5555555555` (starts with a `#`): the part is a project UUID
* `project-name@version`: the part designates a project name and version
* `project-name`: the part designates a project name only (empty version)
1. `#11111111-1111-1111-1111-111111111111`: a project [Universally Unique Identifier (UUID)](https://en.wikipedia.org/wiki/Universally_unique_identifier) (starting with a hash `#`)
2. `project-name@version`: a **project name** and a **version** (separated with a `@`)
3. `project-name`: a **project name** only (empty version)

Lastly, the project path supports the `%{file_prefix}` pattern, that will be dynamically replaced with the SBOM filename prefix (before the first dot).
Ex: when processing the SBOM file `reports/docker-sbom.cyclonedx.json`, the `%{file_prefix}` will be equal to `docker-sbom`.
Here is the project path regular grammar:

Project paths examples:
```
<path> -> <element> '/' <path> | element
<element> -> <name> '@' <version> | <name> | '#' <UUID>
<name> -> [a-zA-Z0-9_-.]+
<version> -> [a-zA-Z0-9_-.]*
<UUID> -> [a-fA-F0-9]{8} '-' [a-fA-F0-9]{4} '-' [a-fA-F0-9]{4} '-' [a-fA-F0-9]{4} '-' [a-fA-F0-9]{12}
```

Lastly, the project path supports some **expressions**, that will be dynamically replaced when being evaluated:

* `#11111111-2222-3333-4444-5555555555`: every SBOM found will be published to the project with UUID `11111111-2222-3333-4444-5555555555`<br/>
| Expression       | Description                                                                                                                                                |
| ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `{file_prefix}`  | SBOM filename prefix (before the first dot).<br/>Ex: when processing the file `reports/docker-sbom.cyclonedx.json`, `{file_prefix}` will be `docker-sbom`. |
| `{sbom_name}`    | `Metadata > Component > Name` info extracted from the SBOM file (json or xml)                                                                              |
| `{sbom_version}` | `Metadata > Component > Version` info extracted from the SBOM file (json or xml)                                                                           |
| `{sbom_type}`    | `Metadata > Component > Type` info extracted from the SBOM file (json or xml)                                                                              |

Project path examples:

- `#550e8400-e29b-41d4-a716-446655440000`: every SBOM found will be published to the project with UUID `550e8400-e29b-41d4-a716-446655440000`<br/>
  :information_source: as Dependency Track is only able to store one SBOM per project, this configuration is suitable only if exactly one SBOM file is found (otherwise each one will overwrite the previous one)
* `#my-project@v1.1.0`: every SBOM found will be published to the project with name `my-project` and version `v1.1.0`<br/>
- `my-project@v1.1.0`: every SBOM found will be published to the project with name `my-project` and version `v1.1.0`<br/>
  :information_source: depending on your API key permissions, `sbom-scanner` might try to automatically create the project if it doesn't exist<br/>
  :information_source: as in the previous example, this configuration is suitable only if exactly one SBOM file is found
* `#11111111-2222-3333-4444-5555555555/my-project-%{file_prefix}`: every SBOM found will be published to a project named `my-project-%{file_prefix}`, direct child of project with UUID `11111111-2222-3333-4444-5555555555`<br/>
- `#550e8400-e29b-41d4-a716-446655440000/my-project-{file_prefix}@{sbom_version}`: every SBOM found will be published to a project named `my-project-{file_prefix}` and version `{sbom_version}` (extracted from the SBOM file), 
   direct child of project with UUID `550e8400-e29b-41d4-a716-446655440000`<br/>
  :information_source: depending on your API key permissions, `sbom-scanner` might try to automatically create the project if it doesn't exist
* `acme-program@v2/acme-services@v1.3/acme-user-api@v1.3/acme-user-api-%{file_prefix}`: complete project path only defined by project names and versions<br/>
- `acme-program@v2/acme-services@v1.3/acme-user-api@v1.3/acme-user-api-{file_prefix}`: complete project path only defined by project names and versions<br/>
  :information_source: depending on your API key permissions, `sbom-scanner` might try to automatically create the project and its ancestors if they don't exist

> :bulb: you may decide to overwrite the path separator (ex: double slash `//`) if you want your project names to contain slashes.

## Developers

`sbom-scanner` is implemented in Python and relies on [Poetry](https://python-poetry.org/) for its packaging and dependency management.
@@ -94,6 +117,6 @@ poetry install
poetry run sbom-scanner \
  --base-api-url http://localhost:8080/api \
  --api-key "$DT_API_KEY" \
  --project-path "my-group/my-project/sub-%{file_prefix}" \
  --project-path "my-group/my-project/sub-{file_prefix}" \
  **/*.cyclonedx.json
```
+153 −41
Original line number Diff line number Diff line
@@ -9,6 +9,7 @@ from functools import cache
from logging import Logger
from pathlib import Path
from typing import Optional
from xml.etree import ElementTree as ET

import requests

@@ -43,6 +44,8 @@ INSECURE_SSL_CTX = ssl.create_default_context()
INSECURE_SSL_CTX.check_hostname = False
INSECURE_SSL_CTX.verify_mode = ssl.CERT_NONE

MIME_APPLICATION_JSON = "application/json"


class DtPermission(str, Enum):
    """Dependency Track permissions.
@@ -112,12 +115,14 @@ class Scanner:
        base_api_url: str,
        api_key: str,
        project_path: str,
        insecure: bool,
        path_separator: str = "/",
        verify_ssl: bool = True,
    ):
        self.base_api_url = base_api_url
        self.api_key = api_key
        self.project_path = project_path
        self.insecure = insecure
        self.path_separator = path_separator
        self.verify_ssl = verify_ssl

        self.sbom_count = 0

@@ -127,7 +132,8 @@ class Scanner:
            permission["name"]
            for permission in requests.get(
                f"{self.base_api_url}/v1/team/self",
                headers={"X-API-Key": self.api_key, "accept": "application/json"},
                headers={"X-API-Key": self.api_key, "accept": MIME_APPLICATION_JSON},
                verify=self.verify_ssl,
            ).json()["permissions"]
        ]

@@ -138,7 +144,7 @@ class Scanner:
    # retunrs the tail project UUID
    @cache
    def get_or_create_project(self, project_path: str, classifier="application") -> str:
        project_path_parts = project_path.split("/")
        project_path_parts = project_path.split(self.path_separator)
        project_def = DtProjectDef(project_path_parts[-1])
        if project_def.is_uuid:
            print(
@@ -149,25 +155,83 @@ class Scanner:
        # project is defined by name/version...
        resp = requests.get(
            f"{self.base_api_url}/v1/project",
            headers={"X-API-Key": self.api_key, "accept": "application/json"},
            headers={"X-API-Key": self.api_key, "accept": MIME_APPLICATION_JSON},
            params={"name": project_def.name},
            verify=self.verify_ssl,
        )
        resp.raise_for_status()
        matching_prj = next(
        # find project with matching name/version
        project_versions: list[dict] = resp.json()
        exact_match = next(
            filter(
                lambda prj: prj["name"] == project_def.name
                and prj.get("version") == project_def.version,
                resp.json(),
                project_versions,
            ),
            None,
        )
        if matching_prj:
        if exact_match:
            # project already exists: replace name with found UUID
            print(
                f"- {AnsiColors.YELLOW}{project_path}{AnsiColors.RESET} found (by name/version): {matching_prj['uuid']}..."
                f"- {AnsiColors.YELLOW}{project_path}{AnsiColors.RESET} found (by name/version): {exact_match['uuid']}..."
            )
            return exact_match["uuid"]
        # if project exists but not the version, we have to CLONE it
        name_match = next(
            filter(
                lambda prj: prj["name"] == project_def.name,
                project_versions,
            ),
            None,
        )
            return matching_prj["uuid"]
        # TODO: if project exists but not the version, we have to CLONE it
        if name_match:
            print(
                f"- {AnsiColors.YELLOW}{project_path}{AnsiColors.RESET} found sibling (version: {name_match.get('version')}): {name_match['uuid']}..."
            )
            # now create a clone of the project
            resp = requests.put(
                f"{self.base_api_url}/v1/project/clone",
                headers={
                    "X-API-Key": self.api_key,
                    "accept": MIME_APPLICATION_JSON,
                    "content-type": MIME_APPLICATION_JSON,
                },
                json={
                    "project": name_match["uuid"],
                    "version": project_def.version,
                    "includeTags": True,
                    "includeProperties": True,
                    "includeComponents": True,
                    "includeServices": True,
                    "includeAuditHistory": True,
                    "includeACL": True,
                },
                verify=self.verify_ssl,
            )
            try:
                resp.raise_for_status()
                # TODO: clone doesn't return UUID :(
                resp = requests.get(
                    f"{self.base_api_url}/v1/project/lookup",
                    headers={
                        "X-API-Key": self.api_key,
                        "accept": MIME_APPLICATION_JSON,
                    },
                    params={"name": project_def.name, "version": project_def.version},
                    verify=self.verify_ssl,
                )
                resp.raise_for_status()
                # retrieve UUID from response and return
                created_uuid = resp.json()["uuid"]
                print(
                    f"- {AnsiColors.YELLOW}{project_path}{AnsiColors.RESET} {AnsiColors.HGREEN}successfully{AnsiColors.RESET} cloned (from sibling): {created_uuid}"
                )
                return created_uuid
            except requests.exceptions.HTTPError as he:
                print(
                    f"- create {AnsiColors.YELLOW}{project_path}{AnsiColors.RESET} {AnsiColors.HRED}failed{AnsiColors.RESET} (err {he.response.status_code}): {AnsiColors.HGRAY}{he.response.text}{AnsiColors.RESET}",
                )
                raise

        # project does not exist: create it
        data = {
@@ -183,7 +247,9 @@ class Scanner:
            parent_def = DtProjectDef(project_path_parts[-2])
            if not parent_def.is_uuid:
                # create parent project
                parent_uuid = self.get_or_create_project("/".join(project_path_parts[:-1]))
                parent_uuid = self.get_or_create_project(
                    self.path_separator.join(project_path_parts[:-1])
                )
                # now parent def must be a UUID
                parent_def = DtProjectDef("#" + parent_uuid)
            # add parent UUID to params
@@ -196,47 +262,77 @@ class Scanner:
            f"{self.base_api_url}/v1/project",
            headers={
                "X-API-Key": self.api_key,
                "accept": "application/json",
                "content-type": "application/json",
                "accept": MIME_APPLICATION_JSON,
                "content-type": MIME_APPLICATION_JSON,
            },
            json=data,
            verify=self.verify_ssl,
        )
        try:
            resp.raise_for_status()
        except requests.exceptions.HTTPError as he:
            print(
                f"- create {AnsiColors.YELLOW}{project_path}{AnsiColors.RESET} {AnsiColors.HRED}failed{AnsiColors.RESET} (err {he.response.status_code}): {AnsiColors.HGRAY}{he.response.text}{AnsiColors.RESET}",
            )
            raise
            # retrieve UUID from response and return
            created_uuid = resp.json()["uuid"]
            print(
                f"- {AnsiColors.YELLOW}{project_path}{AnsiColors.RESET} {AnsiColors.HGREEN}successfully{AnsiColors.RESET} created: {created_uuid}"
            )
            return created_uuid
        except requests.exceptions.HTTPError as he:
            print(
                f"- create {AnsiColors.YELLOW}{project_path}{AnsiColors.RESET} {AnsiColors.HRED}failed{AnsiColors.RESET} (err {he.response.status_code}): {AnsiColors.HGRAY}{he.response.text}{AnsiColors.RESET}",
            )
            raise

    def publish(self, sbom_file: Path):
        print(
            f"{AnsiColors.BOLD}📄 SBOM: {AnsiColors.BLUE}{sbom_file}{AnsiColors.RESET}"
        )
        # compute the target project path
        sbom_prefix = sbom_file.name.split(".")[0]
        # sbom_extension = sbom_file.name.split(".")[-1]
        project_path = self.project_path.replace("%{file_prefix}", sbom_prefix)
        print(
            f"- target project: {AnsiColors.YELLOW}{project_path}{AnsiColors.RESET}..."
        )
        # load the SBOM content
        with open(sbom_file, "r") as reader:
            sbom_content = reader.read()

        sbom_extension = sbom_file.name.split(".")[-1]
        if sbom_extension == "json":
            sbom_json = json.loads(sbom_content)
            # normalize SBOM (shorten IDs, ...)
            # TODO
            sbom_md_cmp = sbom_json.get("metadata", {}).get("component", {})
            sbom_type = sbom_md_cmp.get("type")
            sbom_name = sbom_md_cmp.get("name")
            sbom_version = sbom_md_cmp.get("version")
        elif sbom_extension == "xml":
            sbom_xml = ET.fromstring(sbom_content)
            # normalize SBOM (shorten IDs, ...)
            # TODO
        self.do_publish(sbom_content, project_path)
            sbom_md_cmp = sbom_xml.find("{*}metadata/{*}component")
            sbom_type = sbom_md_cmp.get("type") if sbom_md_cmp else None
            sbom_name = sbom_md_cmp.find("{*}name").text if sbom_md_cmp else None
            sbom_version = sbom_md_cmp.find("{*}version").text if sbom_md_cmp else None
        else:
            raise ValueError(f"unsupported SBOM extension: {sbom_extension}")

        file_prefix = sbom_file.name.split(".")[0]
        print(
            f"- file_prefix: {AnsiColors.HGRAY}{file_prefix}{AnsiColors.RESET}; sbom_type: {AnsiColors.HGRAY}{sbom_type}{AnsiColors.RESET}; sbom_name: {AnsiColors.HGRAY}{sbom_name}{AnsiColors.RESET}; sbom_version: {AnsiColors.HGRAY}{sbom_version}{AnsiColors.RESET}"
        )

        # compute the target project path
        project_path = str.format(
            self.project_path,
            file_prefix=file_prefix,
            sbom_type=sbom_type or "unk",
            sbom_name=sbom_name or "unk",
            sbom_version=sbom_version or "",
        )
        print(
            f"- target project: {AnsiColors.YELLOW}{project_path}{AnsiColors.RESET}"
        )

        self.do_publish(sbom_content, project_path, sbom_type)
        self.sbom_count += 1

    def do_publish(self, sbom_content: str, project_path: str, allow_retry=True):
        project_path_parts = project_path.split("/")
    def do_publish(
        self, sbom_content: str, project_path: str, sbom_type: str, allow_retry=True
    ):
        project_path_parts = project_path.split(self.path_separator)
        # determine publish params
        params = {}
        project_def = DtProjectDef(project_path_parts[-1])
@@ -264,8 +360,9 @@ class Scanner:
        )
        resp = requests.post(
            f"{self.base_api_url}/v1/bom",
            headers={"X-API-Key": self.api_key, "accept": "application/json"},
            headers={"X-API-Key": self.api_key, "accept": MIME_APPLICATION_JSON},
            files={"bom": sbom_content, **params},
            verify=self.verify_ssl,
        )
        try:
            resp.raise_for_status()
@@ -286,11 +383,16 @@ class Scanner:
                print("- create projects...")
                # replace last path part with project UUID
                # TODO: retrieve classifier from SBOM
                project_path_parts[-1] = "#" + self.get_or_create_project(project_path)
                project_path_parts[-1] = "#" + self.get_or_create_project(
                    project_path, sbom_type
                )
                # then retry
                print("- retry publish...")
                self.do_publish(
                    sbom_content, "/".join(project_path_parts), allow_retry=False
                    sbom_content,
                    self.path_separator.join(project_path_parts),
                    sbom_type,
                    allow_retry=False,
                )
            else:
                raise
@@ -340,6 +442,12 @@ def run() -> None:
        default=os.getenv("DEPTRACK_PROJECT_PATH"),
        help="Dependency Track target project path to publish SBOM files to (see doc)",
    )
    parser.add_argument(
        "-s",
        "--path-separator",
        default=os.getenv("DEPTRACK_PATH_SEPARATOR", "/"),
        help="Separator to use in project path (default: '/')",
    )
    parser.add_argument(
        "-i",
        "--insecure",
@@ -353,7 +461,7 @@ def run() -> None:
        default=os.getenv(
            "DEPTRACK_SBOM_PATTERNS", "**/*.cyclonedx.json **/*.cyclonedx.xml"
        ).split(" "),
        help="SBOM file patterns to publish (supports glob patterns)",
        help="SBOM file patterns to publish (supports glob patterns). Default: '**/*.cyclonedx.json **/*.cyclonedx.xml'",
    )

    # parse command and args
@@ -381,6 +489,9 @@ def run() -> None:
    print(
        f"- project path   (--project-path)  : {AnsiColors.CYAN}{args.project_path}{AnsiColors.RESET}"
    )
    print(
        f"- path separator (--path-separator): {AnsiColors.CYAN}{args.path_separator}{AnsiColors.RESET}"
    )
    print(
        f"- insecure       (--insecure)      : {AnsiColors.CYAN}{args.insecure}{AnsiColors.RESET}"
    )
@@ -394,7 +505,8 @@ def run() -> None:
        base_api_url=args.base_api_url,
        api_key=args.api_key,
        project_path=args.project_path,
        insecure=args.insecure,
        path_separator=args.path_separator,
        verify_ssl=not args.insecure,
    )
    scanner.scan(args.sbom_patterns)