qtbase/cmake/QtSbomHelpers.cmake
Alexandru Croitor 65bf57ce7c CMake: Generate an SPDX v2.3 SBOM file for each built repository
This change adds a new -sbom configure option to allow generating and
installing an SPDX v2.3 SBOM file when building a qt repo.

The -sbom-dir option can be used to configure the location where
each repo sbom file will be installed.

By default it is installed into

 $prefix/$archdatadir/sbom/$sbom_lower_project_name.sdpx

which is basically ~/Qt/sbom/qtbase-6.8.0.spdx

The file is installed as part of the default installation rules, but
it can also be installed manually using the "sbom" installation
component, or "sbom_$lower_project_name" in a top-level build. For
example: cmake install . --component sbom_qtbase

CMake 3.19+ is needed to read the qt_attribution.json files for
copyrights, license info, etc. When using an older cmake version,
configuration will error out. It is possible to opt into using an
older cmake version, but the generated sbom will lack all the
attribution file information.
Using an older cmake version is untested and not officially supported.

Implementation notes.

The bulk of the implementation is split into 4 new files:

- QtPublicSbomHelpers.cmake - for Qt-specific collecting, processing
  and dispatching the generation of various pieces of the SBOM document
  e.g. a SDPX package associated with a target like Core, a SDPX
  file entry for each target binary file (per-config shared library,
  archive, executable, etc)

- QtPublicSbomGenerationHelpers.cmake - for non-Qt specific
  implementation of SPDX generation. This also has some code that was
  taken from the cmake-sbom 3rd party project, so it is dual licensed
  under the usual Qt build system BSD license, as well as the MIT
  license of the 3rd party project

- QtPublicGitHelpers.cmake - for git related features, mainly to embed
  queried hashes or tags into version strings, is dual-licensed for
  the same reasons as QtPublicSbomGenerationHelpers.cmake

- QtSbomHelpers.cmake - Qt-specific functions that just forward
  arguments to the public functions. These are meant to be used in our
  Qt CMakeLists.txt instead of the public _qt_internal_add_sbom ones
  for naming consistency. These function would mostly be used to
  annotate 3rd party libraries with sbom info and to add sbom info
  for unusual target setups (like the Bootstrap library), because most
  of the handling is already done automatically via
  qt_internal_add_module/plugin/etc.

The files are put into Public cmake files, with the future hope of
making this available to user projects in some capacity.

The distinction of Qt-specific and non-Qt specific code might blur a
bit, and thus the separation across files might not always be
consistent, but it was best effort.

The main purpose of the code is to collect various information about
targets and their relationships and generate equivalent SPDX info.

Collection is currently done for the following targets: Qt modules,
plugins, apps, tools, system libraries, bundled 3rd party libraries
and partial 3rd party sources compiled directly as part of Qt targets.

Each target has an equivalent SPDX package generated with information
like version, license, copyright, CPE (common vulnerability
identifier), files that belong to the package, and relationships on
other SPDX packages (associated cmake targets), mostly gathered from
direct linking dependencies.

Each package might also contain files, e.g. libQt6Core.so for the Core
target. Each file also has info like license id, copyrights, but also
the list of source files that were used to generate the file and a
sha1 checksum.

SPDX documents can also refer to packages in other SPDX documents, and
those are referred to via external document references. This is the
case when building qtdeclarative and we refer to Core.

For qt provided targets, we have complete information regarding
licenses, and copyrights.

For bundled 3rd party libraries, we should also have most information,
which is usually parsed from the
src/3rdparty/libfoo/qt_attribution.json files.
If there are multiple attribution files, or if the files have multiple
entries, we create a separate SBOM package for each of those entries,
because each might have a separate copyright or version, and an sbom
package can have only one version (although many copyrights).

For system libraries we usually lack the information because we don't
have attribution files for Find scripts. So the info needs to be
manually annotated via arguments to the sbom function calls, or the
FindFoo.cmake scripts expose that information in some form and we
can query it.

There are also corner cases like 3rdparty sources being directly
included in a Qt library, like the m4dc files for Gui, or PCRE2 for
Bootstrap.
Or QtWebEngine libraries (either Qt bundled or Chromium bundled or
system libraries) which get linked in by GN instead of CMake, so there
are no direct targets for them.
The information for these need to be annotated manually as well.

There is also a distinction to be made for static Qt builds (or any
static Qt library in a shared build), where the system libraries found
during the Qt build might not be the same that are linked into the
final user application or library.

The actual generation of the SBOM is done by file(GENERATE)-ing one
.cmake file for each target, file, external ref, etc, which will be
included in a top-level cmake script.

The top-level cmake script will run through each included file, to
append to a "staging" spdx file, which will then be used in a
configure_file() call to replace some final
variables, like embedding a file checksum.

There are install rules to generate a complete SBOM during
installation, and an optional 'sbom' custom target that allows
building an incomplete SBOM during the build step.

The build target is just for convenience and faster development
iteration time. It is incomplete because it is missing the installed
file SHA1 checksums and the document verification code (the sha1 of
all sha1s). We can't compute those during the build before the files
are actually installed.

A complete SBOM can only be achieved at installation time. The install
script will include all the generated helper files, but also set some
additional variables to ensure checksumming happens, and also handle
multi-config installation, among other small things.

For multi-config builds, CMake doesn't offer a way to run code after
all configs are installed, because they might not always be installed,
someone might choose to install just Release.
To handle that, we rely on ninja installing each config sequentially
(because ninja places the install rules into the 'console' pool which
runs one task at a time).
For each installed config we create a config-specific marker file.
Once all marker files are present, whichever config ends up being
installed as the last one, we run the sbom generation once, and then
delete all marker files.

There are a few internal variables that can be set during
configuration to enable various checks (and other features) on the
generated spdx files:

- QT_INTERNAL_SBOM_VERIFY
- QT_INTERNAL_SBOM_AUDIT
- QT_INTERNAL_SBOM_AUDIT_NO_ERROR
- QT_INTERNAL_SBOM_GENERATE_JSON
- QT_INTERNAL_SBOM_SHOW_TABLE
- QT_INTERNAL_SBOM_DEFAULT_CHECKS

These use 3rd party python tools, so they are not enabled by default.
If enabled, they run at installation time after the sbom is installed.
We will hopefully enable them in CI.

Overall, the code is still a bit messy in a few places, due to time
constraints, but can be improved later.

Some possible TODOs for the future:
- Do we need to handle 3rd party libs linked into a Qt static library
  in a Qt shared build, where the Qt static lib is not installed, but
  linked into a Qt shared library, somehow specially?
  We can record a package for it, but we can't
  create a spdx file record for it (and associated source
  relationships) because we don't install the file, and spdx requires
  the file to be installed and checksummed. Perhaps we can consider
  adding some free-form text snippet to the package itself?

- Do we want to add parsing of .cpp source files for Copyrights, to
  embed them into the packages? This will likely slow down
  configuration quite a bit.

- Currently sbom info attached to WrapFoo packages in one repo is
  not exported / available in other repos. E.g. If we annotate
  WrapZLIB in qtbase with CPE_VENDOR zlib, this info will not be
  available when looking up WrapZLIB in qtimageformats.
  This is because they are IMPORTED libraries, and are not
  exported. We might want to record this info in the future.

[ChangeLog][Build System] A new -sbom configure option can be used
to generate and install a SPDX SBOM (Software Bill of Materials) file
for each built Qt repository.

Task-number: QTBUG-122899
Change-Id: I9c730a6bbc47e02ce1836fccf00a14ec8eb1a5f4
Reviewed-by: Joerg Bornemann <joerg.bornemann@qt.io>
Reviewed-by:  Alexey Edelev <alexey.edelev@qt.io>
(cherry picked from commit 37a5e001277db9e1392a242171ab2b88cb6c3049)
Reviewed-by: Qt Cherry-pick Bot <cherrypick_bot@qt-project.org>
2024-06-13 14:55:07 +00:00

25 lines
701 B
CMake

# Copyright (C) 2024 The Qt Company Ltd.
# SPDX-License-Identifier: BSD-3-Clause
# For now these are simple internal forwarding wrappers for the public counterparts, which are
# meant to be used in qt repo CMakeLists.txt files.
function(qt_internal_add_sbom)
_qt_internal_add_sbom(${ARGN})
endfunction()
function(qt_internal_extend_sbom)
_qt_internal_extend_sbom(${ARGN})
endfunction()
function(qt_internal_sbom_add_license)
_qt_internal_sbom_add_license(${ARGN})
endfunction()
function(qt_internal_extend_sbom_dependencies)
_qt_internal_extend_sbom_dependencies(${ARGN})
endfunction()
function(qt_find_package_extend_sbom)
_qt_find_package_extend_sbom(${ARGN})
endfunction()