From: Michael Olbrich <m.olbrich@pengutronix.de>
To: ptxdist@pengutronix.de
Cc: Roland Hieber <rhi@pengutronix.de>,
Felicitas Jung <f.jung@pengutronix.de>
Subject: Re: [ptxdist] [PATCH] doc: working with licensing information in packages
Date: Wed, 16 Jun 2021 16:19:43 +0200 [thread overview]
Message-ID: <20210616141943.GO839947@pengutronix.de> (raw)
In-Reply-To: <20210608103639.24336-1-rhi@pengutronix.de>
On Tue, Jun 08, 2021 at 12:36:40PM +0200, Roland Hieber wrote:
> Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Roland Hieber <rhi@pengutronix.de>
> ---
> v1 -> v2:
> - rebase to current master
> - squash PATCH 1/2 ("link to the SPDX license list")
> - move from daily use into dev manual chapter
> - expand and rewrite some parts completely
> - absorb old content in doc/dev_add_new_pkgs.rst
> - address feedback from Michael Olbrich:
> - check all source files instead of "some relevant-sounding files"
> - introduce "custom" and "custom-exception" identifiers instead of
> "unknown"
> - be restrictive and err on the side of caution when interpreting
> ambiguities
> - shortly mention the AND, OR and bracket syntaxes
>
> PATCH v1: https://lore.ptxdist.org/ptxdist/20200511100306.7948-2-rhi@pengutronix.de
> ---
> doc/contributing.rst | 4 +
> doc/daily_work.inc | 2 +
> doc/dev_add_new_pkgs.rst | 46 +------
> doc/dev_licenses.rst | 243 +++++++++++++++++++++++++++++++++++++
> doc/dev_manual.rst | 1 +
> doc/ref_make_variables.rst | 20 ++-
> 6 files changed, 267 insertions(+), 49 deletions(-)
> create mode 100644 doc/dev_licenses.rst
>
> diff --git a/doc/contributing.rst b/doc/contributing.rst
> index bdaddee245a9..496998c913f7 100644
> --- a/doc/contributing.rst
> +++ b/doc/contributing.rst
> @@ -103,6 +103,10 @@ updated of removed after a version bump. Unknown PTXCONF_* variables or
> macros used in menu files. There are often typos or the variables was just
> removed.
>
> +New packages must also have licensing information in the ``<PKG>_LICENSE``
> +and ``<PKG>_LICENSE_FILES`` variables.
> +Refer to the section :ref:`licensing_in_packages` for more information.
> +
> Helper Scripts
> --------------
>
> diff --git a/doc/daily_work.inc b/doc/daily_work.inc
> index 8fe7739aa0c8..ab901a54ee60 100644
> --- a/doc/daily_work.inc
> +++ b/doc/daily_work.inc
> @@ -1480,3 +1480,5 @@ be enabled. A used mount option of the overlayfs in the default
> newer.
> If your kernel does not meet this requirement you can provide your own local
> and adapted variant of the mentioned mount unit.
> +
> +.. include:: daily_work_licenses.inc
> diff --git a/doc/dev_add_new_pkgs.rst b/doc/dev_add_new_pkgs.rst
> index 4ae2765c2ce9..a9e8fcf236c4 100644
> --- a/doc/dev_add_new_pkgs.rst
> +++ b/doc/dev_add_new_pkgs.rst
> @@ -248,6 +248,7 @@ PTXdist specific. What does it mean:
>
> - ``*_LICENSE`` enables the user to get a list of licenses she/he is
> using in her/his project (licenses of the enabled packages).
> + See :ref:`licensing_in_packages` below for detailed information.
>
> After enabling the menu entry, we can start to check the *get* and
> *extract* stages, calling them manually one after another.
> @@ -604,51 +605,6 @@ This will re-start with a **clean** BSP and builds exactly the new package and
> its (known) dependencies. If this builds successfully as well we are really done
> with the new package.
>
> -Some Notes about Licenses
> -~~~~~~~~~~~~~~~~~~~~~~~~~
> -
> -The already mentioned rule variable ``*_LICENSE`` (e.g. ``FOO_LICENSE`` in our
> -example) is very important and must be filled by the developer of the package.
> -Many licenses bring in obligations using the corresponding package (*attribution*
> -for example). To make life easier for everybody the license for a package must
> -be provided. *SPDX* license identifiers unify the license names and are used
> -in PTXdist to identify license types and obligations.
> -
> -If a package comes with more than one license, all of their SPDX identifiers
> -must be listed and connected with the keyword ``AND``. If your package comes
> -with GPL-2.0 and LGPL-2.1 licenses, the definition should look like this:
> -
> -.. code-block:: make
> -
> - FOO_LICENSE := GPL-2.0 AND LGPL-2.1
> -
> -One specific obligation cannot be detected examining the SPDX license identifiers
> -by PTXdist: *the license choice*. In this case all licenses of choice must be
> -listed and connected by the keyword ``OR``.
> -
> -If, for example, your obligation is to select one of the licenses *GPL-2.0* **or**
> -*GPL-3.0*, the ``*_LICENSE`` variable should look like this:
> -
> -.. code-block:: make
> -
> - FOO_LICENSE := GPL-2.0 OR GPL-3.0
> -
> -SPDX License Identifiers
> -^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -A list of SPDX license identifiers can be found here:
> -
> - https://spdx.org/licenses/
> -
> -Help to Detect the Correct License
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -License identification isn't trivial. A help in doing so can be the following
> -repository and its content. It contains a list of known licenses based on their
> -SPDX identifier. The content is without formatting to simplify text search.
> -
> - https://github.com/spdx/license-list-data/tree/master/text
> -
> Advanced Rule Files
> ~~~~~~~~~~~~~~~~~~~
>
> diff --git a/doc/dev_licenses.rst b/doc/dev_licenses.rst
> new file mode 100644
> index 000000000000..06b4decd7728
> --- /dev/null
> +++ b/doc/dev_licenses.rst
> @@ -0,0 +1,243 @@
> +.. _licensing_in_packages:
> +
> +Tracking licensing information in packages
> +------------------------------------------
> +
> +PTXdist aims to track licensing information for every package.
> +This includes the license(s) under which a package can be distributed,
> +as well as the respective files in the package's source tree that state those terms.
> +Sadly there is no widely adopted standard for machine-readable licensing
> +information in source code (`yet <https://reuse.software>`_),
> +so here are a few hints where to look.
> +
> +In that process, we aim to collect the baseline set of licenses
> +which at least apply to a package.
> +There may be other licenses which apply too, but the complete set often cannot
> +be found without a time-consuming review.
> +Still, the extracted license information in PTXdist can serve as a hint for
> +the full license compliance process,
> +and can help to exclude certain software under certain licenses from the build.
> +
> +There are many older package rules in PTXdist which don't specify licensing information.
> +If you want to help complete the database,
> +you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
> +Note however that this cannot find wrong or incomplete licensing information.
> +
> +Finding licensing information
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +You should first select and extract the package in question, and then have a
> +look at in the extracted package sources (usually something like
> +``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
> +``ptxdist package-info mypackage``).
> +
> +* Check for files named ``COPYING``, ``COPYRIGHT``, or ``LICENSE``.
> + These often only contain the license text and, in case of GPL, no information
> + if the code is available under the *-only* or *-or-later* variant.
> + Sometimes these files are in a folder ``/doc`` or ``/legal``.
> +
> +* Check the ``README``, if there is any.
> + Often there is important information there, e.g. in case of GPL if the
> + software is *GPL-x.x-or-later* or *GPL-x.x-only*.
> +
> +* Check source files, like ``*.c`` for license headers.
> + Often additional information can be found here.
> +
> +* If you want to be extra sure, use a license compliance toolchain (e.g.
> + `FOSSology <https://www.fossology.org/>`__) on the project.
> +
> +Ideally you'll find two pieces of information:
> +
> +* A *license text* (e.g. a GNU General Public License v2.0 text)
> +* A *license statement* that states that a certain license applies to (parts of) the project
> + (often also including copyright statements and a warranty disclaimer)
> +
> +Some licenses (e.g. BSD-style licenses) are also short enough so that both
> +pieces are combined in a short comment header in a source file or a README.
> +Strictly speaking, both the license text and the license statement must be
> +present for a complete, unambiguous license, but see the next section about
> +edge cases.
> +
> +On the other hand, there are some parts that can be ignored for our purposes:
> +
> +* Everything that is auto-generated, either by a script in the project source,
> + or by the build system previous to packaging.
> + The generator itself cannot hold copyright, although the authors of the
> + templates used for the generation or the authors of the generator can.
> +
> +* Most files belonging to the build system don't make it into the compiled code
> + and can therefore be ignored (e.g. configure scripts, Makefiles).
> + These cases sometimes can be hard to detect – if unsure, include the file in
> + your research.
> +
> +Some projects also include a COPYING.LIB containing an LGPL text, which is
> +referenced nowhere in the project.
> +In that case, ignore the COPYING.LIB – it probably comes from a boilerplate
> +project skeleton and the maintainer forgot to delete it.
> +
> +Distillation into license identifiers
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +In PTXdist, we use `SPDX license expressions <https://spdx.org/licenses/>`_.
> +
> +Either the license identifier is clear, e.g. because the README says "GPL 2.0
> +or later" (check the license text to be sure), or you can use tools like
> +`FOSSology <https://www.fossology.org>`__,
> +`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
> +or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
> +to match texts to SPDX license identifiers.
> +
> +License texts don't have to match exactly, you should apply the
> +`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
> +accordingly.
> +The important part here is that the project's license and the SPDX identifier
> +describe the same licensing terms.
> +"Rather close" or "mostly similar" statements are not enough for a match,
> +but simple unimportant changes like replacing *"The Author"* with the project's
> +maintainer's name, or a change in e-mail adresses, are usually okay.
> +
> +For software that is not open-source according to the `OSI definition
> +<https://opensource.org/osd>`_, use the identifier ``proprietary``.
> +
> +.. important::
> +
> + If no license identifier matches, or if anything is unclear about the
> + licensing situation, use the identifier ``custom`` (for licenses)
> + or ``custom-exception`` (for license exceptions, e.g.: ``GPL-2.0-only WITH
> + custom-exception``).
> +
> +If SPDX doesn't know about a license yet, and the project is considered open
> +source or free software, you can `report its license to be added to the SPDX
> +license list
> +<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.
> +
> +Multiple licenses
> +^^^^^^^^^^^^^^^^^
> +
> +Open-source software is re-used all the time, so it can happen that some files
> +make their way into a different project.
> +This is usually no problem.
> +If you encounter multiple parts of the project under different licenses, combine
> +their license expressions with ``AND``.
> +For example, in a project that contains both a library and command line tools,
> +the license expression could be ``GPL-2.0-or-later AND LGPL-2.1-or-later``.
> +
> +Sometimes files are licensed under multiple licenses, and only one license is to
> +be selected.
> +In that case, combine the license expressions with ``OR``.
> +This is often the case with Device Trees in the Linux kernel, e.g.:
> +``GPL-2.0-only OR BSD-2-Clause``.
> +
> +No operator precedence is defined, use brackets ``(…)`` to group sub-statements.
> +
> +Conflicting and ambiguous statements
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Human interpretation is needed when statements inside the project conflict with
> +each other.
> +Some clues that can help you decide:
> +
> +Detailedness:
> + If the header in the COPYING file says *"GNU General Public License"*, but
> + the license text below that is in fact a BSD license, the correct license for
> + the license identifier is the BSD license.
> +
> +Author Intent:
> + If the README says *"this code is LGPL 2.1"*, but COPYING contains a GPL
> + boilerplate license text, the correct license identifier is probably *"LGPL 2.1"*
> + – the README written by the author prevails over the boilerplate text.
> +
> +Recency:
> + If README and COPYING are both clearly written by the author themselves, and
> + the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
> + recent file prevails.
> +
> +Scope:
> + If no license statement can be found, but there is a COPYING file containing
> + a license text, infer that the whole project is licensed under that license.
> +
> +Err on the side of caution:
> + If all you can find is a GPL license text, this doesn't yet tell you whether
> + the project is licensed under the *-only* or the *-or-later* variant.
> + In that case, interpret the license restrictively and choose the *-only*
> + variant for the license identifier.
> +
> +Don't assume:
> + If anything is ambiguous or unclear, choose ``custom`` as a license identifier.
> +
> +.. note::
> +
> + Any of these cases is considered a bug and should be reported to the upstream maintainers!
> +
> +"Public Domain" software
> +^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +For `good reasons <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files>`_,
> +SPDX doesn't supply a license identifier for "Public Domain".
> +Nevertheless, some PTXdist package rules specify ``public_domain`` as their
> +respective license identifier.
> +This is purely for historical reasons, and ``public_domain`` should normally
> +*not* be used for new packages.
> +Some of those "Public Domain" dedications in packages have since been accepted
> +in SPDX, e.g. `libselinux <https://spdx.org/licenses/libselinux-1.0.html>`_ or
> +`SQLite <https://spdx.org/licenses/blessing.html>`_.
> +
> +No license information at all
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +No license - no usage rights!
> +
> +Definitely report this bug to the upstream maintainer.
> +Maybe even point them in the direction of `machine-readablity <https://reuse.software/>`_ :)
> +
> +Adding license files to PTXdist package rules
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
> +variable in the respective package rule file.
> +All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
> +including a checksum so that PTXdist complains when they change.
> +
> +Example:
> +
> +.. code-block:: make
> + :caption: ddrescue.make
> +
> + DDRESCUE_LICENSE := GPL-2.0-or-later AND BSD-2-Clause
> + DDRESCUE_LICENSE_FILES := \
> + file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
> + file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
> + file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
> +
> +See the section :ref:`package_specific_variables` for more information about
> +the syntax of those two variables.
> +
> +The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
> +command applied to a range of lines.
> +For the example above, lines 1 to 16 of main.cc would be::
> +
> + $ sed -n 1,16p main.cc | md5sum -
> + a01d61d3293ce28b883d8ba0c497e968
> +
> +If the copyright statement contains a string of years, leave those lines out for
> +the calculation of the checksum, as an added year does not change the license
> +(in fact, not even a single year is needed for the license to be valid),
> +but only makes package version updates more cumbersome.
I think, this is not quite clear or incorrect. For me, a 'copyright
statement' is something like this:
Copyright (C) 2013 by Michael Olbrich <m.olbrich@pengutronix.de>
And for many licenses, this must not be removed. So omitting those lines is
wrong.
In some cases the copyright header in a file contains lines with only the
year. Maybe those can be skipped. But they are pretty rare.
The rest looks good to me.
Michael
> +
> +If additional information is in the README or license headers in source files
> +are used, also include these files (for source code: one of each is enough),
> +but use md5sum only on the relevant lines, so changes in the rest of the file
> +do not appear as license changes.
> +
> +For rather chaotic directories with lots of license files, definitely include at
> +least one relevant source file with license headers (if there are any), as some
> +developers tend to accumulate license files without adjusting it to license
> +changes in their source.
> +
> +.. note::
> +
> + For each single license identifier in the license expression, include at
> + least one file with checksum in the ``<PKG>_LICENSE_FILES`` variable.
> +
> +PTXdist will include all files (or their respective lines) that were referenced
> +in ``<PKG>_LICENSE_FILES`` as verbatim sources in the license report.
> diff --git a/doc/dev_manual.rst b/doc/dev_manual.rst
> index c232cc91428a..0a1eaf8a1413 100644
> --- a/doc/dev_manual.rst
> +++ b/doc/dev_manual.rst
> @@ -13,6 +13,7 @@ This chapter shows all (or most) of the details of how PTXdist works.
> dev_add_new_pkgs
> dev_add_bin_only_files
> dev_create_new_pkg_templates
> + dev_licenses
> dev_layers_in_ptxdist
> dev_kconfig_diffs
> dev_code_signing
> diff --git a/doc/ref_make_variables.rst b/doc/ref_make_variables.rst
> index 674acdcea982..2ee34856dd02 100644
> --- a/doc/ref_make_variables.rst
> +++ b/doc/ref_make_variables.rst
> @@ -127,6 +127,8 @@ Other useful variables:
> that are built and installed during the PTXdist build run.
> There are analogous ``-y`` and ``-m`` variants of those variables too.
>
> +.. _package_specific_variables:
> +
> Package Specific Variables
> ~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> @@ -223,10 +225,19 @@ Package Definition
> 'gdbserver' for an example.
>
> ``<PKG>_LICENSE``
> - The license of the package. The SPDX license identifiers should be used
> - here. Use ``proprietary`` for proprietary packages and ``ignore`` for
> - packages without their own license, e.g. meta packages or packages that
> - only install files from ``projectroot/``.
> + The license of the package in the form of an `SPDX license expression
> + <https://spdx.org/licenses/>`_.
> + The following values have special meaning for PTXdist:
> +
> + - ``custom`` and ``custom-exception``: for licenses or license exceptions
> + that are considered free software, but do not match any license or license
> + exception known to SPDX.
> + - ``proprietary``: for proprietary (non-free) packages
> + - ``ignore`` for packages without their own license, e.g. meta packages or
> + packages that only install files from ``projectroot/``
> + - ``unknown``: no licensing information was extracted yet
> +
> + See the section :ref:`licensing_in_packages` for more information.
>
> ``<PKG>_LICENSE_FILES``
> A space separated list of URLs of license text files. The URLs must be
> @@ -238,6 +249,7 @@ Package Definition
> used in case the specified file contains more than just the license text,
> e.g. if the license is in the header of a source file. For non ASCII or
> UTF-8 files the encoding can be specified with ``encoding=<enc>``.
> + See the section :ref:`licensing_in_packages` for more information.
>
> For most packages the variables described above are undefined by default.
> However, for cross and host packages these variables default to the value
> --
> 2.29.2
>
>
> _______________________________________________
> ptxdist mailing list
> ptxdist@pengutronix.de
> To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de
next prev parent reply other threads:[~2021-06-16 14:19 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-11 10:03 [ptxdist] [PATCH 1/2] doc: ref_make_variables: link to the SPDX license list Roland Hieber
2020-05-11 10:03 ` [ptxdist] [PATCH 2/2] doc: working with licensing information in packages Roland Hieber
2020-05-26 10:29 ` Roland Hieber
2020-05-26 11:12 ` Alexander Dahl
2020-05-29 6:23 ` Michael Olbrich
2020-05-29 8:27 ` Roland Hieber
2020-05-29 8:55 ` Michael Olbrich
2020-05-29 9:40 ` Roland Hieber
2020-05-29 12:03 ` Michael Olbrich
2020-05-31 19:56 ` Roland Hieber
2020-06-02 13:16 ` Michael Olbrich
2020-06-02 15:14 ` Roland Hieber
2021-06-08 10:36 ` [ptxdist] [PATCH] " Roland Hieber
2021-06-16 14:19 ` Michael Olbrich [this message]
2021-06-16 14:40 ` Roland Hieber
2021-08-05 9:18 ` [ptxdist] [PATCH v3] " Roland Hieber
2021-08-06 6:29 ` Michael Olbrich
2021-08-06 10:44 ` [ptxdist] [PATCH] " Roland Hieber
2021-10-07 10:18 ` [ptxdist] [APPLIED] " Michael Olbrich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210616141943.GO839947@pengutronix.de \
--to=m.olbrich@pengutronix.de \
--cc=f.jung@pengutronix.de \
--cc=ptxdist@pengutronix.de \
--cc=rhi@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox