mailarchive of the ptxdist mailing list
 help / color / mirror / Atom feed
From: Michael Olbrich <m.olbrich@pengutronix.de>
To: ptxdist@pengutronix.de
Subject: Re: [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
Date: Fri, 29 May 2020 08:23:46 +0200	[thread overview]
Message-ID: <20200529062346.GB31789@pengutronix.de> (raw)
In-Reply-To: <20200511100306.7948-2-rhi@pengutronix.de>

On Mon, May 11, 2020 at 12:03:06PM +0200, Roland Hieber wrote:
> Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Roland Hieber <rhi@pengutronix.de>
> ---
>  doc/contributing.rst        |   5 +
>  doc/daily_work.inc          |   2 +
>  doc/daily_work_licenses.inc | 208 ++++++++++++++++++++++++++++++++++++
>  doc/ref_make_variables.inc  |   4 +
>  4 files changed, 219 insertions(+)
>  create mode 100644 doc/daily_work_licenses.inc
> 
> diff --git a/doc/contributing.rst b/doc/contributing.rst
> index 705f01377d32..7352b46dfcf0 100644
> --- a/doc/contributing.rst
> +++ b/doc/contributing.rst
> @@ -90,6 +90,11 @@ For new packages, the generated templates contain commented-out default
>  sections. These are meant as a helper to simplify creating custom stages.
>  Any remaining default stages must be removed.
>  
> +New packages should also have licensing information in the ``<PKG>_LICENSE``
> +and ``<PKG>_LICENSE_FILES`` variables.
> +Refer to the section :ref:`licensing_in_packages` for more information.
> +
> +
>  Helper Scripts
>  --------------
>  
> diff --git a/doc/daily_work.inc b/doc/daily_work.inc
> index a37aac4c3339..f68d25bf7cb5 100644
> --- a/doc/daily_work.inc
> +++ b/doc/daily_work.inc
> @@ -1472,3 +1472,5 @@ be enabled. A used mount option of the overlayfs in the default
>  newer.
>  If your kernel does not meet this requirement you can provide your own local
>  and adapted variant of the mentioned mount unit.
> +
> +.. include:: daily_work_licenses.inc
> diff --git a/doc/daily_work_licenses.inc b/doc/daily_work_licenses.inc
> new file mode 100644
> index 000000000000..7e90b7ba541d
> --- /dev/null
> +++ b/doc/daily_work_licenses.inc
> @@ -0,0 +1,208 @@
> +.. _licensing_in_packages:
> +
> +Tracking licensing information in packages
> +------------------------------------------
> +
> +PTXdist aims to track licensing information for every package.
> +This includes the license(s) under which a package can be distributed,
> +as well as the respective files in the package's source tree that state those terms.
> +Sadly there is no widely adopted standard for machine-readable licensing
> +information in source code (`yet <https://reuse.software>`_),
> +so here are a few hints where to look.
> +
> +There are many older package rules in PTXdist which don't specify licensing information.
> +If you want to help complete the database,
> +you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
> +Note however that this cannot find wrong or incomplete licensing information.
> +
> +Finding licensing information
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +You should first select and extract the package in question, and then have a
> +look at in the extracted package sources (usually something like
> +``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
> +``ptxdist package-info mypackage``).
> +
> +* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
> +  These often only contain the license text and, in case of GPL, no information
> +  if the code is available under the *-only* or *-or-later* variant.
> +  Sometimes these files are in a folder ``/doc`` or ``/legal``.
> +
> +* Check the ``README``, if there is any.
> +  Often there is important information there, e.g. in case of GPL if the
> +  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
> +
> +* Check some relevant-sounding files, like ``main.c`` for license headers.
> +  Often additional information can be found here.

This is too lax for me. Unless there is an explicit statement that all code
has the same license, all source files must be checked. Especially older
projects have a few files that where copied from somewhere else and have a
different license.

> +
> +* If you want to be extra sure, use a license compliance toolchain (e.g.
> +  `FOSSology <https://www.fossology.org/>`__) on the project.
> +
> +On the other hand, there are some things that can be ignored for our purposes:
> +
> +* Everything that is auto-generated, either by a script in the project source,
> +  or by the build system previous to packaging.
> +  The generator itself cannot hold copyright, although the authors of the
> +  templates used for the generation or the authors of the generator can.
> +
> +* Most files belonging to the build system don't make it into the compiled code
> +  and can therefore be ignored (e.g. configure scripts, Makefiles).
> +  These cases sometimes can be hard to detect – if unsure, include the file in
> +  your research.
> +
> +Distillation down to license identifiers
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +We use the `SPDX license identifiers <https://spdx.org/licenses/>`_.
> +
> +Either the license is clear, e.g. because it says "GPL 2.0" (roughly check the
> +license content to be sure), or you can use tools like
> +`FOSSology <https://www.fossology.org>`__,
> +`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
> +or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
> +to detect license material in the project.
> +
> +License texts don't have to match exactly, you should apply the
> +`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
> +accordingly.
> +The important part here is that the project's license and the SPDX identifier
> +describe the same licensing terms.
> +"Rather close" or "mostly similar" statements are not enough for a match,
> +but simple unimportant changes like replacing *"The Author"* with the project's
> +maintainer's name, or a change in e-mail adresses, are usually okay.
> +
> +For software that is not open-source according to the `OSI definition
> +<https://opensource.org/osd>`_, use the identifier ``proprietary``.
> +
> +If no license identifier matches, use ``unknown``.
> +If the project is considered open source or free software, you can
> +`report its license to be added to the SPDX license list
> +<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.

I think I'd like to use something else here. Right now 'unknown' mostly
means "nobody looked at this yet". I want something else to say "I looked
at this and here it the license text but there is no SPDX identifier".
I'm considering the package name to make it unique. But that has the
downside, that it's not easily found.
Suggestions?

> +Conflicting statements
> +^^^^^^^^^^^^^^^^^^^^^^
> +
> +Human interpretation is needed when statements inside the project conflict with
> +each other.
> +Some clues that can help you decide:
> +
> +Detailedness:
> +  If the header in COPYING or the README says *"GNU General Public License"*,
> +  but the license text is in fact a BSD license, the correct license is the BSD
> +  license.
> +
> +Author Intent:
> +  If the README says *"this is LGPL 2.1"*, but COPYING contains a GPL boilerplate
> +  license text, the correct licensing information is probably *"LGPL 2.1"* –
> +  the README written by the author prevails over the boilerplate text.
> +
> +Recency:
> +  If README and COPYING are both clearly written by the author themselves, and
> +  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
> +  recent file prevails.
> +
> +  .. note::
> +
> +   Any of such cases is considered a bug and should be reported to the upstream maintainer!
> +
> +License versions, and GPL-vv-only or GPL-vv-or-later?
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +If the ``COPYING`` file is a GPL text, it is still uncertain if the correct
> +license identifier is *GPL-vv-only* or *GPL-vv-or-later*.
> +The GPL text itself does not give information on that in its terms and
> +conditions.
> +Sometimes there is a notice at the top of the COPYING or the README file stating
> +whether *"-only"* or *"-or-later"* applies – this is the easy case.
> +Otherwise: check headers in relevant files.
> +
> +If no license information can be found, but one file mentions e.g. *"GPL-vv or
> +later"*, use that information for the whole project.
> +E.g.: no license information can be found except a ``COPYING`` which contains
> +a GPL-2.0 text → the license is GPL-2.0-only.
> +
> +Sometimes the best information available is statements like
> +*"this code is under GPL"* without any version information.
> +Such cases should be interpreted as the most liberal reading,
> +i.e. *GPL-1.0-or-later* (any possible GPL version).

I'm not sure this is good. I would say, when in doubt then be restrictive.
After all, this is about compliance. If we comply with the more restrictive
interpretation then we also comply with more liberal interpretations.

> +If multiple versions and variants can be found in the project, combine them with
> +``AND``, e.g.: ``GPL-2.0-only AND GPL-2.0-or-later`` in the license identifier.

This should also mention `OR` for a license choice.

Michael

> +Public domain software
> +^^^^^^^^^^^^^^^^^^^^^^
> +
> +For `good reasons <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files>`_,
> +SPDX doesn't supply a license identifier for "Public Domain".
> +Nevertheless, some PTXdist package rules specify ``public_domain`` as their
> +respective license identifier.
> +When this is done, it is purely for historical reasons, and ``public_domain``
> +should normally not be used for new packages.
> +Some of those "Public Domain" dedications in packages have since been accepted
> +in SPDX, e.g. `libselinux <https://spdx.org/licenses/libselinux-1.0.html>`_ or
> +`SQLite <https://spdx.org/licenses/blessing.html>`_.
> +
> +No license information at all
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +No license - no usage rights!
> +
> +Definitely report this bug to the upstream maintainer.
> +Maybe even point them in the direction of `machine-readablity <https://reuse.software/>`_ :)
> +
> +Adding license files to PTXdist package rules
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
> +variable in the respective package rule file.
> +All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
> +including a checksum so that PTXdist complains when they change.
> +
> +Example:
> +
> +.. code-block:: make
> +   :caption: ddrescue.make
> +
> +   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
> +   DDRESCUE_LICENSE_FILES	:= \
> +           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
> +           file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
> +           file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
> +
> +See the section :ref:`package_specific_variables` for more information about
> +the syntax of those two variables.
> +
> +The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
> +command applied to a range of lines.
> +For the example above, lines 1 to 16 of main.cc would be:
> +
> +.. code-block:: terminal
> +
> +   $ sed -n 1,16p main.cc | md5sum -
> +   a01d61d3293ce28b883d8ba0c497e968
> +
> +If the copyright statement contains a string of years, leave those lines out for
> +the calculation of the checksum, as an added year does not change the license
> +(in fact, not even a single year is needed for the license to be valid),
> +but only makes package version updates more cumbersome.
> +
> +If additional information is in the ``README`` or license headers in source
> +files are used, also include these files (for source code: one of each is enough),
> +but use md5sum only on the relevant lines, so changes in the rest of the file do
> +not appear as license changes.
> +
> +For rather chaotic directories with lots of license files, definetly include at
> +least one relevant source file with license headers (if there are any), as some
> +developers tend to accumulate license files without adjusting it to license
> +changes in their source.
> +
> +As in the example above, sometimes more than one license applies.
> +If different files in the package are under different licenses, use ``AND`` (e.g.
> +``GPL-2.0-only AND LGPL-2.1``).
> +If it leaves the choice to modify/redistribute under one or the other
> +license, use ``OR``.
> +
> +.. note::
> +
> +   For each single license in the compound statement, include at least one file
> +   with checksum in the ``<PKG>_LICENSE_FILES`` variable.
> diff --git a/doc/ref_make_variables.inc b/doc/ref_make_variables.inc
> index 56912bb2e364..701c029591d8 100644
> --- a/doc/ref_make_variables.inc
> +++ b/doc/ref_make_variables.inc
> @@ -127,6 +127,8 @@ Other useful variables:
>    that are built and installed during the PTXdist build run.
>    There are analogous ``-y`` and ``-m`` variants of those variables too.
>  
> +.. _package_specific_variables:
> +
>  Package Specific Variables
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> @@ -228,6 +230,7 @@ Package Definition
>    here. Use ``proprietary`` for proprietary packages and ``ignore`` for
>    packages without their own license, e.g. meta packages or packages that
>    only install files from ``projectroot/``.
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  ``<PKG>_LICENSE_FILES``
>    A space separated list of URLs of license text files. The URLs must be
> @@ -239,6 +242,7 @@ Package Definition
>    used in case the specified file contains more than just the license text,
>    e.g. if the license is in the header of a source file. For non ASCII or
>    UTF-8 files the encoding can be specified with ``encoding=<enc>``.
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  For most packages the variables described above are undefined by default.
>  However, for cross and host packages these variables default to the value
> -- 
> 2.20.1
> 
> 
> _______________________________________________
> ptxdist mailing list
> ptxdist@pengutronix.de

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

  parent reply	other threads:[~2020-05-29  6:23 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-11 10:03 [ptxdist] [PATCH 1/2] doc: ref_make_variables: link to the SPDX license list Roland Hieber
2020-05-11 10:03 ` [ptxdist] [PATCH 2/2] doc: working with licensing information in packages Roland Hieber
2020-05-26 10:29   ` Roland Hieber
2020-05-26 11:12   ` Alexander Dahl
2020-05-29  6:23   ` Michael Olbrich [this message]
2020-05-29  8:27     ` Roland Hieber
2020-05-29  8:55       ` Michael Olbrich
2020-05-29  9:40         ` Roland Hieber
2020-05-29 12:03           ` Michael Olbrich
2020-05-31 19:56             ` Roland Hieber
2020-06-02 13:16               ` Michael Olbrich
2020-06-02 15:14                 ` Roland Hieber
2021-06-08 10:36 ` [ptxdist] [PATCH] " Roland Hieber
2021-06-16 14:19   ` Michael Olbrich
2021-06-16 14:40     ` Roland Hieber
2021-08-05  9:18     ` [ptxdist] [PATCH v3] " Roland Hieber
2021-08-06  6:29       ` Michael Olbrich
2021-08-06 10:44         ` [ptxdist] [PATCH] " Roland Hieber
2021-10-07 10:18           ` [ptxdist] [APPLIED] " Michael Olbrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200529062346.GB31789@pengutronix.de \
    --to=m.olbrich@pengutronix.de \
    --cc=ptxdist@pengutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox