mailarchive of the ptxdist mailing list
 help / color / mirror / Atom feed
* [ptxdist] [PATCH 1/2] doc: ref_make_variables: link to the SPDX license list
@ 2020-05-11 10:03 Roland Hieber
  2020-05-11 10:03 ` [ptxdist] [PATCH 2/2] doc: working with licensing information in packages Roland Hieber
  2021-06-08 10:36 ` [ptxdist] [PATCH] " Roland Hieber
  0 siblings, 2 replies; 19+ messages in thread
From: Roland Hieber @ 2020-05-11 10:03 UTC (permalink / raw)
  To: ptxdist; +Cc: Roland Hieber

Signed-off-by: Roland Hieber <rhi@pengutronix.de>
---
 doc/ref_make_variables.inc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/doc/ref_make_variables.inc b/doc/ref_make_variables.inc
index b770b1b49f18..56912bb2e364 100644
--- a/doc/ref_make_variables.inc
+++ b/doc/ref_make_variables.inc
@@ -223,7 +223,8 @@ Package Definition
   'gdbserver' for an example.
 
 ``<PKG>_LICENSE``
-  The license of the package. The SPDX license identifiers should be used
+  The license of the package.
+  An `SPDX license identifier <https://spdx.org/licenses/>`_ should be used
   here. Use ``proprietary`` for proprietary packages and ``ignore`` for
   packages without their own license, e.g. meta packages or packages that
   only install files from ``projectroot/``.
-- 
2.20.1


_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
  2020-05-11 10:03 [ptxdist] [PATCH 1/2] doc: ref_make_variables: link to the SPDX license list Roland Hieber
@ 2020-05-11 10:03 ` Roland Hieber
  2020-05-26 10:29   ` Roland Hieber
                     ` (2 more replies)
  2021-06-08 10:36 ` [ptxdist] [PATCH] " Roland Hieber
  1 sibling, 3 replies; 19+ messages in thread
From: Roland Hieber @ 2020-05-11 10:03 UTC (permalink / raw)
  To: ptxdist; +Cc: Roland Hieber, Felicitas Jung

Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
Signed-off-by: Roland Hieber <rhi@pengutronix.de>
---
 doc/contributing.rst        |   5 +
 doc/daily_work.inc          |   2 +
 doc/daily_work_licenses.inc | 208 ++++++++++++++++++++++++++++++++++++
 doc/ref_make_variables.inc  |   4 +
 4 files changed, 219 insertions(+)
 create mode 100644 doc/daily_work_licenses.inc

diff --git a/doc/contributing.rst b/doc/contributing.rst
index 705f01377d32..7352b46dfcf0 100644
--- a/doc/contributing.rst
+++ b/doc/contributing.rst
@@ -90,6 +90,11 @@ For new packages, the generated templates contain commented-out default
 sections. These are meant as a helper to simplify creating custom stages.
 Any remaining default stages must be removed.
 
+New packages should also have licensing information in the ``<PKG>_LICENSE``
+and ``<PKG>_LICENSE_FILES`` variables.
+Refer to the section :ref:`licensing_in_packages` for more information.
+
+
 Helper Scripts
 --------------
 
diff --git a/doc/daily_work.inc b/doc/daily_work.inc
index a37aac4c3339..f68d25bf7cb5 100644
--- a/doc/daily_work.inc
+++ b/doc/daily_work.inc
@@ -1472,3 +1472,5 @@ be enabled. A used mount option of the overlayfs in the default
 newer.
 If your kernel does not meet this requirement you can provide your own local
 and adapted variant of the mentioned mount unit.
+
+.. include:: daily_work_licenses.inc
diff --git a/doc/daily_work_licenses.inc b/doc/daily_work_licenses.inc
new file mode 100644
index 000000000000..7e90b7ba541d
--- /dev/null
+++ b/doc/daily_work_licenses.inc
@@ -0,0 +1,208 @@
+.. _licensing_in_packages:
+
+Tracking licensing information in packages
+------------------------------------------
+
+PTXdist aims to track licensing information for every package.
+This includes the license(s) under which a package can be distributed,
+as well as the respective files in the package's source tree that state those terms.
+Sadly there is no widely adopted standard for machine-readable licensing
+information in source code (`yet <https://reuse.software>`_),
+so here are a few hints where to look.
+
+There are many older package rules in PTXdist which don't specify licensing information.
+If you want to help complete the database,
+you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
+Note however that this cannot find wrong or incomplete licensing information.
+
+Finding licensing information
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You should first select and extract the package in question, and then have a
+look at in the extracted package sources (usually something like
+``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
+``ptxdist package-info mypackage``).
+
+* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
+  These often only contain the license text and, in case of GPL, no information
+  if the code is available under the *-only* or *-or-later* variant.
+  Sometimes these files are in a folder ``/doc`` or ``/legal``.
+
+* Check the ``README``, if there is any.
+  Often there is important information there, e.g. in case of GPL if the
+  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
+
+* Check some relevant-sounding files, like ``main.c`` for license headers.
+  Often additional information can be found here.
+
+* If you want to be extra sure, use a license compliance toolchain (e.g.
+  `FOSSology <https://www.fossology.org/>`__) on the project.
+
+On the other hand, there are some things that can be ignored for our purposes:
+
+* Everything that is auto-generated, either by a script in the project source,
+  or by the build system previous to packaging.
+  The generator itself cannot hold copyright, although the authors of the
+  templates used for the generation or the authors of the generator can.
+
+* Most files belonging to the build system don't make it into the compiled code
+  and can therefore be ignored (e.g. configure scripts, Makefiles).
+  These cases sometimes can be hard to detect – if unsure, include the file in
+  your research.
+
+Distillation down to license identifiers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+We use the `SPDX license identifiers <https://spdx.org/licenses/>`_.
+
+Either the license is clear, e.g. because it says "GPL 2.0" (roughly check the
+license content to be sure), or you can use tools like
+`FOSSology <https://www.fossology.org>`__,
+`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
+or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
+to detect license material in the project.
+
+License texts don't have to match exactly, you should apply the
+`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
+accordingly.
+The important part here is that the project's license and the SPDX identifier
+describe the same licensing terms.
+"Rather close" or "mostly similar" statements are not enough for a match,
+but simple unimportant changes like replacing *"The Author"* with the project's
+maintainer's name, or a change in e-mail adresses, are usually okay.
+
+For software that is not open-source according to the `OSI definition
+<https://opensource.org/osd>`_, use the identifier ``proprietary``.
+
+If no license identifier matches, use ``unknown``.
+If the project is considered open source or free software, you can
+`report its license to be added to the SPDX license list
+<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.
+
+Conflicting statements
+^^^^^^^^^^^^^^^^^^^^^^
+
+Human interpretation is needed when statements inside the project conflict with
+each other.
+Some clues that can help you decide:
+
+Detailedness:
+  If the header in COPYING or the README says *"GNU General Public License"*,
+  but the license text is in fact a BSD license, the correct license is the BSD
+  license.
+
+Author Intent:
+  If the README says *"this is LGPL 2.1"*, but COPYING contains a GPL boilerplate
+  license text, the correct licensing information is probably *"LGPL 2.1"* –
+  the README written by the author prevails over the boilerplate text.
+
+Recency:
+  If README and COPYING are both clearly written by the author themselves, and
+  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
+  recent file prevails.
+
+  .. note::
+
+   Any of such cases is considered a bug and should be reported to the upstream maintainer!
+
+License versions, and GPL-vv-only or GPL-vv-or-later?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If the ``COPYING`` file is a GPL text, it is still uncertain if the correct
+license identifier is *GPL-vv-only* or *GPL-vv-or-later*.
+The GPL text itself does not give information on that in its terms and
+conditions.
+Sometimes there is a notice at the top of the COPYING or the README file stating
+whether *"-only"* or *"-or-later"* applies – this is the easy case.
+Otherwise: check headers in relevant files.
+
+If no license information can be found, but one file mentions e.g. *"GPL-vv or
+later"*, use that information for the whole project.
+E.g.: no license information can be found except a ``COPYING`` which contains
+a GPL-2.0 text → the license is GPL-2.0-only.
+
+Sometimes the best information available is statements like
+*"this code is under GPL"* without any version information.
+Such cases should be interpreted as the most liberal reading,
+i.e. *GPL-1.0-or-later* (any possible GPL version).
+
+If multiple versions and variants can be found in the project, combine them with
+``AND``, e.g.: ``GPL-2.0-only AND GPL-2.0-or-later`` in the license identifier.
+
+Public domain software
+^^^^^^^^^^^^^^^^^^^^^^
+
+For `good reasons <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files>`_,
+SPDX doesn't supply a license identifier for "Public Domain".
+Nevertheless, some PTXdist package rules specify ``public_domain`` as their
+respective license identifier.
+When this is done, it is purely for historical reasons, and ``public_domain``
+should normally not be used for new packages.
+Some of those "Public Domain" dedications in packages have since been accepted
+in SPDX, e.g. `libselinux <https://spdx.org/licenses/libselinux-1.0.html>`_ or
+`SQLite <https://spdx.org/licenses/blessing.html>`_.
+
+No license information at all
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+No license - no usage rights!
+
+Definitely report this bug to the upstream maintainer.
+Maybe even point them in the direction of `machine-readablity <https://reuse.software/>`_ :)
+
+Adding license files to PTXdist package rules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
+variable in the respective package rule file.
+All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
+including a checksum so that PTXdist complains when they change.
+
+Example:
+
+.. code-block:: make
+   :caption: ddrescue.make
+
+   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
+   DDRESCUE_LICENSE_FILES	:= \
+           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
+           file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
+           file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
+
+See the section :ref:`package_specific_variables` for more information about
+the syntax of those two variables.
+
+The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
+command applied to a range of lines.
+For the example above, lines 1 to 16 of main.cc would be:
+
+.. code-block:: terminal
+
+   $ sed -n 1,16p main.cc | md5sum -
+   a01d61d3293ce28b883d8ba0c497e968
+
+If the copyright statement contains a string of years, leave those lines out for
+the calculation of the checksum, as an added year does not change the license
+(in fact, not even a single year is needed for the license to be valid),
+but only makes package version updates more cumbersome.
+
+If additional information is in the ``README`` or license headers in source
+files are used, also include these files (for source code: one of each is enough),
+but use md5sum only on the relevant lines, so changes in the rest of the file do
+not appear as license changes.
+
+For rather chaotic directories with lots of license files, definetly include at
+least one relevant source file with license headers (if there are any), as some
+developers tend to accumulate license files without adjusting it to license
+changes in their source.
+
+As in the example above, sometimes more than one license applies.
+If different files in the package are under different licenses, use ``AND`` (e.g.
+``GPL-2.0-only AND LGPL-2.1``).
+If it leaves the choice to modify/redistribute under one or the other
+license, use ``OR``.
+
+.. note::
+
+   For each single license in the compound statement, include at least one file
+   with checksum in the ``<PKG>_LICENSE_FILES`` variable.
diff --git a/doc/ref_make_variables.inc b/doc/ref_make_variables.inc
index 56912bb2e364..701c029591d8 100644
--- a/doc/ref_make_variables.inc
+++ b/doc/ref_make_variables.inc
@@ -127,6 +127,8 @@ Other useful variables:
   that are built and installed during the PTXdist build run.
   There are analogous ``-y`` and ``-m`` variants of those variables too.
 
+.. _package_specific_variables:
+
 Package Specific Variables
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -228,6 +230,7 @@ Package Definition
   here. Use ``proprietary`` for proprietary packages and ``ignore`` for
   packages without their own license, e.g. meta packages or packages that
   only install files from ``projectroot/``.
+  See the section :ref:`licensing_in_packages` for more information.
 
 ``<PKG>_LICENSE_FILES``
   A space separated list of URLs of license text files. The URLs must be
@@ -239,6 +242,7 @@ Package Definition
   used in case the specified file contains more than just the license text,
   e.g. if the license is in the header of a source file. For non ASCII or
   UTF-8 files the encoding can be specified with ``encoding=<enc>``.
+  See the section :ref:`licensing_in_packages` for more information.
 
 For most packages the variables described above are undefined by default.
 However, for cross and host packages these variables default to the value
-- 
2.20.1


_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
  2020-05-11 10:03 ` [ptxdist] [PATCH 2/2] doc: working with licensing information in packages Roland Hieber
@ 2020-05-26 10:29   ` Roland Hieber
  2020-05-26 11:12   ` Alexander Dahl
  2020-05-29  6:23   ` Michael Olbrich
  2 siblings, 0 replies; 19+ messages in thread
From: Roland Hieber @ 2020-05-26 10:29 UTC (permalink / raw)
  To: ptxdist; +Cc: Felicitas Jung

Any comments here?

 - Roland

On Mon, May 11, 2020 at 12:03:06PM +0200, Roland Hieber wrote:
> Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Roland Hieber <rhi@pengutronix.de>
> ---
>  doc/contributing.rst        |   5 +
>  doc/daily_work.inc          |   2 +
>  doc/daily_work_licenses.inc | 208 ++++++++++++++++++++++++++++++++++++
>  doc/ref_make_variables.inc  |   4 +
>  4 files changed, 219 insertions(+)
>  create mode 100644 doc/daily_work_licenses.inc
> 
> diff --git a/doc/contributing.rst b/doc/contributing.rst
> index 705f01377d32..7352b46dfcf0 100644
> --- a/doc/contributing.rst
> +++ b/doc/contributing.rst
> @@ -90,6 +90,11 @@ For new packages, the generated templates contain commented-out default
>  sections. These are meant as a helper to simplify creating custom stages.
>  Any remaining default stages must be removed.
>  
> +New packages should also have licensing information in the ``<PKG>_LICENSE``
> +and ``<PKG>_LICENSE_FILES`` variables.
> +Refer to the section :ref:`licensing_in_packages` for more information.
> +
> +
>  Helper Scripts
>  --------------
>  
> diff --git a/doc/daily_work.inc b/doc/daily_work.inc
> index a37aac4c3339..f68d25bf7cb5 100644
> --- a/doc/daily_work.inc
> +++ b/doc/daily_work.inc
> @@ -1472,3 +1472,5 @@ be enabled. A used mount option of the overlayfs in the default
>  newer.
>  If your kernel does not meet this requirement you can provide your own local
>  and adapted variant of the mentioned mount unit.
> +
> +.. include:: daily_work_licenses.inc
> diff --git a/doc/daily_work_licenses.inc b/doc/daily_work_licenses.inc
> new file mode 100644
> index 000000000000..7e90b7ba541d
> --- /dev/null
> +++ b/doc/daily_work_licenses.inc
> @@ -0,0 +1,208 @@
> +.. _licensing_in_packages:
> +
> +Tracking licensing information in packages
> +------------------------------------------
> +
> +PTXdist aims to track licensing information for every package.
> +This includes the license(s) under which a package can be distributed,
> +as well as the respective files in the package's source tree that state those terms.
> +Sadly there is no widely adopted standard for machine-readable licensing
> +information in source code (`yet <https://reuse.software>`_),
> +so here are a few hints where to look.
> +
> +There are many older package rules in PTXdist which don't specify licensing information.
> +If you want to help complete the database,
> +you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
> +Note however that this cannot find wrong or incomplete licensing information.
> +
> +Finding licensing information
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +You should first select and extract the package in question, and then have a
> +look at in the extracted package sources (usually something like
> +``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
> +``ptxdist package-info mypackage``).
> +
> +* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
> +  These often only contain the license text and, in case of GPL, no information
> +  if the code is available under the *-only* or *-or-later* variant.
> +  Sometimes these files are in a folder ``/doc`` or ``/legal``.
> +
> +* Check the ``README``, if there is any.
> +  Often there is important information there, e.g. in case of GPL if the
> +  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
> +
> +* Check some relevant-sounding files, like ``main.c`` for license headers.
> +  Often additional information can be found here.
> +
> +* If you want to be extra sure, use a license compliance toolchain (e.g.
> +  `FOSSology <https://www.fossology.org/>`__) on the project.
> +
> +On the other hand, there are some things that can be ignored for our purposes:
> +
> +* Everything that is auto-generated, either by a script in the project source,
> +  or by the build system previous to packaging.
> +  The generator itself cannot hold copyright, although the authors of the
> +  templates used for the generation or the authors of the generator can.
> +
> +* Most files belonging to the build system don't make it into the compiled code
> +  and can therefore be ignored (e.g. configure scripts, Makefiles).
> +  These cases sometimes can be hard to detect – if unsure, include the file in
> +  your research.
> +
> +Distillation down to license identifiers
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +We use the `SPDX license identifiers <https://spdx.org/licenses/>`_.
> +
> +Either the license is clear, e.g. because it says "GPL 2.0" (roughly check the
> +license content to be sure), or you can use tools like
> +`FOSSology <https://www.fossology.org>`__,
> +`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
> +or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
> +to detect license material in the project.
> +
> +License texts don't have to match exactly, you should apply the
> +`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
> +accordingly.
> +The important part here is that the project's license and the SPDX identifier
> +describe the same licensing terms.
> +"Rather close" or "mostly similar" statements are not enough for a match,
> +but simple unimportant changes like replacing *"The Author"* with the project's
> +maintainer's name, or a change in e-mail adresses, are usually okay.
> +
> +For software that is not open-source according to the `OSI definition
> +<https://opensource.org/osd>`_, use the identifier ``proprietary``.
> +
> +If no license identifier matches, use ``unknown``.
> +If the project is considered open source or free software, you can
> +`report its license to be added to the SPDX license list
> +<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.
> +
> +Conflicting statements
> +^^^^^^^^^^^^^^^^^^^^^^
> +
> +Human interpretation is needed when statements inside the project conflict with
> +each other.
> +Some clues that can help you decide:
> +
> +Detailedness:
> +  If the header in COPYING or the README says *"GNU General Public License"*,
> +  but the license text is in fact a BSD license, the correct license is the BSD
> +  license.
> +
> +Author Intent:
> +  If the README says *"this is LGPL 2.1"*, but COPYING contains a GPL boilerplate
> +  license text, the correct licensing information is probably *"LGPL 2.1"* –
> +  the README written by the author prevails over the boilerplate text.
> +
> +Recency:
> +  If README and COPYING are both clearly written by the author themselves, and
> +  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
> +  recent file prevails.
> +
> +  .. note::
> +
> +   Any of such cases is considered a bug and should be reported to the upstream maintainer!
> +
> +License versions, and GPL-vv-only or GPL-vv-or-later?
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +If the ``COPYING`` file is a GPL text, it is still uncertain if the correct
> +license identifier is *GPL-vv-only* or *GPL-vv-or-later*.
> +The GPL text itself does not give information on that in its terms and
> +conditions.
> +Sometimes there is a notice at the top of the COPYING or the README file stating
> +whether *"-only"* or *"-or-later"* applies – this is the easy case.
> +Otherwise: check headers in relevant files.
> +
> +If no license information can be found, but one file mentions e.g. *"GPL-vv or
> +later"*, use that information for the whole project.
> +E.g.: no license information can be found except a ``COPYING`` which contains
> +a GPL-2.0 text → the license is GPL-2.0-only.
> +
> +Sometimes the best information available is statements like
> +*"this code is under GPL"* without any version information.
> +Such cases should be interpreted as the most liberal reading,
> +i.e. *GPL-1.0-or-later* (any possible GPL version).
> +
> +If multiple versions and variants can be found in the project, combine them with
> +``AND``, e.g.: ``GPL-2.0-only AND GPL-2.0-or-later`` in the license identifier.
> +
> +Public domain software
> +^^^^^^^^^^^^^^^^^^^^^^
> +
> +For `good reasons <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files>`_,
> +SPDX doesn't supply a license identifier for "Public Domain".
> +Nevertheless, some PTXdist package rules specify ``public_domain`` as their
> +respective license identifier.
> +When this is done, it is purely for historical reasons, and ``public_domain``
> +should normally not be used for new packages.
> +Some of those "Public Domain" dedications in packages have since been accepted
> +in SPDX, e.g. `libselinux <https://spdx.org/licenses/libselinux-1.0.html>`_ or
> +`SQLite <https://spdx.org/licenses/blessing.html>`_.
> +
> +No license information at all
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +No license - no usage rights!
> +
> +Definitely report this bug to the upstream maintainer.
> +Maybe even point them in the direction of `machine-readablity <https://reuse.software/>`_ :)
> +
> +Adding license files to PTXdist package rules
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
> +variable in the respective package rule file.
> +All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
> +including a checksum so that PTXdist complains when they change.
> +
> +Example:
> +
> +.. code-block:: make
> +   :caption: ddrescue.make
> +
> +   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
> +   DDRESCUE_LICENSE_FILES	:= \
> +           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
> +           file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
> +           file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
> +
> +See the section :ref:`package_specific_variables` for more information about
> +the syntax of those two variables.
> +
> +The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
> +command applied to a range of lines.
> +For the example above, lines 1 to 16 of main.cc would be:
> +
> +.. code-block:: terminal
> +
> +   $ sed -n 1,16p main.cc | md5sum -
> +   a01d61d3293ce28b883d8ba0c497e968
> +
> +If the copyright statement contains a string of years, leave those lines out for
> +the calculation of the checksum, as an added year does not change the license
> +(in fact, not even a single year is needed for the license to be valid),
> +but only makes package version updates more cumbersome.
> +
> +If additional information is in the ``README`` or license headers in source
> +files are used, also include these files (for source code: one of each is enough),
> +but use md5sum only on the relevant lines, so changes in the rest of the file do
> +not appear as license changes.
> +
> +For rather chaotic directories with lots of license files, definetly include at
> +least one relevant source file with license headers (if there are any), as some
> +developers tend to accumulate license files without adjusting it to license
> +changes in their source.
> +
> +As in the example above, sometimes more than one license applies.
> +If different files in the package are under different licenses, use ``AND`` (e.g.
> +``GPL-2.0-only AND LGPL-2.1``).
> +If it leaves the choice to modify/redistribute under one or the other
> +license, use ``OR``.
> +
> +.. note::
> +
> +   For each single license in the compound statement, include at least one file
> +   with checksum in the ``<PKG>_LICENSE_FILES`` variable.
> diff --git a/doc/ref_make_variables.inc b/doc/ref_make_variables.inc
> index 56912bb2e364..701c029591d8 100644
> --- a/doc/ref_make_variables.inc
> +++ b/doc/ref_make_variables.inc
> @@ -127,6 +127,8 @@ Other useful variables:
>    that are built and installed during the PTXdist build run.
>    There are analogous ``-y`` and ``-m`` variants of those variables too.
>  
> +.. _package_specific_variables:
> +
>  Package Specific Variables
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> @@ -228,6 +230,7 @@ Package Definition
>    here. Use ``proprietary`` for proprietary packages and ``ignore`` for
>    packages without their own license, e.g. meta packages or packages that
>    only install files from ``projectroot/``.
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  ``<PKG>_LICENSE_FILES``
>    A space separated list of URLs of license text files. The URLs must be
> @@ -239,6 +242,7 @@ Package Definition
>    used in case the specified file contains more than just the license text,
>    e.g. if the license is in the header of a source file. For non ASCII or
>    UTF-8 files the encoding can be specified with ``encoding=<enc>``.
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  For most packages the variables described above are undefined by default.
>  However, for cross and host packages these variables default to the value
> -- 
> 2.20.1
> 
> 
> _______________________________________________
> ptxdist mailing list
> ptxdist@pengutronix.de

-- 
Roland Hieber, Pengutronix e.K.          | r.hieber@pengutronix.de     |
Steuerwalder Str. 21                     | https://www.pengutronix.de/ |
31137 Hildesheim, Germany                | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686         | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
  2020-05-11 10:03 ` [ptxdist] [PATCH 2/2] doc: working with licensing information in packages Roland Hieber
  2020-05-26 10:29   ` Roland Hieber
@ 2020-05-26 11:12   ` Alexander Dahl
  2020-05-29  6:23   ` Michael Olbrich
  2 siblings, 0 replies; 19+ messages in thread
From: Alexander Dahl @ 2020-05-26 11:12 UTC (permalink / raw)
  To: ptxdist; +Cc: Roland Hieber, Felicitas Jung

Hei hei,

see below …

Am Montag, 11. Mai 2020, 12:03:06 CEST schrieb Roland Hieber:
> Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Roland Hieber <rhi@pengutronix.de>
> ---
>  doc/contributing.rst        |   5 +
>  doc/daily_work.inc          |   2 +
>  doc/daily_work_licenses.inc | 208 ++++++++++++++++++++++++++++++++++++
>  doc/ref_make_variables.inc  |   4 +
>  4 files changed, 219 insertions(+)
>  create mode 100644 doc/daily_work_licenses.inc
> 
> diff --git a/doc/contributing.rst b/doc/contributing.rst
> index 705f01377d32..7352b46dfcf0 100644
> --- a/doc/contributing.rst
> +++ b/doc/contributing.rst
> @@ -90,6 +90,11 @@ For new packages, the generated templates contain
> commented-out default sections. These are meant as a helper to simplify
> creating custom stages. Any remaining default stages must be removed.
> 
> +New packages should also have licensing information in the
> ``<PKG>_LICENSE`` +and ``<PKG>_LICENSE_FILES`` variables.
> +Refer to the section :ref:`licensing_in_packages` for more information.
> +
> +
>  Helper Scripts
>  --------------
> 
> diff --git a/doc/daily_work.inc b/doc/daily_work.inc
> index a37aac4c3339..f68d25bf7cb5 100644
> --- a/doc/daily_work.inc
> +++ b/doc/daily_work.inc
> @@ -1472,3 +1472,5 @@ be enabled. A used mount option of the overlayfs in
> the default newer.
>  If your kernel does not meet this requirement you can provide your own
> local and adapted variant of the mentioned mount unit.
> +
> +.. include:: daily_work_licenses.inc
> diff --git a/doc/daily_work_licenses.inc b/doc/daily_work_licenses.inc
> new file mode 100644
> index 000000000000..7e90b7ba541d
> --- /dev/null
> +++ b/doc/daily_work_licenses.inc
> @@ -0,0 +1,208 @@
> +.. _licensing_in_packages:
> +
> +Tracking licensing information in packages
> +------------------------------------------
> +
> +PTXdist aims to track licensing information for every package.
> +This includes the license(s) under which a package can be distributed,
> +as well as the respective files in the package's source tree that state
> those terms. +Sadly there is no widely adopted standard for
> machine-readable licensing +information in source code (`yet
> <https://reuse.software>`_),
> +so here are a few hints where to look.
> +
> +There are many older package rules in PTXdist which don't specify licensing
> information. +If you want to help complete the database,
> +you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree)
> to find those rules. +Note however that this cannot find wrong or
> incomplete licensing information. +
> +Finding licensing information
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +You should first select and extract the package in question, and then have
> a +look at in the extracted package sources (usually something like
> +``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
> +``ptxdist package-info mypackage``).
> +
> +* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
> +  These often only contain the license text and, in case of GPL, no
> information +  if the code is available under the *-only* or *-or-later*
> variant. +  Sometimes these files are in a folder ``/doc`` or ``/legal``.
> +
> +* Check the ``README``, if there is any.
> +  Often there is important information there, e.g. in case of GPL if the
> +  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
> +
> +* Check some relevant-sounding files, like ``main.c`` for license headers.
> +  Often additional information can be found here.
> +
> +* If you want to be extra sure, use a license compliance toolchain (e.g.
> +  `FOSSology <https://www.fossology.org/>`__) on the project.
> +
> +On the other hand, there are some things that can be ignored for our
> purposes: +
> +* Everything that is auto-generated, either by a script in the project
> source, +  or by the build system previous to packaging.
> +  The generator itself cannot hold copyright, although the authors of the
> +  templates used for the generation or the authors of the generator can.
> +
> +* Most files belonging to the build system don't make it into the compiled
> code +  and can therefore be ignored (e.g. configure scripts, Makefiles). +
>  These cases sometimes can be hard to detect – if unsure, include the file
> in +  your research.
> +
> +Distillation down to license identifiers
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +We use the `SPDX license identifiers <https://spdx.org/licenses/>`_.
> +
> +Either the license is clear, e.g. because it says "GPL 2.0" (roughly check
> the +license content to be sure), or you can use tools like
> +`FOSSology <https://www.fossology.org>`__,
> +`licensecheck
> <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>
> `_, +or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
> +to detect license material in the project.
> +
> +License texts don't have to match exactly, you should apply the
> +`SPDX Matching Guidelines
> <https://spdx.org/spdx-license-list/matching-guidelines>`_ +accordingly.
> +The important part here is that the project's license and the SPDX
> identifier +describe the same licensing terms.
> +"Rather close" or "mostly similar" statements are not enough for a match,
> +but simple unimportant changes like replacing *"The Author"* with the
> project's +maintainer's name, or a change in e-mail adresses, are usually
> okay. +
> +For software that is not open-source according to the `OSI definition
> +<https://opensource.org/osd>`_, use the identifier ``proprietary``.
> +
> +If no license identifier matches, use ``unknown``.
> +If the project is considered open source or free software, you can
> +`report its license to be added to the SPDX license list
> +<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#reque
> st-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_. +
> +Conflicting statements
> +^^^^^^^^^^^^^^^^^^^^^^
> +
> +Human interpretation is needed when statements inside the project conflict
> with +each other.
> +Some clues that can help you decide:
> +
> +Detailedness:
> +  If the header in COPYING or the README says *"GNU General Public
> License"*, +  but the license text is in fact a BSD license, the correct
> license is the BSD +  license.
> +
> +Author Intent:
> +  If the README says *"this is LGPL 2.1"*, but COPYING contains a GPL
> boilerplate +  license text, the correct licensing information is probably
> *"LGPL 2.1"* – +  the README written by the author prevails over the
> boilerplate text. +
> +Recency:
> +  If README and COPYING are both clearly written by the author themselves,
> and +  the README says *"don't do $thing*" and COPYING says *"do $thing*",
> the more +  recent file prevails.
> +
> +  .. note::
> +
> +   Any of such cases is considered a bug and should be reported to the
> upstream maintainer! +
> +License versions, and GPL-vv-only or GPL-vv-or-later?
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +If the ``COPYING`` file is a GPL text, it is still uncertain if the correct
> +license identifier is *GPL-vv-only* or *GPL-vv-or-later*.
> +The GPL text itself does not give information on that in its terms and
> +conditions.
> +Sometimes there is a notice at the top of the COPYING or the README file
> stating +whether *"-only"* or *"-or-later"* applies – this is the easy
> case. +Otherwise: check headers in relevant files.
> +
> +If no license information can be found, but one file mentions e.g. *"GPL-vv
> or +later"*, use that information for the whole project.
> +E.g.: no license information can be found except a ``COPYING`` which
> contains +a GPL-2.0 text → the license is GPL-2.0-only.
> +
> +Sometimes the best information available is statements like
> +*"this code is under GPL"* without any version information.
> +Such cases should be interpreted as the most liberal reading,
> +i.e. *GPL-1.0-or-later* (any possible GPL version).
> +
> +If multiple versions and variants can be found in the project, combine them
> with +``AND``, e.g.: ``GPL-2.0-only AND GPL-2.0-or-later`` in the license
> identifier. +
> +Public domain software
> +^^^^^^^^^^^^^^^^^^^^^^
> +
> +For `good reasons
> <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain
> _within_SPDX_Files>`_, +SPDX doesn't supply a license identifier for "Public
> Domain".
> +Nevertheless, some PTXdist package rules specify ``public_domain`` as their
> +respective license identifier.
> +When this is done, it is purely for historical reasons, and
> ``public_domain`` +should normally not be used for new packages.
> +Some of those "Public Domain" dedications in packages have since been
> accepted +in SPDX, e.g. `libselinux
> <https://spdx.org/licenses/libselinux-1.0.html>`_ or +`SQLite
> <https://spdx.org/licenses/blessing.html>`_.
> +
> +No license information at all
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +No license - no usage rights!
> +
> +Definitely report this bug to the upstream maintainer.
> +Maybe even point them in the direction of `machine-readablity
> <https://reuse.software/>`_ :) +
> +Adding license files to PTXdist package rules
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
> +variable in the respective package rule file.
> +All relevant files identified in the steps above are then added to the
> variable ``<PKG>_LICENSE``, +including a checksum so that PTXdist complains
> when they change.
> +
> +Example:
> +
> +.. code-block:: make
> +   :caption: ddrescue.make
> +
> +   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
> +   DDRESCUE_LICENSE_FILES	:= \
> +           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
> +          
> file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968
> \ +          
> file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc
> 1ac42 +
> +See the section :ref:`package_specific_variables` for more information
> about +the syntax of those two variables.
> +
> +The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
> +command applied to a range of lines.
> +For the example above, lines 1 to 16 of main.cc would be:
> +
> +.. code-block:: terminal
> +
> +   $ sed -n 1,16p main.cc | md5sum -
> +   a01d61d3293ce28b883d8ba0c497e968
> +
> +If the copyright statement contains a string of years, leave those lines
> out for +the calculation of the checksum, as an added year does not change
> the license +(in fact, not even a single year is needed for the license to
> be valid), +but only makes package version updates more cumbersome.
> +
> +If additional information is in the ``README`` or license headers in source
> +files are used, also include these files (for source code: one of each is
> enough), +but use md5sum only on the relevant lines, so changes in the rest
> of the file do +not appear as license changes.
> +
> +For rather chaotic directories with lots of license files, definetly
> include at +least one relevant source file with license headers (if there
> are any), as some +developers tend to accumulate license files without
> adjusting it to license +changes in their source.
> +
> +As in the example above, sometimes more than one license applies.
> +If different files in the package are under different licenses, use ``AND``
> (e.g. +``GPL-2.0-only AND LGPL-2.1``).
> +If it leaves the choice to modify/redistribute under one or the other
> +license, use ``OR``.
> +
> +.. note::
> +
> +   For each single license in the compound statement, include at least one
> file +   with checksum in the ``<PKG>_LICENSE_FILES`` variable.
> diff --git a/doc/ref_make_variables.inc b/doc/ref_make_variables.inc
> index 56912bb2e364..701c029591d8 100644
> --- a/doc/ref_make_variables.inc
> +++ b/doc/ref_make_variables.inc
> @@ -127,6 +127,8 @@ Other useful variables:
>    that are built and installed during the PTXdist build run.
>    There are analogous ``-y`` and ``-m`` variants of those variables too.
> 
> +.. _package_specific_variables:
> +
>  Package Specific Variables
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> @@ -228,6 +230,7 @@ Package Definition
>    here. Use ``proprietary`` for proprietary packages and ``ignore`` for
>    packages without their own license, e.g. meta packages or packages that
>    only install files from ``projectroot/``.
> +  See the section :ref:`licensing_in_packages` for more information.
> 
>  ``<PKG>_LICENSE_FILES``
>    A space separated list of URLs of license text files. The URLs must be
> @@ -239,6 +242,7 @@ Package Definition
>    used in case the specified file contains more than just the license text,
> e.g. if the license is in the header of a source file. For non ASCII or
> UTF-8 files the encoding can be specified with ``encoding=<enc>``. +  See
> the section :ref:`licensing_in_packages` for more information.
> 
>  For most packages the variables described above are undefined by default.
>  However, for cross and host packages these variables default to the value

I read that whole patch text, and nothing really catched my eye. Sounds quite 
complete and is a better explanation on licensing issues than most bits and 
pieces on the web. I did not check thoroughly for spelling mistakes or even 
doc build errors, so:

Acked-by: Alexander Dahl <ada@thorsis.com>

Greets
Alex




_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
  2020-05-11 10:03 ` [ptxdist] [PATCH 2/2] doc: working with licensing information in packages Roland Hieber
  2020-05-26 10:29   ` Roland Hieber
  2020-05-26 11:12   ` Alexander Dahl
@ 2020-05-29  6:23   ` Michael Olbrich
  2020-05-29  8:27     ` Roland Hieber
  2 siblings, 1 reply; 19+ messages in thread
From: Michael Olbrich @ 2020-05-29  6:23 UTC (permalink / raw)
  To: ptxdist

On Mon, May 11, 2020 at 12:03:06PM +0200, Roland Hieber wrote:
> Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Roland Hieber <rhi@pengutronix.de>
> ---
>  doc/contributing.rst        |   5 +
>  doc/daily_work.inc          |   2 +
>  doc/daily_work_licenses.inc | 208 ++++++++++++++++++++++++++++++++++++
>  doc/ref_make_variables.inc  |   4 +
>  4 files changed, 219 insertions(+)
>  create mode 100644 doc/daily_work_licenses.inc
> 
> diff --git a/doc/contributing.rst b/doc/contributing.rst
> index 705f01377d32..7352b46dfcf0 100644
> --- a/doc/contributing.rst
> +++ b/doc/contributing.rst
> @@ -90,6 +90,11 @@ For new packages, the generated templates contain commented-out default
>  sections. These are meant as a helper to simplify creating custom stages.
>  Any remaining default stages must be removed.
>  
> +New packages should also have licensing information in the ``<PKG>_LICENSE``
> +and ``<PKG>_LICENSE_FILES`` variables.
> +Refer to the section :ref:`licensing_in_packages` for more information.
> +
> +
>  Helper Scripts
>  --------------
>  
> diff --git a/doc/daily_work.inc b/doc/daily_work.inc
> index a37aac4c3339..f68d25bf7cb5 100644
> --- a/doc/daily_work.inc
> +++ b/doc/daily_work.inc
> @@ -1472,3 +1472,5 @@ be enabled. A used mount option of the overlayfs in the default
>  newer.
>  If your kernel does not meet this requirement you can provide your own local
>  and adapted variant of the mentioned mount unit.
> +
> +.. include:: daily_work_licenses.inc
> diff --git a/doc/daily_work_licenses.inc b/doc/daily_work_licenses.inc
> new file mode 100644
> index 000000000000..7e90b7ba541d
> --- /dev/null
> +++ b/doc/daily_work_licenses.inc
> @@ -0,0 +1,208 @@
> +.. _licensing_in_packages:
> +
> +Tracking licensing information in packages
> +------------------------------------------
> +
> +PTXdist aims to track licensing information for every package.
> +This includes the license(s) under which a package can be distributed,
> +as well as the respective files in the package's source tree that state those terms.
> +Sadly there is no widely adopted standard for machine-readable licensing
> +information in source code (`yet <https://reuse.software>`_),
> +so here are a few hints where to look.
> +
> +There are many older package rules in PTXdist which don't specify licensing information.
> +If you want to help complete the database,
> +you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
> +Note however that this cannot find wrong or incomplete licensing information.
> +
> +Finding licensing information
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +You should first select and extract the package in question, and then have a
> +look at in the extracted package sources (usually something like
> +``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
> +``ptxdist package-info mypackage``).
> +
> +* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
> +  These often only contain the license text and, in case of GPL, no information
> +  if the code is available under the *-only* or *-or-later* variant.
> +  Sometimes these files are in a folder ``/doc`` or ``/legal``.
> +
> +* Check the ``README``, if there is any.
> +  Often there is important information there, e.g. in case of GPL if the
> +  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
> +
> +* Check some relevant-sounding files, like ``main.c`` for license headers.
> +  Often additional information can be found here.

This is too lax for me. Unless there is an explicit statement that all code
has the same license, all source files must be checked. Especially older
projects have a few files that where copied from somewhere else and have a
different license.

> +
> +* If you want to be extra sure, use a license compliance toolchain (e.g.
> +  `FOSSology <https://www.fossology.org/>`__) on the project.
> +
> +On the other hand, there are some things that can be ignored for our purposes:
> +
> +* Everything that is auto-generated, either by a script in the project source,
> +  or by the build system previous to packaging.
> +  The generator itself cannot hold copyright, although the authors of the
> +  templates used for the generation or the authors of the generator can.
> +
> +* Most files belonging to the build system don't make it into the compiled code
> +  and can therefore be ignored (e.g. configure scripts, Makefiles).
> +  These cases sometimes can be hard to detect – if unsure, include the file in
> +  your research.
> +
> +Distillation down to license identifiers
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +We use the `SPDX license identifiers <https://spdx.org/licenses/>`_.
> +
> +Either the license is clear, e.g. because it says "GPL 2.0" (roughly check the
> +license content to be sure), or you can use tools like
> +`FOSSology <https://www.fossology.org>`__,
> +`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
> +or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
> +to detect license material in the project.
> +
> +License texts don't have to match exactly, you should apply the
> +`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
> +accordingly.
> +The important part here is that the project's license and the SPDX identifier
> +describe the same licensing terms.
> +"Rather close" or "mostly similar" statements are not enough for a match,
> +but simple unimportant changes like replacing *"The Author"* with the project's
> +maintainer's name, or a change in e-mail adresses, are usually okay.
> +
> +For software that is not open-source according to the `OSI definition
> +<https://opensource.org/osd>`_, use the identifier ``proprietary``.
> +
> +If no license identifier matches, use ``unknown``.
> +If the project is considered open source or free software, you can
> +`report its license to be added to the SPDX license list
> +<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.

I think I'd like to use something else here. Right now 'unknown' mostly
means "nobody looked at this yet". I want something else to say "I looked
at this and here it the license text but there is no SPDX identifier".
I'm considering the package name to make it unique. But that has the
downside, that it's not easily found.
Suggestions?

> +Conflicting statements
> +^^^^^^^^^^^^^^^^^^^^^^
> +
> +Human interpretation is needed when statements inside the project conflict with
> +each other.
> +Some clues that can help you decide:
> +
> +Detailedness:
> +  If the header in COPYING or the README says *"GNU General Public License"*,
> +  but the license text is in fact a BSD license, the correct license is the BSD
> +  license.
> +
> +Author Intent:
> +  If the README says *"this is LGPL 2.1"*, but COPYING contains a GPL boilerplate
> +  license text, the correct licensing information is probably *"LGPL 2.1"* –
> +  the README written by the author prevails over the boilerplate text.
> +
> +Recency:
> +  If README and COPYING are both clearly written by the author themselves, and
> +  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
> +  recent file prevails.
> +
> +  .. note::
> +
> +   Any of such cases is considered a bug and should be reported to the upstream maintainer!
> +
> +License versions, and GPL-vv-only or GPL-vv-or-later?
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +If the ``COPYING`` file is a GPL text, it is still uncertain if the correct
> +license identifier is *GPL-vv-only* or *GPL-vv-or-later*.
> +The GPL text itself does not give information on that in its terms and
> +conditions.
> +Sometimes there is a notice at the top of the COPYING or the README file stating
> +whether *"-only"* or *"-or-later"* applies – this is the easy case.
> +Otherwise: check headers in relevant files.
> +
> +If no license information can be found, but one file mentions e.g. *"GPL-vv or
> +later"*, use that information for the whole project.
> +E.g.: no license information can be found except a ``COPYING`` which contains
> +a GPL-2.0 text → the license is GPL-2.0-only.
> +
> +Sometimes the best information available is statements like
> +*"this code is under GPL"* without any version information.
> +Such cases should be interpreted as the most liberal reading,
> +i.e. *GPL-1.0-or-later* (any possible GPL version).

I'm not sure this is good. I would say, when in doubt then be restrictive.
After all, this is about compliance. If we comply with the more restrictive
interpretation then we also comply with more liberal interpretations.

> +If multiple versions and variants can be found in the project, combine them with
> +``AND``, e.g.: ``GPL-2.0-only AND GPL-2.0-or-later`` in the license identifier.

This should also mention `OR` for a license choice.

Michael

> +Public domain software
> +^^^^^^^^^^^^^^^^^^^^^^
> +
> +For `good reasons <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files>`_,
> +SPDX doesn't supply a license identifier for "Public Domain".
> +Nevertheless, some PTXdist package rules specify ``public_domain`` as their
> +respective license identifier.
> +When this is done, it is purely for historical reasons, and ``public_domain``
> +should normally not be used for new packages.
> +Some of those "Public Domain" dedications in packages have since been accepted
> +in SPDX, e.g. `libselinux <https://spdx.org/licenses/libselinux-1.0.html>`_ or
> +`SQLite <https://spdx.org/licenses/blessing.html>`_.
> +
> +No license information at all
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +No license - no usage rights!
> +
> +Definitely report this bug to the upstream maintainer.
> +Maybe even point them in the direction of `machine-readablity <https://reuse.software/>`_ :)
> +
> +Adding license files to PTXdist package rules
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
> +variable in the respective package rule file.
> +All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
> +including a checksum so that PTXdist complains when they change.
> +
> +Example:
> +
> +.. code-block:: make
> +   :caption: ddrescue.make
> +
> +   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
> +   DDRESCUE_LICENSE_FILES	:= \
> +           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
> +           file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
> +           file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
> +
> +See the section :ref:`package_specific_variables` for more information about
> +the syntax of those two variables.
> +
> +The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
> +command applied to a range of lines.
> +For the example above, lines 1 to 16 of main.cc would be:
> +
> +.. code-block:: terminal
> +
> +   $ sed -n 1,16p main.cc | md5sum -
> +   a01d61d3293ce28b883d8ba0c497e968
> +
> +If the copyright statement contains a string of years, leave those lines out for
> +the calculation of the checksum, as an added year does not change the license
> +(in fact, not even a single year is needed for the license to be valid),
> +but only makes package version updates more cumbersome.
> +
> +If additional information is in the ``README`` or license headers in source
> +files are used, also include these files (for source code: one of each is enough),
> +but use md5sum only on the relevant lines, so changes in the rest of the file do
> +not appear as license changes.
> +
> +For rather chaotic directories with lots of license files, definetly include at
> +least one relevant source file with license headers (if there are any), as some
> +developers tend to accumulate license files without adjusting it to license
> +changes in their source.
> +
> +As in the example above, sometimes more than one license applies.
> +If different files in the package are under different licenses, use ``AND`` (e.g.
> +``GPL-2.0-only AND LGPL-2.1``).
> +If it leaves the choice to modify/redistribute under one or the other
> +license, use ``OR``.
> +
> +.. note::
> +
> +   For each single license in the compound statement, include at least one file
> +   with checksum in the ``<PKG>_LICENSE_FILES`` variable.
> diff --git a/doc/ref_make_variables.inc b/doc/ref_make_variables.inc
> index 56912bb2e364..701c029591d8 100644
> --- a/doc/ref_make_variables.inc
> +++ b/doc/ref_make_variables.inc
> @@ -127,6 +127,8 @@ Other useful variables:
>    that are built and installed during the PTXdist build run.
>    There are analogous ``-y`` and ``-m`` variants of those variables too.
>  
> +.. _package_specific_variables:
> +
>  Package Specific Variables
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> @@ -228,6 +230,7 @@ Package Definition
>    here. Use ``proprietary`` for proprietary packages and ``ignore`` for
>    packages without their own license, e.g. meta packages or packages that
>    only install files from ``projectroot/``.
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  ``<PKG>_LICENSE_FILES``
>    A space separated list of URLs of license text files. The URLs must be
> @@ -239,6 +242,7 @@ Package Definition
>    used in case the specified file contains more than just the license text,
>    e.g. if the license is in the header of a source file. For non ASCII or
>    UTF-8 files the encoding can be specified with ``encoding=<enc>``.
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  For most packages the variables described above are undefined by default.
>  However, for cross and host packages these variables default to the value
> -- 
> 2.20.1
> 
> 
> _______________________________________________
> ptxdist mailing list
> ptxdist@pengutronix.de

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
  2020-05-29  6:23   ` Michael Olbrich
@ 2020-05-29  8:27     ` Roland Hieber
  2020-05-29  8:55       ` Michael Olbrich
  0 siblings, 1 reply; 19+ messages in thread
From: Roland Hieber @ 2020-05-29  8:27 UTC (permalink / raw)
  To: ptxdist

On Fri, May 29, 2020 at 08:23:46AM +0200, Michael Olbrich wrote:
> On Mon, May 11, 2020 at 12:03:06PM +0200, Roland Hieber wrote:
> > Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
> > Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
> > Signed-off-by: Roland Hieber <rhi@pengutronix.de>
> > ---
> >  doc/contributing.rst        |   5 +
> >  doc/daily_work.inc          |   2 +
> >  doc/daily_work_licenses.inc | 208 ++++++++++++++++++++++++++++++++++++
> >  doc/ref_make_variables.inc  |   4 +
> >  4 files changed, 219 insertions(+)
> >  create mode 100644 doc/daily_work_licenses.inc
> > 
> > diff --git a/doc/contributing.rst b/doc/contributing.rst
> > index 705f01377d32..7352b46dfcf0 100644
> > --- a/doc/contributing.rst
> > +++ b/doc/contributing.rst
> > @@ -90,6 +90,11 @@ For new packages, the generated templates contain commented-out default
> >  sections. These are meant as a helper to simplify creating custom stages.
> >  Any remaining default stages must be removed.
> >  
> > +New packages should also have licensing information in the ``<PKG>_LICENSE``
> > +and ``<PKG>_LICENSE_FILES`` variables.
> > +Refer to the section :ref:`licensing_in_packages` for more information.
> > +
> > +
> >  Helper Scripts
> >  --------------
> >  
> > diff --git a/doc/daily_work.inc b/doc/daily_work.inc
> > index a37aac4c3339..f68d25bf7cb5 100644
> > --- a/doc/daily_work.inc
> > +++ b/doc/daily_work.inc
> > @@ -1472,3 +1472,5 @@ be enabled. A used mount option of the overlayfs in the default
> >  newer.
> >  If your kernel does not meet this requirement you can provide your own local
> >  and adapted variant of the mentioned mount unit.
> > +
> > +.. include:: daily_work_licenses.inc
> > diff --git a/doc/daily_work_licenses.inc b/doc/daily_work_licenses.inc
> > new file mode 100644
> > index 000000000000..7e90b7ba541d
> > --- /dev/null
> > +++ b/doc/daily_work_licenses.inc
> > @@ -0,0 +1,208 @@
> > +.. _licensing_in_packages:
> > +
> > +Tracking licensing information in packages
> > +------------------------------------------
> > +
> > +PTXdist aims to track licensing information for every package.
> > +This includes the license(s) under which a package can be distributed,
> > +as well as the respective files in the package's source tree that state those terms.
> > +Sadly there is no widely adopted standard for machine-readable licensing
> > +information in source code (`yet <https://reuse.software>`_),
> > +so here are a few hints where to look.
> > +
> > +There are many older package rules in PTXdist which don't specify licensing information.
> > +If you want to help complete the database,
> > +you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
> > +Note however that this cannot find wrong or incomplete licensing information.
> > +
> > +Finding licensing information
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +You should first select and extract the package in question, and then have a
> > +look at in the extracted package sources (usually something like
> > +``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
> > +``ptxdist package-info mypackage``).
> > +
> > +* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
> > +  These often only contain the license text and, in case of GPL, no information
> > +  if the code is available under the *-only* or *-or-later* variant.
> > +  Sometimes these files are in a folder ``/doc`` or ``/legal``.
> > +
> > +* Check the ``README``, if there is any.
> > +  Often there is important information there, e.g. in case of GPL if the
> > +  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
> > +
> > +* Check some relevant-sounding files, like ``main.c`` for license headers.
> > +  Often additional information can be found here.
> 
> This is too lax for me. Unless there is an explicit statement that all code
> has the same license, all source files must be checked. Especially older
> projects have a few files that where copied from somewhere else and have a
> different license.

Yes, right.

> > +
> > +* If you want to be extra sure, use a license compliance toolchain (e.g.
> > +  `FOSSology <https://www.fossology.org/>`__) on the project.
> > +
> > +On the other hand, there are some things that can be ignored for our purposes:
> > +
> > +* Everything that is auto-generated, either by a script in the project source,
> > +  or by the build system previous to packaging.
> > +  The generator itself cannot hold copyright, although the authors of the
> > +  templates used for the generation or the authors of the generator can.
> > +
> > +* Most files belonging to the build system don't make it into the compiled code
> > +  and can therefore be ignored (e.g. configure scripts, Makefiles).
> > +  These cases sometimes can be hard to detect – if unsure, include the file in
> > +  your research.
> > +
> > +Distillation down to license identifiers
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +We use the `SPDX license identifiers <https://spdx.org/licenses/>`_.
> > +
> > +Either the license is clear, e.g. because it says "GPL 2.0" (roughly check the
> > +license content to be sure), or you can use tools like
> > +`FOSSology <https://www.fossology.org>`__,
> > +`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
> > +or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
> > +to detect license material in the project.
> > +
> > +License texts don't have to match exactly, you should apply the
> > +`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
> > +accordingly.
> > +The important part here is that the project's license and the SPDX identifier
> > +describe the same licensing terms.
> > +"Rather close" or "mostly similar" statements are not enough for a match,
> > +but simple unimportant changes like replacing *"The Author"* with the project's
> > +maintainer's name, or a change in e-mail adresses, are usually okay.
> > +
> > +For software that is not open-source according to the `OSI definition
> > +<https://opensource.org/osd>`_, use the identifier ``proprietary``.
> > +
> > +If no license identifier matches, use ``unknown``.
> > +If the project is considered open source or free software, you can
> > +`report its license to be added to the SPDX license list
> > +<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.
> 
> I think I'd like to use something else here. Right now 'unknown' mostly
> means "nobody looked at this yet". I want something else to say "I looked
> at this and here it the license text but there is no SPDX identifier".
> I'm considering the package name to make it unique. But that has the
> downside, that it's not easily found.
> Suggestions?

In the IRC channel, someone suggested using "custom" instead, since it is
apparently used by other distros. We could also use it as a prefix and
combine it with the package name, but I think this becomes a problem
when multiple "unknown" cases happen in one package.

> > +Conflicting statements
> > +^^^^^^^^^^^^^^^^^^^^^^
> > +
> > +Human interpretation is needed when statements inside the project conflict with
> > +each other.
> > +Some clues that can help you decide:
> > +
> > +Detailedness:
> > +  If the header in COPYING or the README says *"GNU General Public License"*,
> > +  but the license text is in fact a BSD license, the correct license is the BSD
> > +  license.
> > +
> > +Author Intent:
> > +  If the README says *"this is LGPL 2.1"*, but COPYING contains a GPL boilerplate
> > +  license text, the correct licensing information is probably *"LGPL 2.1"* –
> > +  the README written by the author prevails over the boilerplate text.
> > +
> > +Recency:
> > +  If README and COPYING are both clearly written by the author themselves, and
> > +  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
> > +  recent file prevails.
> > +
> > +  .. note::
> > +
> > +   Any of such cases is considered a bug and should be reported to the upstream maintainer!
> > +
> > +License versions, and GPL-vv-only or GPL-vv-or-later?
> > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > +
> > +If the ``COPYING`` file is a GPL text, it is still uncertain if the correct
> > +license identifier is *GPL-vv-only* or *GPL-vv-or-later*.
> > +The GPL text itself does not give information on that in its terms and
> > +conditions.
> > +Sometimes there is a notice at the top of the COPYING or the README file stating
> > +whether *"-only"* or *"-or-later"* applies – this is the easy case.
> > +Otherwise: check headers in relevant files.
> > +
> > +If no license information can be found, but one file mentions e.g. *"GPL-vv or
> > +later"*, use that information for the whole project.
> > +E.g.: no license information can be found except a ``COPYING`` which contains
> > +a GPL-2.0 text → the license is GPL-2.0-only.
> > +
> > +Sometimes the best information available is statements like
> > +*"this code is under GPL"* without any version information.
> > +Such cases should be interpreted as the most liberal reading,
> > +i.e. *GPL-1.0-or-later* (any possible GPL version).
> 
> I'm not sure this is good. I would say, when in doubt then be restrictive.
> After all, this is about compliance. If we comply with the more restrictive
> interpretation then we also comply with more liberal interpretations.

What would being restrictive look like? We don't have any good pointers
as to what license to use here.

 - Roland

> > +If multiple versions and variants can be found in the project, combine them with
> > +``AND``, e.g.: ``GPL-2.0-only AND GPL-2.0-or-later`` in the license identifier.
> 
> This should also mention `OR` for a license choice.
> 
> Michael

-- 
Roland Hieber, Pengutronix e.K.          | r.hieber@pengutronix.de     |
Steuerwalder Str. 21                     | https://www.pengutronix.de/ |
31137 Hildesheim, Germany                | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686         | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
  2020-05-29  8:27     ` Roland Hieber
@ 2020-05-29  8:55       ` Michael Olbrich
  2020-05-29  9:40         ` Roland Hieber
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Olbrich @ 2020-05-29  8:55 UTC (permalink / raw)
  To: ptxdist

On Fri, May 29, 2020 at 10:27:04AM +0200, Roland Hieber wrote:
> On Fri, May 29, 2020 at 08:23:46AM +0200, Michael Olbrich wrote:
> > On Mon, May 11, 2020 at 12:03:06PM +0200, Roland Hieber wrote:
> > > Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
> > > Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
> > > Signed-off-by: Roland Hieber <rhi@pengutronix.de>
> > > ---
> > >  doc/contributing.rst        |   5 +
> > >  doc/daily_work.inc          |   2 +
> > >  doc/daily_work_licenses.inc | 208 ++++++++++++++++++++++++++++++++++++
> > >  doc/ref_make_variables.inc  |   4 +
> > >  4 files changed, 219 insertions(+)
> > >  create mode 100644 doc/daily_work_licenses.inc
> > > 
> > > diff --git a/doc/contributing.rst b/doc/contributing.rst
> > > index 705f01377d32..7352b46dfcf0 100644
> > > --- a/doc/contributing.rst
> > > +++ b/doc/contributing.rst
> > > @@ -90,6 +90,11 @@ For new packages, the generated templates contain commented-out default
> > >  sections. These are meant as a helper to simplify creating custom stages.
> > >  Any remaining default stages must be removed.
> > >  
> > > +New packages should also have licensing information in the ``<PKG>_LICENSE``
> > > +and ``<PKG>_LICENSE_FILES`` variables.
> > > +Refer to the section :ref:`licensing_in_packages` for more information.
> > > +
> > > +
> > >  Helper Scripts
> > >  --------------
> > >  
> > > diff --git a/doc/daily_work.inc b/doc/daily_work.inc
> > > index a37aac4c3339..f68d25bf7cb5 100644
> > > --- a/doc/daily_work.inc
> > > +++ b/doc/daily_work.inc
> > > @@ -1472,3 +1472,5 @@ be enabled. A used mount option of the overlayfs in the default
> > >  newer.
> > >  If your kernel does not meet this requirement you can provide your own local
> > >  and adapted variant of the mentioned mount unit.
> > > +
> > > +.. include:: daily_work_licenses.inc
> > > diff --git a/doc/daily_work_licenses.inc b/doc/daily_work_licenses.inc
> > > new file mode 100644
> > > index 000000000000..7e90b7ba541d
> > > --- /dev/null
> > > +++ b/doc/daily_work_licenses.inc
> > > @@ -0,0 +1,208 @@
> > > +.. _licensing_in_packages:
> > > +
> > > +Tracking licensing information in packages
> > > +------------------------------------------
> > > +
> > > +PTXdist aims to track licensing information for every package.
> > > +This includes the license(s) under which a package can be distributed,
> > > +as well as the respective files in the package's source tree that state those terms.
> > > +Sadly there is no widely adopted standard for machine-readable licensing
> > > +information in source code (`yet <https://reuse.software>`_),
> > > +so here are a few hints where to look.
> > > +
> > > +There are many older package rules in PTXdist which don't specify licensing information.
> > > +If you want to help complete the database,
> > > +you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
> > > +Note however that this cannot find wrong or incomplete licensing information.
> > > +
> > > +Finding licensing information
> > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > +
> > > +You should first select and extract the package in question, and then have a
> > > +look at in the extracted package sources (usually something like
> > > +``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
> > > +``ptxdist package-info mypackage``).
> > > +
> > > +* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
> > > +  These often only contain the license text and, in case of GPL, no information
> > > +  if the code is available under the *-only* or *-or-later* variant.
> > > +  Sometimes these files are in a folder ``/doc`` or ``/legal``.
> > > +
> > > +* Check the ``README``, if there is any.
> > > +  Often there is important information there, e.g. in case of GPL if the
> > > +  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
> > > +
> > > +* Check some relevant-sounding files, like ``main.c`` for license headers.
> > > +  Often additional information can be found here.
> > 
> > This is too lax for me. Unless there is an explicit statement that all code
> > has the same license, all source files must be checked. Especially older
> > projects have a few files that where copied from somewhere else and have a
> > different license.
> 
> Yes, right.
> 
> > > +
> > > +* If you want to be extra sure, use a license compliance toolchain (e.g.
> > > +  `FOSSology <https://www.fossology.org/>`__) on the project.
> > > +
> > > +On the other hand, there are some things that can be ignored for our purposes:
> > > +
> > > +* Everything that is auto-generated, either by a script in the project source,
> > > +  or by the build system previous to packaging.
> > > +  The generator itself cannot hold copyright, although the authors of the
> > > +  templates used for the generation or the authors of the generator can.
> > > +
> > > +* Most files belonging to the build system don't make it into the compiled code
> > > +  and can therefore be ignored (e.g. configure scripts, Makefiles).
> > > +  These cases sometimes can be hard to detect – if unsure, include the file in
> > > +  your research.
> > > +
> > > +Distillation down to license identifiers
> > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > +
> > > +We use the `SPDX license identifiers <https://spdx.org/licenses/>`_.
> > > +
> > > +Either the license is clear, e.g. because it says "GPL 2.0" (roughly check the
> > > +license content to be sure), or you can use tools like
> > > +`FOSSology <https://www.fossology.org>`__,
> > > +`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
> > > +or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
> > > +to detect license material in the project.
> > > +
> > > +License texts don't have to match exactly, you should apply the
> > > +`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
> > > +accordingly.
> > > +The important part here is that the project's license and the SPDX identifier
> > > +describe the same licensing terms.
> > > +"Rather close" or "mostly similar" statements are not enough for a match,
> > > +but simple unimportant changes like replacing *"The Author"* with the project's
> > > +maintainer's name, or a change in e-mail adresses, are usually okay.
> > > +
> > > +For software that is not open-source according to the `OSI definition
> > > +<https://opensource.org/osd>`_, use the identifier ``proprietary``.
> > > +
> > > +If no license identifier matches, use ``unknown``.
> > > +If the project is considered open source or free software, you can
> > > +`report its license to be added to the SPDX license list
> > > +<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.
> > 
> > I think I'd like to use something else here. Right now 'unknown' mostly
> > means "nobody looked at this yet". I want something else to say "I looked
> > at this and here it the license text but there is no SPDX identifier".
> > I'm considering the package name to make it unique. But that has the
> > downside, that it's not easily found.
> > Suggestions?
> 
> In the IRC channel, someone suggested using "custom" instead, since it is
> apparently used by other distros. We could also use it as a prefix and
> combine it with the package name, but I think this becomes a problem
> when multiple "unknown" cases happen in one package.

custom is a good idea.

> > > +Conflicting statements
> > > +^^^^^^^^^^^^^^^^^^^^^^
> > > +
> > > +Human interpretation is needed when statements inside the project conflict with
> > > +each other.
> > > +Some clues that can help you decide:
> > > +
> > > +Detailedness:
> > > +  If the header in COPYING or the README says *"GNU General Public License"*,
> > > +  but the license text is in fact a BSD license, the correct license is the BSD
> > > +  license.
> > > +
> > > +Author Intent:
> > > +  If the README says *"this is LGPL 2.1"*, but COPYING contains a GPL boilerplate
> > > +  license text, the correct licensing information is probably *"LGPL 2.1"* –
> > > +  the README written by the author prevails over the boilerplate text.
> > > +
> > > +Recency:
> > > +  If README and COPYING are both clearly written by the author themselves, and
> > > +  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
> > > +  recent file prevails.
> > > +
> > > +  .. note::
> > > +
> > > +   Any of such cases is considered a bug and should be reported to the upstream maintainer!
> > > +
> > > +License versions, and GPL-vv-only or GPL-vv-or-later?
> > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > +
> > > +If the ``COPYING`` file is a GPL text, it is still uncertain if the correct
> > > +license identifier is *GPL-vv-only* or *GPL-vv-or-later*.
> > > +The GPL text itself does not give information on that in its terms and
> > > +conditions.
> > > +Sometimes there is a notice at the top of the COPYING or the README file stating
> > > +whether *"-only"* or *"-or-later"* applies – this is the easy case.
> > > +Otherwise: check headers in relevant files.
> > > +
> > > +If no license information can be found, but one file mentions e.g. *"GPL-vv or
> > > +later"*, use that information for the whole project.
> > > +E.g.: no license information can be found except a ``COPYING`` which contains
> > > +a GPL-2.0 text → the license is GPL-2.0-only.
> > > +
> > > +Sometimes the best information available is statements like
> > > +*"this code is under GPL"* without any version information.
> > > +Such cases should be interpreted as the most liberal reading,
> > > +i.e. *GPL-1.0-or-later* (any possible GPL version).
> > 
> > I'm not sure this is good. I would say, when in doubt then be restrictive.
> > After all, this is about compliance. If we comply with the more restrictive
> > interpretation then we also comply with more liberal interpretations.
> 
> What would being restrictive look like? We don't have any good pointers
> as to what license to use here.

Use the version from the license text. Or are you saying there is no
license text as well? I'm not sure if that's even distributable. Do you
have an example for this?

Michael

> > > +If multiple versions and variants can be found in the project, combine them with
> > > +``AND``, e.g.: ``GPL-2.0-only AND GPL-2.0-or-later`` in the license identifier.
> > 
> > This should also mention `OR` for a license choice.
> > 
> > Michael
> 
> -- 
> Roland Hieber, Pengutronix e.K.          | r.hieber@pengutronix.de     |
> Steuerwalder Str. 21                     | https://www.pengutronix.de/ |
> 31137 Hildesheim, Germany                | Phone: +49-5121-206917-0    |
> Amtsgericht Hildesheim, HRA 2686         | Fax:   +49-5121-206917-5555 |
> 
> _______________________________________________
> ptxdist mailing list
> ptxdist@pengutronix.de
> To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
  2020-05-29  8:55       ` Michael Olbrich
@ 2020-05-29  9:40         ` Roland Hieber
  2020-05-29 12:03           ` Michael Olbrich
  0 siblings, 1 reply; 19+ messages in thread
From: Roland Hieber @ 2020-05-29  9:40 UTC (permalink / raw)
  To: ptxdist

On Fri, May 29, 2020 at 10:55:57AM +0200, Michael Olbrich wrote:
> On Fri, May 29, 2020 at 10:27:04AM +0200, Roland Hieber wrote:
> > On Fri, May 29, 2020 at 08:23:46AM +0200, Michael Olbrich wrote:
> > > On Mon, May 11, 2020 at 12:03:06PM +0200, Roland Hieber wrote:
> > > > +License versions, and GPL-vv-only or GPL-vv-or-later?
> > > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > +
> > > > +If the ``COPYING`` file is a GPL text, it is still uncertain if the correct
> > > > +license identifier is *GPL-vv-only* or *GPL-vv-or-later*.
> > > > +The GPL text itself does not give information on that in its terms and
> > > > +conditions.
> > > > +Sometimes there is a notice at the top of the COPYING or the README file stating
> > > > +whether *"-only"* or *"-or-later"* applies – this is the easy case.
> > > > +Otherwise: check headers in relevant files.
> > > > +
> > > > +If no license information can be found, but one file mentions e.g. *"GPL-vv or
> > > > +later"*, use that information for the whole project.
> > > > +E.g.: no license information can be found except a ``COPYING`` which contains
> > > > +a GPL-2.0 text → the license is GPL-2.0-only.
> > > > +
> > > > +Sometimes the best information available is statements like
> > > > +*"this code is under GPL"* without any version information.
> > > > +Such cases should be interpreted as the most liberal reading,
> > > > +i.e. *GPL-1.0-or-later* (any possible GPL version).
> > > 
> > > I'm not sure this is good. I would say, when in doubt then be restrictive.
> > > After all, this is about compliance. If we comply with the more restrictive
> > > interpretation then we also comply with more liberal interpretations.
> > 
> > What would being restrictive look like? We don't have any good pointers
> > as to what license to use here.
> 
> Use the version from the license text. Or are you saying there is no
> license text as well? I'm not sure if that's even distributable. Do you
> have an example for this?

https://git.pengutronix.de/cgit/rhi/ptxdist/commit/?h=6dc705e869353f24d3cd1be7698afcd119e8da95

 - Roland


-- 
Roland Hieber, Pengutronix e.K.          | r.hieber@pengutronix.de     |
Steuerwalder Str. 21                     | https://www.pengutronix.de/ |
31137 Hildesheim, Germany                | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686         | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
  2020-05-29  9:40         ` Roland Hieber
@ 2020-05-29 12:03           ` Michael Olbrich
  2020-05-31 19:56             ` Roland Hieber
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Olbrich @ 2020-05-29 12:03 UTC (permalink / raw)
  To: ptxdist

On Fri, May 29, 2020 at 11:40:49AM +0200, Roland Hieber wrote:
> On Fri, May 29, 2020 at 10:55:57AM +0200, Michael Olbrich wrote:
> > On Fri, May 29, 2020 at 10:27:04AM +0200, Roland Hieber wrote:
> > > On Fri, May 29, 2020 at 08:23:46AM +0200, Michael Olbrich wrote:
> > > > On Mon, May 11, 2020 at 12:03:06PM +0200, Roland Hieber wrote:
> > > > > +License versions, and GPL-vv-only or GPL-vv-or-later?
> > > > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > +
> > > > > +If the ``COPYING`` file is a GPL text, it is still uncertain if the correct
> > > > > +license identifier is *GPL-vv-only* or *GPL-vv-or-later*.
> > > > > +The GPL text itself does not give information on that in its terms and
> > > > > +conditions.
> > > > > +Sometimes there is a notice at the top of the COPYING or the README file stating
> > > > > +whether *"-only"* or *"-or-later"* applies – this is the easy case.
> > > > > +Otherwise: check headers in relevant files.
> > > > > +
> > > > > +If no license information can be found, but one file mentions e.g. *"GPL-vv or
> > > > > +later"*, use that information for the whole project.
> > > > > +E.g.: no license information can be found except a ``COPYING`` which contains
> > > > > +a GPL-2.0 text → the license is GPL-2.0-only.
> > > > > +
> > > > > +Sometimes the best information available is statements like
> > > > > +*"this code is under GPL"* without any version information.
> > > > > +Such cases should be interpreted as the most liberal reading,
> > > > > +i.e. *GPL-1.0-or-later* (any possible GPL version).
> > > > 
> > > > I'm not sure this is good. I would say, when in doubt then be restrictive.
> > > > After all, this is about compliance. If we comply with the more restrictive
> > > > interpretation then we also comply with more liberal interpretations.
> > > 
> > > What would being restrictive look like? We don't have any good pointers
> > > as to what license to use here.
> > 
> > Use the version from the license text. Or are you saying there is no
> > license text as well? I'm not sure if that's even distributable. Do you
> > have an example for this?
> 
> https://git.pengutronix.de/cgit/rhi/ptxdist/commit/?h=6dc705e869353f24d3cd1be7698afcd119e8da95

"The whole package, starting with version 1.1.22, is distributed under·
the GNU GPL license, found in the accompanying file 'COPYING'."

And the COPYING file does not exist... But check the latest version: The
COPYING exist there and it's GPL-2.0 and with the wording above I'd say
GPL-2.0-only.

Michael

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
  2020-05-29 12:03           ` Michael Olbrich
@ 2020-05-31 19:56             ` Roland Hieber
  2020-06-02 13:16               ` Michael Olbrich
  0 siblings, 1 reply; 19+ messages in thread
From: Roland Hieber @ 2020-05-31 19:56 UTC (permalink / raw)
  To: ptxdist

On Fri, May 29, 2020 at 02:03:00PM +0200, Michael Olbrich wrote:
> On Fri, May 29, 2020 at 11:40:49AM +0200, Roland Hieber wrote:
> > On Fri, May 29, 2020 at 10:55:57AM +0200, Michael Olbrich wrote:
> > > On Fri, May 29, 2020 at 10:27:04AM +0200, Roland Hieber wrote:
> > > > On Fri, May 29, 2020 at 08:23:46AM +0200, Michael Olbrich wrote:
> > > > > On Mon, May 11, 2020 at 12:03:06PM +0200, Roland Hieber wrote:
> > > > > > +License versions, and GPL-vv-only or GPL-vv-or-later?
> > > > > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > > +
> > > > > > +If the ``COPYING`` file is a GPL text, it is still uncertain if the correct
> > > > > > +license identifier is *GPL-vv-only* or *GPL-vv-or-later*.
> > > > > > +The GPL text itself does not give information on that in its terms and
> > > > > > +conditions.
> > > > > > +Sometimes there is a notice at the top of the COPYING or the README file stating
> > > > > > +whether *"-only"* or *"-or-later"* applies – this is the easy case.
> > > > > > +Otherwise: check headers in relevant files.
> > > > > > +
> > > > > > +If no license information can be found, but one file mentions e.g. *"GPL-vv or
> > > > > > +later"*, use that information for the whole project.
> > > > > > +E.g.: no license information can be found except a ``COPYING`` which contains
> > > > > > +a GPL-2.0 text → the license is GPL-2.0-only.
> > > > > > +
> > > > > > +Sometimes the best information available is statements like
> > > > > > +*"this code is under GPL"* without any version information.
> > > > > > +Such cases should be interpreted as the most liberal reading,
> > > > > > +i.e. *GPL-1.0-or-later* (any possible GPL version).
> > > > > 
> > > > > I'm not sure this is good. I would say, when in doubt then be restrictive.
> > > > > After all, this is about compliance. If we comply with the more restrictive
> > > > > interpretation then we also comply with more liberal interpretations.
> > > > 
> > > > What would being restrictive look like? We don't have any good pointers
> > > > as to what license to use here.
> > > 
> > > Use the version from the license text. Or are you saying there is no
> > > license text as well? I'm not sure if that's even distributable. Do you
> > > have an example for this?
> > 
> > https://git.pengutronix.de/cgit/rhi/ptxdist/commit/?h=6dc705e869353f24d3cd1be7698afcd119e8da95
> 
> "The whole package, starting with version 1.1.22, is distributed under·
> the GNU GPL license, found in the accompanying file 'COPYING'."
> 
> And the COPYING file does not exist... But check the latest version: The
> COPYING exist there and it's GPL-2.0 and with the wording above I'd say
> GPL-2.0-only.

Given that this release is from 2007, I'd rather file another staging
patch than do a version bump...

But what does that mean for the documentation patch above? Of course it
very much concerns only edge cases, but then, edge cases is what that
section is really about ^^ So I think only the wording needs to be
adapted to clarify that it applies only when not even a GPL text is
available. Or what was the reason in your view that the package was not
even distributable?

 - Roland

-- 
Roland Hieber, Pengutronix e.K.          | r.hieber@pengutronix.de     |
Steuerwalder Str. 21                     | https://www.pengutronix.de/ |
31137 Hildesheim, Germany                | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686         | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
  2020-05-31 19:56             ` Roland Hieber
@ 2020-06-02 13:16               ` Michael Olbrich
  2020-06-02 15:14                 ` Roland Hieber
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Olbrich @ 2020-06-02 13:16 UTC (permalink / raw)
  To: ptxdist

On Sun, May 31, 2020 at 09:56:15PM +0200, Roland Hieber wrote:
> On Fri, May 29, 2020 at 02:03:00PM +0200, Michael Olbrich wrote:
> > On Fri, May 29, 2020 at 11:40:49AM +0200, Roland Hieber wrote:
> > > On Fri, May 29, 2020 at 10:55:57AM +0200, Michael Olbrich wrote:
> > > > On Fri, May 29, 2020 at 10:27:04AM +0200, Roland Hieber wrote:
> > > > > On Fri, May 29, 2020 at 08:23:46AM +0200, Michael Olbrich wrote:
> > > > > > On Mon, May 11, 2020 at 12:03:06PM +0200, Roland Hieber wrote:
> > > > > > > +License versions, and GPL-vv-only or GPL-vv-or-later?
> > > > > > > +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > > > > +
> > > > > > > +If the ``COPYING`` file is a GPL text, it is still uncertain if the correct
> > > > > > > +license identifier is *GPL-vv-only* or *GPL-vv-or-later*.
> > > > > > > +The GPL text itself does not give information on that in its terms and
> > > > > > > +conditions.
> > > > > > > +Sometimes there is a notice at the top of the COPYING or the README file stating
> > > > > > > +whether *"-only"* or *"-or-later"* applies – this is the easy case.
> > > > > > > +Otherwise: check headers in relevant files.
> > > > > > > +
> > > > > > > +If no license information can be found, but one file mentions e.g. *"GPL-vv or
> > > > > > > +later"*, use that information for the whole project.
> > > > > > > +E.g.: no license information can be found except a ``COPYING`` which contains
> > > > > > > +a GPL-2.0 text → the license is GPL-2.0-only.
> > > > > > > +
> > > > > > > +Sometimes the best information available is statements like
> > > > > > > +*"this code is under GPL"* without any version information.
> > > > > > > +Such cases should be interpreted as the most liberal reading,
> > > > > > > +i.e. *GPL-1.0-or-later* (any possible GPL version).
> > > > > > 
> > > > > > I'm not sure this is good. I would say, when in doubt then be restrictive.
> > > > > > After all, this is about compliance. If we comply with the more restrictive
> > > > > > interpretation then we also comply with more liberal interpretations.
> > > > > 
> > > > > What would being restrictive look like? We don't have any good pointers
> > > > > as to what license to use here.
> > > > 
> > > > Use the version from the license text. Or are you saying there is no
> > > > license text as well? I'm not sure if that's even distributable. Do you
> > > > have an example for this?
> > > 
> > > https://git.pengutronix.de/cgit/rhi/ptxdist/commit/?h=6dc705e869353f24d3cd1be7698afcd119e8da95
> > 
> > "The whole package, starting with version 1.1.22, is distributed under·
> > the GNU GPL license, found in the accompanying file 'COPYING'."
> > 
> > And the COPYING file does not exist... But check the latest version: The
> > COPYING exist there and it's GPL-2.0 and with the wording above I'd say
> > GPL-2.0-only.
> 
> Given that this release is from 2007, I'd rather file another staging
> patch than do a version bump...

ok, can you rebase your branch to remove corresponding the license patch?

> But what does that mean for the documentation patch above? Of course it
> very much concerns only edge cases, but then, edge cases is what that
> section is really about ^^ So I think only the wording needs to be
> adapted to clarify that it applies only when not even a GPL text is
> available. Or what was the reason in your view that the package was not
> even distributable?

For the documentation, the main thing is "don't assume". If it's really
unclear, then just use 'custom' and add all relevant texts to the license
files. Then whoever creates a BSP must decide for themself if the license
is acceptable.
The license identifier is just a hint. And it must be the correct one not
just whatever is closest. Otherwise it's useless.

Michael

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH 2/2] doc: working with licensing information in packages
  2020-06-02 13:16               ` Michael Olbrich
@ 2020-06-02 15:14                 ` Roland Hieber
  0 siblings, 0 replies; 19+ messages in thread
From: Roland Hieber @ 2020-06-02 15:14 UTC (permalink / raw)
  To: ptxdist

On Tue, Jun 02, 2020 at 03:16:34PM +0200, Michael Olbrich wrote:
> On Sun, May 31, 2020 at 09:56:15PM +0200, Roland Hieber wrote:
> > On Fri, May 29, 2020 at 02:03:00PM +0200, Michael Olbrich wrote:
> > > On Fri, May 29, 2020 at 11:40:49AM +0200, Roland Hieber wrote:
> > > > https://git.pengutronix.de/cgit/rhi/ptxdist/commit/?h=6dc705e869353f24d3cd1be7698afcd119e8da95
> > > 
> > > "The whole package, starting with version 1.1.22, is distributed under·
> > > the GNU GPL license, found in the accompanying file 'COPYING'."
> > > 
> > > And the COPYING file does not exist... But check the latest version: The
> > > COPYING exist there and it's GPL-2.0 and with the wording above I'd say
> > > GPL-2.0-only.
> > 
> > Given that this release is from 2007, I'd rather file another staging
> > patch than do a version bump...
> 
> ok, can you rebase your branch to remove corresponding the license patch?

Yes, I'll send a PULL v2.

> > But what does that mean for the documentation patch above? Of course it
> > very much concerns only edge cases, but then, edge cases is what that
> > section is really about ^^ So I think only the wording needs to be
> > adapted to clarify that it applies only when not even a GPL text is
> > available. Or what was the reason in your view that the package was not
> > even distributable?
> 
> For the documentation, the main thing is "don't assume". If it's really
> unclear, then just use 'custom' and add all relevant texts to the license
> files. Then whoever creates a BSP must decide for themself if the license
> is acceptable.
> The license identifier is just a hint. And it must be the correct one not
> just whatever is closest. Otherwise it's useless.

Right, that sounds reasonable.

Also I found old documentation in the dev_manual.rst, which I'll
integrate into this patch. It also fits better in the developer manual
instead of the daily use chapter.

 - Roland

-- 
Roland Hieber, Pengutronix e.K.          | r.hieber@pengutronix.de     |
Steuerwalder Str. 21                     | https://www.pengutronix.de/ |
31137 Hildesheim, Germany                | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686         | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ptxdist] [PATCH] doc: working with licensing information in packages
  2020-05-11 10:03 [ptxdist] [PATCH 1/2] doc: ref_make_variables: link to the SPDX license list Roland Hieber
  2020-05-11 10:03 ` [ptxdist] [PATCH 2/2] doc: working with licensing information in packages Roland Hieber
@ 2021-06-08 10:36 ` Roland Hieber
  2021-06-16 14:19   ` Michael Olbrich
  1 sibling, 1 reply; 19+ messages in thread
From: Roland Hieber @ 2021-06-08 10:36 UTC (permalink / raw)
  To: ptxdist; +Cc: Roland Hieber, Felicitas Jung

Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
Signed-off-by: Roland Hieber <rhi@pengutronix.de>
---
v1 -> v2:
 - rebase to current master
 - squash PATCH 1/2 ("link to the SPDX license list")
 - move from daily use into dev manual chapter
 - expand and rewrite some parts completely
 - absorb old content in doc/dev_add_new_pkgs.rst
 - address feedback from Michael Olbrich:
   - check all source files instead of "some relevant-sounding files"
   - introduce "custom" and "custom-exception" identifiers instead of
     "unknown"
   - be restrictive and err on the side of caution when interpreting
     ambiguities
   - shortly mention the AND, OR and bracket syntaxes

PATCH v1: https://lore.ptxdist.org/ptxdist/20200511100306.7948-2-rhi@pengutronix.de
---
 doc/contributing.rst       |   4 +
 doc/daily_work.inc         |   2 +
 doc/dev_add_new_pkgs.rst   |  46 +------
 doc/dev_licenses.rst       | 243 +++++++++++++++++++++++++++++++++++++
 doc/dev_manual.rst         |   1 +
 doc/ref_make_variables.rst |  20 ++-
 6 files changed, 267 insertions(+), 49 deletions(-)
 create mode 100644 doc/dev_licenses.rst

diff --git a/doc/contributing.rst b/doc/contributing.rst
index bdaddee245a9..496998c913f7 100644
--- a/doc/contributing.rst
+++ b/doc/contributing.rst
@@ -103,6 +103,10 @@ updated of removed after a version bump. Unknown PTXCONF_* variables or
 macros used in menu files. There are often typos or the variables was just
 removed.
 
+New packages must also have licensing information in the ``<PKG>_LICENSE``
+and ``<PKG>_LICENSE_FILES`` variables.
+Refer to the section :ref:`licensing_in_packages` for more information.
+
 Helper Scripts
 --------------
 
diff --git a/doc/daily_work.inc b/doc/daily_work.inc
index 8fe7739aa0c8..ab901a54ee60 100644
--- a/doc/daily_work.inc
+++ b/doc/daily_work.inc
@@ -1480,3 +1480,5 @@ be enabled. A used mount option of the overlayfs in the default
 newer.
 If your kernel does not meet this requirement you can provide your own local
 and adapted variant of the mentioned mount unit.
+
+.. include:: daily_work_licenses.inc
diff --git a/doc/dev_add_new_pkgs.rst b/doc/dev_add_new_pkgs.rst
index 4ae2765c2ce9..a9e8fcf236c4 100644
--- a/doc/dev_add_new_pkgs.rst
+++ b/doc/dev_add_new_pkgs.rst
@@ -248,6 +248,7 @@ PTXdist specific. What does it mean:
 
 -  ``*_LICENSE`` enables the user to get a list of licenses she/he is
    using in her/his project (licenses of the enabled packages).
+   See :ref:`licensing_in_packages` below for detailed information.
 
 After enabling the menu entry, we can start to check the *get* and
 *extract* stages, calling them manually one after another.
@@ -604,51 +605,6 @@ This will re-start with a **clean** BSP and builds exactly the new package and
 its (known) dependencies. If this builds successfully as well we are really done
 with the new package.
 
-Some Notes about Licenses
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-The already mentioned rule variable ``*_LICENSE`` (e.g. ``FOO_LICENSE`` in our
-example) is very important and must be filled by the developer of the package.
-Many licenses bring in obligations using the corresponding package (*attribution*
-for example). To make life easier for everybody the license for a package must
-be provided. *SPDX* license identifiers unify the license names and are used
-in PTXdist to identify license types and obligations.
-
-If a package comes with more than one license, all of their SPDX identifiers
-must be listed and connected with the keyword ``AND``. If your package comes
-with GPL-2.0 and LGPL-2.1 licenses, the definition should look like this:
-
-.. code-block:: make
-
-   FOO_LICENSE := GPL-2.0 AND LGPL-2.1
-
-One specific obligation cannot be detected examining the SPDX license identifiers
-by PTXdist: *the license choice*. In this case all licenses of choice must be
-listed and connected by the keyword ``OR``.
-
-If, for example, your obligation is to select one of the licenses *GPL-2.0* **or**
-*GPL-3.0*, the ``*_LICENSE`` variable should look like this:
-
-.. code-block:: make
-
-   FOO_LICENSE := GPL-2.0 OR GPL-3.0
-
-SPDX License Identifiers
-^^^^^^^^^^^^^^^^^^^^^^^^
-
-A list of SPDX license identifiers can be found here:
-
-   https://spdx.org/licenses/
-
-Help to Detect the Correct License
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-License identification isn't trivial. A help in doing so can be the following
-repository and its content. It contains a list of known licenses based on their
-SPDX identifier. The content is without formatting to simplify text search.
-
-   https://github.com/spdx/license-list-data/tree/master/text
-
 Advanced Rule Files
 ~~~~~~~~~~~~~~~~~~~
 
diff --git a/doc/dev_licenses.rst b/doc/dev_licenses.rst
new file mode 100644
index 000000000000..06b4decd7728
--- /dev/null
+++ b/doc/dev_licenses.rst
@@ -0,0 +1,243 @@
+.. _licensing_in_packages:
+
+Tracking licensing information in packages
+------------------------------------------
+
+PTXdist aims to track licensing information for every package.
+This includes the license(s) under which a package can be distributed,
+as well as the respective files in the package's source tree that state those terms.
+Sadly there is no widely adopted standard for machine-readable licensing
+information in source code (`yet <https://reuse.software>`_),
+so here are a few hints where to look.
+
+In that process, we aim to collect the baseline set of licenses
+which at least apply to a package.
+There may be other licenses which apply too, but the complete set often cannot
+be found without a time-consuming review.
+Still, the extracted license information in PTXdist can serve as a hint for
+the full license compliance process,
+and can help to exclude certain software under certain licenses from the build.
+
+There are many older package rules in PTXdist which don't specify licensing information.
+If you want to help complete the database,
+you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
+Note however that this cannot find wrong or incomplete licensing information.
+
+Finding licensing information
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You should first select and extract the package in question, and then have a
+look at in the extracted package sources (usually something like
+``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
+``ptxdist package-info mypackage``).
+
+* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
+  These often only contain the license text and, in case of GPL, no information
+  if the code is available under the *-only* or *-or-later* variant.
+  Sometimes these files are in a folder ``/doc`` or ``/legal``.
+
+* Check the ``README``, if there is any.
+  Often there is important information there, e.g. in case of GPL if the
+  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
+
+* Check source files, like ``*.c`` for license headers.
+  Often additional information can be found here.
+
+* If you want to be extra sure, use a license compliance toolchain (e.g.
+  `FOSSology <https://www.fossology.org/>`__) on the project.
+
+Ideally you'll find two pieces of information:
+
+* A *license text* (e.g. a GNU General Public License v2.0 text)
+* A *license statement* that states that a certain license applies to (parts of) the project
+  (often also including copyright statements and a warranty disclaimer)
+
+Some licenses (e.g. BSD-style licenses) are also short enough so that both
+pieces are combined in a short comment header in a source file or a README.
+Strictly speaking, both the license text and the license statement must be
+present for a complete, unambiguous license, but see the next section about
+edge cases.
+
+On the other hand, there are some parts that can be ignored for our purposes:
+
+* Everything that is auto-generated, either by a script in the project source,
+  or by the build system previous to packaging.
+  The generator itself cannot hold copyright, although the authors of the
+  templates used for the generation or the authors of the generator can.
+
+* Most files belonging to the build system don't make it into the compiled code
+  and can therefore be ignored (e.g. configure scripts, Makefiles).
+  These cases sometimes can be hard to detect – if unsure, include the file in
+  your research.
+
+Some projects also include a COPYING.LIB containing an LGPL text, which is
+referenced nowhere in the project.
+In that case, ignore the COPYING.LIB – it probably comes from a boilerplate
+project skeleton and the maintainer forgot to delete it.
+
+Distillation into license identifiers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In PTXdist, we use `SPDX license expressions <https://spdx.org/licenses/>`_.
+
+Either the license identifier is clear, e.g. because the README says "GPL 2.0
+or later" (check the license text to be sure), or you can use tools like
+`FOSSology <https://www.fossology.org>`__,
+`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
+or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
+to match texts to SPDX license identifiers.
+
+License texts don't have to match exactly, you should apply the
+`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
+accordingly.
+The important part here is that the project's license and the SPDX identifier
+describe the same licensing terms.
+"Rather close" or "mostly similar" statements are not enough for a match,
+but simple unimportant changes like replacing *"The Author"* with the project's
+maintainer's name, or a change in e-mail adresses, are usually okay.
+
+For software that is not open-source according to the `OSI definition
+<https://opensource.org/osd>`_, use the identifier ``proprietary``.
+
+.. important::
+
+   If no license identifier matches, or if anything is unclear about the
+   licensing situation, use the identifier ``custom`` (for licenses)
+   or ``custom-exception`` (for license exceptions, e.g.: ``GPL-2.0-only WITH
+   custom-exception``).
+
+If SPDX doesn't know about a license yet, and the project is considered open
+source or free software, you can `report its license to be added to the SPDX
+license list
+<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.
+
+Multiple licenses
+^^^^^^^^^^^^^^^^^
+
+Open-source software is re-used all the time, so it can happen that some files
+make their way into a different project.
+This is usually no problem.
+If you encounter multiple parts of the project under different licenses, combine
+their license expressions with ``AND``.
+For example, in a project that contains both a library and command line tools,
+the license expression could be ``GPL-2.0-or-later AND LGPL-2.1-or-later``.
+
+Sometimes files are licensed under multiple licenses, and only one license is to
+be selected.
+In that case, combine the license expressions with ``OR``.
+This is often the case with Device Trees in the Linux kernel, e.g.:
+``GPL-2.0-only OR BSD-2-Clause``.
+
+No operator precedence is defined, use brackets ``(…)`` to group sub-statements.
+
+Conflicting and ambiguous statements
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Human interpretation is needed when statements inside the project conflict with
+each other.
+Some clues that can help you decide:
+
+Detailedness:
+  If the header in the COPYING file says *"GNU General Public License"*, but
+  the license text below that is in fact a BSD license, the correct license for
+  the license identifier is the BSD license.
+
+Author Intent:
+  If the README says *"this code is LGPL 2.1"*, but COPYING contains a GPL
+  boilerplate license text, the correct license identifier is probably *"LGPL 2.1"*
+  – the README written by the author prevails over the boilerplate text.
+
+Recency:
+  If README and COPYING are both clearly written by the author themselves, and
+  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
+  recent file prevails.
+
+Scope:
+  If no license statement can be found, but there is a COPYING file containing
+  a license text, infer that the whole project is licensed under that license.
+
+Err on the side of caution:
+  If all you can find is a GPL license text, this doesn't yet tell you whether
+  the project is licensed under the *-only* or the *-or-later* variant.
+  In that case, interpret the license restrictively and choose the *-only*
+  variant for the license identifier.
+
+Don't assume:
+  If anything is ambiguous or unclear, choose ``custom`` as a license identifier.
+
+.. note::
+
+   Any of these cases is considered a bug and should be reported to the upstream maintainers!
+
+"Public Domain" software
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+For `good reasons <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files>`_,
+SPDX doesn't supply a license identifier for "Public Domain".
+Nevertheless, some PTXdist package rules specify ``public_domain`` as their
+respective license identifier.
+This is purely for historical reasons, and ``public_domain`` should normally
+*not* be used for new packages.
+Some of those "Public Domain" dedications in packages have since been accepted
+in SPDX, e.g. `libselinux <https://spdx.org/licenses/libselinux-1.0.html>`_ or
+`SQLite <https://spdx.org/licenses/blessing.html>`_.
+
+No license information at all
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+No license - no usage rights!
+
+Definitely report this bug to the upstream maintainer.
+Maybe even point them in the direction of `machine-readablity <https://reuse.software/>`_ :)
+
+Adding license files to PTXdist package rules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
+variable in the respective package rule file.
+All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
+including a checksum so that PTXdist complains when they change.
+
+Example:
+
+.. code-block:: make
+   :caption: ddrescue.make
+
+   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
+   DDRESCUE_LICENSE_FILES	:= \
+           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
+           file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
+           file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
+
+See the section :ref:`package_specific_variables` for more information about
+the syntax of those two variables.
+
+The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
+command applied to a range of lines.
+For the example above, lines 1 to 16 of main.cc would be::
+
+   $ sed -n 1,16p main.cc | md5sum -
+   a01d61d3293ce28b883d8ba0c497e968
+
+If the copyright statement contains a string of years, leave those lines out for
+the calculation of the checksum, as an added year does not change the license
+(in fact, not even a single year is needed for the license to be valid),
+but only makes package version updates more cumbersome.
+
+If additional information is in the README or license headers in source files
+are used, also include these files (for source code: one of each is enough),
+but use md5sum only on the relevant lines, so changes in the rest of the file
+do not appear as license changes.
+
+For rather chaotic directories with lots of license files, definitely include at
+least one relevant source file with license headers (if there are any), as some
+developers tend to accumulate license files without adjusting it to license
+changes in their source.
+
+.. note::
+
+   For each single license identifier in the license expression, include at
+   least one file with checksum in the ``<PKG>_LICENSE_FILES`` variable.
+
+PTXdist will include all files (or their respective lines) that were referenced
+in ``<PKG>_LICENSE_FILES`` as verbatim sources in the license report.
diff --git a/doc/dev_manual.rst b/doc/dev_manual.rst
index c232cc91428a..0a1eaf8a1413 100644
--- a/doc/dev_manual.rst
+++ b/doc/dev_manual.rst
@@ -13,6 +13,7 @@ This chapter shows all (or most) of the details of how PTXdist works.
    dev_add_new_pkgs
    dev_add_bin_only_files
    dev_create_new_pkg_templates
+   dev_licenses
    dev_layers_in_ptxdist
    dev_kconfig_diffs
    dev_code_signing
diff --git a/doc/ref_make_variables.rst b/doc/ref_make_variables.rst
index 674acdcea982..2ee34856dd02 100644
--- a/doc/ref_make_variables.rst
+++ b/doc/ref_make_variables.rst
@@ -127,6 +127,8 @@ Other useful variables:
   that are built and installed during the PTXdist build run.
   There are analogous ``-y`` and ``-m`` variants of those variables too.
 
+.. _package_specific_variables:
+
 Package Specific Variables
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -223,10 +225,19 @@ Package Definition
   'gdbserver' for an example.
 
 ``<PKG>_LICENSE``
-  The license of the package. The SPDX license identifiers should be used
-  here. Use ``proprietary`` for proprietary packages and ``ignore`` for
-  packages without their own license, e.g. meta packages or packages that
-  only install files from ``projectroot/``.
+  The license of the package in the form of an `SPDX license expression
+  <https://spdx.org/licenses/>`_.
+  The following values have special meaning for PTXdist:
+
+  - ``custom`` and ``custom-exception``: for licenses or license exceptions
+    that are considered free software, but do not match any license or license
+    exception known to SPDX.
+  - ``proprietary``: for proprietary (non-free) packages
+  - ``ignore`` for packages without their own license, e.g. meta packages or
+    packages that only install files from ``projectroot/``
+  - ``unknown``: no licensing information was extracted yet
+
+  See the section :ref:`licensing_in_packages` for more information.
 
 ``<PKG>_LICENSE_FILES``
   A space separated list of URLs of license text files. The URLs must be
@@ -238,6 +249,7 @@ Package Definition
   used in case the specified file contains more than just the license text,
   e.g. if the license is in the header of a source file. For non ASCII or
   UTF-8 files the encoding can be specified with ``encoding=<enc>``.
+  See the section :ref:`licensing_in_packages` for more information.
 
 For most packages the variables described above are undefined by default.
 However, for cross and host packages these variables default to the value
-- 
2.29.2


_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH] doc: working with licensing information in packages
  2021-06-08 10:36 ` [ptxdist] [PATCH] " Roland Hieber
@ 2021-06-16 14:19   ` Michael Olbrich
  2021-06-16 14:40     ` Roland Hieber
  2021-08-05  9:18     ` [ptxdist] [PATCH v3] " Roland Hieber
  0 siblings, 2 replies; 19+ messages in thread
From: Michael Olbrich @ 2021-06-16 14:19 UTC (permalink / raw)
  To: ptxdist; +Cc: Roland Hieber, Felicitas Jung

On Tue, Jun 08, 2021 at 12:36:40PM +0200, Roland Hieber wrote:
> Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Roland Hieber <rhi@pengutronix.de>
> ---
> v1 -> v2:
>  - rebase to current master
>  - squash PATCH 1/2 ("link to the SPDX license list")
>  - move from daily use into dev manual chapter
>  - expand and rewrite some parts completely
>  - absorb old content in doc/dev_add_new_pkgs.rst
>  - address feedback from Michael Olbrich:
>    - check all source files instead of "some relevant-sounding files"
>    - introduce "custom" and "custom-exception" identifiers instead of
>      "unknown"
>    - be restrictive and err on the side of caution when interpreting
>      ambiguities
>    - shortly mention the AND, OR and bracket syntaxes
> 
> PATCH v1: https://lore.ptxdist.org/ptxdist/20200511100306.7948-2-rhi@pengutronix.de
> ---
>  doc/contributing.rst       |   4 +
>  doc/daily_work.inc         |   2 +
>  doc/dev_add_new_pkgs.rst   |  46 +------
>  doc/dev_licenses.rst       | 243 +++++++++++++++++++++++++++++++++++++
>  doc/dev_manual.rst         |   1 +
>  doc/ref_make_variables.rst |  20 ++-
>  6 files changed, 267 insertions(+), 49 deletions(-)
>  create mode 100644 doc/dev_licenses.rst
> 
> diff --git a/doc/contributing.rst b/doc/contributing.rst
> index bdaddee245a9..496998c913f7 100644
> --- a/doc/contributing.rst
> +++ b/doc/contributing.rst
> @@ -103,6 +103,10 @@ updated of removed after a version bump. Unknown PTXCONF_* variables or
>  macros used in menu files. There are often typos or the variables was just
>  removed.
>  
> +New packages must also have licensing information in the ``<PKG>_LICENSE``
> +and ``<PKG>_LICENSE_FILES`` variables.
> +Refer to the section :ref:`licensing_in_packages` for more information.
> +
>  Helper Scripts
>  --------------
>  
> diff --git a/doc/daily_work.inc b/doc/daily_work.inc
> index 8fe7739aa0c8..ab901a54ee60 100644
> --- a/doc/daily_work.inc
> +++ b/doc/daily_work.inc
> @@ -1480,3 +1480,5 @@ be enabled. A used mount option of the overlayfs in the default
>  newer.
>  If your kernel does not meet this requirement you can provide your own local
>  and adapted variant of the mentioned mount unit.
> +
> +.. include:: daily_work_licenses.inc
> diff --git a/doc/dev_add_new_pkgs.rst b/doc/dev_add_new_pkgs.rst
> index 4ae2765c2ce9..a9e8fcf236c4 100644
> --- a/doc/dev_add_new_pkgs.rst
> +++ b/doc/dev_add_new_pkgs.rst
> @@ -248,6 +248,7 @@ PTXdist specific. What does it mean:
>  
>  -  ``*_LICENSE`` enables the user to get a list of licenses she/he is
>     using in her/his project (licenses of the enabled packages).
> +   See :ref:`licensing_in_packages` below for detailed information.
>  
>  After enabling the menu entry, we can start to check the *get* and
>  *extract* stages, calling them manually one after another.
> @@ -604,51 +605,6 @@ This will re-start with a **clean** BSP and builds exactly the new package and
>  its (known) dependencies. If this builds successfully as well we are really done
>  with the new package.
>  
> -Some Notes about Licenses
> -~~~~~~~~~~~~~~~~~~~~~~~~~
> -
> -The already mentioned rule variable ``*_LICENSE`` (e.g. ``FOO_LICENSE`` in our
> -example) is very important and must be filled by the developer of the package.
> -Many licenses bring in obligations using the corresponding package (*attribution*
> -for example). To make life easier for everybody the license for a package must
> -be provided. *SPDX* license identifiers unify the license names and are used
> -in PTXdist to identify license types and obligations.
> -
> -If a package comes with more than one license, all of their SPDX identifiers
> -must be listed and connected with the keyword ``AND``. If your package comes
> -with GPL-2.0 and LGPL-2.1 licenses, the definition should look like this:
> -
> -.. code-block:: make
> -
> -   FOO_LICENSE := GPL-2.0 AND LGPL-2.1
> -
> -One specific obligation cannot be detected examining the SPDX license identifiers
> -by PTXdist: *the license choice*. In this case all licenses of choice must be
> -listed and connected by the keyword ``OR``.
> -
> -If, for example, your obligation is to select one of the licenses *GPL-2.0* **or**
> -*GPL-3.0*, the ``*_LICENSE`` variable should look like this:
> -
> -.. code-block:: make
> -
> -   FOO_LICENSE := GPL-2.0 OR GPL-3.0
> -
> -SPDX License Identifiers
> -^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -A list of SPDX license identifiers can be found here:
> -
> -   https://spdx.org/licenses/
> -
> -Help to Detect the Correct License
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -License identification isn't trivial. A help in doing so can be the following
> -repository and its content. It contains a list of known licenses based on their
> -SPDX identifier. The content is without formatting to simplify text search.
> -
> -   https://github.com/spdx/license-list-data/tree/master/text
> -
>  Advanced Rule Files
>  ~~~~~~~~~~~~~~~~~~~
>  
> diff --git a/doc/dev_licenses.rst b/doc/dev_licenses.rst
> new file mode 100644
> index 000000000000..06b4decd7728
> --- /dev/null
> +++ b/doc/dev_licenses.rst
> @@ -0,0 +1,243 @@
> +.. _licensing_in_packages:
> +
> +Tracking licensing information in packages
> +------------------------------------------
> +
> +PTXdist aims to track licensing information for every package.
> +This includes the license(s) under which a package can be distributed,
> +as well as the respective files in the package's source tree that state those terms.
> +Sadly there is no widely adopted standard for machine-readable licensing
> +information in source code (`yet <https://reuse.software>`_),
> +so here are a few hints where to look.
> +
> +In that process, we aim to collect the baseline set of licenses
> +which at least apply to a package.
> +There may be other licenses which apply too, but the complete set often cannot
> +be found without a time-consuming review.
> +Still, the extracted license information in PTXdist can serve as a hint for
> +the full license compliance process,
> +and can help to exclude certain software under certain licenses from the build.
> +
> +There are many older package rules in PTXdist which don't specify licensing information.
> +If you want to help complete the database,
> +you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
> +Note however that this cannot find wrong or incomplete licensing information.
> +
> +Finding licensing information
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +You should first select and extract the package in question, and then have a
> +look at in the extracted package sources (usually something like
> +``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
> +``ptxdist package-info mypackage``).
> +
> +* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
> +  These often only contain the license text and, in case of GPL, no information
> +  if the code is available under the *-only* or *-or-later* variant.
> +  Sometimes these files are in a folder ``/doc`` or ``/legal``.
> +
> +* Check the ``README``, if there is any.
> +  Often there is important information there, e.g. in case of GPL if the
> +  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
> +
> +* Check source files, like ``*.c`` for license headers.
> +  Often additional information can be found here.
> +
> +* If you want to be extra sure, use a license compliance toolchain (e.g.
> +  `FOSSology <https://www.fossology.org/>`__) on the project.
> +
> +Ideally you'll find two pieces of information:
> +
> +* A *license text* (e.g. a GNU General Public License v2.0 text)
> +* A *license statement* that states that a certain license applies to (parts of) the project
> +  (often also including copyright statements and a warranty disclaimer)
> +
> +Some licenses (e.g. BSD-style licenses) are also short enough so that both
> +pieces are combined in a short comment header in a source file or a README.
> +Strictly speaking, both the license text and the license statement must be
> +present for a complete, unambiguous license, but see the next section about
> +edge cases.
> +
> +On the other hand, there are some parts that can be ignored for our purposes:
> +
> +* Everything that is auto-generated, either by a script in the project source,
> +  or by the build system previous to packaging.
> +  The generator itself cannot hold copyright, although the authors of the
> +  templates used for the generation or the authors of the generator can.
> +
> +* Most files belonging to the build system don't make it into the compiled code
> +  and can therefore be ignored (e.g. configure scripts, Makefiles).
> +  These cases sometimes can be hard to detect – if unsure, include the file in
> +  your research.
> +
> +Some projects also include a COPYING.LIB containing an LGPL text, which is
> +referenced nowhere in the project.
> +In that case, ignore the COPYING.LIB – it probably comes from a boilerplate
> +project skeleton and the maintainer forgot to delete it.
> +
> +Distillation into license identifiers
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +In PTXdist, we use `SPDX license expressions <https://spdx.org/licenses/>`_.
> +
> +Either the license identifier is clear, e.g. because the README says "GPL 2.0
> +or later" (check the license text to be sure), or you can use tools like
> +`FOSSology <https://www.fossology.org>`__,
> +`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
> +or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
> +to match texts to SPDX license identifiers.
> +
> +License texts don't have to match exactly, you should apply the
> +`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
> +accordingly.
> +The important part here is that the project's license and the SPDX identifier
> +describe the same licensing terms.
> +"Rather close" or "mostly similar" statements are not enough for a match,
> +but simple unimportant changes like replacing *"The Author"* with the project's
> +maintainer's name, or a change in e-mail adresses, are usually okay.
> +
> +For software that is not open-source according to the `OSI definition
> +<https://opensource.org/osd>`_, use the identifier ``proprietary``.
> +
> +.. important::
> +
> +   If no license identifier matches, or if anything is unclear about the
> +   licensing situation, use the identifier ``custom`` (for licenses)
> +   or ``custom-exception`` (for license exceptions, e.g.: ``GPL-2.0-only WITH
> +   custom-exception``).
> +
> +If SPDX doesn't know about a license yet, and the project is considered open
> +source or free software, you can `report its license to be added to the SPDX
> +license list
> +<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.
> +
> +Multiple licenses
> +^^^^^^^^^^^^^^^^^
> +
> +Open-source software is re-used all the time, so it can happen that some files
> +make their way into a different project.
> +This is usually no problem.
> +If you encounter multiple parts of the project under different licenses, combine
> +their license expressions with ``AND``.
> +For example, in a project that contains both a library and command line tools,
> +the license expression could be ``GPL-2.0-or-later AND LGPL-2.1-or-later``.
> +
> +Sometimes files are licensed under multiple licenses, and only one license is to
> +be selected.
> +In that case, combine the license expressions with ``OR``.
> +This is often the case with Device Trees in the Linux kernel, e.g.:
> +``GPL-2.0-only OR BSD-2-Clause``.
> +
> +No operator precedence is defined, use brackets ``(…)`` to group sub-statements.
> +
> +Conflicting and ambiguous statements
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Human interpretation is needed when statements inside the project conflict with
> +each other.
> +Some clues that can help you decide:
> +
> +Detailedness:
> +  If the header in the COPYING file says *"GNU General Public License"*, but
> +  the license text below that is in fact a BSD license, the correct license for
> +  the license identifier is the BSD license.
> +
> +Author Intent:
> +  If the README says *"this code is LGPL 2.1"*, but COPYING contains a GPL
> +  boilerplate license text, the correct license identifier is probably *"LGPL 2.1"*
> +  – the README written by the author prevails over the boilerplate text.
> +
> +Recency:
> +  If README and COPYING are both clearly written by the author themselves, and
> +  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
> +  recent file prevails.
> +
> +Scope:
> +  If no license statement can be found, but there is a COPYING file containing
> +  a license text, infer that the whole project is licensed under that license.
> +
> +Err on the side of caution:
> +  If all you can find is a GPL license text, this doesn't yet tell you whether
> +  the project is licensed under the *-only* or the *-or-later* variant.
> +  In that case, interpret the license restrictively and choose the *-only*
> +  variant for the license identifier.
> +
> +Don't assume:
> +  If anything is ambiguous or unclear, choose ``custom`` as a license identifier.
> +
> +.. note::
> +
> +   Any of these cases is considered a bug and should be reported to the upstream maintainers!
> +
> +"Public Domain" software
> +^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +For `good reasons <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files>`_,
> +SPDX doesn't supply a license identifier for "Public Domain".
> +Nevertheless, some PTXdist package rules specify ``public_domain`` as their
> +respective license identifier.
> +This is purely for historical reasons, and ``public_domain`` should normally
> +*not* be used for new packages.
> +Some of those "Public Domain" dedications in packages have since been accepted
> +in SPDX, e.g. `libselinux <https://spdx.org/licenses/libselinux-1.0.html>`_ or
> +`SQLite <https://spdx.org/licenses/blessing.html>`_.
> +
> +No license information at all
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +No license - no usage rights!
> +
> +Definitely report this bug to the upstream maintainer.
> +Maybe even point them in the direction of `machine-readablity <https://reuse.software/>`_ :)
> +
> +Adding license files to PTXdist package rules
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
> +variable in the respective package rule file.
> +All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
> +including a checksum so that PTXdist complains when they change.
> +
> +Example:
> +
> +.. code-block:: make
> +   :caption: ddrescue.make
> +
> +   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
> +   DDRESCUE_LICENSE_FILES	:= \
> +           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
> +           file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
> +           file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
> +
> +See the section :ref:`package_specific_variables` for more information about
> +the syntax of those two variables.
> +
> +The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
> +command applied to a range of lines.
> +For the example above, lines 1 to 16 of main.cc would be::
> +
> +   $ sed -n 1,16p main.cc | md5sum -
> +   a01d61d3293ce28b883d8ba0c497e968
> +
> +If the copyright statement contains a string of years, leave those lines out for
> +the calculation of the checksum, as an added year does not change the license
> +(in fact, not even a single year is needed for the license to be valid),
> +but only makes package version updates more cumbersome.

I think, this is not quite clear or incorrect. For me, a 'copyright
statement' is something like this:

  Copyright (C) 2013 by Michael Olbrich <m.olbrich@pengutronix.de>

And for many licenses, this must not be removed. So omitting those lines is
wrong.
In some cases the copyright header in a file contains lines with only the
year. Maybe those can be skipped. But they are pretty rare.

The rest looks good to me.

Michael

> +
> +If additional information is in the README or license headers in source files
> +are used, also include these files (for source code: one of each is enough),
> +but use md5sum only on the relevant lines, so changes in the rest of the file
> +do not appear as license changes.
> +
> +For rather chaotic directories with lots of license files, definitely include at
> +least one relevant source file with license headers (if there are any), as some
> +developers tend to accumulate license files without adjusting it to license
> +changes in their source.
> +
> +.. note::
> +
> +   For each single license identifier in the license expression, include at
> +   least one file with checksum in the ``<PKG>_LICENSE_FILES`` variable.
> +
> +PTXdist will include all files (or their respective lines) that were referenced
> +in ``<PKG>_LICENSE_FILES`` as verbatim sources in the license report.
> diff --git a/doc/dev_manual.rst b/doc/dev_manual.rst
> index c232cc91428a..0a1eaf8a1413 100644
> --- a/doc/dev_manual.rst
> +++ b/doc/dev_manual.rst
> @@ -13,6 +13,7 @@ This chapter shows all (or most) of the details of how PTXdist works.
>     dev_add_new_pkgs
>     dev_add_bin_only_files
>     dev_create_new_pkg_templates
> +   dev_licenses
>     dev_layers_in_ptxdist
>     dev_kconfig_diffs
>     dev_code_signing
> diff --git a/doc/ref_make_variables.rst b/doc/ref_make_variables.rst
> index 674acdcea982..2ee34856dd02 100644
> --- a/doc/ref_make_variables.rst
> +++ b/doc/ref_make_variables.rst
> @@ -127,6 +127,8 @@ Other useful variables:
>    that are built and installed during the PTXdist build run.
>    There are analogous ``-y`` and ``-m`` variants of those variables too.
>  
> +.. _package_specific_variables:
> +
>  Package Specific Variables
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> @@ -223,10 +225,19 @@ Package Definition
>    'gdbserver' for an example.
>  
>  ``<PKG>_LICENSE``
> -  The license of the package. The SPDX license identifiers should be used
> -  here. Use ``proprietary`` for proprietary packages and ``ignore`` for
> -  packages without their own license, e.g. meta packages or packages that
> -  only install files from ``projectroot/``.
> +  The license of the package in the form of an `SPDX license expression
> +  <https://spdx.org/licenses/>`_.
> +  The following values have special meaning for PTXdist:
> +
> +  - ``custom`` and ``custom-exception``: for licenses or license exceptions
> +    that are considered free software, but do not match any license or license
> +    exception known to SPDX.
> +  - ``proprietary``: for proprietary (non-free) packages
> +  - ``ignore`` for packages without their own license, e.g. meta packages or
> +    packages that only install files from ``projectroot/``
> +  - ``unknown``: no licensing information was extracted yet
> +
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  ``<PKG>_LICENSE_FILES``
>    A space separated list of URLs of license text files. The URLs must be
> @@ -238,6 +249,7 @@ Package Definition
>    used in case the specified file contains more than just the license text,
>    e.g. if the license is in the header of a source file. For non ASCII or
>    UTF-8 files the encoding can be specified with ``encoding=<enc>``.
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  For most packages the variables described above are undefined by default.
>  However, for cross and host packages these variables default to the value
> -- 
> 2.29.2
> 
> 
> _______________________________________________
> ptxdist mailing list
> ptxdist@pengutronix.de
> To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH] doc: working with licensing information in packages
  2021-06-16 14:19   ` Michael Olbrich
@ 2021-06-16 14:40     ` Roland Hieber
  2021-08-05  9:18     ` [ptxdist] [PATCH v3] " Roland Hieber
  1 sibling, 0 replies; 19+ messages in thread
From: Roland Hieber @ 2021-06-16 14:40 UTC (permalink / raw)
  To: ptxdist

On Wed, Jun 16, 2021 at 04:19:43PM +0200, Michael Olbrich wrote:
> > +Adding license files to PTXdist package rules
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
> > +variable in the respective package rule file.
> > +All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
> > +including a checksum so that PTXdist complains when they change.
> > +
> > +Example:
> > +
> > +.. code-block:: make
> > +   :caption: ddrescue.make
> > +
> > +   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
> > +   DDRESCUE_LICENSE_FILES	:= \
> > +           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
> > +           file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
> > +           file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
> > +
> > +See the section :ref:`package_specific_variables` for more information about
> > +the syntax of those two variables.
> > +
> > +The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
> > +command applied to a range of lines.
> > +For the example above, lines 1 to 16 of main.cc would be::
> > +
> > +   $ sed -n 1,16p main.cc | md5sum -
> > +   a01d61d3293ce28b883d8ba0c497e968
> > +
> > +If the copyright statement contains a string of years, leave those lines out for
> > +the calculation of the checksum, as an added year does not change the license
> > +(in fact, not even a single year is needed for the license to be valid),
> > +but only makes package version updates more cumbersome.
> 
> I think, this is not quite clear or incorrect. For me, a 'copyright
> statement' is something like this:
> 
>   Copyright (C) 2013 by Michael Olbrich <m.olbrich@pengutronix.de>
> 
> And for many licenses, this must not be removed. So omitting those lines is
> wrong.

Hmmmm, yes, that makes sense to me too. I guess the reasoning was that
the year could be updated every year, so the license MD5 will change too
and needs updating. But yes, GPL actually demands that the copyright
lines stay in place.

I'll rephrase that paragraph to include the copyright statement, the
license statement and (if present) the license text in the
_LICENSE_FILES variable. Having to bump license MD5s is a smaller evil
compared to delivering incomplete data.

 - Roland

> In some cases the copyright header in a file contains lines with only the
> year. Maybe those can be skipped. But they are pretty rare.
> 
> The rest looks good to me.
> 
> Michael

-- 
Roland Hieber, Pengutronix e.K.          | r.hieber@pengutronix.de     |
Steuerwalder Str. 21                     | https://www.pengutronix.de/ |
31137 Hildesheim, Germany                | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686         | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ptxdist] [PATCH v3] doc: working with licensing information in packages
  2021-06-16 14:19   ` Michael Olbrich
  2021-06-16 14:40     ` Roland Hieber
@ 2021-08-05  9:18     ` Roland Hieber
  2021-08-06  6:29       ` Michael Olbrich
  1 sibling, 1 reply; 19+ messages in thread
From: Roland Hieber @ 2021-08-05  9:18 UTC (permalink / raw)
  To: ptxdist; +Cc: Roland Hieber

Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
Signed-off-by: Roland Hieber <rhi@pengutronix.de>
---
PATCH v3:
 - rebase to current master
 - rewrite paragraph about always including the copyright statement
   lines in the checksum (feedback from Michael Olbrich)

PATCH v2: https://lore.ptxdist.org/ptxdist/20210608103639.24336-1-rhi@pengutronix.de
 - rebase to current master
 - squash PATCH 1/2 ("link to the SPDX license list")
 - move from daily use into dev manual chapter
 - expand and rewrite some parts completely
 - absorb old content in doc/dev_add_new_pkgs.rst
 - address feedback from Michael Olbrich:
   - check all source files instead of "some relevant-sounding files"
   - introduce "custom" and "custom-exception" identifiers instead of
     "unknown"
   - be restrictive and err on the side of caution when interpreting
     ambiguities
   - shortly mention the AND, OR and bracket syntaxes

PATCH v1: https://lore.ptxdist.org/ptxdist/20200511100306.7948-2-rhi@pengutronix.de
---
 doc/contributing.rst       |   4 +
 doc/daily_work.inc         |   2 +
 doc/dev_add_new_pkgs.rst   |  46 +------
 doc/dev_licenses.rst       | 246 +++++++++++++++++++++++++++++++++++++
 doc/dev_manual.rst         |   1 +
 doc/ref_make_variables.rst |  20 ++-
 6 files changed, 270 insertions(+), 49 deletions(-)
 create mode 100644 doc/dev_licenses.rst

diff --git a/doc/contributing.rst b/doc/contributing.rst
index e7cbd90e6cc3..e4209480893d 100644
--- a/doc/contributing.rst
+++ b/doc/contributing.rst
@@ -156,6 +156,10 @@ updated of removed after a version bump. Unknown PTXCONF_* variables or
 macros used in menu files. There are often typos or the variables was just
 removed.
 
+New packages must also have licensing information in the ``<PKG>_LICENSE``
+and ``<PKG>_LICENSE_FILES`` variables.
+Refer to the section :ref:`licensing_in_packages` for more information.
+
 Helper Scripts
 --------------
 
diff --git a/doc/daily_work.inc b/doc/daily_work.inc
index 37bb9bc48180..b887ed8cd29d 100644
--- a/doc/daily_work.inc
+++ b/doc/daily_work.inc
@@ -1552,3 +1552,5 @@ be enabled. A used mount option of the overlayfs in the default
 newer.
 If your kernel does not meet this requirement you can provide your own local
 and adapted variant of the mentioned mount unit.
+
+.. include:: daily_work_licenses.inc
diff --git a/doc/dev_add_new_pkgs.rst b/doc/dev_add_new_pkgs.rst
index 3506436d78ec..6b1248563e6f 100644
--- a/doc/dev_add_new_pkgs.rst
+++ b/doc/dev_add_new_pkgs.rst
@@ -248,6 +248,7 @@ PTXdist specific. What does it mean:
 
 -  ``*_LICENSE`` enables the user to get a list of licenses she/he is
    using in her/his project (licenses of the enabled packages).
+   See :ref:`licensing_in_packages` below for detailed information.
 
 After enabling the menu entry, we can start to check the *get* and
 *extract* stages, calling them manually one after another.
@@ -603,48 +604,3 @@ to do (even if its boring and takes time):
 This will re-start with a **clean** BSP and builds exactly the new package and
 its (known) dependencies. If this builds successfully as well we are really done
 with the new package.
-
-Some Notes about Licenses
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-The already mentioned rule variable ``*_LICENSE`` (e.g. ``FOO_LICENSE`` in our
-example) is very important and must be filled by the developer of the package.
-Many licenses bring in obligations using the corresponding package (*attribution*
-for example). To make life easier for everybody the license for a package must
-be provided. *SPDX* license identifiers unify the license names and are used
-in PTXdist to identify license types and obligations.
-
-If a package comes with more than one license, all of their SPDX identifiers
-must be listed and connected with the keyword ``AND``. If your package comes
-with GPL-2.0 and LGPL-2.1 licenses, the definition should look like this:
-
-.. code-block:: make
-
-   FOO_LICENSE := GPL-2.0 AND LGPL-2.1
-
-One specific obligation cannot be detected examining the SPDX license identifiers
-by PTXdist: *the license choice*. In this case all licenses of choice must be
-listed and connected by the keyword ``OR``.
-
-If, for example, your obligation is to select one of the licenses *GPL-2.0* **or**
-*GPL-3.0*, the ``*_LICENSE`` variable should look like this:
-
-.. code-block:: make
-
-   FOO_LICENSE := GPL-2.0 OR GPL-3.0
-
-SPDX License Identifiers
-^^^^^^^^^^^^^^^^^^^^^^^^
-
-A list of SPDX license identifiers can be found here:
-
-   https://spdx.org/licenses/
-
-Help to Detect the Correct License
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-License identification isn't trivial. A help in doing so can be the following
-repository and its content. It contains a list of known licenses based on their
-SPDX identifier. The content is without formatting to simplify text search.
-
-   https://github.com/spdx/license-list-data/tree/master/text
diff --git a/doc/dev_licenses.rst b/doc/dev_licenses.rst
new file mode 100644
index 000000000000..3b7d0bb5f644
--- /dev/null
+++ b/doc/dev_licenses.rst
@@ -0,0 +1,246 @@
+.. _licensing_in_packages:
+
+Tracking licensing information in packages
+------------------------------------------
+
+PTXdist aims to track licensing information for every package.
+This includes the license(s) under which a package can be distributed,
+as well as the respective files in the package's source tree that state those terms.
+Sadly there is no widely adopted standard for machine-readable licensing
+information in source code (`yet <https://reuse.software>`_),
+so here are a few hints where to look.
+
+In that process, we aim to collect the baseline set of licenses
+which at least apply to a package.
+There may be other licenses which apply too, but the complete set often cannot
+be found without a time-consuming review.
+Still, the extracted license information in PTXdist can serve as a hint for
+the full license compliance process,
+and can help to exclude certain software under certain licenses from the build.
+
+There are many older package rules in PTXdist which don't specify licensing information.
+If you want to help complete the database,
+you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
+Note however that this cannot find wrong or incomplete licensing information.
+
+Finding licensing information
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You should first select and extract the package in question, and then have a
+look at in the extracted package sources (usually something like
+``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
+``ptxdist package-info mypackage``).
+
+* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
+  These often only contain the license text and, in case of GPL, no information
+  if the code is available under the *-only* or *-or-later* variant.
+  Sometimes these files are in a folder ``/doc`` or ``/legal``.
+
+* Check the ``README``, if there is any.
+  Often there is important information there, e.g. in case of GPL if the
+  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
+
+* Check source files, like ``*.c`` for license headers.
+  Often additional information can be found here.
+
+* If you want to be extra sure, use a license compliance toolchain (e.g.
+  `FOSSology <https://www.fossology.org/>`__) on the project.
+
+Ideally you'll find two pieces of information:
+
+* A *license text* (e.g. a GNU General Public License v2.0 text)
+* A *license statement* that states that a certain license applies to (parts of) the project
+  (often also including copyright statements and a warranty disclaimer)
+
+Some licenses (e.g. BSD-style licenses) are also short enough so that both
+pieces are combined in a short comment header in a source file or a README.
+Strictly speaking, both the license text and the license statement must be
+present for a complete, unambiguous license, but see the next section about
+edge cases.
+
+On the other hand, there are some parts that can be ignored for our purposes:
+
+* Everything that is auto-generated, either by a script in the project source,
+  or by the build system previous to packaging.
+  The generator itself cannot hold copyright, although the authors of the
+  templates used for the generation or the authors of the generator can.
+
+* Most files belonging to the build system don't make it into the compiled code
+  and can therefore be ignored (e.g. configure scripts, Makefiles).
+  These cases sometimes can be hard to detect – if unsure, include the file in
+  your research.
+
+Some projects also include a COPYING.LIB containing an LGPL text, which is
+referenced nowhere in the project.
+In that case, ignore the COPYING.LIB – it probably comes from a boilerplate
+project skeleton and the maintainer forgot to delete it.
+
+Distillation into license identifiers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In PTXdist, we use `SPDX license expressions <https://spdx.org/licenses/>`_.
+
+Either the license identifier is clear, e.g. because the README says "GPL 2.0
+or later" (check the license text to be sure), or you can use tools like
+`FOSSology <https://www.fossology.org>`__,
+`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
+or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
+to match texts to SPDX license identifiers.
+
+License texts don't have to match exactly, you should apply the
+`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
+accordingly.
+The important part here is that the project's license and the SPDX identifier
+describe the same licensing terms.
+"Rather close" or "mostly similar" statements are not enough for a match,
+but simple unimportant changes like replacing *"The Author"* with the project's
+maintainer's name, or a change in e-mail adresses, are usually okay.
+
+For software that is not open-source according to the `OSI definition
+<https://opensource.org/osd>`_, use the identifier ``proprietary``.
+
+.. important::
+
+   If no license identifier matches, or if anything is unclear about the
+   licensing situation, use the identifier ``custom`` (for licenses)
+   or ``custom-exception`` (for license exceptions, e.g.: ``GPL-2.0-only WITH
+   custom-exception``).
+
+If SPDX doesn't know about a license yet, and the project is considered open
+source or free software, you can `report its license to be added to the SPDX
+license list
+<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.
+
+Multiple licenses
+^^^^^^^^^^^^^^^^^
+
+Open-source software is re-used all the time, so it can happen that some files
+make their way into a different project.
+This is usually no problem.
+If you encounter multiple parts of the project under different licenses, combine
+their license expressions with ``AND``.
+For example, in a project that contains both a library and command line tools,
+the license expression could be ``GPL-2.0-or-later AND LGPL-2.1-or-later``.
+
+Sometimes files are licensed under multiple licenses, and only one license is to
+be selected.
+In that case, combine the license expressions with ``OR``.
+This is often the case with Device Trees in the Linux kernel, e.g.:
+``GPL-2.0-only OR BSD-2-Clause``.
+
+No operator precedence is defined, use brackets ``(…)`` to group sub-statements.
+
+Conflicting and ambiguous statements
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Human interpretation is needed when statements inside the project conflict with
+each other.
+Some clues that can help you decide:
+
+Detailedness:
+  If the header in the COPYING file says *"GNU General Public License"*, but
+  the license text below that is in fact a BSD license, the correct license for
+  the license identifier is the BSD license.
+
+Author Intent:
+  If the README says *"this code is LGPL 2.1"*, but COPYING contains a GPL
+  boilerplate license text, the correct license identifier is probably *"LGPL 2.1"*
+  – the README written by the author prevails over the boilerplate text.
+
+Recency:
+  If README and COPYING are both clearly written by the author themselves, and
+  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
+  recent file prevails.
+
+Scope:
+  If no license statement can be found, but there is a COPYING file containing
+  a license text, infer that the whole project is licensed under that license.
+
+Err on the side of caution:
+  If all you can find is a GPL license text, this doesn't yet tell you whether
+  the project is licensed under the *-only* or the *-or-later* variant.
+  In that case, interpret the license restrictively and choose the *-only*
+  variant for the license identifier.
+
+Don't assume:
+  If anything is ambiguous or unclear, choose ``custom`` as a license identifier.
+
+.. note::
+
+   Any of these cases is considered a bug and should be reported to the upstream maintainers!
+
+"Public Domain" software
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+For `good reasons <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files>`_,
+SPDX doesn't supply a license identifier for "Public Domain".
+Nevertheless, some PTXdist package rules specify ``public_domain`` as their
+respective license identifier.
+This is purely for historical reasons, and ``public_domain`` should normally
+*not* be used for new packages.
+Some of those "Public Domain" dedications in packages have since been accepted
+in SPDX, e.g. `libselinux <https://spdx.org/licenses/libselinux-1.0.html>`_ or
+`SQLite <https://spdx.org/licenses/blessing.html>`_.
+
+No license information at all
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+No license - no usage rights!
+
+Definitely report this bug to the upstream maintainer.
+Maybe even point them in the direction of `machine-readablity <https://reuse.software/>`_ :)
+
+Adding license files to PTXdist packages
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
+variable in the respective package rule file.
+All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
+including a checksum so that PTXdist complains when they change.
+
+Example:
+
+.. code-block:: make
+   :caption: ddrescue.make
+
+   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
+   DDRESCUE_LICENSE_FILES	:= \
+           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
+           file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
+           file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
+
+See the section :ref:`package_specific_variables` for more information about
+the syntax of those two variables.
+
+The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
+command applied to a range of lines.
+For the example above, lines 1 to 16 of main.cc would be::
+
+   $ sed -n 1,16p main.cc | md5sum -
+   a01d61d3293ce28b883d8ba0c497e968
+
+Always include the copyright statement ("Copyright YYYY (C) Some Person")
+for the calculation of the checksum, even if it means that the checksum changes
+on package updates when new years are added to the string.
+While it is not is needed for most licenses to be valid, some licenses require
+that it must not be removed (e.g. see GPLv2, section 1),
+and it is proper etiquette to give attribution to the maintainers in the
+license report document.
+
+If additional information is in the README or license headers in source files
+are used, also include these files (for source code: one of each is enough),
+but use md5sum only on the relevant lines, so changes in the rest of the file
+do not appear as license changes.
+
+For rather chaotic directories with lots of license files, definitely include at
+least one relevant source file with license headers (if there are any), as some
+developers tend to accumulate license files without adjusting it to license
+changes in their source.
+
+.. note::
+
+   For each single license identifier in the license expression, include at
+   least one file with checksum in the ``<PKG>_LICENSE_FILES`` variable.
+
+PTXdist will include all files (or their respective lines) that were referenced
+in ``<PKG>_LICENSE_FILES`` as verbatim sources in the license report.
diff --git a/doc/dev_manual.rst b/doc/dev_manual.rst
index e9a88c1a97f5..fe4307a86b80 100644
--- a/doc/dev_manual.rst
+++ b/doc/dev_manual.rst
@@ -15,6 +15,7 @@ This chapter shows all (or most) of the details of how PTXdist works.
    dev_patching
    dev_add_bin_only_files
    dev_create_new_pkg_templates
+   dev_licenses
    dev_layers_in_ptxdist
    dev_kconfig_diffs
    dev_code_signing
diff --git a/doc/ref_make_variables.rst b/doc/ref_make_variables.rst
index 674acdcea982..2ee34856dd02 100644
--- a/doc/ref_make_variables.rst
+++ b/doc/ref_make_variables.rst
@@ -127,6 +127,8 @@ Other useful variables:
   that are built and installed during the PTXdist build run.
   There are analogous ``-y`` and ``-m`` variants of those variables too.
 
+.. _package_specific_variables:
+
 Package Specific Variables
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -223,10 +225,19 @@ Package Definition
   'gdbserver' for an example.
 
 ``<PKG>_LICENSE``
-  The license of the package. The SPDX license identifiers should be used
-  here. Use ``proprietary`` for proprietary packages and ``ignore`` for
-  packages without their own license, e.g. meta packages or packages that
-  only install files from ``projectroot/``.
+  The license of the package in the form of an `SPDX license expression
+  <https://spdx.org/licenses/>`_.
+  The following values have special meaning for PTXdist:
+
+  - ``custom`` and ``custom-exception``: for licenses or license exceptions
+    that are considered free software, but do not match any license or license
+    exception known to SPDX.
+  - ``proprietary``: for proprietary (non-free) packages
+  - ``ignore`` for packages without their own license, e.g. meta packages or
+    packages that only install files from ``projectroot/``
+  - ``unknown``: no licensing information was extracted yet
+
+  See the section :ref:`licensing_in_packages` for more information.
 
 ``<PKG>_LICENSE_FILES``
   A space separated list of URLs of license text files. The URLs must be
@@ -238,6 +249,7 @@ Package Definition
   used in case the specified file contains more than just the license text,
   e.g. if the license is in the header of a source file. For non ASCII or
   UTF-8 files the encoding can be specified with ``encoding=<enc>``.
+  See the section :ref:`licensing_in_packages` for more information.
 
 For most packages the variables described above are undefined by default.
 However, for cross and host packages these variables default to the value
-- 
2.30.2


_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [PATCH v3] doc: working with licensing information in packages
  2021-08-05  9:18     ` [ptxdist] [PATCH v3] " Roland Hieber
@ 2021-08-06  6:29       ` Michael Olbrich
  2021-08-06 10:44         ` [ptxdist] [PATCH] " Roland Hieber
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Olbrich @ 2021-08-06  6:29 UTC (permalink / raw)
  To: Roland Hieber, ptxdist

On Thu, Aug 05, 2021 at 11:18:48AM +0200, Roland Hieber wrote:
> Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Roland Hieber <rhi@pengutronix.de>
> ---
> PATCH v3:
>  - rebase to current master
>  - rewrite paragraph about always including the copyright statement
>    lines in the checksum (feedback from Michael Olbrich)
> 
> PATCH v2: https://lore.ptxdist.org/ptxdist/20210608103639.24336-1-rhi@pengutronix.de
>  - rebase to current master
>  - squash PATCH 1/2 ("link to the SPDX license list")
>  - move from daily use into dev manual chapter
>  - expand and rewrite some parts completely
>  - absorb old content in doc/dev_add_new_pkgs.rst
>  - address feedback from Michael Olbrich:
>    - check all source files instead of "some relevant-sounding files"
>    - introduce "custom" and "custom-exception" identifiers instead of
>      "unknown"
>    - be restrictive and err on the side of caution when interpreting
>      ambiguities
>    - shortly mention the AND, OR and bracket syntaxes
> 
> PATCH v1: https://lore.ptxdist.org/ptxdist/20200511100306.7948-2-rhi@pengutronix.de
> ---
>  doc/contributing.rst       |   4 +
>  doc/daily_work.inc         |   2 +
>  doc/dev_add_new_pkgs.rst   |  46 +------
>  doc/dev_licenses.rst       | 246 +++++++++++++++++++++++++++++++++++++
>  doc/dev_manual.rst         |   1 +
>  doc/ref_make_variables.rst |  20 ++-
>  6 files changed, 270 insertions(+), 49 deletions(-)
>  create mode 100644 doc/dev_licenses.rst
> 
> diff --git a/doc/contributing.rst b/doc/contributing.rst
> index e7cbd90e6cc3..e4209480893d 100644
> --- a/doc/contributing.rst
> +++ b/doc/contributing.rst
> @@ -156,6 +156,10 @@ updated of removed after a version bump. Unknown PTXCONF_* variables or
>  macros used in menu files. There are often typos or the variables was just
>  removed.
>  
> +New packages must also have licensing information in the ``<PKG>_LICENSE``
> +and ``<PKG>_LICENSE_FILES`` variables.
> +Refer to the section :ref:`licensing_in_packages` for more information.
> +
>  Helper Scripts
>  --------------
>  
> diff --git a/doc/daily_work.inc b/doc/daily_work.inc
> index 37bb9bc48180..b887ed8cd29d 100644
> --- a/doc/daily_work.inc
> +++ b/doc/daily_work.inc
> @@ -1552,3 +1552,5 @@ be enabled. A used mount option of the overlayfs in the default
>  newer.
>  If your kernel does not meet this requirement you can provide your own local
>  and adapted variant of the mentioned mount unit.
> +
> +.. include:: daily_work_licenses.inc

/tmp/ptxdist.9cBZgq/docs/daily_work.inc:1556: WARNING: Problems with "include" directive path:
InputError: [Errno 2] No such file or directory: '/tmp/ptxdist.9cBZgq/docs/daily_work_licenses.inc'.

Michael


> diff --git a/doc/dev_add_new_pkgs.rst b/doc/dev_add_new_pkgs.rst
> index 3506436d78ec..6b1248563e6f 100644
> --- a/doc/dev_add_new_pkgs.rst
> +++ b/doc/dev_add_new_pkgs.rst
> @@ -248,6 +248,7 @@ PTXdist specific. What does it mean:
>  
>  -  ``*_LICENSE`` enables the user to get a list of licenses she/he is
>     using in her/his project (licenses of the enabled packages).
> +   See :ref:`licensing_in_packages` below for detailed information.
>  
>  After enabling the menu entry, we can start to check the *get* and
>  *extract* stages, calling them manually one after another.
> @@ -603,48 +604,3 @@ to do (even if its boring and takes time):
>  This will re-start with a **clean** BSP and builds exactly the new package and
>  its (known) dependencies. If this builds successfully as well we are really done
>  with the new package.
> -
> -Some Notes about Licenses
> -~~~~~~~~~~~~~~~~~~~~~~~~~
> -
> -The already mentioned rule variable ``*_LICENSE`` (e.g. ``FOO_LICENSE`` in our
> -example) is very important and must be filled by the developer of the package.
> -Many licenses bring in obligations using the corresponding package (*attribution*
> -for example). To make life easier for everybody the license for a package must
> -be provided. *SPDX* license identifiers unify the license names and are used
> -in PTXdist to identify license types and obligations.
> -
> -If a package comes with more than one license, all of their SPDX identifiers
> -must be listed and connected with the keyword ``AND``. If your package comes
> -with GPL-2.0 and LGPL-2.1 licenses, the definition should look like this:
> -
> -.. code-block:: make
> -
> -   FOO_LICENSE := GPL-2.0 AND LGPL-2.1
> -
> -One specific obligation cannot be detected examining the SPDX license identifiers
> -by PTXdist: *the license choice*. In this case all licenses of choice must be
> -listed and connected by the keyword ``OR``.
> -
> -If, for example, your obligation is to select one of the licenses *GPL-2.0* **or**
> -*GPL-3.0*, the ``*_LICENSE`` variable should look like this:
> -
> -.. code-block:: make
> -
> -   FOO_LICENSE := GPL-2.0 OR GPL-3.0
> -
> -SPDX License Identifiers
> -^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -A list of SPDX license identifiers can be found here:
> -
> -   https://spdx.org/licenses/
> -
> -Help to Detect the Correct License
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -License identification isn't trivial. A help in doing so can be the following
> -repository and its content. It contains a list of known licenses based on their
> -SPDX identifier. The content is without formatting to simplify text search.
> -
> -   https://github.com/spdx/license-list-data/tree/master/text
> diff --git a/doc/dev_licenses.rst b/doc/dev_licenses.rst
> new file mode 100644
> index 000000000000..3b7d0bb5f644
> --- /dev/null
> +++ b/doc/dev_licenses.rst
> @@ -0,0 +1,246 @@
> +.. _licensing_in_packages:
> +
> +Tracking licensing information in packages
> +------------------------------------------
> +
> +PTXdist aims to track licensing information for every package.
> +This includes the license(s) under which a package can be distributed,
> +as well as the respective files in the package's source tree that state those terms.
> +Sadly there is no widely adopted standard for machine-readable licensing
> +information in source code (`yet <https://reuse.software>`_),
> +so here are a few hints where to look.
> +
> +In that process, we aim to collect the baseline set of licenses
> +which at least apply to a package.
> +There may be other licenses which apply too, but the complete set often cannot
> +be found without a time-consuming review.
> +Still, the extracted license information in PTXdist can serve as a hint for
> +the full license compliance process,
> +and can help to exclude certain software under certain licenses from the build.
> +
> +There are many older package rules in PTXdist which don't specify licensing information.
> +If you want to help complete the database,
> +you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
> +Note however that this cannot find wrong or incomplete licensing information.
> +
> +Finding licensing information
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +You should first select and extract the package in question, and then have a
> +look at in the extracted package sources (usually something like
> +``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
> +``ptxdist package-info mypackage``).
> +
> +* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
> +  These often only contain the license text and, in case of GPL, no information
> +  if the code is available under the *-only* or *-or-later* variant.
> +  Sometimes these files are in a folder ``/doc`` or ``/legal``.
> +
> +* Check the ``README``, if there is any.
> +  Often there is important information there, e.g. in case of GPL if the
> +  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
> +
> +* Check source files, like ``*.c`` for license headers.
> +  Often additional information can be found here.
> +
> +* If you want to be extra sure, use a license compliance toolchain (e.g.
> +  `FOSSology <https://www.fossology.org/>`__) on the project.
> +
> +Ideally you'll find two pieces of information:
> +
> +* A *license text* (e.g. a GNU General Public License v2.0 text)
> +* A *license statement* that states that a certain license applies to (parts of) the project
> +  (often also including copyright statements and a warranty disclaimer)
> +
> +Some licenses (e.g. BSD-style licenses) are also short enough so that both
> +pieces are combined in a short comment header in a source file or a README.
> +Strictly speaking, both the license text and the license statement must be
> +present for a complete, unambiguous license, but see the next section about
> +edge cases.
> +
> +On the other hand, there are some parts that can be ignored for our purposes:
> +
> +* Everything that is auto-generated, either by a script in the project source,
> +  or by the build system previous to packaging.
> +  The generator itself cannot hold copyright, although the authors of the
> +  templates used for the generation or the authors of the generator can.
> +
> +* Most files belonging to the build system don't make it into the compiled code
> +  and can therefore be ignored (e.g. configure scripts, Makefiles).
> +  These cases sometimes can be hard to detect – if unsure, include the file in
> +  your research.
> +
> +Some projects also include a COPYING.LIB containing an LGPL text, which is
> +referenced nowhere in the project.
> +In that case, ignore the COPYING.LIB – it probably comes from a boilerplate
> +project skeleton and the maintainer forgot to delete it.
> +
> +Distillation into license identifiers
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +In PTXdist, we use `SPDX license expressions <https://spdx.org/licenses/>`_.
> +
> +Either the license identifier is clear, e.g. because the README says "GPL 2.0
> +or later" (check the license text to be sure), or you can use tools like
> +`FOSSology <https://www.fossology.org>`__,
> +`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
> +or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
> +to match texts to SPDX license identifiers.
> +
> +License texts don't have to match exactly, you should apply the
> +`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
> +accordingly.
> +The important part here is that the project's license and the SPDX identifier
> +describe the same licensing terms.
> +"Rather close" or "mostly similar" statements are not enough for a match,
> +but simple unimportant changes like replacing *"The Author"* with the project's
> +maintainer's name, or a change in e-mail adresses, are usually okay.
> +
> +For software that is not open-source according to the `OSI definition
> +<https://opensource.org/osd>`_, use the identifier ``proprietary``.
> +
> +.. important::
> +
> +   If no license identifier matches, or if anything is unclear about the
> +   licensing situation, use the identifier ``custom`` (for licenses)
> +   or ``custom-exception`` (for license exceptions, e.g.: ``GPL-2.0-only WITH
> +   custom-exception``).
> +
> +If SPDX doesn't know about a license yet, and the project is considered open
> +source or free software, you can `report its license to be added to the SPDX
> +license list
> +<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.
> +
> +Multiple licenses
> +^^^^^^^^^^^^^^^^^
> +
> +Open-source software is re-used all the time, so it can happen that some files
> +make their way into a different project.
> +This is usually no problem.
> +If you encounter multiple parts of the project under different licenses, combine
> +their license expressions with ``AND``.
> +For example, in a project that contains both a library and command line tools,
> +the license expression could be ``GPL-2.0-or-later AND LGPL-2.1-or-later``.
> +
> +Sometimes files are licensed under multiple licenses, and only one license is to
> +be selected.
> +In that case, combine the license expressions with ``OR``.
> +This is often the case with Device Trees in the Linux kernel, e.g.:
> +``GPL-2.0-only OR BSD-2-Clause``.
> +
> +No operator precedence is defined, use brackets ``(…)`` to group sub-statements.
> +
> +Conflicting and ambiguous statements
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Human interpretation is needed when statements inside the project conflict with
> +each other.
> +Some clues that can help you decide:
> +
> +Detailedness:
> +  If the header in the COPYING file says *"GNU General Public License"*, but
> +  the license text below that is in fact a BSD license, the correct license for
> +  the license identifier is the BSD license.
> +
> +Author Intent:
> +  If the README says *"this code is LGPL 2.1"*, but COPYING contains a GPL
> +  boilerplate license text, the correct license identifier is probably *"LGPL 2.1"*
> +  – the README written by the author prevails over the boilerplate text.
> +
> +Recency:
> +  If README and COPYING are both clearly written by the author themselves, and
> +  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
> +  recent file prevails.
> +
> +Scope:
> +  If no license statement can be found, but there is a COPYING file containing
> +  a license text, infer that the whole project is licensed under that license.
> +
> +Err on the side of caution:
> +  If all you can find is a GPL license text, this doesn't yet tell you whether
> +  the project is licensed under the *-only* or the *-or-later* variant.
> +  In that case, interpret the license restrictively and choose the *-only*
> +  variant for the license identifier.
> +
> +Don't assume:
> +  If anything is ambiguous or unclear, choose ``custom`` as a license identifier.
> +
> +.. note::
> +
> +   Any of these cases is considered a bug and should be reported to the upstream maintainers!
> +
> +"Public Domain" software
> +^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +For `good reasons <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files>`_,
> +SPDX doesn't supply a license identifier for "Public Domain".
> +Nevertheless, some PTXdist package rules specify ``public_domain`` as their
> +respective license identifier.
> +This is purely for historical reasons, and ``public_domain`` should normally
> +*not* be used for new packages.
> +Some of those "Public Domain" dedications in packages have since been accepted
> +in SPDX, e.g. `libselinux <https://spdx.org/licenses/libselinux-1.0.html>`_ or
> +`SQLite <https://spdx.org/licenses/blessing.html>`_.
> +
> +No license information at all
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +No license - no usage rights!
> +
> +Definitely report this bug to the upstream maintainer.
> +Maybe even point them in the direction of `machine-readablity <https://reuse.software/>`_ :)
> +
> +Adding license files to PTXdist packages
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
> +variable in the respective package rule file.
> +All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
> +including a checksum so that PTXdist complains when they change.
> +
> +Example:
> +
> +.. code-block:: make
> +   :caption: ddrescue.make
> +
> +   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
> +   DDRESCUE_LICENSE_FILES	:= \
> +           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
> +           file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
> +           file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
> +
> +See the section :ref:`package_specific_variables` for more information about
> +the syntax of those two variables.
> +
> +The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
> +command applied to a range of lines.
> +For the example above, lines 1 to 16 of main.cc would be::
> +
> +   $ sed -n 1,16p main.cc | md5sum -
> +   a01d61d3293ce28b883d8ba0c497e968
> +
> +Always include the copyright statement ("Copyright YYYY (C) Some Person")
> +for the calculation of the checksum, even if it means that the checksum changes
> +on package updates when new years are added to the string.
> +While it is not is needed for most licenses to be valid, some licenses require
> +that it must not be removed (e.g. see GPLv2, section 1),
> +and it is proper etiquette to give attribution to the maintainers in the
> +license report document.
> +
> +If additional information is in the README or license headers in source files
> +are used, also include these files (for source code: one of each is enough),
> +but use md5sum only on the relevant lines, so changes in the rest of the file
> +do not appear as license changes.
> +
> +For rather chaotic directories with lots of license files, definitely include at
> +least one relevant source file with license headers (if there are any), as some
> +developers tend to accumulate license files without adjusting it to license
> +changes in their source.
> +
> +.. note::
> +
> +   For each single license identifier in the license expression, include at
> +   least one file with checksum in the ``<PKG>_LICENSE_FILES`` variable.
> +
> +PTXdist will include all files (or their respective lines) that were referenced
> +in ``<PKG>_LICENSE_FILES`` as verbatim sources in the license report.
> diff --git a/doc/dev_manual.rst b/doc/dev_manual.rst
> index e9a88c1a97f5..fe4307a86b80 100644
> --- a/doc/dev_manual.rst
> +++ b/doc/dev_manual.rst
> @@ -15,6 +15,7 @@ This chapter shows all (or most) of the details of how PTXdist works.
>     dev_patching
>     dev_add_bin_only_files
>     dev_create_new_pkg_templates
> +   dev_licenses
>     dev_layers_in_ptxdist
>     dev_kconfig_diffs
>     dev_code_signing
> diff --git a/doc/ref_make_variables.rst b/doc/ref_make_variables.rst
> index 674acdcea982..2ee34856dd02 100644
> --- a/doc/ref_make_variables.rst
> +++ b/doc/ref_make_variables.rst
> @@ -127,6 +127,8 @@ Other useful variables:
>    that are built and installed during the PTXdist build run.
>    There are analogous ``-y`` and ``-m`` variants of those variables too.
>  
> +.. _package_specific_variables:
> +
>  Package Specific Variables
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> @@ -223,10 +225,19 @@ Package Definition
>    'gdbserver' for an example.
>  
>  ``<PKG>_LICENSE``
> -  The license of the package. The SPDX license identifiers should be used
> -  here. Use ``proprietary`` for proprietary packages and ``ignore`` for
> -  packages without their own license, e.g. meta packages or packages that
> -  only install files from ``projectroot/``.
> +  The license of the package in the form of an `SPDX license expression
> +  <https://spdx.org/licenses/>`_.
> +  The following values have special meaning for PTXdist:
> +
> +  - ``custom`` and ``custom-exception``: for licenses or license exceptions
> +    that are considered free software, but do not match any license or license
> +    exception known to SPDX.
> +  - ``proprietary``: for proprietary (non-free) packages
> +  - ``ignore`` for packages without their own license, e.g. meta packages or
> +    packages that only install files from ``projectroot/``
> +  - ``unknown``: no licensing information was extracted yet
> +
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  ``<PKG>_LICENSE_FILES``
>    A space separated list of URLs of license text files. The URLs must be
> @@ -238,6 +249,7 @@ Package Definition
>    used in case the specified file contains more than just the license text,
>    e.g. if the license is in the header of a source file. For non ASCII or
>    UTF-8 files the encoding can be specified with ``encoding=<enc>``.
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  For most packages the variables described above are undefined by default.
>  However, for cross and host packages these variables default to the value
> -- 
> 2.30.2
> 
> 
> _______________________________________________
> ptxdist mailing list
> ptxdist@pengutronix.de
> To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [ptxdist] [PATCH] doc: working with licensing information in packages
  2021-08-06  6:29       ` Michael Olbrich
@ 2021-08-06 10:44         ` Roland Hieber
  2021-10-07 10:18           ` [ptxdist] [APPLIED] " Michael Olbrich
  0 siblings, 1 reply; 19+ messages in thread
From: Roland Hieber @ 2021-08-06 10:44 UTC (permalink / raw)
  To: ptxdist; +Cc: Roland Hieber

Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
Signed-off-by: Roland Hieber <rhi@pengutronix.de>
---
PATCH v4:
 - remove dangling include to daily_work_licenses.inc (how did that ever
   work…?)

PATCH v3: https://lore.ptxdist.org/ptxdist/20210805091848.2855-1-rhi@pengutronix.de
 - rebase to current master
 - rewrite paragraph about always including the copyright statement
   lines in the checksum (feedback from Michael Olbrich)

PATCH v2: https://lore.ptxdist.org/ptxdist/20210608103639.24336-1-rhi@pengutronix.de
 - rebase to current master
 - squash PATCH 1/2 ("link to the SPDX license list")
 - move from daily use into dev manual chapter
 - expand and rewrite some parts completely
 - absorb old content in doc/dev_add_new_pkgs.rst
 - address feedback from Michael Olbrich:
   - check all source files instead of "some relevant-sounding files"
   - introduce "custom" and "custom-exception" identifiers instead of
     "unknown"
   - be restrictive and err on the side of caution when interpreting
     ambiguities
   - shortly mention the AND, OR and bracket syntaxes

PATCH v1: https://lore.ptxdist.org/ptxdist/20200511100306.7948-2-rhi@pengutronix.de
---
 doc/contributing.rst       |   4 +
 doc/dev_add_new_pkgs.rst   |  46 +------
 doc/dev_licenses.rst       | 245 +++++++++++++++++++++++++++++++++++++
 doc/dev_manual.rst         |   1 +
 doc/ref_make_variables.rst |  20 ++-
 5 files changed, 267 insertions(+), 49 deletions(-)
 create mode 100644 doc/dev_licenses.rst

diff --git a/doc/contributing.rst b/doc/contributing.rst
index e7cbd90e6cc3..e4209480893d 100644
--- a/doc/contributing.rst
+++ b/doc/contributing.rst
@@ -156,6 +156,10 @@ updated of removed after a version bump. Unknown PTXCONF_* variables or
 macros used in menu files. There are often typos or the variables was just
 removed.
 
+New packages must also have licensing information in the ``<PKG>_LICENSE``
+and ``<PKG>_LICENSE_FILES`` variables.
+Refer to the section :ref:`licensing_in_packages` for more information.
+
 Helper Scripts
 --------------
 
diff --git a/doc/dev_add_new_pkgs.rst b/doc/dev_add_new_pkgs.rst
index 3506436d78ec..6b1248563e6f 100644
--- a/doc/dev_add_new_pkgs.rst
+++ b/doc/dev_add_new_pkgs.rst
@@ -248,6 +248,7 @@ PTXdist specific. What does it mean:
 
 -  ``*_LICENSE`` enables the user to get a list of licenses she/he is
    using in her/his project (licenses of the enabled packages).
+   See :ref:`licensing_in_packages` below for detailed information.
 
 After enabling the menu entry, we can start to check the *get* and
 *extract* stages, calling them manually one after another.
@@ -603,48 +604,3 @@ to do (even if its boring and takes time):
 This will re-start with a **clean** BSP and builds exactly the new package and
 its (known) dependencies. If this builds successfully as well we are really done
 with the new package.
-
-Some Notes about Licenses
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-The already mentioned rule variable ``*_LICENSE`` (e.g. ``FOO_LICENSE`` in our
-example) is very important and must be filled by the developer of the package.
-Many licenses bring in obligations using the corresponding package (*attribution*
-for example). To make life easier for everybody the license for a package must
-be provided. *SPDX* license identifiers unify the license names and are used
-in PTXdist to identify license types and obligations.
-
-If a package comes with more than one license, all of their SPDX identifiers
-must be listed and connected with the keyword ``AND``. If your package comes
-with GPL-2.0 and LGPL-2.1 licenses, the definition should look like this:
-
-.. code-block:: make
-
-   FOO_LICENSE := GPL-2.0 AND LGPL-2.1
-
-One specific obligation cannot be detected examining the SPDX license identifiers
-by PTXdist: *the license choice*. In this case all licenses of choice must be
-listed and connected by the keyword ``OR``.
-
-If, for example, your obligation is to select one of the licenses *GPL-2.0* **or**
-*GPL-3.0*, the ``*_LICENSE`` variable should look like this:
-
-.. code-block:: make
-
-   FOO_LICENSE := GPL-2.0 OR GPL-3.0
-
-SPDX License Identifiers
-^^^^^^^^^^^^^^^^^^^^^^^^
-
-A list of SPDX license identifiers can be found here:
-
-   https://spdx.org/licenses/
-
-Help to Detect the Correct License
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-License identification isn't trivial. A help in doing so can be the following
-repository and its content. It contains a list of known licenses based on their
-SPDX identifier. The content is without formatting to simplify text search.
-
-   https://github.com/spdx/license-list-data/tree/master/text
diff --git a/doc/dev_licenses.rst b/doc/dev_licenses.rst
new file mode 100644
index 000000000000..0bb1c8d77c5e
--- /dev/null
+++ b/doc/dev_licenses.rst
@@ -0,0 +1,245 @@
+.. _licensing_in_packages:
+
+Tracking licensing information in packages
+------------------------------------------
+
+PTXdist aims to track licensing information for every package.
+This includes the license(s) under which a package can be distributed,
+as well as the respective files in the package's source tree that state those terms.
+Sadly there is no widely adopted standard for machine-readable licensing
+information in source code (`yet <https://reuse.software>`_),
+so here are a few hints where to look.
+
+In that process, we aim to collect the baseline set of licenses
+which at least apply to a package.
+There may be other licenses which apply too, but the complete set often cannot
+be found without a time-consuming review.
+Still, the extracted license information in PTXdist can serve as a hint for
+the full license compliance process,
+and can help to exclude certain software under certain licenses from the build.
+
+There are many older package rules in PTXdist which don't specify licensing information.
+If you want to help complete the database,
+you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
+Note however that this cannot find wrong or incomplete licensing information.
+
+Finding licensing information
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You should first select and extract the package in question, and then have a
+look at in the extracted package sources (usually something like
+``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
+``ptxdist package-info mypackage``).
+
+* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
+  These often only contain the license text and, in case of GPL, no information
+  if the code is available under the *-only* or *-or-later* variant.
+  Sometimes these files are in a folder ``/doc`` or ``/legal``.
+
+* Check the ``README``, if there is any.
+  Often there is important information there, e.g. in case of GPL if the
+  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
+
+* Check source files, like ``*.c`` for license headers.
+  Often additional information can be found here.
+
+* If you want to be extra sure, use a license compliance toolchain (e.g.
+  `FOSSology <https://www.fossology.org/>`__) on the project.
+
+Ideally you'll find two pieces of information:
+
+* A *license text* (e.g. a GNU General Public License v2.0 text)
+* A *license statement* that states that a certain license applies to (parts of) the project
+  (often also including copyright statements and a warranty disclaimer)
+
+Some licenses (e.g. BSD-style licenses) are also short enough so that both
+pieces are combined in a short comment header in a source file or a README.
+Strictly speaking, both the license text and the license statement must be
+present for a complete, unambiguous license, but see the next section about
+edge cases.
+
+On the other hand, there are some parts that can be ignored for our purposes:
+
+* Everything that is auto-generated, either by a script in the project source,
+  or by the build system previous to packaging.
+  The generator itself cannot hold copyright, although the authors of the
+  templates used for the generation or the authors of the generator can.
+
+* Most files belonging to the build system don't make it into the compiled code
+  and can therefore be ignored (e.g. configure scripts, Makefiles).
+  These cases sometimes can be hard to detect – if unsure, include the file in
+  your research.
+
+Some projects also include a COPYING.LIB containing an LGPL text, which is
+referenced nowhere in the project.
+In that case, ignore the COPYING.LIB – it probably comes from a boilerplate
+project skeleton and the maintainer forgot to delete it.
+
+Distillation into license identifiers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In PTXdist, we use `SPDX license expressions <https://spdx.org/licenses/>`_.
+
+Either the license identifier is clear, e.g. because the README says "GPL 2.0
+or later" (check the license text to be sure), or you can use tools like
+`FOSSology <https://www.fossology.org>`__,
+`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
+or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
+to match texts to SPDX license identifiers.
+
+License texts don't have to match exactly, you should apply the
+`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
+accordingly.
+The important part here is that the project's license and the SPDX identifier
+describe the same licensing terms.
+"Rather close" or "mostly similar" statements are not enough for a match,
+but simple unimportant changes like replacing *"The Author"* with the project's
+maintainer's name, or a change in e-mail adresses, are usually okay.
+
+For software that is not open-source according to the `OSI definition
+<https://opensource.org/osd>`_, use the identifier ``proprietary``.
+
+.. important::
+
+   If no license identifier matches, or if anything is unclear about the
+   licensing situation, use the identifier ``custom`` (for licenses)
+   or ``custom-exception`` (for license exceptions, e.g.: ``GPL-2.0-only WITH
+   custom-exception``).
+
+If SPDX doesn't know about a license yet, and the project is considered open
+source or free software, you can `report its license to be added to the SPDX
+license list
+<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.
+
+Multiple licenses
+^^^^^^^^^^^^^^^^^
+
+Open-source software is re-used all the time, so it can happen that some files
+make their way into a different project.
+This is usually no problem.
+If you encounter multiple parts of the project under different licenses, combine
+their license expressions with ``AND``.
+For example, in a project that contains both a library and command line tools,
+the license expression could be ``GPL-2.0-or-later AND LGPL-2.1-or-later``.
+
+Sometimes files are licensed under multiple licenses, and only one license is to
+be selected.
+In that case, combine the license expressions with ``OR``.
+This is often the case with Device Trees in the Linux kernel, e.g.:
+``GPL-2.0-only OR BSD-2-Clause``.
+
+No operator precedence is defined, use brackets ``(…)`` to group sub-statements.
+
+Conflicting and ambiguous statements
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Human interpretation is needed when statements inside the project conflict with
+each other.
+Some clues that can help you decide:
+
+Detailedness:
+  If the header in the COPYING file says *"GNU General Public License"*, but
+  the license text below that is in fact a BSD license, the correct license for
+  the license identifier is the BSD license.
+
+Author Intent:
+  If the README says *"this code is LGPL 2.1"*, but COPYING contains a GPL
+  boilerplate license text, the correct license identifier is probably *"LGPL 2.1"*
+  – the README written by the author prevails over the boilerplate text.
+
+Recency:
+  If README and COPYING are both clearly written by the author themselves, and
+  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
+  recent file prevails.
+
+Scope:
+  If no license statement can be found, but there is a COPYING file containing
+  a license text, infer that the whole project is licensed under that license.
+
+Err on the side of caution:
+  If all you can find is a GPL license text, this doesn't yet tell you whether
+  the project is licensed under the *-only* or the *-or-later* variant.
+  In that case, interpret the license restrictively and choose the *-only*
+  variant for the license identifier.
+
+Don't assume:
+  If anything is ambiguous or unclear, choose ``custom`` as a license identifier.
+
+.. note::
+
+   Any of these cases is considered a bug and should be reported to the upstream maintainers!
+
+"Public Domain" software
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+For `good reasons <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files>`_,
+SPDX doesn't supply a license identifier for "Public Domain".
+Nevertheless, some PTXdist package rules specify ``public_domain`` as their
+respective license identifier.
+This is purely for historical reasons, and ``public_domain`` should normally
+*not* be used for new packages.
+Some of those "Public Domain" dedications in packages have since been accepted
+in SPDX, e.g. `libselinux <https://spdx.org/licenses/libselinux-1.0.html>`_ or
+`SQLite <https://spdx.org/licenses/blessing.html>`_.
+
+No license information at all
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+No license - no usage rights!
+
+Definitely report this bug to the upstream maintainer.
+Maybe even point them in the direction of `machine-readablity <https://reuse.software/>`_ :)
+
+Adding license files to PTXdist packages
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
+variable in the respective package rule file.
+All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
+including a checksum so that PTXdist complains when they change.
+
+Example:
+
+.. code-block:: make
+
+   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
+   DDRESCUE_LICENSE_FILES	:= \
+           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
+           file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
+           file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
+
+See the section :ref:`package_specific_variables` for more information about
+the syntax of those two variables.
+
+The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
+command applied to a range of lines.
+For the example above, lines 1 to 16 of main.cc would be::
+
+   $ sed -n 1,16p main.cc | md5sum -
+   a01d61d3293ce28b883d8ba0c497e968
+
+Always include the copyright statement ("Copyright YYYY (C) Some Person")
+for the calculation of the checksum, even if it means that the checksum changes
+on package updates when new years are added to the string.
+While it is not is needed for most licenses to be valid, some licenses require
+that it must not be removed (e.g. see GPLv2, section 1),
+and it is proper etiquette to give attribution to the maintainers in the
+license report document.
+
+If additional information is in the README or license headers in source files
+are used, also include these files (for source code: one of each is enough),
+but use md5sum only on the relevant lines, so changes in the rest of the file
+do not appear as license changes.
+
+For rather chaotic directories with lots of license files, definitely include at
+least one relevant source file with license headers (if there are any), as some
+developers tend to accumulate license files without adjusting it to license
+changes in their source.
+
+.. note::
+
+   For each single license identifier in the license expression, include at
+   least one file with checksum in the ``<PKG>_LICENSE_FILES`` variable.
+
+PTXdist will include all files (or their respective lines) that were referenced
+in ``<PKG>_LICENSE_FILES`` as verbatim sources in the license report.
diff --git a/doc/dev_manual.rst b/doc/dev_manual.rst
index e9a88c1a97f5..fe4307a86b80 100644
--- a/doc/dev_manual.rst
+++ b/doc/dev_manual.rst
@@ -15,6 +15,7 @@ This chapter shows all (or most) of the details of how PTXdist works.
    dev_patching
    dev_add_bin_only_files
    dev_create_new_pkg_templates
+   dev_licenses
    dev_layers_in_ptxdist
    dev_kconfig_diffs
    dev_code_signing
diff --git a/doc/ref_make_variables.rst b/doc/ref_make_variables.rst
index 674acdcea982..2ee34856dd02 100644
--- a/doc/ref_make_variables.rst
+++ b/doc/ref_make_variables.rst
@@ -127,6 +127,8 @@ Other useful variables:
   that are built and installed during the PTXdist build run.
   There are analogous ``-y`` and ``-m`` variants of those variables too.
 
+.. _package_specific_variables:
+
 Package Specific Variables
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -223,10 +225,19 @@ Package Definition
   'gdbserver' for an example.
 
 ``<PKG>_LICENSE``
-  The license of the package. The SPDX license identifiers should be used
-  here. Use ``proprietary`` for proprietary packages and ``ignore`` for
-  packages without their own license, e.g. meta packages or packages that
-  only install files from ``projectroot/``.
+  The license of the package in the form of an `SPDX license expression
+  <https://spdx.org/licenses/>`_.
+  The following values have special meaning for PTXdist:
+
+  - ``custom`` and ``custom-exception``: for licenses or license exceptions
+    that are considered free software, but do not match any license or license
+    exception known to SPDX.
+  - ``proprietary``: for proprietary (non-free) packages
+  - ``ignore`` for packages without their own license, e.g. meta packages or
+    packages that only install files from ``projectroot/``
+  - ``unknown``: no licensing information was extracted yet
+
+  See the section :ref:`licensing_in_packages` for more information.
 
 ``<PKG>_LICENSE_FILES``
   A space separated list of URLs of license text files. The URLs must be
@@ -238,6 +249,7 @@ Package Definition
   used in case the specified file contains more than just the license text,
   e.g. if the license is in the header of a source file. For non ASCII or
   UTF-8 files the encoding can be specified with ``encoding=<enc>``.
+  See the section :ref:`licensing_in_packages` for more information.
 
 For most packages the variables described above are undefined by default.
 However, for cross and host packages these variables default to the value
-- 
2.30.2


_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [ptxdist] [APPLIED] doc: working with licensing information in packages
  2021-08-06 10:44         ` [ptxdist] [PATCH] " Roland Hieber
@ 2021-10-07 10:18           ` Michael Olbrich
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Olbrich @ 2021-10-07 10:18 UTC (permalink / raw)
  To: ptxdist; +Cc: Roland Hieber

Thanks, applied as 76d1f680233955839298435e9faf11f15434b4a4.

Michael

[sent from post-receive hook]

On Thu, 07 Oct 2021 12:18:15 +0200, Roland Hieber <rhi@pengutronix.de> wrote:
> Co-authored-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Felicitas Jung <f.jung@pengutronix.de>
> Signed-off-by: Roland Hieber <rhi@pengutronix.de>
> Message-Id: <20210806104401.12401-1-rhi@pengutronix.de>
> Signed-off-by: Michael Olbrich <m.olbrich@pengutronix.de>
> 
> diff --git a/doc/contributing.rst b/doc/contributing.rst
> index e7cbd90e6cc3..e4209480893d 100644
> --- a/doc/contributing.rst
> +++ b/doc/contributing.rst
> @@ -156,6 +156,10 @@ updated of removed after a version bump. Unknown PTXCONF_* variables or
>  macros used in menu files. There are often typos or the variables was just
>  removed.
>  
> +New packages must also have licensing information in the ``<PKG>_LICENSE``
> +and ``<PKG>_LICENSE_FILES`` variables.
> +Refer to the section :ref:`licensing_in_packages` for more information.
> +
>  Helper Scripts
>  --------------
>  
> diff --git a/doc/dev_add_new_pkgs.rst b/doc/dev_add_new_pkgs.rst
> index 3506436d78ec..6b1248563e6f 100644
> --- a/doc/dev_add_new_pkgs.rst
> +++ b/doc/dev_add_new_pkgs.rst
> @@ -248,6 +248,7 @@ PTXdist specific. What does it mean:
>  
>  -  ``*_LICENSE`` enables the user to get a list of licenses she/he is
>     using in her/his project (licenses of the enabled packages).
> +   See :ref:`licensing_in_packages` below for detailed information.
>  
>  After enabling the menu entry, we can start to check the *get* and
>  *extract* stages, calling them manually one after another.
> @@ -603,48 +604,3 @@ to do (even if its boring and takes time):
>  This will re-start with a **clean** BSP and builds exactly the new package and
>  its (known) dependencies. If this builds successfully as well we are really done
>  with the new package.
> -
> -Some Notes about Licenses
> -~~~~~~~~~~~~~~~~~~~~~~~~~
> -
> -The already mentioned rule variable ``*_LICENSE`` (e.g. ``FOO_LICENSE`` in our
> -example) is very important and must be filled by the developer of the package.
> -Many licenses bring in obligations using the corresponding package (*attribution*
> -for example). To make life easier for everybody the license for a package must
> -be provided. *SPDX* license identifiers unify the license names and are used
> -in PTXdist to identify license types and obligations.
> -
> -If a package comes with more than one license, all of their SPDX identifiers
> -must be listed and connected with the keyword ``AND``. If your package comes
> -with GPL-2.0 and LGPL-2.1 licenses, the definition should look like this:
> -
> -.. code-block:: make
> -
> -   FOO_LICENSE := GPL-2.0 AND LGPL-2.1
> -
> -One specific obligation cannot be detected examining the SPDX license identifiers
> -by PTXdist: *the license choice*. In this case all licenses of choice must be
> -listed and connected by the keyword ``OR``.
> -
> -If, for example, your obligation is to select one of the licenses *GPL-2.0* **or**
> -*GPL-3.0*, the ``*_LICENSE`` variable should look like this:
> -
> -.. code-block:: make
> -
> -   FOO_LICENSE := GPL-2.0 OR GPL-3.0
> -
> -SPDX License Identifiers
> -^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -A list of SPDX license identifiers can be found here:
> -
> -   https://spdx.org/licenses/
> -
> -Help to Detect the Correct License
> -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> -
> -License identification isn't trivial. A help in doing so can be the following
> -repository and its content. It contains a list of known licenses based on their
> -SPDX identifier. The content is without formatting to simplify text search.
> -
> -   https://github.com/spdx/license-list-data/tree/master/text
> diff --git a/doc/dev_licenses.rst b/doc/dev_licenses.rst
> new file mode 100644
> index 000000000000..0bb1c8d77c5e
> --- /dev/null
> +++ b/doc/dev_licenses.rst
> @@ -0,0 +1,245 @@
> +.. _licensing_in_packages:
> +
> +Tracking licensing information in packages
> +------------------------------------------
> +
> +PTXdist aims to track licensing information for every package.
> +This includes the license(s) under which a package can be distributed,
> +as well as the respective files in the package's source tree that state those terms.
> +Sadly there is no widely adopted standard for machine-readable licensing
> +information in source code (`yet <https://reuse.software>`_),
> +so here are a few hints where to look.
> +
> +In that process, we aim to collect the baseline set of licenses
> +which at least apply to a package.
> +There may be other licenses which apply too, but the complete set often cannot
> +be found without a time-consuming review.
> +Still, the extracted license information in PTXdist can serve as a hint for
> +the full license compliance process,
> +and can help to exclude certain software under certain licenses from the build.
> +
> +There are many older package rules in PTXdist which don't specify licensing information.
> +If you want to help complete the database,
> +you can use ``grep -L _LICENSE_FILES rules/*.make`` (in the PTXdist tree) to find those rules.
> +Note however that this cannot find wrong or incomplete licensing information.
> +
> +Finding licensing information
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +You should first select and extract the package in question, and then have a
> +look at in the extracted package sources (usually something like
> +``platform-nnn/build-target/mypackage-1.0`` in your BSP, if in doubt see
> +``ptxdist package-info mypackage``).
> +
> +* Check for files named ``COPYING``, ``COPYRIGHT``,  or ``LICENSE``.
> +  These often only contain the license text and, in case of GPL, no information
> +  if the code is available under the *-only* or *-or-later* variant.
> +  Sometimes these files are in a folder ``/doc`` or ``/legal``.
> +
> +* Check the ``README``, if there is any.
> +  Often there is important information there, e.g. in case of GPL if the
> +  software is *GPL-x.x-or-later* or *GPL-x.x-only*.
> +
> +* Check source files, like ``*.c`` for license headers.
> +  Often additional information can be found here.
> +
> +* If you want to be extra sure, use a license compliance toolchain (e.g.
> +  `FOSSology <https://www.fossology.org/>`__) on the project.
> +
> +Ideally you'll find two pieces of information:
> +
> +* A *license text* (e.g. a GNU General Public License v2.0 text)
> +* A *license statement* that states that a certain license applies to (parts of) the project
> +  (often also including copyright statements and a warranty disclaimer)
> +
> +Some licenses (e.g. BSD-style licenses) are also short enough so that both
> +pieces are combined in a short comment header in a source file or a README.
> +Strictly speaking, both the license text and the license statement must be
> +present for a complete, unambiguous license, but see the next section about
> +edge cases.
> +
> +On the other hand, there are some parts that can be ignored for our purposes:
> +
> +* Everything that is auto-generated, either by a script in the project source,
> +  or by the build system previous to packaging.
> +  The generator itself cannot hold copyright, although the authors of the
> +  templates used for the generation or the authors of the generator can.
> +
> +* Most files belonging to the build system don't make it into the compiled code
> +  and can therefore be ignored (e.g. configure scripts, Makefiles).
> +  These cases sometimes can be hard to detect – if unsure, include the file in
> +  your research.
> +
> +Some projects also include a COPYING.LIB containing an LGPL text, which is
> +referenced nowhere in the project.
> +In that case, ignore the COPYING.LIB – it probably comes from a boilerplate
> +project skeleton and the maintainer forgot to delete it.
> +
> +Distillation into license identifiers
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +In PTXdist, we use `SPDX license expressions <https://spdx.org/licenses/>`_.
> +
> +Either the license identifier is clear, e.g. because the README says "GPL 2.0
> +or later" (check the license text to be sure), or you can use tools like
> +`FOSSology <https://www.fossology.org>`__,
> +`licensecheck <https://wiki.debian.org/CopyrightReviewTools#Command-line_tools_in_Debian>`_,
> +or `spdx-license-match <https://github.com/rohieb/spdx-license-match>`_
> +to match texts to SPDX license identifiers.
> +
> +License texts don't have to match exactly, you should apply the
> +`SPDX Matching Guidelines <https://spdx.org/spdx-license-list/matching-guidelines>`_
> +accordingly.
> +The important part here is that the project's license and the SPDX identifier
> +describe the same licensing terms.
> +"Rather close" or "mostly similar" statements are not enough for a match,
> +but simple unimportant changes like replacing *"The Author"* with the project's
> +maintainer's name, or a change in e-mail adresses, are usually okay.
> +
> +For software that is not open-source according to the `OSI definition
> +<https://opensource.org/osd>`_, use the identifier ``proprietary``.
> +
> +.. important::
> +
> +   If no license identifier matches, or if anything is unclear about the
> +   licensing situation, use the identifier ``custom`` (for licenses)
> +   or ``custom-exception`` (for license exceptions, e.g.: ``GPL-2.0-only WITH
> +   custom-exception``).
> +
> +If SPDX doesn't know about a license yet, and the project is considered open
> +source or free software, you can `report its license to be added to the SPDX
> +license list
> +<https://github.com/spdx/license-list-XML/blob/master/CONTRIBUTING.md#request-a-new-license-or-exception-be-added-to-the-spdx-license-list>`_.
> +
> +Multiple licenses
> +^^^^^^^^^^^^^^^^^
> +
> +Open-source software is re-used all the time, so it can happen that some files
> +make their way into a different project.
> +This is usually no problem.
> +If you encounter multiple parts of the project under different licenses, combine
> +their license expressions with ``AND``.
> +For example, in a project that contains both a library and command line tools,
> +the license expression could be ``GPL-2.0-or-later AND LGPL-2.1-or-later``.
> +
> +Sometimes files are licensed under multiple licenses, and only one license is to
> +be selected.
> +In that case, combine the license expressions with ``OR``.
> +This is often the case with Device Trees in the Linux kernel, e.g.:
> +``GPL-2.0-only OR BSD-2-Clause``.
> +
> +No operator precedence is defined, use brackets ``(…)`` to group sub-statements.
> +
> +Conflicting and ambiguous statements
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +Human interpretation is needed when statements inside the project conflict with
> +each other.
> +Some clues that can help you decide:
> +
> +Detailedness:
> +  If the header in the COPYING file says *"GNU General Public License"*, but
> +  the license text below that is in fact a BSD license, the correct license for
> +  the license identifier is the BSD license.
> +
> +Author Intent:
> +  If the README says *"this code is LGPL 2.1"*, but COPYING contains a GPL
> +  boilerplate license text, the correct license identifier is probably *"LGPL 2.1"*
> +  – the README written by the author prevails over the boilerplate text.
> +
> +Recency:
> +  If README and COPYING are both clearly written by the author themselves, and
> +  the README says *"don't do $thing*" and COPYING says *"do $thing*", the more
> +  recent file prevails.
> +
> +Scope:
> +  If no license statement can be found, but there is a COPYING file containing
> +  a license text, infer that the whole project is licensed under that license.
> +
> +Err on the side of caution:
> +  If all you can find is a GPL license text, this doesn't yet tell you whether
> +  the project is licensed under the *-only* or the *-or-later* variant.
> +  In that case, interpret the license restrictively and choose the *-only*
> +  variant for the license identifier.
> +
> +Don't assume:
> +  If anything is ambiguous or unclear, choose ``custom`` as a license identifier.
> +
> +.. note::
> +
> +   Any of these cases is considered a bug and should be reported to the upstream maintainers!
> +
> +"Public Domain" software
> +^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +For `good reasons <https://wiki.spdx.org/view/Legal_Team/Decisions/Dealing_with_Public_Domain_within_SPDX_Files>`_,
> +SPDX doesn't supply a license identifier for "Public Domain".
> +Nevertheless, some PTXdist package rules specify ``public_domain`` as their
> +respective license identifier.
> +This is purely for historical reasons, and ``public_domain`` should normally
> +*not* be used for new packages.
> +Some of those "Public Domain" dedications in packages have since been accepted
> +in SPDX, e.g. `libselinux <https://spdx.org/licenses/libselinux-1.0.html>`_ or
> +`SQLite <https://spdx.org/licenses/blessing.html>`_.
> +
> +No license information at all
> +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> +
> +No license - no usage rights!
> +
> +Definitely report this bug to the upstream maintainer.
> +Maybe even point them in the direction of `machine-readablity <https://reuse.software/>`_ :)
> +
> +Adding license files to PTXdist packages
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The SPDX license identifier of the package goes into the ``<PKG>_LICENSE``
> +variable in the respective package rule file.
> +All relevant files identified in the steps above are then added to the variable ``<PKG>_LICENSE``,
> +including a checksum so that PTXdist complains when they change.
> +
> +Example:
> +
> +.. code-block:: make
> +
> +   DDRESCUE_LICENSE	:= GPL-2.0-or-later AND BSD-2-Clause
> +   DDRESCUE_LICENSE_FILES	:= \
> +           file://COPYING;md5=76d6e300ffd8fb9d18bd9b136a9bba13 \
> +           file://main.cc;startline=1;endline=16;md5=a01d61d3293ce28b883d8ba0c497e968 \
> +           file://arg_parser.cc;startline=1;endline=18;md5=41d1341d0d733a5d24b26dc3cbc1ac42
> +
> +See the section :ref:`package_specific_variables` for more information about
> +the syntax of those two variables.
> +
> +The MD5 sum for a block of lines can be generated with sed's ``p`` (print)
> +command applied to a range of lines.
> +For the example above, lines 1 to 16 of main.cc would be::
> +
> +   $ sed -n 1,16p main.cc | md5sum -
> +   a01d61d3293ce28b883d8ba0c497e968
> +
> +Always include the copyright statement ("Copyright YYYY (C) Some Person")
> +for the calculation of the checksum, even if it means that the checksum changes
> +on package updates when new years are added to the string.
> +While it is not is needed for most licenses to be valid, some licenses require
> +that it must not be removed (e.g. see GPLv2, section 1),
> +and it is proper etiquette to give attribution to the maintainers in the
> +license report document.
> +
> +If additional information is in the README or license headers in source files
> +are used, also include these files (for source code: one of each is enough),
> +but use md5sum only on the relevant lines, so changes in the rest of the file
> +do not appear as license changes.
> +
> +For rather chaotic directories with lots of license files, definitely include at
> +least one relevant source file with license headers (if there are any), as some
> +developers tend to accumulate license files without adjusting it to license
> +changes in their source.
> +
> +.. note::
> +
> +   For each single license identifier in the license expression, include at
> +   least one file with checksum in the ``<PKG>_LICENSE_FILES`` variable.
> +
> +PTXdist will include all files (or their respective lines) that were referenced
> +in ``<PKG>_LICENSE_FILES`` as verbatim sources in the license report.
> diff --git a/doc/dev_manual.rst b/doc/dev_manual.rst
> index e9a88c1a97f5..fe4307a86b80 100644
> --- a/doc/dev_manual.rst
> +++ b/doc/dev_manual.rst
> @@ -15,6 +15,7 @@ This chapter shows all (or most) of the details of how PTXdist works.
>     dev_patching
>     dev_add_bin_only_files
>     dev_create_new_pkg_templates
> +   dev_licenses
>     dev_layers_in_ptxdist
>     dev_kconfig_diffs
>     dev_code_signing
> diff --git a/doc/ref_make_variables.rst b/doc/ref_make_variables.rst
> index 674acdcea982..2ee34856dd02 100644
> --- a/doc/ref_make_variables.rst
> +++ b/doc/ref_make_variables.rst
> @@ -127,6 +127,8 @@ Other useful variables:
>    that are built and installed during the PTXdist build run.
>    There are analogous ``-y`` and ``-m`` variants of those variables too.
>  
> +.. _package_specific_variables:
> +
>  Package Specific Variables
>  ~~~~~~~~~~~~~~~~~~~~~~~~~~
>  
> @@ -223,10 +225,19 @@ Package Definition
>    'gdbserver' for an example.
>  
>  ``<PKG>_LICENSE``
> -  The license of the package. The SPDX license identifiers should be used
> -  here. Use ``proprietary`` for proprietary packages and ``ignore`` for
> -  packages without their own license, e.g. meta packages or packages that
> -  only install files from ``projectroot/``.
> +  The license of the package in the form of an `SPDX license expression
> +  <https://spdx.org/licenses/>`_.
> +  The following values have special meaning for PTXdist:
> +
> +  - ``custom`` and ``custom-exception``: for licenses or license exceptions
> +    that are considered free software, but do not match any license or license
> +    exception known to SPDX.
> +  - ``proprietary``: for proprietary (non-free) packages
> +  - ``ignore`` for packages without their own license, e.g. meta packages or
> +    packages that only install files from ``projectroot/``
> +  - ``unknown``: no licensing information was extracted yet
> +
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  ``<PKG>_LICENSE_FILES``
>    A space separated list of URLs of license text files. The URLs must be
> @@ -238,6 +249,7 @@ Package Definition
>    used in case the specified file contains more than just the license text,
>    e.g. if the license is in the header of a source file. For non ASCII or
>    UTF-8 files the encoding can be specified with ``encoding=<enc>``.
> +  See the section :ref:`licensing_in_packages` for more information.
>  
>  For most packages the variables described above are undefined by default.
>  However, for cross and host packages these variables default to the value

_______________________________________________
ptxdist mailing list
ptxdist@pengutronix.de
To unsubscribe, send a mail with subject "unsubscribe" to ptxdist-request@pengutronix.de

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-10-07 10:18 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-11 10:03 [ptxdist] [PATCH 1/2] doc: ref_make_variables: link to the SPDX license list Roland Hieber
2020-05-11 10:03 ` [ptxdist] [PATCH 2/2] doc: working with licensing information in packages Roland Hieber
2020-05-26 10:29   ` Roland Hieber
2020-05-26 11:12   ` Alexander Dahl
2020-05-29  6:23   ` Michael Olbrich
2020-05-29  8:27     ` Roland Hieber
2020-05-29  8:55       ` Michael Olbrich
2020-05-29  9:40         ` Roland Hieber
2020-05-29 12:03           ` Michael Olbrich
2020-05-31 19:56             ` Roland Hieber
2020-06-02 13:16               ` Michael Olbrich
2020-06-02 15:14                 ` Roland Hieber
2021-06-08 10:36 ` [ptxdist] [PATCH] " Roland Hieber
2021-06-16 14:19   ` Michael Olbrich
2021-06-16 14:40     ` Roland Hieber
2021-08-05  9:18     ` [ptxdist] [PATCH v3] " Roland Hieber
2021-08-06  6:29       ` Michael Olbrich
2021-08-06 10:44         ` [ptxdist] [PATCH] " Roland Hieber
2021-10-07 10:18           ` [ptxdist] [APPLIED] " Michael Olbrich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox