From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Received: from dude02.hi.pengutronix.de ([2001:67c:670:100:1d::28] helo=dude02.lab.pengutronix.de) by metis.ext.pengutronix.de with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1hglY1-0007rk-TT for ptxdist@pengutronix.de; Fri, 28 Jun 2019 09:43:53 +0200 Received: from jbe by dude02.lab.pengutronix.de with local (Exim 4.89) (envelope-from ) id 1hglY1-00029V-Fu for ptxdist@pengutronix.de; Fri, 28 Jun 2019 09:43:53 +0200 From: Juergen Borleis Date: Fri, 28 Jun 2019 09:43:34 +0200 Message-Id: <20190628074343.6700-2-jbe@pengutronix.de> MIME-Version: 1.0 Subject: [ptxdist] [PATCH v2 01/10] rootfs: keep /var writable, even if the rootfs is read-only List-Id: PTXdist Development Mailing List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: ptxdist@pengutronix.de Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: ptxdist-bounces@pengutronix.de Sender: "ptxdist" To: ptxdist@pengutronix.de Having a read-only root filesystem is always a source of pain and trouble. Many applications and tools expect to be able to store their state or caching data or at least their logs somewhere in the filesystem. The '/var' directory tree has a well known structure according to the "File System Hierarchy Standard" and is used by all carefully designed programs. Thus, this change provides a way to have this '/var' directory tree writable, even if the main root filesystem is mounted read-only. It uses an overlay filesystem and by default a RAM disk to store changed and added data to this directory tree in a non persistent manner. Due to the nature of the overlay filesystem the underlaying files from the main root filesystem can still be accessed. This approach requires the overlay filesystem support from the Linux kernel. In order to use it, the feature CONFIG_OVERLAY_FS must be enabled. The ugly details to establish the required overlaying filesystem are hidden behind a "mount helper" for a dummy filesystem (here called 'varoverlayfs'). Thus, a BSP can change the overlaying filesystem by providing its own 'run-varoverlay.mount' in order to restrict the default RAM disk differently or to switch to a different local storage. The '/etc/fstab' file gets touched in this change, to enable some already used RAM disks on demand, to gain backward compatibility if no overlay approach is used. Signed-off-by: Juergen Borleis --- doc/daily_work.inc | 101 ++++++++++++++++++ projectroot/etc/fstab | 6 +- .../lib/systemd/system/run-varoverlayfs.mount | 9 ++ projectroot/usr/lib/systemd/system/var.mount | 11 ++ projectroot/usr/sbin/mount.varoverlayfs | 11 ++ rules/rootfs.in | 66 +++++++----- rules/rootfs.make | 19 +++- 7 files changed, 191 insertions(+), 32 deletions(-) create mode 100644 projectroot/usr/lib/systemd/system/run-varoverlayfs.mount create mode 100644 projectroot/usr/lib/systemd/system/var.mount create mode 100644 projectroot/usr/sbin/mount.varoverlayfs diff --git a/doc/daily_work.inc b/doc/daily_work.inc index 74da11953..6f1525aec 100644 --- a/doc/daily_work.inc +++ b/doc/daily_work.inc @@ -1371,3 +1371,104 @@ in the build machine's filesystem also for the target filesystem image. With a different ``umask`` than ``0022`` at build-time this may fail badly at run-time with strange erroneous behaviour (for example some daemons with regular user permissions cannot acces their own configuration files). + +Read Only Filesystem +-------------------- + +A system can run a read-only root filesystem in order to have a unit which +can be powered off at any time, without any previous shut down sequence. + +But many applications and tools are still expecting a writable filesystem to +temporarily store some kind of data or logging information for example. All +these write attempts will fail and thus, the applications and tools will fail, +too. + +According to the *Filesystem Hierarchy Standard 2.3* the directory tree in +``/var/`` is traditionally writable and its content is persistent across system +restarts. Thus, this directory tree is used by most applications and tools to +store their data. + +The *Filesystem Hierarchy Standard 2.3* defines the following directories +below ``/var/``: + +- ``cache/``: Application specific cache data +- ``crash/``: System crash dumps +- ``lib/``: Application specific variable state information +- ``lock/``: Lock files +- ``log/``: Log files and directories +- ``run/``: Data relevant to running processes +- ``spool/``: Application spool data +- ``tmp/``: Temporary files preserved between system reboots + +Although this writable directory tree is useful and valid for full blown host +machines, an embedded system can behave differently here: For example a +requirement can drop the persistency of changed data across reboots and always +start with empty directories. + +Partially RAM Disks +~~~~~~~~~~~~~~~~~~~ + +This is the default behaviour of PTXdist: it mounts a couple of RAM disks over +directories in ``/var`` expected to be writable by various applications and +tools. These RAM disks start always in an empty state and are defined as follows: + ++-------------+---------------------------------------------------------------+ +| mount point | mount options | ++=============+===============================================================+ +| /var/log | nosuid,nodev,noexec,mode=0755,size=10% | ++-------------+---------------------------------------------------------------+ +| /var/lock | nosuid,nodev,noexec,mode=0755,size=1M | ++-------------+---------------------------------------------------------------+ +| /var/tmp | nosuid,nodev,mode=1777,size=20% | ++-------------+---------------------------------------------------------------+ + +This is a very simple and optimistic approach and works for surprisingly many use +cases. But some applications expect a writable ``/var/lib`` and will fail due +to this setup. Using an additional RAM disk for ``/var/lib`` might not help in +this use case, because it will bury all build-time generated data already present +in this directory tree (package pre-defined configuration files for example). + +Overlay RAM Disk +~~~~~~~~~~~~~~~~ + +A different approach to have a writable ``/var`` without persistency is to use +a so called *overlay filesystem*. This *overlay filesystem* is a transparent +writable layer on top of a read-only filesystem. After the system's start the +*overlay filesystem layer* is empty and all reads will be satisfied by the +underlaying read-only filesystem. Writes (new files, directories, changes of +existing files) are stored in the *overlay filesystem layer* and on the +next read satisfied by this layer, instead of the underlaying read-only +filesystem. + +PTXdist supports this use case, by enabling the *overlay* feature for the +``/var`` directory in its configuration menu: + +.. code-block:: text + + Root Filesystem ---> + directories in rootfs ---> + /var ---> + [*] overlay '/var' with RAM disk + +Keep in mind: this approach just enables write support to the ``/var`` directory +tree, but nothing stored/changed in there at run-time will be persistent and is +always lost if the system restarts. And each additional RAM disk consumes +additional main memory, and if applications and tools will fill up the directory +tree in ``/var`` the machine might run short on memory and slows down +dramatically. + +Thus, it is a good idea to check the amount of data written by applications and +tools to the ``/var`` directory tree and limit it by default. +You can limit the size of the *overlay filesystem* RAM disk as well. For this +you can provide your own +``projectroot/usr/lib/systemd/system/run-varoverlayfs.mount`` with restrictive +settings. But then the used applications and tools must deal with the +"no space left on device" error correctly... + +This *overlay filesystem* approach requires the *overlay filesystem feature* +from the Linux kernel. In order to use it, the feature CONFIG_OVERLAY_FS must +be enabled. A used mount option of the overlayfs in the default +``projectroot/usr/lib/systemd/system/var.mount`` unit requires a Linux-4.18 or +newer. +If your kernel does not meet this requirement you can provide your own local +and adapted variant of the mentioned mount unit. diff --git a/projectroot/etc/fstab b/projectroot/etc/fstab index 0121c3076..364b495a9 100644 --- a/projectroot/etc/fstab +++ b/projectroot/etc/fstab @@ -11,6 +11,6 @@ debugfs /sys/kernel/debug debugfs noauto 0 0 # ramdisks tmpfs /tmp tmpfs nosuid,nodev,mode=1777,size=20% 0 0 tmpfs /run tmpfs nosuid,nodev,strictatime,mode=0755 0 0 -tmpfs /var/log tmpfs nosuid,nodev,noexec,mode=0755,size=10% 0 0 -tmpfs /var/lock tmpfs nosuid,nodev,noexec,mode=0755,size=1M 0 0 -tmpfs /var/tmp tmpfs nosuid,nodev,mode=1777,size=20% 0 0 +@VAR_OVERLAYFS@tmpfs /var/log tmpfs nosuid,nodev,noexec,mode=0755,size=10% 0 0 +@VAR_OVERLAYFS@tmpfs /var/lock tmpfs nosuid,nodev,noexec,mode=0755,size=1M 0 0 +@VAR_OVERLAYFS@tmpfs /var/tmp tmpfs nosuid,nodev,mode=1777,size=20% 0 0 diff --git a/projectroot/usr/lib/systemd/system/run-varoverlayfs.mount b/projectroot/usr/lib/systemd/system/run-varoverlayfs.mount new file mode 100644 index 000000000..c067b9b96 --- /dev/null +++ b/projectroot/usr/lib/systemd/system/run-varoverlayfs.mount @@ -0,0 +1,9 @@ +[Unit] +Description=Overlay for '/var' +Before=local-fs.target + +[Mount] +Where=/run/varoverlayfs +What=tmpfs +Type=tmpfs +Options=size=20% diff --git a/projectroot/usr/lib/systemd/system/var.mount b/projectroot/usr/lib/systemd/system/var.mount new file mode 100644 index 000000000..bd6350237 --- /dev/null +++ b/projectroot/usr/lib/systemd/system/var.mount @@ -0,0 +1,11 @@ +[Unit] +Description=Writable support for '/var' +After=run-varoverlayfs.mount +Before=local-fs.target + +[Mount] +Where=/var +# note: this is a dummy filesystem only to trigger the corresponding mount helper +What=varoverlayfs +Type=varoverlayfs +Options=metacopy=on diff --git a/projectroot/usr/sbin/mount.varoverlayfs b/projectroot/usr/sbin/mount.varoverlayfs new file mode 100644 index 000000000..f8fc8c88f --- /dev/null +++ b/projectroot/usr/sbin/mount.varoverlayfs @@ -0,0 +1,11 @@ +#!/bin/sh -e +# Mount helper tool to mount some kind of writable filesystem over '/var' +# (which might be read-only). +# What kind of filesystem is used to mount over '/var' can be controlled via +# the 'run-varoverlayfs.mount' mount unit and is usually a RAM disk. + +mkdir -p /run/varoverlayfs/upper +mkdir -p /run/varoverlayfs/work +mount -t overlay -olowerdir=/var,upperdir=/run/varoverlayfs/upper,workdir=/run/varoverlayfs/work "${@}" +systemctl stop run-varoverlayfs.mount +rmdir /run/varoverlayfs diff --git a/rules/rootfs.in b/rules/rootfs.in index f105dc477..f9951ffec 100644 --- a/rules/rootfs.in +++ b/rules/rootfs.in @@ -171,76 +171,90 @@ config ROOTFS_TMP menu "/var " +config ROOTFS_VAR_OVERLAYFS + bool + prompt "overlay '/var' with RAM disk" + depends on INITMETHOD_SYSTEMD + help + This lets the whole '/var' content be writable transparently via an + 'overlayfs'. + Reading content happens from the underlaying root filesystem, while + changed content gets stored into a RAM disk instead. This enables all + applications to read initial data (configuration files for example) + and let them change this data even if the root filesystem is read-only. + Due to this behavior all changes made at run-time aren't persistent + by default. + Read documentation chapter 'Read Only Filesystem' for further details. + In order to use the default mount units and mount options, you need + to enable the 'mkdir' and 'rmdir' commands (from 'coreutils' or + 'busybox') and use a Linux kernel 4.18 or newer. By replacing the + default files in + 'projectroot/usr/lib/systemd/system/run-varoverlayfs.mount', + 'projectroot/usr/lib/systemd/system/var.mount' and + 'projectroot/usr/sbin/mount.varoverlayfs' by your own variants, + you can adapt these requirements. + config ROOTFS_VAR_RUN bool select ROOTFS_RUN prompt "/var/run" default y help - This will not create a directory but a symlink to /run. - Unless you want to mount a tmpfs on /var you should - say yes here. + Ensure a '/var/run' directory is available at run-time. This will + always be a symlink to '/run'. config ROOTFS_VAR_LOG bool prompt "/var/log" default y help - Create a /var/log directory in the root filesystem. - Unless you want to mount a tmpfs on /var you should - say yes here. + This directory is intended for log files and directories. Say 'y' here + to ensure a '/var/log' directory is available at run-time. config ROOTFS_VAR_LOCK bool prompt "/var/lock" default y help - Create a /var/lock directory in the root filesystem. - Unless you want to mount a tmpfs on /var you should - say yes here. + This directory is intended for application lock files. Say 'y' here + to ensure a '/var/lock' directory is available at run-time. config ROOTFS_VAR_LIB bool prompt "/var/lib" help - Create a /var/lib directory in the root filesystem. - Unless you want to mount a tmpfs on /var you should - say yes here. - If you are going to run an NFS server with file locking - support this folder must be persistent! + This directory is intended for application variable state information. + Say 'y' here to ensure a '/var/lib' directory is available at + run-time. config ROOTFS_VAR_CACHE bool prompt "/var/cache" help - Create a /var/cache directory in the root filesystem. - Unless you want to mount a tmpfs on /var you should - say yes here. + This directory is intended for application cache data. Say 'y' here + to ensure a '/var/cache' directory is available at run-time. config ROOTFS_VAR_SPOOL bool prompt "/var/spool" help - Create a /var/spool directory in the root filesystem. - Unless you want to mount a tmpfs on /var you should - say yes here. + This directory is intended for application spool data. Say 'y' here to + ensure a '/var/spool' directory is available at run-time. config ROOTFS_VAR_SPOOL_CRON bool prompt "/var/spool/cron" help - Create a /var/spool/cron directory in the root filesystem. - Unless you want to mount a tmpfs on /var you should - say yes here. + Create a '/var/spool/cron' directory in the root filesystem. config ROOTFS_VAR_TMP bool prompt "/var/tmp" default y help - Create a /var/tmp directory in the root filesystem. - Unless you want to mount a tmpfs on /var you should - say yes here. + This directory is intended for temporary files preserved between + system reboots. Say 'y' here to ensure a '/var/tmp' directory is + available at run-time. endmenu endif # ROOTFS diff --git a/rules/rootfs.make b/rules/rootfs.make index 7164521a8..d7b7eccdc 100644 --- a/rules/rootfs.make +++ b/rules/rootfs.make @@ -30,7 +30,7 @@ $(STATEDIR)/rootfs.targetinstall: @$(call install_fixup, rootfs,PRIORITY,optional) @$(call install_fixup, rootfs,SECTION,base) @$(call install_fixup, rootfs,AUTHOR,"Robert Schwebel ") - @$(call install_fixup, rootfs,DESCRIPTION,missing) + @$(call install_fixup, rootfs,DESCRIPTION, "Filesystem Hierarchy Standard") # # # # install directories in rootfs @@ -121,7 +121,18 @@ endif ifdef PTXCONF_ROOTFS_VAR_TMP @$(call install_copy, rootfs, 0, 0, 01777, /var/tmp) endif - +ifdef PTXCONF_ROOTFS_VAR_OVERLAYFS + @$(call install_alternative, rootfs, 0, 0, 0644, \ + /usr/lib/systemd/system/run-varoverlayfs.mount) + @$(call install_link, rootfs, ../run-varoverlayfs.mount, \ + /usr/lib/systemd/system/local-fs.target.requires/run-varoverlayfs.mount) + @$(call install_alternative, rootfs, 0, 0, 0755, \ + /usr/sbin/mount.varoverlayfs) + @$(call install_alternative, rootfs, 0, 0, 0644, \ + /usr/lib/systemd/system/var.mount) + @$(call install_link, rootfs, ../var.mount, \ + /usr/lib/systemd/system/local-fs.target.requires/var.mount) +endif # # # # install files in rootfs @@ -140,7 +151,9 @@ ifdef PTXCONF_ROOTFS_GSHADOW endif ifdef PTXCONF_ROOTFS_FSTAB @$(call install_alternative, rootfs, 0, 0, 0644, /etc/fstab) -endif + @$(call install_replace, rootfs, /etc/fstab, @VAR_OVERLAYFS@, \ + $(call ptx/ifdef,PTXCONF_ROOTFS_VAR_OVERLAYFS,#)) +endif # PTXCONF_ROOTFS_FSTAB ifdef PTXCONF_ROOTFS_MTAB_FILE @$(call install_alternative, rootfs, 0, 0, 0644, /etc/mtab) endif -- 2.20.1 _______________________________________________ ptxdist mailing list ptxdist@pengutronix.de