<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:633369105;
mso-list-type:hybrid;
mso-list-template-ids:-1281319240 -2083731880 134807555 134807557 134807553 134807555 134807557 134807553 134807555 134807557;}
@list l0:level1
{mso-level-start-at:0;
mso-level-number-format:bullet;
mso-level-text:-;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-font-family:Calibri;}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l1
{mso-list-id:826438811;
mso-list-template-ids:-2074417694;}
@list l1:level1
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:36.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level2
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:72.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:108.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:144.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level5
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:180.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:216.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:252.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level8
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:288.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:324.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style>
</head>
<body lang="en-RO" link="#0563C1" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US">Hi all,</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">This RFC is a continuation of a longer kernel patch thread</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"><a href="https://lkml.org/lkml/2021/3/8/677">https://lkml.org/lkml/2021/3/8/677</a> where we originally thought such</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">a mechanism belongs. Ultimately, consensus there was that this mechanism</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">would be better suited in userspace, so systemd was an obvious first choice.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">Current proposal:</span><o:p></o:p></p>
<ul style="margin-top:0cm" type="disc">
<li class="MsoListParagraph" style="margin-left:0cm;mso-list:l0 level1 lfo3"><span lang="EN-US">As GitHub Issue here:
<a href="https://github.com/systemd/systemd/issues/19269">https://github.com/systemd/systemd/issues/19269</a></span><o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0cm;mso-list:l0 level1 lfo3"><span lang="EN-US">An example PoC here:
<a href="https://github.com/acatangiu/sysgenid-dbus">https://github.com/acatangiu/sysgenid-dbus</a></span><o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0cm;mso-list:l0 level1 lfo3"><span lang="EN-US">Described in this email as follows:</span><o:p></o:p></li></ul>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"># SysGenID: a system generation id provider</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">## Background and problem</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">The System Generation ID feature is required in virtualized or</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">containerized environments by applications that work with local copies</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">or caches of world-unique data such as random values, uuids,</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">monotonically increasing counters, cryptographic nonces, etc.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">Such applications can be negatively affected by VM or container</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">snapshotting when the VM or container is either cloned or returned to</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">an earlier point in time.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">Solving the uniqueness problem strongly enough for cryptographic</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">purposes requires a mechanism which can deterministically reseed</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">userspace PRNGs with new entropy at restore time. This mechanism must</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">also support the high-throughput and low-latency use-cases that led</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">programmers to pick a userspace PRNG in the first place; be usable by</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">both application code and libraries; allow transparent retrofitting</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">behind existing popular PRNG interfaces without changing application</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">code; it must be efficient, especially on snapshot restore; and be</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">simple enough for wide adoption.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">## Solution</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">Introduce a mechanism that standardizes an API for</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">applications and libraries to be made aware of uniqueness breaking</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">events such as VM or container snapshotting, and allow them to react</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">and adapt to such events.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">The System Generation ID is meant to help in these scenarios by</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">providing a monotonically increasing u32 counter that changes each time</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">the VM or container is restored from a snapshot.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">The `sysgenid` service exposes a monotonic incremental System Generation</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">u32 counter via the DBus `com.RFC.sysgenid` accessible at</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">`/com/RFC/sysgenid`. It provides asynchronous SysGen</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">counter update notifications, as well as counter retrieval and</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">confirmation mechanisms.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">The counter starts from zero when the service is started and</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">monotonically increments every time the system generation changes.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">Userspace applications or libraries can (a)synchronously consume the</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">system generation counter through the provided DBus interface, to</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">make any necessary internal adjustments following a system generation</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">update.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">The provided DBus interface operations can be used to build a</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">system level safe workflow that guest software can follow to protect</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">itself from negative system snapshot effects.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">System generation changes are driven by userspace software through a</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">dedicated DBus method.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">### Warning</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">SysGenID alone does not guarantee complete snapshot</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">safety to applications using it. A certain workflow needs to be</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">followed at the system level, in order to make the system</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">snapshot-resilient. Please see the "Snapshot Safety Prerequisites"</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">section below.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">## SysGenID DBus interface</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">#### Terminology</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">- `watcher` - a client using the SysGenID service _watching_ for system generation changes.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">- `untracked watcher` - default state for all clients. For a client to be tracked it has</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> to explicitly opt-in by confirming back to the service the correct _system generation</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> counter_.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">- `tracked watcher` - a client that is tracked by the service. Such a watcher is considered</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> `up-to-date` only after confirming back to the service the correct</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> _system generation counter_.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> Once tracked, a client is only _untracked_ when closing its connection to the DBus bus.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">- `outdated watcher` - a _tracked_ client that whose tracking has lived through a system</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> generation change, but has not (yet) confirmed back to the service the correct _system</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> generation counter_.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">**Methods:**</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">- `GetSysGenCounter` - returns latest system generation counter.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">- `AckWatcherCounter` - marks the client/watcher to be tracked for ACKs, is also</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> used by the watcher to confirm/ack the correct _sys gen counter_ to the service after</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> every generation change so the service keeps correct track of it as `outdated` or</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> `up-to-date`.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> Will error if client/watcher confirms/acks the wrong _sys gen counter_.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">- `CountOutdatedWatchers` - returns the number of current number of</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> _outdated tracked watchers_.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> A value of `zero` can be interpreted as the system being fully re-adjusted after a</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> generation change.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">- `TriggerSysGenUpdate` - triggers a generation update (should be a privileged operation).</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">**Signals:**</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">- `NewSystemGeneration` - system generation change notification, also carries new</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> _sys gen counter_.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">- `SystemReady` - notification sent out when all tracked watchers have _acked_ the new</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> _sys gen counter_. In other words, when all tracked software has adjusted to the new</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> environment.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">The service can keep track of watchers by DBus connections</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">(`org.freedesktop.DBus.NameOwnerChanged`).</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">**Exported read-only file used for memory mappings:**</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">The service also exports the current _sys gen counter_ through a simple file.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">The file contains only 4 bytes of data at offset 0, representing the u32 value</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">of the system generation counter.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">This file is meant to be mapped by other software in the system and be used as</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">a low-latency generation counter probe mechanism in critical sections.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">This mmap() interface is targeted at libraries or code that needs to</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">check for generation changes in-line, where an event loop is not</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">available or in cases where DBus calls are too expensive.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">In such cases, logic can be added in-line with the sensitive code to check the</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">counter and trigger on-demand/just-in-time readjustments when changes are</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">detected on the memory mapped file.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">Users of this interface that plan to lazily adjust most likely don't need to</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">also use the DBus interface, since tracking or waiting on them doesn't make sense.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">### Service interface DBus XML specification</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">```xml</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"><node name="/com/RFC/sysgenid"></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <interface name="com.RFC.sysgenid"></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <method name="AckWatcherCounter"></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <arg name="watcher_counter" type="u" direction="in"/></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <arg name="sysgen_counter" type="u" direction="out"/></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </method></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <method name="CountOutdatedWatchers"></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <arg name="outdated_watchers" type="u" direction="out"/></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </method></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <method name="GetSysGenCounter"></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <arg name="sysgen_counter" type="u" direction="out"/></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </method></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <method name="TriggerSysGenUpdate"></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <arg name="min_gen" type="u" direction="in"/></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </method></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <signal name="NewSystemGeneration"></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <arg name="sysgen_counter" type="u"/></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </signal></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <signal name="SystemReady"></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </signal></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </interface></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <interface name="org.freedesktop.DBus.Introspectable"></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <method name="Introspect"></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> <arg name="xml_data" type="s" direction="out"/></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </method></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </interface></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"></node></span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">```</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">## Snapshot Safety Prerequisites and Example</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">If VM, container or other system-level snapshots happen asynchronously,</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">at arbitrary times during an active workload there is no practical way</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">to ensure that in-flight local copies or caches of world-unique data</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">such as random values, secrets, UUIDs, etc are properly scrubbed and</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">regenerated.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">The challenge stems from the fact that the categorization of data as</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">snapshot-sensitive is only known to the software working with it, and</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">this software has no logical control over the moment in time when an</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">external system snapshot occurs.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">Let's take an OpenSSL session token for example. Even if the library</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">code is made 100% snapshot-safe, meaning the library guarantees that</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">the session token is unique (any snapshot that happened during the</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">library call did not duplicate or leak the token), the token is still</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">vulnerable to snapshot events while it transits the various layers of</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">the library caller, then the various layers of the OS before leaving</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">the system.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">To catch a secret while it's in-flight, we'd have to validate system</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">generation at every layer, every step of the way. Even if that would</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">be deemed the right solution, it would be a long road and a whole</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">universe to patch before we get there.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">Bottom line is we don't have a way to track all of these in-flight</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">secrets and dynamically scrub them from existence with snapshot</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">events happening arbitrarily.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">### Simplifying assumption - safety prerequisite</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">**Control the snapshot flow**, disallow snapshots coming at arbitrary</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">moments in the workload lifetime.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">Use a system-level overseer entity that quiesces the system before</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">snapshot, and post-snapshot-resume oversees that software components</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">have readjusted to new environment, to the new generation. Only after,</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">will the overseer un-quiesce the system and allow active workloads.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">Software components can choose whether they want to be tracked and</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">waited on by the overseer by using the marking themselves as tracked</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">watchers.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">The sysgenid service standardizes the API for system software to</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">find out about needing to readjust and at the same time provides a</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">mechanism for the overseer entity to wait for everyone to be done, the</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">system to have readjusted, so it can un-quiesce.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">### Example snapshot-safe workflow</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> </span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">1) Before taking a snapshot, quiesce the VM/container/system. Exactly</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> how this is achieved is very workload-specific, but the general</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> description is to get all software to an expected state where their</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> event loops dry up and they are effectively quiesced.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">2) Take snapshot.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">3) Resume the VM/container/system from said snapshot.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">4) Overseer will trigger generation bump using</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> `TriggerSysGenUpdate` method.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">5) Software components which have the DBus `NewGeneration` signal in</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> their event loops are notified of the generation change.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> They do their specific internal adjustments. Some may have chosen to</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> be tracked and waited on by the overseer, others might choose to do</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> their adjustments out of band and not block the overseer.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> Tracked ones *must* signal when they are done/ready by confirming the</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> new sys gen counter using the `AckWatcherCounter` DBus method.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">6) Overseer will block and wait for all tracked watchers by waiting on</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> the `SystemReady` DBus signal. Once all tracked watchers are done</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> in step 5, the signal is sent by `sysgenid` service and overseer will</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> know that the system has readjusted and is ready for active workload.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">7) Overseer un-quiesces system.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US">8) There is a class of software, usually libraries, most notably PRNGs</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> or SSLs, that don't fit the event-loop model and also have strict</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> latency requirements. These can take advantage of the</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> _exported read-only file used for memory mappings_. They can map the</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> file and check sys gen counter value in-line with the critical section</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> and can do so with low latency. When they are called after un-quiesce,</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> they can just-in-time adjust based on the updated mapped value.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> For a well-designed service stack, these libraries should not be</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> called while system is quiesced. When workload is resumed by the</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> overseer, on the first call into these libs, they will safely JIT</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> readjust.</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> Users of this lazy on-demand readjustment model should not use the</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> DBus interface or at least not enable watcher tracking since doing so</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> would introduce a logical deadlock:</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> lazy adjustments happen only after un-quiesce, but un-quiesce is</span><o:p></o:p></p>
<p class="MsoNormal"><span lang="EN-US"> blocked until all tracked watchers are up-to-date.</span><o:p></o:p></p>
</div>
<p></p>
<p><br>
Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.</p>
</body>
</html>