[RFC] Wayland Security Modules
Sloane, Brandon
bsloane at owlcyberdefense.com
Thu May 22 21:02:48 UTC 2025
> What's worse is that in the client side, there is no way to tell what failed. System calls can be traced to see their return values, and X11 error events exist, but Wayland has neither unless you disconnect the client.
For debug purposes, there are the compositor logs (and any other logging the WSM chooses to implement). A new global object to provide denial notifications should at least be good enough for debug purposes, although I do the difficulties in having that mechanism be useful for programmatic recovery.
> Could you explain more of your actual use cases? Maybe people would have better ideas how to solve them.
We are focused on cross domain and multi-level systems built using SELinux as a core part of their security architecture. For instance, we might have a workstation that is connected to sensitive networks A and B as well as the open internet. The security policy might dictate that there can be no data transfer between the internet and the other networks; and that data transfer between A and B is allowed only through trusted applications. The details of this policy are specified using SELinux.
For a bit of context on how SELinux works, every process, file, and other resource type is given a context consisting of (among other information): a domain (firefox_t, apache_t, etc.), a sensitivity level (s0, s1, ... sn), and 0 or more categories (c0 ... c1024). Multi-level processes may be given a low and high context to represent their access to everything between the two. In the above example, we might say that processes that talk to the internet run with a level of "s0", those with access to network A run with "s1:c0", those with access to network B run with "s1:c1", and a trusted multilevel process might run with "s0 - s1:c0.c1". If a file comes from network A, it would have a context containing "s1:c0", and so only be accessable to processes containing category c0 and a sensitivity of s1 or greater.
Thus far, there have been two potential dataflows within Wayland we have looked into. The first is related to clipboard access and drag-and-drop. If a copy is performed in an application connected to network A, we want the compositor to block any paste action for an application connected to network B. From the SElinux perspective, this would denial would be based on the fact that the copy has the c0 label, which is not present in the recieving client.
This gets more complicated with the introduction of a trusted application with access to both networks A and B. Such an application would run with level "s1:c0.c1". Naivly, this would allow it to perform a paste operation from both c0 and c1 clients, but perform a copy operation to neither. We would like for this application to be able to copy data to either c0 or c1 applications while still restricting c0 data to c0 applications and c1 data to c1 applications. Since this is a trusted application, we trust it to tell us what data it provided belongs to c0 and what data belongs to c1.
The way I anticipate this working is that the trusted client constructs a wl_data_source, then labels that wl_data_source object with either the c0 or c1 category as appropriate. This can be accomplished by the SELinux WSM its own global object which provides requests like "wsm_selinux::setcon(res: object, ctx: string)" and "wsm_selinux::setcreatecon(interface: string, ctx: string)", which will either update the context of an existing object, or set the context for any subsequent object of the given interface. Obviously, the ability of a client to do this would itself be subject to an access control check.
As simmilar issue arrises with screen captures. A c0 application should not be able to see the pixels of a c1 application and vice-versa. However a trusted c0.c1 application should be able to identify specific regions as belonging to c0 and others as belonging to c1. The complication with screen captures is that a single capture may implicate multiple surfaces. We anticipate this causing a seperate access check for every surface. The behaviour for cases where these access checks return different results is up to the compositor. Ideally, the compositor would redact only the nessasary surfaces, but that may prove depending on how compositors implement screen capturing.
Those are the only cases we have looked at closely, but we anticipate this type of concern in any protocol extension that facilitates cross-client communication. For instance an ext_foreign_toplevel_handle_v1 should share a context with the corresponding xdg_toplevel in the foreign client, where a single client might have multiple top level surfaces with different contexts.
The other issue is that we cannot fully anticipate the security needs of every future project. By creating an access check for everything (and writing the SELinux WSM such that all libwayland access checks are answered based on the system policy), we allow system integrators to create security policies to meet their specific needs.
A related issue is that it is simply not feasible to expect compositor authors to consider every potential security implicication for every system of every protocol they implement. Nor is it feasible to expect system integrators to audit every compositor release for new features and create new access checks for every potential concern.
> I'm curious about when does your WSM design actually work.
For copy-and-paste, drag-and-drop, and screenshots it pretty much just works. We do require the compositor to advice us on the relationship between wl_data_offer and wl_data_sources, and be able to handle a NULL coming out of wl_resource_create. However, from the protocol perspective, just not creating the wl_data_offer and not sending the event works fine.
Screenshots run into some composting issues (since, the notion of a partial screenshot does not seem to be well supported), but nothing that is protocol related.
Those are the only two situations we have gotten to testing thus far. However, looking through the protocols, there are a lot of cases where blocking a request/event is safe. It is up to WSM developers (and, where relevent system policy developers) to make sensible decisions on what to block. If they do not do so, the resulting breakage is on them.
As far as I can tell, the only Wayland protocol level breakage that occurs with our design is blocking new_id. From what I can tell, there are not many circumstances where doing so is actually important, as most goals could be achieved by allowing the creation and blocking the accesses the new object would give. Again, the specific here are policy decisions that can be defered to the system integrator. Logical breakage for specific protocols with some poorly placed denials remains possible, and can cause varying levels of problems. But that would again be the responsibility of WSM developers and policy developers to resolve.
> You later wrote that e.g. with clipboard access, the security policy would need to know not just the source and destination clients, but also the destination wl_surface. Maybe you want to have even more context. But can you really get this context reliably from libwayland-server hooks?
I might have not been fully clear here. For clipboard access, the important thing is the source wl_data_source; this does require explicit compositor support to provide. The wl_surface was relevent for screenshots.
-- Brandon
> -----Original Message-----
> From: Pekka Paalanen <pekka.paalanen at haloniitty.fi>
> Sent: Thursday, May 22, 2025 4:58 AM
> To: Sloane, Brandon <bsloane at owlcyberdefense.com>
> Cc: wayland-devel at lists.freedesktop.org
> Subject: Re: [RFC] Wayland Security Modules
>
> On Tue, 20 May 2025 19:11:06 +0000
> "Sloane, Brandon" <bsloane at owlcyberdefense.com> wrote:
>
> > > -----Original Message-----
> > > From: Pekka Paalanen <pekka.paalanen at haloniitty.fi>
> > > Sent: Tuesday, May 20, 2025 4:58 AM
> > > To: Sloane, Brandon <bsloane at owlcyberdefense.com>
> > > Cc: wayland-devel at lists.freedesktop.org
> > > Subject: Re: [RFC] Wayland Security Modules
> > >
> > > On Mon, 19 May 2025 15:48:04 +0000
> > > "Sloane, Brandon" <bsloane at owlcyberdefense.com> wrote:
> > >
> > > > Hello,
> > > >
> > > > I've spent the past few months prototyping a security modules
> > > > system for Wayland. Our specific motivation for this is to support
> > > > SELinux integration to meet some rather unique security requirements.
> > > > However, what we are proposing here is a rather general purpose
> > > > security module system that provides high level hooks modules can
> > > > then implement. Potential usecases for this system are:
> > > >
> > > > * Creating SELinux permissions for Wayland actions.
> > > > * Integrating with non-SELinux Linux Security Modules
> > > > (AppArmor/SMACK/etc).
> > > > * Integrating with PolicyKit.
> > > > * Disabling privileged protocols that a specific compositor
> > > > implements.
> > > > * Restricting privileged protocols to trusted clients.
> > > > * Creating backends for wp_security_context_manager.
> > >
> > > Hi,
> > >
> > > from this and the readme I understand that the goal is to remove
> > > security policies from compositors and place them outside of
> > > compositors and DE projects, where they can be shared by many desktop
> and other environments.
> > > Is that right?
> > >
> > > What is the reason for this goal?
> > >
> > > To unify policy configuration over all environments?
> > >
> > > To enforce policy where the compositor does not do so itself?
> >
> > I would say the goal is to move security policy out of the compositors
> > to the system integrators. I tend to consider DE projects to be a form
> > of system integration, so using them using this to implement their own
> > security policies would be well within scope (although I would hope
> > they do so in a way that allows downstream integrators to replace the
> > security policy if needed). Our specific motivation is that we
> > building systems with some rather niche security requirements that
> > would not be suitable to implement in a general purpose DE. We also
> > want to unify our policy configuration with a single environment, so
> > our network policy, file access policy, device access policy, and GUI
> > policy can all exist as a single unified security policy.
> >
> > The ability of have a common policy configuration work across
> > different environments/compositors is nice, but was not a primary
> > design goal. As we went to design this, we quickly found that, even if
> > we were only concerned with a single compositor, the low-level IPC
> > layer by far the most natural and simplest place to implement this.
> >
> > >
> > > > Our current proof of concept is here:
> > > > https://gitlab.freedesktop.org/bsloane1650/wayland. Some more
> > > > in-depth technical discussion is available in the doc/WSM.md file
> > > > in that repository.
> > > >
> > > > We also have some modules in development here:
> > > > https://gitlab.freedesktop.org/bsloane1650/wayland-security-module
> > > > s
> > > > * Logger - a basic proof of concept that demonstrates
> > > > instantiating a module and logging every access.
> > > > * Allow-list - A basic proof of concept that demonstrates globally
> > > > restricting what interfaces can be used.
> > > > * SELinux - A more complex module that defers all access decisions
> > > > to the system's SELinux policy (under active
> > > > development)
> > > >
> > > > The overall design is to add hooks at key points in libwayland:
> > > > * Creation and destruction of core libwayland objects: wl_client,
> > > > wl_display, wl_global, wl_resource
> > > > * Prior to sending an event to the client
> > > > * Prior to invoking the request handler after receiving a request
> > > > from a client.
> > > > * Prior to publishing a global object
> > > > * Prior to binding a global object.
> > >
> > > What about all the other Wayland protocol implementations that do
> > > not use libwayland? Or those that bundle a libwayland without the
> > > hooks?
> >
> > Our changes should only effect libwayland-server, and do not change
> > the wire format at all. As such, non-libwayland clients should work
> > just as well as libwayland clients.
> >
> > For compositors, a non-libwayland compositor would not see any change
> > or benefit from this effort. If they want to be able to use a WSM,
> > they would need to implement enough of the libwayland-server ABI to be
> > compatible with whatever module(s) they want to use (or just the
> > libwayland-server API if recompiling the module is acceptable).
> > Originally, we had intended to have a WSM API that modules would be
> > written against, which better enable sharing with non-libwayland
> > compositors. However, we ultimately determined that such an API was
> > not worth the effort.
> >
> > >
> > >
> > > > A security module is represented by a "struct wsm" object, which
> > > > has void* for modules to use, and a function pointer for each
> > > > hook. Compositors can instantiate these structs however they like
> > > > and pass them into a new wl_display_create_with_wsms method. The
> > > > existing wl_display_create method is modified to dynamically load
> > > > shared object files based on the new WAYLAND_SECURITY_MODULES
> > > > environmental variable. These shared objects are expected to
> > > > export a wl_wsm_init symbol that instantiates a wsm structure.
> > >
> > > What's the failure mode of losing the environment variable by
> > > accident, e.g. by a software update? Any security limitations just
> > > won't be there and without notice? Isn't that too fragile and
> > > invisible?
> >
> > Yes. Loosing the environmental variable would silently disable the
> > security protections, which is certainly not ideal. How much of a risk
> > this is would depend on how the DE developers or system integrators
> > choose to set the environmental variable. I don't think there is a
> > better option available from within libwayland. If a compositor wants
> > to implement a more robust mechanism, it can do so by calling the new
> > wl_display_create_with_wsms() method explicitly.
> >
> > >
> > > > We have had success running this by linking unmodified compositors
> > > > (mostly Weston) against an updated libwayland. Depending on what
> > > > accesses the module blocks, existing compositors work without even
> > > > needing a recompile. However, to be useful, we have found a couple
> > > > of areas that additional compositor integration is needed.
> > > > Mostly this has been shifting from wl_resourc_create to
> > > > wl_resource_create_with_related in a few key places (such as
> > > > creating a wl_data_offer) to allow the security modules to
> > > > associated resources that are shared between clients.
> > >
> > > To me this sounds like aiming for unmodified compositors just won't
> > > work. Would it not be better to aim for explicit integration?
> >
> > The actual design goal was maintaining backward compatibility.
> > Updating to a WSM aware version of libwayland shouldn't break any
> > existing system (unless they happened to have an environmental
> > variable called WAYLAND_SECURITY_MODULES set).
>
> Hi,
>
> by explicit compositor integration I meant by not adding such hooks into
> libwayland-server but have the compositor call into the WSM framework.
>
> Libwayland-server can certainly be enhanced with supporting functionality.
> The globals filtering is one that already exists. You can get the client socket fd to
> ask the kernel about a security context. What might be missing is asking which
> listening socket the client connected to. These kind of things are fine.
>
> > > Or, if unmodified compositors is an explicit design goal, make the
> > > security layer a Wayland proxy? An independent man-in-the-middle
> > > process.
> > > > We have also found a need to modify compositors to deal with
> > > > denials associated with new_id type requests. We think we have a
> > > > workable solution implemented in libwsm_compositor that
> > > > compositors can incorporate with a few library calls; but I still
> > > > consider this the most questionable part of the project. This
> > > > issue is discussed rather in much more detail in doc/WSM.md. Any
> > > > input on this would be greatly appreciated.
> > >
> > > You cannot "deny" any request at will, not even those without new_id
> > > arguments. The protocol specification defines the behaviour of each
> > > request, and a security module cannot decide against the spec. It
> > > would break the protocol, cause mismatching state between the
> > > compositor and the client, likely lead to hard-to-debug failure
> > > modes if it does not outright cause a protocol error soon after,
> > > disconnecting the client. Some requests are specified so that the
> > > compositor can decide to refuse, but those are very rare. In
> > > general, it makes no sense to be able to gate each and every
> > > request. No message can ever be ignored at will and expect the
> > > application to continue working fine.
> >
> > I'm probably showing my roots as an SELinux developer here, but overly
> > strict security modules causing applications to crash or misbehave in
> > weird ways. My expectation is that general purpose systems would write
> > their security policy to allow most accesses.
> > Given how Wayland protocols are generally written, restricting access
> > to global objects should be suitable for purposes, and that is already
> > a well-supported concept. We are following the Linux Security Module
> > model of 'let the security module decide what is or isn't a good
> > idea'. If a module does this in a way that breaks things, that is up
> > to the module. The goal here is simply to mitigate this issue as much
> > as possible.
>
> Linux system calls can return a failure. Programs are expected to check for
> them.
>
> X11 at least theoretically allows delivering recoverable errors to clients.
>
> Wayland requests allow neither. The only way to fail is to disconnect.
>
> This makes me think that a security policy with this kind of an implementation
> design will be much more prone to accidentally break applications than LSM or
> XAce, up to a level where I wonder if it is practical. What's worse is that in the
> client side, there is no way to tell what failed. System calls can be traced to see
> their return values, and X11 error events exist, but Wayland has neither unless
> you disconnect the client.
>
> > >
> > > You also cannot really retrofit asynchronous access checks (that may
> > > take more than, say, 100 ms) to interfaces or requests that were not
> > > designed with that in mind. Wayland protocol stream follows a strict
> > > execute-in-order model, so you would have to freeze the whole client
> > > connection until an access decision is available. This would make
> > > the application appear frozen, and the application might get
> > > disconnected due to being unresponsive (even though it would be the
> > > compositor's fault) or due to overflowing socket buffers (less of a
> > > problem nowadays I guess).
> >
> > We are not particularly attached to asynchronous access checks. None
> > of our users actually require it. I bring it up because most general
> > purpose user systems have it nowadays. Our thinking for it was the
> > initial behavior would be the same as a permission denial. However, we
> > would introduce a wsm_manager global object that clients could bind
> > to. This object would then send an event when a denial occurred
> > because of an asynchronous check, and would send another event once
> > the check is complete, allowing the client to retry the original
> > action. Obviously, this does require the client to be aware of the new
> > system, and assumes that the original "deny the request" behavior can
> > be done in such a way that nothing outright breaks.
>
> The first problem is that Wayland has no generic way to indicate a gracefully
> failed request. This means that no client side code is architected in a way that a
> failure might be possible to handle.
>
> The problem with wsm_manager delivering failure events is that Wayland is
> asynchronous. Since no request can gracefully fail, the protocol interfaces are
> designed to take advantage of that, and allow clients to send lots of requests
> before needing to wait for replies. Every request carries a built-in assumption
> that the previous request succeeded. This would bring a cascade failure.
>
> There are a couple of problems with retry. The first is how to define what
> actually should be retried. Wayland messages do not have any message serial
> numbers to indicate which one we are talking about. The second problem is
> the burden of implementation in a client to actually prepare for failure and
> implement a retry. I cannot imagine that being feasible, if it wasn't part of the
> original protocol interface specification.
>
> > This scheme should be implementable without any explicit support from
> > libwayland. We have not done so because our usecases do not need it.
>
> Could you explain more of your actual use cases? Maybe people would have
> better ideas how to solve them. I'm curious about when does your WSM
> design actually work.
>
> You later wrote that e.g. with clipboard access, the security policy would need
> to know not just the source and destination clients, but also the destination
> wl_surface. Maybe you want to have even more context. But can you really
> get this context reliably from libwayland-server hooks?
>
> I guess you would need to do a lot of protocol state tracking to have an idea of
> the context. The compositor already does this state tracking, so it could just
> tell the WSM if there was an API for it.
>
>
> Thanks,
> pq
>
> > > Instead, developers try to account for the security requirements at
> > > the protocol (extension) design. In the simplest form, the interface
> > > offered through wl_registry either exists if granted, or not.
> >
> > I think I've alluded to this before, but one of the hooks is in
> > wl_global_is_visible, that would enable modules to implement exactly
> > this form of access control.
> >
> >
> > > When finer grained control
> > > is necessary, the possibility to refuse and revoke is built into the
> > > interface specification. Also the need for asynchronous access
> > > checks is considered, allowing the application and the compositor to
> > > continue their otherwise normal operations until the access decision
> > > arrives.
> > >
> > > If there are security design issues with Wayland protocol
> > > interfaces, I would hope they get fixed by revising the protocol.
> > >
> > > I am most sceptical about the hooking libwayland part of the
> > > proposal. OTOH, offering compositors an API they can query in order
> > > to determine access might be a good idea. It might need to be
> > > decoupled from literal protocol interfaces, because there can be
> > > many different interfaces, not all upstream wayland-protocols or
> > > even Wayland at all, for applications to do things.
> > >
> > >
> > > Thanks,
> > > pq
> > >
> > >
> > > > We have also experimented with per-surface screenshot restrictions
> > > > in Weston, which needs to be implemented almost entirely in the
> > > > compositor itself.
> > > >
> > > > We probably should have gone public with this far earlier in the
> > > > design process. However, despite the late stage we find ourselves
> > > > in, we are open to significant revisions based on community
> > > > feedback.
> > > >
> > > > Thanks,
> > > > Brandon
> > > >
> >
More information about the wayland-devel
mailing list