GPU-side memory protection landscape

Mon Nov 30 14:07:00 UTC 2020

Hi,

On #dri-devel Daniel invited me to chime in on the topic of clearing GPU
memory handed to userspace, so here I go.

I was asking how information leak from giving userspace dirty memory
previously used by another process is not seen as a security issue.
I was pointed to a recent thread, which offers a little perspective:
https://lists.freedesktop.org/archives/dri-devel/2020-November/287144.html

I think the main argument shown there is weak:

> And for the legacy node model with authentication of clients against
> the X server, leaking that all around was ok.

seeing how there's the XCSECURITY extension that is supposed to limit
what clients can retrieve, or there could be two X servers running for
different users.

My other concern is how easy it is to cause system instability or hangs
by out-of-bounds writes from the GPU (via compute shaders or copy
commands). In my experience of several years doing GPU computing with
NVIDIA tech, I don't recall needing to lose time rebooting my PC after
running a buggy CUDA "kernel". Heck, I could run the GCC C testsuite on
the GPU without worrying about locking myself and others from the
server. But now when I develop on a laptop with AMD's latest mobile SoC,
every time I make a mistake in my GLSL code it more often than not forces
a reboot. I hope you understand what a huge pain it is.

What are the existing GPU hardware capabilities for memory protection
(both in terms of preventing random accesses to system memory like with
an IOMMU, and in terms of isolating different process contexts from each
other), and to what extend Linux DRM drivers are taking advantage of
them?

Would you consider producing a document with answers to the above so
users know what to expect?

Thank you.
Alexander