[Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering

Eero Tamminen eero.t.tamminen at intel.com
Thu Feb 28 12:50:37 UTC 2019


Hi,

On 28.2.2019 11.57, Marc-André Lureau wrote:
> On Thu, Feb 28, 2019 at 1:17 AM Marek Olšák <maraeo at gmail.com> wrote:
>> I'd rather have something more robust than an env var, like catching SIGSYS.

SIGSYS is info for the invoking parent, not the (Mesa) process doing the 
syscall.

 From "man 2 seccomp":

The process terminates as though killed by a SIGSYS signal.  Even if a 
signal handler has been registered for SIGSYS,  the  handler will be 
ignored in this case and the process always terminates.  To a parent 
process that is waiting on this process (using waitpid(2) or similar), 
the returned wstatus will indicate that its child was terminated as 
though by a SIGSYS signal.


> With current qemu in most distros, it defaults to SIGSYS (we switched
> away from SCMP_ACT_KILL, which had other problems). With more recent
> qemu/libseccomp, it will default to SCMP_ACT_KILL_PROCESS. In those
> KILL action cases, mesa will not be able to catch the failing
> syscalls.

Qemu / libvirt isn't the only thing using seccomp.

For example Docker enables seccomp filters (along with capability
restrictions) for the invoked containers unless that is explicitly
disabled:
	https://docs.docker.com/engine/security/seccomp/

What actually gets filtered, is trivially changeable on Docker command 
line by giving a JSON file specifying the syscall filtering.

Default policy seems to be white-listing affinity syscall:
	https://github.com/moby/moby/blob/master/profiles/seccomp/default.json


Why distro versions of Qemu filter sched_setaffinity() syscall?


	- Eero

>> Marek
>>
>> On Wed, Feb 27, 2019 at 6:13 PM <marcandre.lureau at redhat.com> wrote:
>>>
>>> From: Marc-André Lureau <marcandre.lureau at redhat.com>
>>>
>>> Since commit d877451b48a59ab0f9a4210fc736f51da5851c9a ("util/u_queue:
>>> add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY"), mesa calls
>>> sched_setaffinity syscall. Unfortunately, qemu crashes with SIGSYS
>>> when sandboxing is enabled (by default with libvirt), as this syscall
>>> is filtered.
>>>
>>> There doesn't seem to be a way to check for the seccomp rule other
>>> than doing a call, which may result in various behaviour depending on
>>> seccomp actions. There is a PTRACE_SECCOMP_GET_FILTER, but it is
>>> low-level and a priviledged operation (but there might be a way to use
>>> it?). A safe way would be to try the call in a subprocess,
>>> unfortunately, qemu also prohibits fork(). Also this could be subject
>>> to TOCTOU.
>>>
>>> There seems to be few solutions, but the issue can be considered a
>>> regression for various libvirt/Boxes users.
>>>
>>> Introduce MESA_NO_THREAD_AFFINITY environment variable to prevent the
>>> offending call. Wrap pthread_setaffinity_np() in a utility function
>>> u_pthread_setaffinity_np(), returning a EACCESS error if the variable
>>> is set.
>>>
>>> Note: one call is left with a FIXME, as I didn't investigate how to
>>> build and test it, help welcome!
>>>
>>> See also:
>>> https://bugs.freedesktop.org/show_bug.cgi?id=109695
>>>
>>> Signed-off-by: Marc-André Lureau <marcandre.lureau at redhat.com>
>>> ---
>>>   .../drivers/swr/rasterizer/core/threads.cpp       |  1 +
>>>   src/util/u_queue.c                                |  2 +-
>>>   src/util/u_thread.h                               | 15 ++++++++++++++-
>>>   3 files changed, 16 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/swr/rasterizer/core/threads.cpp b/src/gallium/drivers/swr/rasterizer/core/threads.cpp
>>> index e30c1170568..d10c79512a1 100644
>>> --- a/src/gallium/drivers/swr/rasterizer/core/threads.cpp
>>> +++ b/src/gallium/drivers/swr/rasterizer/core/threads.cpp
>>> @@ -364,6 +364,7 @@ void bindThread(SWR_CONTEXT* pContext,
>>>       CPU_ZERO(&cpuset);
>>>       CPU_SET(threadId, &cpuset);
>>>
>>> +    /* FIXME: use u_pthread_setaffinity_np() if possible */
>>>       int err = pthread_setaffinity_np(thread, sizeof(cpu_set_t), &cpuset);
>>>       if (err != 0)
>>>       {
>>> diff --git a/src/util/u_queue.c b/src/util/u_queue.c
>>> index 3812c824b6d..dea8d2bb4ae 100644
>>> --- a/src/util/u_queue.c
>>> +++ b/src/util/u_queue.c
>>> @@ -249,7 +249,7 @@ util_queue_thread_func(void *input)
>>>         for (unsigned i = 0; i < CPU_SETSIZE; i++)
>>>            CPU_SET(i, &cpuset);
>>>
>>> -      pthread_setaffinity_np(pthread_self(), sizeof(cpuset), &cpuset);
>>> +      u_pthread_setaffinity_np(pthread_self(), sizeof(cpuset), &cpuset);
>>>      }
>>>   #endif
>>>
>>> diff --git a/src/util/u_thread.h b/src/util/u_thread.h
>>> index a46c18d3db2..a4e6dbae5d7 100644
>>> --- a/src/util/u_thread.h
>>> +++ b/src/util/u_thread.h
>>> @@ -70,6 +70,19 @@ static inline void u_thread_setname( const char *name )
>>>      (void)name;
>>>   }
>>>
>>> +#if defined(HAVE_PTHREAD_SETAFFINITY)
>>> +static inline int u_pthread_setaffinity_np(pthread_t thread, size_t cpusetsize,
>>> +                                           const cpu_set_t *cpuset)
>>> +{
>>> +   if (getenv("MESA_NO_THREAD_AFFINITY")) {
>>> +      errno = EACCES;
>>> +      return -1;
>>> +   }
>>> +
>>> +   return pthread_setaffinity_np(thread, cpusetsize, cpuset);
>>> +}
>>> +#endif
>>> +
>>>   /**
>>>    * An AMD Zen CPU consists of multiple modules where each module has its own L3
>>>    * cache. Inter-thread communication such as locks and atomics between modules
>>> @@ -89,7 +102,7 @@ util_pin_thread_to_L3(thrd_t thread, unsigned L3_index, unsigned cores_per_L3)
>>>      CPU_ZERO(&cpuset);
>>>      for (unsigned i = 0; i < cores_per_L3; i++)
>>>         CPU_SET(L3_index * cores_per_L3 + i, &cpuset);
>>> -   pthread_setaffinity_np(thread, sizeof(cpuset), &cpuset);
>>> +   u_pthread_setaffinity_np(thread, sizeof(cpuset), &cpuset);
>>>   #endif
>>>   }
>>>
>>> --
>>> 2.21.0
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> 



More information about the mesa-dev mailing list