[Mesa-dev] virgl and vc4 problem on Android

Thu Jun 16 18:57:01 UTC 2016

On Thu, Jun 16, 2016 at 12:09 PM, Rob Clark <robdclark at gmail.com> wrote:
> On Thu, Jun 16, 2016 at 12:56 PM, Rob Herring <robh at kernel.org> wrote:
>> On Thu, Jun 16, 2016 at 11:44 AM, Rob Clark <robdclark at gmail.com> wrote:
>>> On Wed, Jun 15, 2016 at 8:34 PM, Rob Herring <robh at kernel.org> wrote:
>>>> In the process of adding RGBX (XB24) format to mesa for Android, I
>>>> started seeing a new problem that makes the UI stop updating. It
>>>> happens about when the splash screen is stopped and the lock screen is
>>>> displayed. The display flickers on mouse movement, and it looks like
>>>> the screen is flipping to old buffers (like the splash screen after
>>>> its process exited). It is working fine for freedreno AFAICT, but I am
>>>> running into a problem with virgl. With virgl, I get the following
>>>> error:
>>>>
>>>> vrend_create_surface: context error reported 1 "surfaceflinger"
>>>> Illegal resource 1435
>>>> vrend_report_buffer_error: context error reported 1 "surfaceflinger"
>>>> Illegal command buffer 329729
>>>>
>>>> The addition of the pixel format changes the eglconfig used for the
>>>> splash screen. If I force the splash screen eglconfig to have an alpha
>>>> or draw one frame of the splash screen and exit early or disable the
>>>> splash screen, everything seems fine though I have hit the problem
>>>> rarely navigating around. I suspect this has nothing to do with the
>>>> pixel format other than different buffer sizes cause buffers to get
>>>> reused differently.
>>>>
>>>> Now I've started working on getting RPi3 and vc4 working, and it
>>>> appears to have a similar problem. I'm getting these errors though
>>>> things go haywire before getting any error message:
>>>>
>>>> [   43.846569] [drm:vc4_submit_cl_ioctl] *ERROR* Failed to look up GEM BO 0: 4
>>>
>>> at least in the vc4 case, I suspect you need a similar bit of winsys
>>> magic to ensure the same pipe_screen is returned for any given drm
>>> device fd.  (Or did someone already add that?)
>>
>> That problem should be gone with GBM gralloc, right?
>
> *maaaybe*..
>
> It, like the gralloc-drm-pipe approach, means we have a pipe_screen
> (vs. the other drm-gralloc backends which were using libdrm_xyz
> directly), so it was going through the logic to avoid duplicate
> pipe_screen's (for the drivers which had that).
>
> Maybe w/ gbm, everything ends up sharing the same pipe_screen?  I'm
> not really sure, since I guess both GL and gralloc are creating a gbm
> device?
>
> I guess easy enough to put some debug print in vc4_screen_create() to
> confirm.  But the sort of errors you are seeing make me suspicious.

Uhh, well looks like that is a problem for vc4:

01-01 00:00:07.295   127   127 W VC4     : vc4_screen_create
01-01 00:00:07.334   127   127 W VC4     : vc4_screen_create
01-01 00:00:08.349   205   223 W VC4     : vc4_screen_create
01-01 00:00:08.352   205   223 W VC4     : vc4_screen_create
01-01 00:00:35.467   437   488 W VC4     : vc4_screen_create
01-01 00:00:35.477   437   488 W VC4     : vc4_screen_create
01-01 00:00:39.041   511   511 W VC4     : vc4_screen_create
01-01 00:00:43.385   511   798 W VC4     : vc4_screen_create
01-01 00:00:44.135   718   718 W VC4     : vc4_screen_create
01-01 00:00:44.202   718   923 W VC4     : vc4_screen_create

> Possibly the "libdrm equivalent" part of vc4 needs to do more to avoid
> re-importing the same handle multiple times?

Maybe time for the common implementation.

This doesn't explain the virgl case though as I already fixed this
problem. The log below is from virgl.

>>> In both virgl and vc4 case, you need to make sure that shared
>>> (exported/imported) buffers don't end up in the bo cache.
>>
>> I've disabled the cache (in the gallium drv, right?) and still see problems.
>>
>> I am seeing a double GEM_CLOSE. I'm not sure how that is happening.
>> One of them must be hwc releasing an imported buffer, but it's all in
>> the same thread.
>>
>> [    7.024495] [drm] pid=1310, dev=0xe280, auth=0, handle=17, ret = 0,
>> DRM_IOCTL_GEM_CLOSE
>> [    7.025379] [drm] pid=1310, dev=0xe280, auth=0, handle=23, ret = 0,
>> DRM_IOCTL_PRIME_FD_TO_HANDLE
>> [    7.026663] [drm] pid=1310, dev=0xe280, auth=0, handle=10, ret = 0,
>> DRM_IOCTL_GEM_CLOSE
>> [    7.027343] [drm] pid=1310, dev=0xe200, auth=1, handle=23, ret = 0,
>> DRM_IOCTL_PRIME_FD_TO_HANDLE
>> [    7.035098] [drm] pid=1333, dev=0xe200, auth=1, handle=1, ret = 0,
>> DRM_IOCTL_GEM_CLOSE
>> [    7.036093] [drm] pid=1310, dev=0xe280, auth=0, handle=17, ret =
>> -22, DRM_IOCTL_GEM_CLOSE
>
> sure would be nice if there was a dump_stack() that showed you the
> userspace stack too ;-)
>
> (but maybe dumb question, is pid unique per process or thread?)

Ignoring namespaces, pids are globally unique.

Rob