[Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread
Michel Dänzer
michel at daenzer.net
Tue Apr 18 08:26:07 UTC 2017
On 18/04/17 05:04 PM, gregory hainaut wrote:
> On Tue, 18 Apr 2017 08:51:24 +0200
> gregory hainaut <gregory.hainaut at gmail.com> wrote:
>
>> On Mon, 17 Apr 2017 11:17:42 +0900
>> Michel Dänzer <michel at daenzer.net> wrote:
>>
>>> On 15/04/17 05:08 PM, gregory hainaut wrote:
>>>> On Sat, 15 Apr 2017 00:50:15 +0200
>>>> Dieter Nützel <Dieter at nuetzel-hh.de> wrote:
>>>>
>>>>> Am 14.04.2017 07:53, schrieb gregory hainaut:
>>>>>> On Fri, 14 Apr 2017 05:20:38 +0200
>>>>>> Dieter Nützel <Dieter at nuetzel-hh.de> wrote:
>>>>>>
>>>>>>> Am 14.04.2017 02:06, schrieb Dieter Nützel:
>>>>>>>> Hello Gregory,
>>>>>>>>
>>>>>>>> have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
>>>>>>>> It result in crazy numbers and do not 'return' (one core stays @ 100%).
>>>>>>>
>>>>>>> This is related to 'mesa_glthread=true'.
>>>>>>> If I disable (unset) it, all is fine after 'b' benchmark and 'pbo'
>>>>>>> exit
>>>>>>> with ESC as expeted.
>>>>>>> Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)
>>>>>>>
>>>>>>> Hope that helps.
>>>>>>>
>>>>>>> Dieter
>>>>>>
>>>>>> Hello Dieter,
>>>>>>
>>>>>> I tested the demo. There is a pseudo unrelated bug on the exit of the
>>>>>> application.
>>>>>>
>>>>>> Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
>>>>>> found non-freed data
>>>>>>
>>>>>> I will add a call to a _mesa_HashDeleteAll to fix it.
>>>>>> i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb, ctx);
>>>>>>
>>>>>> Now let's go back to the test behavior. The benchmarks will send 4s of
>>>>>> asynchronous PBO transfer commands. And then will sync gl_thread which
>>>>>> mean the application thread will be blocked until all PBO transfers are
>>>>>> done. Gl_thread is faster to dispatch command so you will need to wait
>>>>>> more before the thread goes back to real life.
>>>>>>
>>>>>> On my side, I need to wait around 45 seconds for 6 millions of
>>>>>> commands.
>>>>>> Result: 6,440,627 reads (gl thread on + PBO patches)
>>>>>> Result: 274,960 reads (gl thread off)
>>>>>>
>>>>>> In your case, "Result: 77,444,412 reads", I hope you're patient.
>>>>>> I think you must wait at least 10 minutes.
>>>>>
>>>>> Now, I was patient...
>>>>> Tried 2 times but after ~20 minutes I've killed it at first and attached
>>>>> gdb at it during second run.
>>>>>
>>>>> 0x00007fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from
>>>>> /lib64/libpthread.so.0
>>>>> (gdb) bt
>>>>> #0 0x00007fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from
>>>>> /lib64/libpthread.so.0
>>>>> #1 0x00007fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
>>>>> #2 0x00007fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
>>>>> #3 0x0000000000401e18 in ?? ()
>>>>> #4 0x00000000004028c7 in ?? ()
>>>>> #5 0x00007fbda9925781 in fghRedrawWindow () from
>>>>> /usr/lib64/libglut.so.3
>>>>> #6 0x00007fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
>>>>> #7 0x00007fbda9926cf9 in fgEnumWindows () from /usr/lib64/libglut.so.3
>>>>> #8 0x00007fbda9925ce4 in glutMainLoopEvent () from
>>>>> /usr/lib64/libglut.so.3
>>>>> #9 0x00007fbda9925d85 in glutMainLoop () from /usr/lib64/libglut.so.3
>>>>> #10 0x00000000004019fc in ?? ()
>>>>> #11 0x00007fbda957e541 in __libc_start_main () from /lib64/libc.so.6
>>>>> #12 0x0000000000401afa in ?? ()
>>>>>
>>>>> Should I do more or not worth it?
>>>>>
>>>>> Dieter
>>>>
>>>> Hello Dieter,
>>>>
>>>> To be honest, I don't konw how much time you need to wait. 77 millions of
>>>> PBO transfer is quite huge. It depends on CPU/Memory/PCIe/VRAM/GPU speed.
>>>>
>>>> Hum based on the image size (194*188*4), you need to approximately transfer
>>>> 10522 GB of data from your GPU... Which is likely around 20 minutes if
>>>> PCIe run at full speed. Honestly I will let the application in background
>>>> for a couple of hours.
>>>
>>> Basically, the application needs to be fixed not to emit an unlimited
>>> number of PBO transfers without doing anything which requires
>>> synchronizing to the transfers.
>>>
>>>
>>
>> Hello Michel, Timothy, Marek
>>
>> Yes, I think it should limit the number of transfer to a million. And
>> also uses fence to measure the PBO transfer.
>>
>>
>> However, I have found others crashes on PCSX2 with those patches. It
>> seems related to synchronization issue with GLX/DRI/X11. This series
>> removes most of the gl sync for PCSX2. So any missing sync will trigger
>> a crash. Or I got a not obvious bug in my patches.
>>
>>
>> Please find a backtrace below of a crash during a draw. I manage to get a similar backtrace (i.e.
>> same exception in _XReply/dequeue_pending_request) when I call XGetGeometry.
>>
>>
>> #4 0xf61ec777 in __GI___assert_fail (assertion=0xf6122099 "!xcb_xlib_unknown_req_in_deq", file=0xf6122067 "../../src/xcb_io.c", line=179, function=0xf612248d <__PRETTY_FUNCTION__.14063> "dequeue_pending_request")
>> at assert.c:101
>> #5 0xf60abbcd in dequeue_pending_request (dpy=<optimized out>, req=<optimized out>) at ../../src/xcb_io.c:185
>> #6 0xf60aca17 in _XReply (dpy=0xe8fdde80, rep=0xcd46b910, extra=6, discard=0) at ../../src/xcb_io.c:639
>> #7 0xf3bba8df in DRI2GetBuffersWithFormat (dpy=0xe8fdde80, drawable=83886261, width=0xd8ba11e8, height=0xd8ba11ec, attachments=0xcd46ba38, count=1, outCount=0xcd46ba24) at dri2.c:485
>> #8 0xf3bbac45 in dri2GetBuffersWithFormat (driDrawable=0xd8ba11d0, width=0xd8ba11e8, height=0xd8ba11ec, attachments=0xcd46ba38, count=1, out_count=0xcd46ba24, loaderPrivate=0xf225df10) at dri2_glx.c:894
>> #9 0xd555e121 in dri2_drawable_get_buffers (count=<synthetic pointer>, atts=0xa15f8b20, drawable=0xa2e50a00) at dri2.c:285
>> #10 dri2_allocate_textures (ctx=0xd8b98810, drawable=0xa2e50a00, statts=0xa15f8b20, statts_count=2) at dri2.c:480
>> #11 0xd5557bc0 in dri_st_framebuffer_validate (stctx=0x9df20900, stfbi=0xa2e50a00, statts=0xa15f8b20, count=2, out=0xcd46bb80) at dri_drawable.c:83
>> #12 0xd533ae8a in st_framebuffer_validate (stfb=stfb at entry=0xa15f8780, st=st at entry=0x9df20900) at state_tracker/st_manager.c:189
>>
>>
>> I don't have any clue on the GLX/DRI/X11 interaction with OpenGL. If
>> someone have any idea, feel free to share :)
>
> If it can help, here the backtrace from XGetGeometry which I can "easily" trigger. I
> only hit once the above trace. Note that above trace was inside glthread whereas
> XGetGeometry is from the application thread.
>
> #4 0xf61ec777 in __GI___assert_fail (assertion=0xf6122099 "!xcb_xlib_unknown_req_in_deq", file=0xf6122067 "../../src/xcb_io.c", line=179, function=0xf612248d <__PRETTY_FUNCTION__.14063> "dequeue_pending_request")
> at assert.c:101
> #5 0xf60abbcd in dequeue_pending_request (dpy=<optimized out>, req=<optimized out>) at ../../src/xcb_io.c:185
> #6 0xf60aca17 in _XReply (dpy=0x8ad3ca80, rep=0xd637b89c, extra=6, discard=1) at ../../src/xcb_io.c:639
> #7 0xf6090a9e in XGetGeometry (dpy=0x8ad3ca80, d=83886309, root=0xd637ba40, x=0xd637ba80, y=0xd637bac0, width=0xd637b980, height=0xd637b940, borderWidth=0xd637b9c0, depth=0xd637ba00) at ../../src/GetGeom.c:47
> #8 0xe5d868b8 in GSWndOGL::GetClientRect (this=0xd8a6509c) at ../plugins/GSdx/GSWndOGL.cpp:219
So, unless the application made sure that XInitThreads was called before
any other libX11 APIs, and all libX11 API calls in the application and
in Mesa are guarded by XLockDisplay/XUnlockDisplay, this is invalid
libX11 API usage, and a crash is expected.
BTW, in addition to what I wrote in my other post, I think this boils
down to: Mesa can only call any libX11 APIs from the main thread, not
from the glthread.
In some cases, an alternative might be using XCB APIs directly instead
of libX11 APIs.
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
More information about the mesa-dev
mailing list