[Mesa-dev] [PATCH v4 0/3] asynchronous pbo transfer with glthread

Tue Apr 18 06:51:24 UTC 2017

On Mon, 17 Apr 2017 11:17:42 +0900
Michel Dänzer <michel at daenzer.net> wrote:

> On 15/04/17 05:08 PM, gregory hainaut wrote:
> > On Sat, 15 Apr 2017 00:50:15 +0200
> > Dieter Nützel <Dieter at nuetzel-hh.de> wrote:
> > 
> >> Am 14.04.2017 07:53, schrieb gregory hainaut:
> >>> On Fri, 14 Apr 2017 05:20:38 +0200
> >>> Dieter Nützel <Dieter at nuetzel-hh.de> wrote:
> >>>
> >>>> Am 14.04.2017 02:06, schrieb Dieter Nützel:
> >>>>> Hello Gregory,
> >>>>>
> >>>>> have you tested this with Mesa-demos/tests/pbo 'b' (benchmark)?
> >>>>> It result in crazy numbers and do not 'return' (one core stays @ 100%).
> >>>>
> >>>> This is related to 'mesa_glthread=true'.
> >>>> If I disable (unset) it, all is fine after 'b' benchmark and 'pbo' 
> >>>> exit
> >>>> with ESC as expeted.
> >>>> Crazy numbers stay, maybe counter overrun due to BIG numbers? ;-)
> >>>>
> >>>> Hope that helps.
> >>>>
> >>>> Dieter
> >>>
> >>> Hello Dieter,
> >>>
> >>> I tested the demo. There is a pseudo unrelated bug on the exit of the
> >>> application.
> >>>
> >>> Mesa 17.1.0-devel implementation error: In _mesa_DeleteHashTable,
> >>> found non-freed data
> >>>
> >>> I will add a call to a _mesa_HashDeleteAll to fix it.
> >>> i.e. _mesa_HashDeleteAll(shared->ShadowBufferObjects, dummy_cb, ctx);
> >>>
> >>> Now let's go back to the test behavior. The benchmarks will send 4s of
> >>> asynchronous PBO transfer commands. And then will sync gl_thread which
> >>> mean the application thread will be blocked until all PBO transfers are
> >>> done. Gl_thread is faster to dispatch command so you will need to wait
> >>> more before the thread goes back to real life.
> >>>
> >>> On my side, I need to wait around 45 seconds for 6 millions of 
> >>> commands.
> >>> Result:  6,440,627 reads (gl thread on + PBO patches)
> >>> Result:    274,960 reads (gl thread off)
> >>>
> >>> In your case, "Result:  77,444,412 reads", I hope you're patient.
> >>> I think you must wait at least 10 minutes.
> >>
> >> Now, I was patient...
> >> Tried 2 times but after ~20 minutes I've killed it at first and attached 
> >> gdb at it during second run.
> >>
> >> 0x00007fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> >> /lib64/libpthread.so.0
> >> (gdb) bt
> >> #0  0x00007fbda686e9a6 in pthread_cond_wait@@GLIBC_2.3.2 () from 
> >> /lib64/libpthread.so.0
> >> #1  0x00007fbda5359453 in ?? () from /usr/local/lib/dri/r600_dri.so
> >> #2  0x00007fbda53661f4 in ?? () from /usr/local/lib/dri/r600_dri.so
> >> #3  0x0000000000401e18 in ?? ()
> >> #4  0x00000000004028c7 in ?? ()
> >> #5  0x00007fbda9925781 in fghRedrawWindow () from 
> >> /usr/lib64/libglut.so.3
> >> #6  0x00007fbda9925c08 in ?? () from /usr/lib64/libglut.so.3
> >> #7  0x00007fbda9926cf9 in fgEnumWindows () from /usr/lib64/libglut.so.3
> >> #8  0x00007fbda9925ce4 in glutMainLoopEvent () from 
> >> /usr/lib64/libglut.so.3
> >> #9  0x00007fbda9925d85 in glutMainLoop () from /usr/lib64/libglut.so.3
> >> #10 0x00000000004019fc in ?? ()
> >> #11 0x00007fbda957e541 in __libc_start_main () from /lib64/libc.so.6
> >> #12 0x0000000000401afa in ?? ()
> >>
> >> Should I do more or not worth it?
> >>
> >> Dieter
> > 
> > Hello Dieter,
> > 
> > To be honest, I don't konw how much time you need to wait. 77 millions of
> > PBO transfer is quite huge. It depends on CPU/Memory/PCIe/VRAM/GPU speed.
> > 
> > Hum based on the image size (194*188*4), you need to approximately transfer
> > 10522 GB of data from your GPU... Which is likely around 20 minutes if
> > PCIe run at full speed. Honestly I will let the application in background
> > for a couple of hours.
> 
> Basically, the application needs to be fixed not to emit an unlimited
> number of PBO transfers without doing anything which requires
> synchronizing to the transfers.
> 
> 

Hello Michel, Timothy, Marek

Yes, I think it should limit the number of transfer to a million. And
also uses fence to measure the PBO transfer.

However, I have found others crashes on PCSX2 with those patches. It
seems related to synchronization issue with GLX/DRI/X11. This series
removes most of the gl sync for PCSX2. So any missing sync will trigger
a crash. Or I got a not obvious bug in my patches.

Please find a backtrace below of a crash during a draw. I manage to get a similar backtrace (i.e. 
same exception in _XReply/dequeue_pending_request) when I call XGetGeometry.

#4  0xf61ec777 in __GI___assert_fail (assertion=0xf6122099 "!xcb_xlib_unknown_req_in_deq", file=0xf6122067 "../../src/xcb_io.c", line=179, function=0xf612248d <__PRETTY_FUNCTION__.14063> "dequeue_pending_request")
    at assert.c:101
#5  0xf60abbcd in dequeue_pending_request (dpy=<optimized out>, req=<optimized out>) at ../../src/xcb_io.c:185
#6  0xf60aca17 in _XReply (dpy=0xe8fdde80, rep=0xcd46b910, extra=6, discard=0) at ../../src/xcb_io.c:639
#7  0xf3bba8df in DRI2GetBuffersWithFormat (dpy=0xe8fdde80, drawable=83886261, width=0xd8ba11e8, height=0xd8ba11ec, attachments=0xcd46ba38, count=1, outCount=0xcd46ba24) at dri2.c:485
#8  0xf3bbac45 in dri2GetBuffersWithFormat (driDrawable=0xd8ba11d0, width=0xd8ba11e8, height=0xd8ba11ec, attachments=0xcd46ba38, count=1, out_count=0xcd46ba24, loaderPrivate=0xf225df10) at dri2_glx.c:894
#9  0xd555e121 in dri2_drawable_get_buffers (count=<synthetic pointer>, atts=0xa15f8b20, drawable=0xa2e50a00) at dri2.c:285
#10 dri2_allocate_textures (ctx=0xd8b98810, drawable=0xa2e50a00, statts=0xa15f8b20, statts_count=2) at dri2.c:480
#11 0xd5557bc0 in dri_st_framebuffer_validate (stctx=0x9df20900, stfbi=0xa2e50a00, statts=0xa15f8b20, count=2, out=0xcd46bb80) at dri_drawable.c:83
#12 0xd533ae8a in st_framebuffer_validate (stfb=stfb at entry=0xa15f8780, st=st at entry=0x9df20900) at state_tracker/st_manager.c:189

I don't have any clue on the GLX/DRI/X11 interaction with OpenGL. If
someone have any idea, feel free to share :)

Best regards,
Gregory