<html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body text="#000000" bgcolor="#FFFFFF"> <div class="moz-cite-prefix">Hi Nocolai, If we don't already have an option for this try to double the size of the VM area allocate for each BO in userspace. That should give you a nice hole between each BO and so should help to catch cases when somebody writes over the end of a BO. Regards, Christian. Am 22.06.2016 um 09:50 schrieb Nicolai Hähnle: </div> <blockquote cite="mid:576A4332.1090409@gmail.com" type="cite">Hi Mads, setting R600_DEBUG=nodma in the X server should work around your problem for now. Marek, perhaps an out-of-bounds check for tiled texture memory access similar to the linear access check is necessary? I wonder if you've seen something about that in the docs. I've annotated the sDMA IB dump. It's a linear-to-display-tiled copy on Carrizo. I tried to reproduce with the attached patch, but failed to do so even with amdgpu.vm_debug=1. With the patch, I get DMA copies that are identical to the one that causes the VM fault except for a different bank_height and macro_tile_aspect, so the issue is likely related to those. Nicolai On 21.06.2016 19:32, Nicolai Hähnle wrote: <blockquote type="cite">On 21.06.2016 19:16, Mads wrote: <blockquote type="cite">I sent this for 1.5 hours ago, but since it hasn't arrived to the mailing list yet, I try again... </blockquote> It arrived, no worries :) I'll take a look later. Nicolai <blockquote type="cite"> On 2016-06-21 17:48, Mads wrote: <blockquote type="cite">On 2016-06-21 10:12, Mads wrote: On 2016-06-21 09:39, Nicolai Hähnle wrote: Thanks. However, I still don't think this is going to help. Your earlier trace experiments showed that the problematic SDMA commands came from the X server, _not_ from plasmashell. So what we see here is likely just the first set of GPU commands sent by plasmashell after the VM fault occurred. Since the plasmashell process is unable to tell who caused the VM fault, it takes the blame incorrectly. Are you sure the X server is using your self-compiled radeonsi_dri.so and has the environment variable set? If it creates a ddebug_dump, it might be somewhere else (it's based off the HOME environment variable, which may be different). I'll take a second look to see if there's an X dump there too, but unfortunately it'll be in about ~8 hours before I have the machine at hand again.. And yes, I'm sure, everything is built through portage, so there is no "self-compiled" on the system per se. There's always just one lib available at any time :) </blockquote> You were right! X didn't have R600_DEBUG=check_vm in environment (no login shell/sourcing of /etc/profile). Here's what i ran: <blockquote type="cite">$ XAUTHORITY=.Xauthority DISPLAY=:0 LIBGL_DEBUG=verbose dolphin libGL: pci id for fd 9: 1002:9874, driver radeonsi libGL: OpenDriver: trying /usr/lib64/dri/tls/radeonsi_dri.so libGL: OpenDriver: trying /usr/lib64/dri/radeonsi_dri.so si_vm_fault_occured: failed to parse line ' Either enable ECC checking or force module loading by setting 'ecc_enable_override'. ' libGL: Using DRI3 for screen 0 Trying to convert empty KLocalizedString to QString. Cannot creat accessible child interface for object: PlacesView(0x118d670) index: 5 QPixmap::scaled: Pixmap is a null pixmap QPixmap::scaled: Pixmap is a null pixmap (... etc ...) The X11 connection broke (error 1). Did the X11 server die? </blockquote> Attaching dmesg and ddebug_dump. - Mads </blockquote> </blockquote> <fieldset class="mimeAttachmentHeader"></fieldset> <pre wrap="">_______________________________________________ amd-gfx mailing list <a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a> <a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a> </pre> </blockquote> </body> </html>