[mipsel+rs780e]Occasionally "GPU lockup" after resuming from suspend.

Chen Jie chenj at lemote.com
Tue Feb 21 02:37:06 PST 2012


在 2012年2月17日 下午5:27,Chen Jie <chenj at lemote.com> 写道:
>> One good way to test gart is to go over GPU gart table and write a
>> dword using the GPU at end of each page something like 0xCAFEDEAD
>> or somevalue that is unlikely to be already set. And then go over
>> all the page and check that GPU write succeed. Abusing the scratch
>> register write back feature is the easiest way to try that.
> I'm planning to add a GART table check procedure when resume, which
> will go over GPU gart table:
> 1. read(backup) a dword at end of each GPU page
> 2. write a mark by GPU and check it
> 3. restore the original dword
Attachment validateGART.patch do the job:
* It current only works for mips64 platform.
* To use it, apply all_in_vram.patch first, which will allocate CP
ring, ih, ib in VRAM and hard code no_wb=1.

The gart test routine will be invoked in r600_resume. We've tried it,
and find that when lockup happened the gart table was good before
userspace restarting. The related dmesg follows:
[ 1521.820312] [drm] r600_gart_table_validate(): Validate GART Table
at 9000000040040000, 32768 entries, Dummy
Page[0x000000000e004000-0x000000000e007fff]
[ 1522.019531] [drm] r600_gart_table_validate(): Sweep 32768
entries(valid=8544, invalid=24224, total=32768).
...
[ 1531.156250] PM: resume of devices complete after 9396.588 msecs
[ 1532.152343] Restarting tasks ... done.
[ 1544.468750] radeon 0000:01:05.0: GPU lockup CP stall for more than 10003msec
[ 1544.472656] ------------[ cut here ]------------
[ 1544.480468] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:243
radeon_fence_wait+0x25c/0x314()
[ 1544.488281] GPU lockup (waiting for 0x0002136B last fence id 0x0002136A)
...
[ 1544.886718] radeon 0000:01:05.0: Wait for MC idle timedout !
[ 1545.046875] radeon 0000:01:05.0: Wait for MC idle timedout !
[ 1545.062500] radeon 0000:01:05.0: WB disabled
[ 1545.097656] [drm] ring test succeeded in 0 usecs
[ 1545.105468] [drm] ib test succeeded in 0 usecs
[ 1545.109375] [drm] Enabling audio support
[ 1545.113281] [drm] r600_gart_table_validate(): Validate GART Table
at 9000000040040000, 32768 entries, Dummy
Page[0x000000000e004000-0x000000000e007fff]
[ 1545.125000] [drm:r600_gart_table_validate] *ERROR* Iter=0:
unexpected value 0x745aaad1(expect 0xDEADBEEF)
entry=0x000000000e008067, orignal=0x745aaad1
...
/* System blocked here. */

Any idea?

BTW, we find the following in r600_pcie_gart_enable()
(drivers/gpu/drm/radeon/r600.c):
WREG32(VM_CONTEXT0_PROTECTION_FAULT_DEFAULT_ADDR,
(u32)(rdev->dummy_page.addr >> 12));

On our platform, PAGE_SIZE is 16K, does it have any problem?

Also in radeon_gart_unbind() and radeon_gart_restore(), the logic
should change to:
  for (j = 0; j < (PAGE_SIZE / RADEON_GPU_PAGE_SIZE); j++, t++) {
          radeon_gart_set_page(rdev, t, page_base);
-         page_base += RADEON_GPU_PAGE_SIZE;
+         if (page_base != rdev->dummy_page.addr)
+                 page_base += RADEON_GPU_PAGE_SIZE;
  }
???



Regards,
-- Chen Jie
-------------- next part --------------
A non-text attachment was scrubbed...
Name: all_in_vram.patch
Type: text/x-patch
Size: 3971 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20120221/cac7c118/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: validateGART.patch
Type: text/x-patch
Size: 3947 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20120221/cac7c118/attachment-0003.bin>


More information about the dri-devel mailing list