[Bug 102322] System crashes after "[drm] IP block:gmc_v8_0 is hung!" / [drm] IP block:sdma_v3_0 is hung!
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Thu Jun 28 04:17:19 UTC 2018
https://bugs.freedesktop.org/show_bug.cgi?id=102322
--- Comment #15 from Andrey Grodzovsky <andrey.grodzovsky at amd.com> ---
(In reply to dwagner from comment #13)
> (In reply to Andrey Grodzovsky from comment #12)
> > Can you load the kernel with grub command line amdgpu.vm_update_mode=3 to
> > force CPU VM update mode and see if this helps ?
>
> Sure. Too early yet to say "hurray", but at an uptime of one hour,
> currently, 4.17.2 survived with amdgpu.vm_update_mode=3 already about 20
> times longer than without that option before the first crash.
>
> One (probably just informal) message is emitted by the kernel:
> [ 19.319565] CPU update of VM recommended only for large BAR system
>
> Can you explain a little: What is a "large BAR system", and what does the
> vm_update_mode=3 option actually cause? Should I expect any weird side
> effects to look for?
I think it just means systems with large VRAM so it will require large BAR for
mapping. But I am not sure on that point.
vm_update_mode=3 means GPUVM page tables update is done using CPU. By default
we do it using DMA engine on the ASIC. The log showed a hang in this engine so
I assumed there is something wrong with SDMA commands we submit.
I assume more CPU utilization as a side effect and maybe slower rendering.
>
>
> BTW: Not a result of that option, but of the kernel version, seems to be the
> fact that the shader clock keeps at a pretty high frequency all the time -
> even without any 3d or compute load, just displaying a quiet 4k/60Hz desktop
> image:
>
> cat pp_dpm_sclk
> 0: 214Mhz
> 1: 481Mhz
> 2: 760Mhz
> 3: 1020Mhz
> 4: 1102Mhz
> 5: 1138Mhz
> 6: 1180Mhz *
> 7: 1220Mhz
>
> Much lower shader clocks are used only if I lower the refresh rate of the
> screen. Is there a reason why the shader clocks should stay high even in the
> absence of 3d/compute load?
>
> (I would have better understood if the minimum memory clock was depending on
> the refresh rate, but memory clock stays as low as with the older kernels.)
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180628/695c751f/attachment.html>
More information about the dri-devel
mailing list