[PATCH] amdgpu: disable GPU reset if amdgpu.lockup_timeout=0
Marek Olšák
maraeo at gmail.com
Tue Dec 12 15:02:17 UTC 2017
On Tue, Dec 12, 2017 at 4:18 AM, Liu, Monk <Monk.Liu at amd.com> wrote:
> NAK, you change break SRIOV logic:
>
> Without lockup_timeout set, this gpu_recover() won't get called at all , unless your IB triggered invalid instruct and that IRQ invoked
> Amdgpu_gpu_recover(), by this cause you should disable the logic that in that IRQ instead of change gpu_recover() itself because
> For SRIOV we need gpu_recover() even lockup_timeout is zero
The default value of 0 indicates that GPU reset isn't ready to be
enabled by default. That's what it means. Once the GPU reset works,
the default should be non-zero (e.g. 10000) and
amdgpu.lockup_timeout=0 should be used to disable all GPU resets in
order to be able do scandumps and debug GPU hangs.
Marek
More information about the amd-gfx
mailing list