[PATCH] drm/nouveau: Fix bug in buffer relocs for Nouveau
John Ogness
john.ogness at linutronix.de
Mon Mar 27 08:42:48 UTC 2023
On 2023-01-19, Tanmay Bhushan <007047221b at gmail.com> wrote:
> dma_resv_wait_timeout returns greater than zero on success
> as opposed to ttm_bo_wait_ctx. As a result of that relocs
> will fail and give failure even when it was a success.
Today I switched my workstation from 6.2 to 6.3-rc3 and started seeing
lots of new kernel messages:
[ 642.138313][ T1751] nouveau 0000:f0:10.0: X[1751]: reloc wait_idle failed: 1500
[ 642.138389][ T1751] nouveau 0000:f0:10.0: X[1751]: reloc apply: 1500
[ 646.123490][ T1751] nouveau 0000:f0:10.0: X[1751]: reloc wait_idle failed: 1500
[ 646.123573][ T1751] nouveau 0000:f0:10.0: X[1751]: reloc apply: 1500
The graphics seemed to go slower or hang a bit when these messages would
appear. I then found your patch! However, I have some comments about it.
First, it should include a fixes tag:
Fixes: 41d351f29528 ("drm/nouveau: stop using ttm_bo_wait")
> Signed-off-by: Tanmay Bhushan <007047221b at gmail.com>
> ---
> drivers/gpu/drm/nouveau/nouveau_gem.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
> index f77e44958037..0e3690459144 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_gem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
> @@ -706,9 +706,8 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli,
> ret = dma_resv_wait_timeout(nvbo->bo.base.resv,
> DMA_RESV_USAGE_BOOKKEEP,
> false, 15 * HZ);
> - if (ret == 0)
> + if (ret <= 0) {
> ret = -EBUSY;
This is incorrect for 2 reasons:
* it treats restarts as timeouts
* this function now returns >0 on success
> - if (ret) {
> NV_PRINTK(err, cli, "reloc wait_idle failed: %ld\n",
> ret);
> break;
I rearranged things to basically correctly translate the return code of
dma_resv_wait_timeout() to match the previous ttm_bo_wait():
ret = dma_resv_wait_timeout(nvbo->bo.base.resv,
DMA_RESV_USAGE_BOOKKEEP,
false, 15 * HZ);
if (ret == 0)
ret = -EBUSY;
if (ret > 0)
ret = 0;
if (ret) {
NV_PRINTK(err, cli, "reloc wait_idle failed: %ld\n",
ret);
break;
}
So the patch just becomes:
@@ -708,6 +708,8 @@ nouveau_gem_pushbuf_reloc_apply(struct n
false, 15 * HZ);
if (ret == 0)
ret = -EBUSY;
+ if (ret > 0)
+ ret = 0;
if (ret) {
NV_PRINTK(err, cli, "reloc wait_idle failed: %ld\n",
ret);
With this variant, everything runs correctly on my workstation again.
It probably deserves a comment about why @ret is being translated. Or
perhaps a new variable should be introduced to separate the return value
of dma_resv_wait_timeout() from the return value of this function.
Either way, this is an important fix for 6.3-rc!
John Ogness
More information about the dri-devel
mailing list