[PATCH] drm/amdgpu: Return -EINVAL when whole gpu reset happened
Liu ChengZhe
ChengZhe.Liu at amd.com
Wed Dec 9 09:46:52 UTC 2020
If CS init return -ECANCELED, UMD will free and create new context.
Job in this new context could conitnue exexcuting. In the case of
BACO or mode 1, we can't allow this happpen. Because VRAM has lost
after whole gpu reset, the job can't guarantee to succeed.
Signed-off-by: Liu ChengZhe <ChengZhe.Liu at amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 85e48c29a57c..2a98f58134ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -120,6 +120,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs
uint64_t *chunk_array;
unsigned size, num_ibs = 0;
uint32_t uf_offset = 0;
+ uint32_t vramlost_count = 0;
int i;
int ret;
@@ -140,7 +141,11 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs
/* skip guilty context job */
if (atomic_read(&p->ctx->guilty) == 1) {
- ret = -ECANCELED;
+ vramlost_count = atomic_read(&p->adev->vram_lost_counter);
+ if (p->ctx->vram_lost_counter != vramlost_count)
+ ret = -EINVAL;
+ else
+ ret = -ECANCELED;
goto free_chunk;
}
@@ -246,7 +251,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs
goto free_all_kdata;
if (p->ctx->vram_lost_counter != p->job->vram_lost_counter) {
- ret = -ECANCELED;
+ ret = -EINVAL;
goto free_all_kdata;
}
--
2.25.1
More information about the amd-gfx
mailing list