<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">On 2025-01-10 11:23, Chen, Xiaogang
wrote:<br>
</div>
<blockquote type="cite" cite="mid:7fba9b16-4bb7-44c2-bc7e-d455024ce2b7@amd.com">
<br>
On 1/10/2025 8:37 AM, Philip Yang wrote:
<br>
<blockquote type="cite">
<br>
<br>
On 2025-01-10 02:49, Emily Deng wrote:
<br>
<blockquote type="cite">For partial migrate from ram to vram,
the migrate->cpages is not
<br>
equal to migrate->npages, should use migrate->npages to
check all needed
<br>
migrate pages which could be copied or not.
<br>
<br>
And only need to set those pages could be migrated to
migrate->dst[i], or
<br>
the migrate_vma_pages will migrate the wrong pages based on
the migrate->dst[i].
<br>
<br>
v2:
<br>
Add mpages to break the loop earlier.
<br>
<br>
v3:
<br>
Uses MIGRATE_PFN_MIGRATE to identify whether page could be
migrated.
<br>
</blockquote>
<br>
The error handling need below change, with that fixed, this
patch is
<br>
<br>
Reviewed-by: Philip Yang<a class="moz-txt-link-rfc2396E" href="mailto:Philip.Yang@amd.com"><Philip.Yang@amd.com></a>
<br>
<br>
<blockquote type="cite">Signed-off-by: Emily
Deng<a class="moz-txt-link-rfc2396E" href="mailto:Emily.Deng@amd.com"><Emily.Deng@amd.com></a>
<br>
---
<br>
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 17
++++++++++-------
<br>
1 file changed, 10 insertions(+), 7 deletions(-)
<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
<br>
index 4b275937d05e..bfaccabeb3a0 100644
<br>
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
<br>
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
<br>
@@ -278,10 +278,11 @@ svm_migrate_copy_to_vram(struct kfd_node
*node, struct svm_range *prange,
<br>
struct migrate_vma *migrate, struct dma_fence
**mfence,
<br>
dma_addr_t *scratch, uint64_t ttm_res_offset)
<br>
{
<br>
- uint64_t npages = migrate->cpages;
<br>
+ uint64_t npages = migrate->npages;
<br>
struct amdgpu_device *adev = node->adev;
<br>
struct device *dev = adev->dev;
<br>
struct amdgpu_res_cursor cursor;
<br>
+ uint64_t mpages = 0;
<br>
dma_addr_t *src;
<br>
uint64_t *dst;
<br>
uint64_t i, j;
<br>
@@ -295,14 +296,16 @@ svm_migrate_copy_to_vram(struct kfd_node
*node, struct svm_range *prange,
<br>
amdgpu_res_first(prange->ttm_res, ttm_res_offset,
<br>
npages << PAGE_SHIFT, &cursor);
<br>
- for (i = j = 0; i < npages; i++) {
<br>
+ for (i = j = 0; (i < npages) && (mpages <
migrate->cpages); i++) {
<br>
struct page *spage;
<br>
- dst[i] = cursor.start + (j << PAGE_SHIFT);
<br>
- migrate->dst[i] = svm_migrate_addr_to_pfn(adev,
dst[i]);
<br>
- svm_migrate_get_vram_page(prange,
migrate->dst[i]);
<br>
- migrate->dst[i] = migrate_pfn(migrate->dst[i]);
<br>
-
<br>
+ if (migrate->src[i] & MIGRATE_PFN_MIGRATE) {
<br>
+ dst[i] = cursor.start + (j << PAGE_SHIFT);
<br>
+ migrate->dst[i] =
svm_migrate_addr_to_pfn(adev, dst[i]);
<br>
+ svm_migrate_get_vram_page(prange,
migrate->dst[i]);
<br>
+ migrate->dst[i] =
migrate_pfn(migrate->dst[i]);
<br>
+ mpages++;
<br>
+ }
<br>
spage = migrate_pfn_to_page(migrate->src[i]);
<br>
if (spage && !is_zone_device_page(spage)) {
<br>
src[i] = dma_map_page(dev, spage, 0, PAGE_SIZE,
<br>
</blockquote>
out_free_vram_pages:
<br>
if (r) {
<br>
pr_debug("failed %d to copy memory to vram\n",
r);
<br>
- while (i--) {
<br>
+
<br>
+ for (i = 0; i < npages && mpages;
i++) {
<br>
+ if (!dst[i])
<br>
+ continue;
<br>
svm_migrate_put_vram_page(adev, dst[i]);
<br>
migrate->dst[i] = 0;
<br>
+ mpages--;
<br>
}
<br>
}
<br>
</blockquote>
<br>
This error handing not need recover all vram pages as error
happened at middle. Can use se do {....} while(i--);
<br>
</blockquote>
no, for example migrate npage=cpage=4, and outside for loop,
svm_migrate_copy_memory_gart failed, dst[4] is out of range access.<br>
<blockquote type="cite" cite="mid:7fba9b16-4bb7-44c2-bc7e-d455024ce2b7@amd.com">
<br>
Regards
<br>
<br>
Xiaogang
<br>
<br>
</blockquote>
</body>
</html>