<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 2024-04-22 10:40, Christian König
      wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:60f0ec7b-52d4-4e8c-a58a-91824af9db5c@amd.com">Am
      22.04.24 um 15:57 schrieb Philip Yang:
      <br>
      <blockquote type="cite">Define macro MAX_SG_SEGMENT_SIZE 2GB,
        because struct scatterlist length
        <br>
        is unsigned int, and some users of it cast to a signed int, so
        every
        <br>
        segment of sg table is limited to size 2GB maximum.
        <br>
        <br>
        For contiguous VRAM allocation, don't limit the max buddy block
        size in
        <br>
        order to get contiguous VRAM memory. To workaround the sg table
        segment
        <br>
        size limit, allocate multiple segments if contiguous size is
        bigger than
        <br>
        MAX_SG_SEGMENT_SIZE.
        <br>
        <br>
        Signed-off-by: Philip Yang <a class="moz-txt-link-rfc2396E" href="mailto:Philip.Yang@amd.com"><Philip.Yang@amd.com></a>
        <br>
        ---
        <br>
          drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 17
        ++++++++++++-----
        <br>
          1 file changed, 12 insertions(+), 5 deletions(-)
        <br>
        <br>
        diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
        b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
        <br>
        index 4be8b091099a..9fe56a21ef88 100644
        <br>
        --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
        <br>
        +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
        <br>
        @@ -31,6 +31,8 @@
        <br>
          #include "amdgpu_atomfirmware.h"
        <br>
          #include "atom.h"
        <br>
          +#define MAX_SG_SEGMENT_SIZE    (2UL << 30)
        <br>
        +
        <br>
          struct amdgpu_vram_reservation {
        <br>
              u64 start;
        <br>
              u64 size;
        <br>
        @@ -532,8 +534,13 @@ static int amdgpu_vram_mgr_new(struct
        ttm_resource_manager *man,
        <br>
                    BUG_ON(min_block_size < mm->chunk_size);
        <br>
          -        /* Limit maximum size to 2GiB due to SG table
        limitations */
        <br>
        -        size = min(remaining_size, 2ULL << 30);
        <br>
        +        if (place->flags & TTM_PL_FLAG_CONTIGUOUS)
        <br>
        +            size = remaining_size;
        <br>
        +        else
        <br>
        +            /* Limit maximum size to 2GiB due to SG table
        limitations
        <br>
        +             * for no contiguous allocation.
        <br>
        +             */
        <br>
        +            size = min(remaining_size, MAX_SG_SEGMENT_SIZE);
        <br>
      </blockquote>
      <br>
      Well that doesn't make sense, either fix the creation of the sg
      tables or limit the segment size. Not both.
      <br>
    </blockquote>
    <p>yes, right. we don't need limit the segment size for
      non-contiguous allocation either as this is handled by
      min_block_size. I will send v4 patch to fix this. Then we could
      have another patch to remove the while loop, size and remaining
      size to simply the code in future.</p>
    <p>Regards,</p>
    <p>Philip<br>
    </p>
    <blockquote type="cite" cite="mid:60f0ec7b-52d4-4e8c-a58a-91824af9db5c@amd.com">
      <br>
      <blockquote type="cite">            if ((size >=
        (u64)pages_per_block << PAGE_SHIFT) &&
        <br>
                          !(size & (((u64)pages_per_block <<
        PAGE_SHIFT) - 1)))
        <br>
        @@ -675,7 +682,7 @@ int amdgpu_vram_mgr_alloc_sgt(struct
        amdgpu_device *adev,
        <br>
              amdgpu_res_first(res, offset, length, &cursor);
        <br>
              while (cursor.remaining) {
        <br>
                  num_entries++;
        <br>
        -        amdgpu_res_next(&cursor, cursor.size);
        <br>
        +        amdgpu_res_next(&cursor, min(cursor.size,
        MAX_SG_SEGMENT_SIZE));
        <br>
              }
        <br>
                r = sg_alloc_table(*sgt, num_entries, GFP_KERNEL);
        <br>
        @@ -695,7 +702,7 @@ int amdgpu_vram_mgr_alloc_sgt(struct
        amdgpu_device *adev,
        <br>
              amdgpu_res_first(res, offset, length, &cursor);
        <br>
              for_each_sgtable_sg((*sgt), sg, i) {
        <br>
                  phys_addr_t phys = cursor.start +
        adev->gmc.aper_base;
        <br>
        -        size_t size = cursor.size;
        <br>
        +        unsigned long size = min(cursor.size,
        MAX_SG_SEGMENT_SIZE);
        <br>
      </blockquote>
      <br>
      Please keep size_t here or use unsigned int, using unsigned long
      just looks like trying to hide the problem.
      <br>
      <br>
      And I wouldn't use a separate define but rather just INT_MAX
      instead.
      <br>
      <br>
      Regards,
      <br>
      Christian.
      <br>
      <br>
      <blockquote type="cite">          dma_addr_t addr;
        <br>
                    addr = dma_map_resource(dev, phys, size, dir,
        <br>
        @@ -708,7 +715,7 @@ int amdgpu_vram_mgr_alloc_sgt(struct
        amdgpu_device *adev,
        <br>
                  sg_dma_address(sg) = addr;
        <br>
                  sg_dma_len(sg) = size;
        <br>
          -        amdgpu_res_next(&cursor, cursor.size);
        <br>
        +        amdgpu_res_next(&cursor, size);
        <br>
              }
        <br>
                return 0;
        <br>
      </blockquote>
      <br>
    </blockquote>
  </body>
</html>