<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Am 17.05.2017 um 10:53 schrieb zhoucm1:<br>
    </div>
    <blockquote cite="mid:591C0F9E.2030800@amd.com" type="cite">
      <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
      <br>
      <br>
      <div class="moz-cite-prefix">On 2017年05月17日 16:48, Christian König
        wrote:<br>
      </div>
      <blockquote
        cite="mid:bec8a5a3-7d62-2ff5-96b9-2c03afec1483@vodafone.de"
        type="cite">
        <meta http-equiv="Content-Type" content="text/html;
          charset=utf-8">
        <div class="moz-cite-prefix">Am 17.05.2017 um 03:54 schrieb
          zhoucm1:<br>
        </div>
        <blockquote cite="mid:591BAD6C.2070605@amd.com" type="cite"> <br>
          <br>
          <div class="moz-cite-prefix">On 2017年05月17日 05:02,
            Kasiviswanathan, Harish wrote:<br>
          </div>
          <blockquote
cite="mid:CY1PR1201MB1034A467A20010323B44EAEC8CE60@CY1PR1201MB1034.namprd12.prod.outlook.com"
            type="cite">
            <meta name="Generator" content="Microsoft Exchange Server">
            <!-- converted from text -->
            <style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
            <font size="2"><span style="font-size:10pt;">
                <div class="PlainText"><br>
                  <br>
                  -----Original Message-----<br>
                  From: Zhou, David(ChunMing) <br>
                  Sent: Monday, May 15, 2017 10:50 PM<br>
                  To: Kasiviswanathan, Harish <a moz-do-not-send="true"
                    class="moz-txt-link-rfc2396E"
                    href="mailto:Harish.Kasiviswanathan@amd.com"><Harish.Kasiviswanathan@amd.com></a>;
                  <a moz-do-not-send="true"
                    class="moz-txt-link-abbreviated"
                    href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a><br>
                  Subject: Re: [PATCH 4/5] drm/amdgpu: Support page
                  directory update via CPU<br>
                  <br>
                  <br>
                  <br>
                  On 2017年05月16日 05:32, Harish Kasiviswanathan wrote:<br>
                  > If amdgpu.vm_update_context param is set to use
                  CPU, then Page<br>
                  > Directories will be updated by CPU instead of
                  SDMA<br>
                  ><br>
                  > Signed-off-by: Harish Kasiviswanathan <a
                    moz-do-not-send="true" class="moz-txt-link-rfc2396E"
                    href="mailto:Harish.Kasiviswanathan@amd.com"><Harish.Kasiviswanathan@amd.com></a><br>
                  > ---<br>
                  >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 151
                  ++++++++++++++++++++++++---------<br>
                  >   1 file changed, 109 insertions(+), 42
                  deletions(-)<br>
                  ><br>
                  > diff --git
                  a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
                  b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                  > index 9c89cb2..d72a624 100644<br>
                  > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                  > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c<br>
                  > @@ -271,6 +271,7 @@ static int
                  amdgpu_vm_alloc_levels(struct amdgpu_device *adev,<br>
                  >                                  uint64_t saddr,
                  uint64_t eaddr,<br>
                  >                                  unsigned level)<br>
                  >   {<br>
                  > +     u64 flags;<br>
                  >        unsigned shift =
                  (adev->vm_manager.num_level - level) *<br>
                  >                adev->vm_manager.block_size;<br>
                  >        unsigned pt_idx, from, to;<br>
                  > @@ -299,6 +300,14 @@ static int
                  amdgpu_vm_alloc_levels(struct amdgpu_device *adev,<br>
                  >        saddr = saddr & ((1 << shift) -
                  1);<br>
                  >        eaddr = eaddr & ((1 << shift) -
                  1);<br>
                  >   <br>
                  > +     flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |<br>
                  > +                    
                  AMDGPU_GEM_CREATE_VRAM_CLEARED;<br>
                  > +     if (vm->use_cpu_for_update)<br>
                  > +             flags |=
                  AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;<br>
                  I think shadow flag is need for CPU case as well,
                  which is used to <br>
                  backup VM bo and meaningful when gpu reset.<br>
                  same comment for pd bo.<br>
                  <br>
                  [HK]: Yes support for shadow BOs are desirable and it
                  could be implemented as a separate commit. For
                  supporting shadow BOs the caller should explicitly add
                  shadow BOs into ttm_eu_reserve_buffer(..) to remove
                  the BO from TTM swap list or ttm_bo_kmap has to be
                  modified. This implementation for CPU update of VM
                  page tables is mainly for KFD usage. Graphics will use
                  for experimental and testing purpose. From KFD's view
                  point shadow BO are not useful because if GPU is reset
                  then all queue information is lost (since submissions
                  are done by user space) and it is not possible to
                  recover.<br>
                </div>
              </span></font></blockquote>
          <font size="2">Either way is fine to me.<br>
          </font></blockquote>
        <br>
        Actually I'm thinking about if we shouldn't completely drop the
        shadow handling.<br>
        <br>
        When VRAM is lost we now completely drop all jobs, so for new
        jobs we can recreate the page table content from the VM
        structures as well.<br>
      </blockquote>
      For KGD, I agree. if their process is using both KGD and KFD, I
      still think shadow bo is needed.<br>
      <br>
      <blockquote
        cite="mid:bec8a5a3-7d62-2ff5-96b9-2c03afec1483@vodafone.de"
        type="cite"> <br>
        When VRAM is not lost we don't need to restore the page tables.<br>
      </blockquote>
      In fact, our 'vram lost' detection isn't critical, I was told by
      other team, they encountered case that just some part vram is
      lost. So restoring page table seems still need for vram isn't
      lost.<br>
    </blockquote>
    <br>
    Ok, random VRAM corruption caused by a GPU reset is a good argument.
    So we should keep this feature.<br>
    <br>
    Regards,<br>
    Christian.<br>
    <br>
    <blockquote cite="mid:591C0F9E.2030800@amd.com" type="cite"> <br>
      Regards,<br>
      David Zhou<br>
      <blockquote
        cite="mid:bec8a5a3-7d62-2ff5-96b9-2c03afec1483@vodafone.de"
        type="cite"> <br>
        What do you think?<br>
      </blockquote>
      <br>
      <blockquote
        cite="mid:bec8a5a3-7d62-2ff5-96b9-2c03afec1483@vodafone.de"
        type="cite"> Regards,<br>
        Christian.<br>
        <br>
        <blockquote cite="mid:591BAD6C.2070605@amd.com" type="cite"><font
            size="2"> <br>
            David Zhou<br>
          </font>
          <blockquote
cite="mid:CY1PR1201MB1034A467A20010323B44EAEC8CE60@CY1PR1201MB1034.namprd12.prod.outlook.com"
            type="cite"><font size="2"><span style="font-size:10pt;">
                <div class="PlainText"> <br>
                  Regards,<br>
                  David Zhou<br>
                  > +     else<br>
                  > +             flags |=
                  (AMDGPU_GEM_CREATE_NO_CPU_ACCESS |<br>
                  > +                            
                  AMDGPU_GEM_CREATE_SHADOW);<br>
                  > +<br>
                  >        /* walk over the address space and
                  allocate the page tables */<br>
                  >        for (pt_idx = from; pt_idx <= to;
                  ++pt_idx) {<br>
                  >                struct reservation_object *resv =
                  vm->root.bo->tbo.resv;<br>
                  > @@ -310,10 +319,7 @@ static int
                  amdgpu_vm_alloc_levels(struct amdgpu_device *adev,<br>
                  >                                            
                  amdgpu_vm_bo_size(adev, level),<br>
                  >                                            
                  AMDGPU_GPU_PAGE_SIZE, true,<br>
                  >                                            
                  AMDGPU_GEM_DOMAIN_VRAM,<br>
                  > -                                         
                  AMDGPU_GEM_CREATE_NO_CPU_ACCESS |<br>
                  > -                                         
                  AMDGPU_GEM_CREATE_SHADOW |<br>
                  > -                                         
                  AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |<br>
                  > -                                         
                  AMDGPU_GEM_CREATE_VRAM_CLEARED,<br>
                  > +                                          flags,<br>
                  >                                             NULL,
                  resv, &pt);<br>
                  >                        if (r)<br>
                  >                                return r;<br>
                  > @@ -952,6 +958,43 @@ static uint64_t
                  amdgpu_vm_map_gart(const dma_addr_t *pages_addr,
                  uint64_t addr)<br>
                  >        return result;<br>
                  >   }<br>
                  >   <br>
                  > +/**<br>
                  > + * amdgpu_vm_cpu_set_ptes - helper to update
                  page tables via CPU<br>
                  > + *<br>
                  > + * @params: see amdgpu_pte_update_params
                  definition<br>
                  > + * @pe: kmap addr of the page entry<br>
                  > + * @addr: dst addr to write into pe<br>
                  > + * @count: number of page entries to update<br>
                  > + * @incr: increase next addr by incr bytes<br>
                  > + * @flags: hw access flags<br>
                  > + */<br>
                  > +static void amdgpu_vm_cpu_set_ptes(struct
                  amdgpu_pte_update_params *params,<br>
                  > +                                uint64_t pe,
                  uint64_t addr,<br>
                  > +                                unsigned count,
                  uint32_t incr,<br>
                  > +                                uint64_t flags)<br>
                  > +{<br>
                  > +     unsigned int i;<br>
                  > +<br>
                  > +     for (i = 0; i < count; i++) {<br>
                  > +            
                  amdgpu_gart_set_pte_pde(params->adev, (void *)pe,<br>
                  > +                                     i, addr,
                  flags);<br>
                  > +             addr += incr;<br>
                  > +     }<br>
                  > +<br>
                  > +     mb();<br>
                  > +     amdgpu_gart_flush_gpu_tlb(params->adev,
                  0);<br>
                  > +}<br>
                  > +<br>
                  > +static void amdgpu_vm_bo_wait(struct
                  amdgpu_device *adev, struct amdgpu_bo *bo)<br>
                  > +{<br>
                  > +     struct amdgpu_sync sync;<br>
                  > +<br>
                  > +     amdgpu_sync_create(&sync);<br>
                  > +     amdgpu_sync_resv(adev, &sync,
                  bo->tbo.resv, AMDGPU_FENCE_OWNER_VM);<br>
                  > +     amdgpu_sync_wait(&sync);<br>
                  > +     amdgpu_sync_free(&sync);<br>
                  > +}<br>
                  > +<br>
                  >   /*<br>
                  >    * amdgpu_vm_update_level - update a single
                  level in the hierarchy<br>
                  >    *<br>
                  > @@ -981,34 +1024,50 @@ static int
                  amdgpu_vm_update_level(struct amdgpu_device *adev,<br>
                  >   <br>
                  >        if (!parent->entries)<br>
                  >                return 0;<br>
                  > -     ring = container_of(vm->entity.sched,
                  struct amdgpu_ring, sched);<br>
                  >   <br>
                  > -     /* padding, etc. */<br>
                  > -     ndw = 64;<br>
                  > +     memset(&params, 0, sizeof(params));<br>
                  > +     params.adev = adev;<br>
                  > +     shadow = parent->bo->shadow;<br>
                  >   <br>
                  > -     /* assume the worst case */<br>
                  > -     ndw += parent->last_entry_used * 6;<br>
                  > +     WARN_ON(vm->use_cpu_for_update
                  && shadow);<br>
                  > +     if (vm->use_cpu_for_update &&
                  !shadow) {<br>
                  > +             r = amdgpu_bo_kmap(parent->bo,
                  (void **)&pd_addr);<br>
                  > +             if (r)<br>
                  > +                     return r;<br>
                  > +             amdgpu_vm_bo_wait(adev,
                  parent->bo);<br>
                  > +             params.func =
                  amdgpu_vm_cpu_set_ptes;<br>
                  > +     } else {<br>
                  > +             if (shadow) {<br>
                  > +                     r =
                  amdgpu_ttm_bind(&shadow->tbo,
                  &shadow->tbo.mem);<br>
                  > +                     if (r)<br>
                  > +                             return r;<br>
                  > +             }<br>
                  > +             ring =
                  container_of(vm->entity.sched, struct amdgpu_ring,<br>
                  > +                                 sched);<br>
                  >   <br>
                  > -     pd_addr =
                  amdgpu_bo_gpu_offset(parent->bo);<br>
                  > +             /* padding, etc. */<br>
                  > +             ndw = 64;<br>
                  >   <br>
                  > -     shadow = parent->bo->shadow;<br>
                  > -     if (shadow) {<br>
                  > -             r =
                  amdgpu_ttm_bind(&shadow->tbo,
                  &shadow->tbo.mem);<br>
                  > +             /* assume the worst case */<br>
                  > +             ndw += parent->last_entry_used *
                  6;<br>
                  > +<br>
                  > +             pd_addr =
                  amdgpu_bo_gpu_offset(parent->bo);<br>
                  > +<br>
                  > +             if (shadow) {<br>
                  > +                     shadow_addr =
                  amdgpu_bo_gpu_offset(shadow);<br>
                  > +                     ndw *= 2;<br>
                  > +             } else {<br>
                  > +                     shadow_addr = 0;<br>
                  > +             }<br>
                  > +<br>
                  > +             r = amdgpu_job_alloc_with_ib(adev,
                  ndw * 4, &job);<br>
                  >                if (r)<br>
                  >                        return r;<br>
                  > -             shadow_addr =
                  amdgpu_bo_gpu_offset(shadow);<br>
                  > -             ndw *= 2;<br>
                  > -     } else {<br>
                  > -             shadow_addr = 0;<br>
                  > -     }<br>
                  >   <br>
                  > -     r = amdgpu_job_alloc_with_ib(adev, ndw * 4,
                  &job);<br>
                  > -     if (r)<br>
                  > -             return r;<br>
                  > +             params.ib = &job->ibs[0];<br>
                  > +             params.func =
                  amdgpu_vm_do_set_ptes;<br>
                  > +     }<br>
                  >   <br>
                  > -     memset(&params, 0, sizeof(params));<br>
                  > -     params.adev = adev;<br>
                  > -     params.ib = &job->ibs[0];<br>
                  >   <br>
                  >        /* walk over the address space and update
                  the directory */<br>
                  >        for (pt_idx = 0; pt_idx <=
                  parent->last_entry_used; ++pt_idx) {<br>
                  > @@ -1043,15 +1102,15 @@ static int
                  amdgpu_vm_update_level(struct amdgpu_device *adev,<br>
                  >                                       
                  amdgpu_vm_adjust_mc_addr(adev, last_pt);<br>
                  >   <br>
                  >                                if (shadow)<br>
                  > -                                    
                  amdgpu_vm_do_set_ptes(&params,<br>
                  >
                  -                                                          
                  last_shadow,<br>
                  >
                  -                                                          
                  pt_addr, count,<br>
                  >
                  -                                                          
                  incr,<br>
                  >
                  -                                                          
                  AMDGPU_PTE_VALID);<br>
                  > -<br>
                  > -                            
                  amdgpu_vm_do_set_ptes(&params, last_pde,<br>
                  >
                  -                                                  
                  pt_addr, count, incr,<br>
                  >
                  -                                                  
                  AMDGPU_PTE_VALID);<br>
                  > +                                    
                  params.func(&params,<br>
                  > +                                                
                  last_shadow,<br>
                  > +                                                
                  pt_addr, count,<br>
                  > +                                                
                  incr,<br>
                  > +                                                
                  AMDGPU_PTE_VALID);<br>
                  > +<br>
                  > +                            
                  params.func(&params, last_pde,<br>
                  > +                                        
                  pt_addr, count, incr,<br>
                  > +                                        
                  AMDGPU_PTE_VALID);<br>
                  >                        }<br>
                  >   <br>
                  >                        count = 1;<br>
                  > @@ -1067,14 +1126,16 @@ static int
                  amdgpu_vm_update_level(struct amdgpu_device *adev,<br>
                  >                uint64_t pt_addr =
                  amdgpu_vm_adjust_mc_addr(adev, last_pt);<br>
                  >   <br>
                  >                if (vm->root.bo->shadow)<br>
                  > -                    
                  amdgpu_vm_do_set_ptes(&params, last_shadow,
                  pt_addr,<br>
                  > -                                          
                  count, incr, AMDGPU_PTE_VALID);<br>
                  > +                     params.func(&params,
                  last_shadow, pt_addr,<br>
                  > +                                 count, incr,
                  AMDGPU_PTE_VALID);<br>
                  >   <br>
                  > -             amdgpu_vm_do_set_ptes(&params,
                  last_pde, pt_addr,<br>
                  > -                                   count, incr,
                  AMDGPU_PTE_VALID);<br>
                  > +             params.func(&params, last_pde,
                  pt_addr,<br>
                  > +                         count, incr,
                  AMDGPU_PTE_VALID);<br>
                  >        }<br>
                  >   <br>
                  > -     if (params.ib->length_dw == 0) {<br>
                  > +     if (params.func == amdgpu_vm_cpu_set_ptes)<br>
                  > +             amdgpu_bo_kunmap(parent->bo);<br>
                  > +     else if (params.ib->length_dw == 0) {<br>
                  >                amdgpu_job_free(job);<br>
                  >        } else {<br>
                  >                amdgpu_ring_pad_ib(ring,
                  params.ib);<br>
                  > @@ -2309,6 +2370,7 @@ int amdgpu_vm_init(struct
                  amdgpu_device *adev, struct amdgpu_vm *vm,<br>
                  >        struct amdgpu_ring *ring;<br>
                  >        struct amd_sched_rq *rq;<br>
                  >        int r, i;<br>
                  > +     u64 flags;<br>
                  >   <br>
                  >        vm->va = RB_ROOT;<br>
                  >        vm->client_id =
                  atomic64_inc_return(&adev->vm_manager.client_counter);<br>
                  > @@ -2342,12 +2404,17 @@ int amdgpu_vm_init(struct
                  amdgpu_device *adev, struct amdgpu_vm *vm,<br>
                  >                  "CPU update of VM recommended
                  only for large BAR system\n");<br>
                  >        vm->last_dir_update = NULL;<br>
                  >   <br>
                  > +     flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |<br>
                  > +                    
                  AMDGPU_GEM_CREATE_VRAM_CLEARED;<br>
                  > +     if (vm->use_cpu_for_update)<br>
                  > +             flags |=
                  AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;<br>
                  > +     else<br>
                  > +             flags |=
                  (AMDGPU_GEM_CREATE_NO_CPU_ACCESS |<br>
                  > +                            
                  AMDGPU_GEM_CREATE_SHADOW);<br>
                  > +<br>
                  >        r = amdgpu_bo_create(adev,
                  amdgpu_vm_bo_size(adev, 0), align, true,<br>
                  >                            
                  AMDGPU_GEM_DOMAIN_VRAM,<br>
                  > -                         
                  AMDGPU_GEM_CREATE_NO_CPU_ACCESS |<br>
                  > -                         
                  AMDGPU_GEM_CREATE_SHADOW |<br>
                  > -                         
                  AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |<br>
                  > -                         
                  AMDGPU_GEM_CREATE_VRAM_CLEARED,<br>
                  > +                          flags,<br>
                  >                             NULL, NULL,
                  &vm->root.bo);<br>
                  >        if (r)<br>
                  >                goto error_free_sched_entity;<br>
                  <br>
                </div>
              </span></font> </blockquote>
          <br>
          <br>
          <fieldset class="mimeAttachmentHeader"></fieldset>
          <br>
          <pre wrap="">_______________________________________________
amd-gfx mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
        </blockquote>
        <p><br>
        </p>
        <br>
        <fieldset class="mimeAttachmentHeader"></fieldset>
        <br>
        <pre wrap="">_______________________________________________
amd-gfx mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
      </blockquote>
      <br>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
    </blockquote>
    <p><br>
    </p>
  </body>
</html>