<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta content="text/html; charset=utf-8">
</head>
<body bgcolor="#FFFFFF">
<div name="smartisanmessageid" id="smartisan1516972869817"><font color="#333333">You patch will prevent concurrent allocation, and will result in allocation performance drop much.</font><br>
<br>
<span id="smartisan_signature" style="font-size:0.8em; display:inline; color:#888888">
<p dir="ltr">发自坚果 Pro</p>
</span><style type="text/css">
<!--
* body
{padding:0 16px 30px!important;
margin:0!important;
background-color:#ffffff;
line-height:1.4;
word-wrap:break-word;
word-break:normal}
div
{word-wrap:break-word;
word-break:normal}
p
{word-wrap:break-word;
word-break:normal;
text-indent:0pt!important}
span
{word-wrap:break-word;
word-break:normal}
a
{word-wrap:break-word;
word-break:normal}
td
{word-wrap:break-word;
word-break:break-all}
-->
</style>
<div class="quote">
<div style="margin:0 0px; font-size:105%"><font color="#629140" style="line-height:1.4"><span>Christian K鰊ig <ckoenig.leichtzumerken@gmail.com> 于 2018年1月26日 下午9:04写道:</span></font></div>
<br type="attribution">
</div>
</div>
<div>
<div class="moz-cite-prefix">Attached is what you actually want to do cleanly implemented. But as I said this is a NO-GO.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 26.01.2018 um 13:43 schrieb Christian König:<br>
</div>
<blockquote type="cite">
<div class="moz-cite-prefix">
<blockquote type="cite">After my investigation, this issue should be detect of TTM design self, which breaks scheduling balance.<br>
</blockquote>
Yeah, but again. This is indented design we can't change easily.<br>
<br>
Regards,<br>
Christian.<br>
<br>
Am 26.01.2018 um 13:36 schrieb Zhou, David(ChunMing):<br>
</div>
<blockquote type="cite">
<meta name="Generator" content="Microsoft Exchange Server">
<style>
<!--
.EmailQuote
{margin-left:1pt;
padding-left:4pt;
border-left:#800000 2px solid}
-->
</style>
<div>
<div name="x_smartisanmessageid" id="x_smartisan1516970199102"><font color="#333333">I am off work, so reply mail by phone, the format could not be text.<br>
</font><br>
back to topic itself:<br>
the problem indeed happen on amdgpu driver, someone reports me that application runs with two instances, the performance are different.<br>
I also reproduced the issue with unit test(bo_eviction_test). They always think our scheduler isn't working as expected.<br>
<br>
After my investigation, this issue should be detect of TTM design self, which breaks scheduling balance.<br>
<br>
Further, if we run containers for our gpu, container A could run high score, container B runs low score with same benchmark.<br>
<br>
So this is bug that we need fix.<br>
<br>
Regards,<br>
David Zhou<br>
<br>
<span id="x_smartisan_signature" style="font-size:0.8em; display:inline; color:#888888">
<p dir="ltr">发自坚果 Pro</p>
</span><style type="text/css">
<!--
* body
{padding:0 16px 30px!important;
margin:0!important;
background-color:#ffffff;
line-height:1.4;
word-wrap:break-word;
word-break:normal}
div
{word-wrap:break-word;
word-break:normal}
p
{word-wrap:break-word;
word-break:normal;
text-indent:0pt!important}
span
{word-wrap:break-word;
word-break:normal}
a
{word-wrap:break-word;
word-break:normal}
td
{word-wrap:break-word;
word-break:break-all}
-->
</style>
<div class="x_quote">
<div style="margin:0 0px; font-size:105%"><font color="#629140" style="line-height:1.4"><span>Christian K鰊ig
<a class="moz-txt-link-rfc2396E" href="mailto:ckoenig.leichtzumerken@gmail.com"><ckoenig.leichtzumerken@gmail.com></a> 于 2018年1月26日 下午6:31写道:</span></font></div>
<br type="attribution">
</div>
</div>
</div>
<font size="2"><span style="font-size:11pt">
<div class="PlainText">Am 26.01.2018 um 11:22 schrieb Chunming Zhou:<br>
> there is a scheduling balance issue about get node like:<br>
> a. process A allocates full memory and use it for submission.<br>
> b. process B tries to allocates memory, will wait for process A BO idle in eviction.<br>
> c. process A completes the job, process B eviction will put process A BO node,<br>
> but in the meantime, process C is comming to allocate BO, whill directly get node successfully, and do submission,<br>
> process B will again wait for process C BO idle.<br>
> d. repeat the above setps, process B could be delayed much more.<br>
><br>
> later allocation must not be ahead of front in same place.<br>
<br>
Again NAK to the whole approach.<br>
<br>
At least with amdgpu the problem you described above never occurs <br>
because evictions are pipelined operations. We could only block for <br>
deleted regions to become free.<br>
<br>
But independent of that incoming memory requests while we make room for <br>
eviction are intended to be served first.<br>
<br>
Changing that is certainly a no-go cause that would favor memory hungry <br>
applications over small clients.<br>
<br>
Regards,<br>
Christian.<br>
<br>
><br>
> Change-Id: I3daa892e50f82226c552cc008a29e55894a98f18<br>
> Signed-off-by: Chunming Zhou <a class="moz-txt-link-rfc2396E" href="mailto:david1.zhou@amd.com">
<david1.zhou@amd.com></a><br>
> ---<br>
> drivers/gpu/drm/ttm/ttm_bo.c | 69 +++++++++++++++++++++++++++++++++++++++--<br>
> include/drm/ttm/ttm_bo_api.h | 7 +++++<br>
> include/drm/ttm/ttm_bo_driver.h | 7 +++++<br>
> 3 files changed, 80 insertions(+), 3 deletions(-)<br>
><br>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c<br>
> index d33a6bb742a1..558ec2cf465d 100644<br>
> --- a/drivers/gpu/drm/ttm/ttm_bo.c<br>
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c<br>
> @@ -841,6 +841,58 @@ static int ttm_bo_add_move_fence(struct ttm_buffer_object *bo,<br>
> return 0;<br>
> }<br>
> <br>
> +static void ttm_man_init_waiter(struct ttm_bo_waiter *waiter,<br>
> + struct ttm_buffer_object *bo,<br>
> + const struct ttm_place *place)<br>
> +{<br>
> + waiter->tbo = bo;<br>
> + memcpy((void *)&waiter->place, (void *)place, sizeof(*place));<br>
> + INIT_LIST_HEAD(&waiter->list);<br>
> +}<br>
> +<br>
> +static void ttm_man_add_waiter(struct ttm_mem_type_manager *man,<br>
> + struct ttm_bo_waiter *waiter)<br>
> +{<br>
> + if (!waiter)<br>
> + return;<br>
> + spin_lock(&man->wait_lock);<br>
> + list_add_tail(&waiter->list, &man->waiter_list);<br>
> + spin_unlock(&man->wait_lock);<br>
> +}<br>
> +<br>
> +static void ttm_man_del_waiter(struct ttm_mem_type_manager *man,<br>
> + struct ttm_bo_waiter *waiter)<br>
> +{<br>
> + if (!waiter)<br>
> + return;<br>
> + spin_lock(&man->wait_lock);<br>
> + if (!list_empty(&waiter->list))<br>
> + list_del(&waiter->list);<br>
> + spin_unlock(&man->wait_lock);<br>
> + kfree(waiter);<br>
> +}<br>
> +<br>
> +int ttm_man_check_bo(struct ttm_mem_type_manager *man,<br>
> + struct ttm_buffer_object *bo,<br>
> + const struct ttm_place *place)<br>
> +{<br>
> + struct ttm_bo_waiter *waiter, *tmp;<br>
> +<br>
> + spin_lock(&man->wait_lock);<br>
> + list_for_each_entry_safe(waiter, tmp, &man->waiter_list, list) {<br>
> + if ((bo != waiter->tbo) &&<br>
> + ((place->fpfn >= waiter->place.fpfn &&<br>
> + place->fpfn <= waiter->place.lpfn) ||<br>
> + (place->lpfn <= waiter->place.lpfn && place->lpfn >=<br>
> + waiter->place.fpfn)))<br>
> + goto later_bo;<br>
> + }<br>
> + spin_unlock(&man->wait_lock);<br>
> + return true;<br>
> +later_bo:<br>
> + spin_unlock(&man->wait_lock);<br>
> + return false;<br>
> +}<br>
> /**<br>
> * Repeatedly evict memory from the LRU for @mem_type until we create enough<br>
> * space, or we've evicted everything and there isn't enough space.<br>
> @@ -853,17 +905,26 @@ static int ttm_bo_mem_force_space(struct ttm_buffer_object *bo,<br>
> {<br>
> struct ttm_bo_device *bdev = bo->bdev;<br>
> struct ttm_mem_type_manager *man = &bdev->man[mem_type];<br>
> + struct ttm_bo_waiter waiter;<br>
> int ret;<br>
> <br>
> + ttm_man_init_waiter(&waiter, bo, place);<br>
> + ttm_man_add_waiter(man, &waiter);<br>
> do {<br>
> ret = (*man->func->get_node)(man, bo, place, mem);<br>
> - if (unlikely(ret != 0))<br>
> + if (unlikely(ret != 0)) {<br>
> + ttm_man_del_waiter(man, &waiter);<br>
> return ret;<br>
> - if (mem->mm_node)<br>
> + }<br>
> + if (mem->mm_node) {<br>
> + ttm_man_del_waiter(man, &waiter);<br>
> break;<br>
> + }<br>
> ret = ttm_mem_evict_first(bdev, mem_type, place, ctx);<br>
> - if (unlikely(ret != 0))<br>
> + if (unlikely(ret != 0)) {<br>
> + ttm_man_del_waiter(man, &waiter);<br>
> return ret;<br>
> + }<br>
> } while (1);<br>
> mem->mem_type = mem_type;<br>
> return ttm_bo_add_move_fence(bo, man, mem);<br>
> @@ -1450,6 +1511,8 @@ int ttm_bo_init_mm(struct ttm_bo_device *bdev, unsigned type,<br>
> man->use_io_reserve_lru = false;<br>
> mutex_init(&man->io_reserve_mutex);<br>
> spin_lock_init(&man->move_lock);<br>
> + spin_lock_init(&man->wait_lock);<br>
> + INIT_LIST_HEAD(&man->waiter_list);<br>
> INIT_LIST_HEAD(&man->io_reserve_lru);<br>
> <br>
> ret = bdev->driver->init_mem_type(bdev, type, man);<br>
> diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h<br>
> index 2cd025c2abe7..0fce4dbd02e7 100644<br>
> --- a/include/drm/ttm/ttm_bo_api.h<br>
> +++ b/include/drm/ttm/ttm_bo_api.h<br>
> @@ -40,6 +40,7 @@<br>
> #include <linux/mm.h><br>
> #include <linux/bitmap.h><br>
> #include <linux/reservation.h><br>
> +#include <drm/ttm/ttm_placement.h><br>
> <br>
> struct ttm_bo_device;<br>
> <br>
> @@ -232,6 +233,12 @@ struct ttm_buffer_object {<br>
> struct mutex wu_mutex;<br>
> };<br>
> <br>
> +struct ttm_bo_waiter {<br>
> + struct ttm_buffer_object *tbo;<br>
> + struct ttm_place place;<br>
> + struct list_head list;<br>
> +};<br>
> +<br>
> /**<br>
> * struct ttm_bo_kmap_obj<br>
> *<br>
> diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h<br>
> index 9b417eb2df20..dc6b8b4c9e06 100644<br>
> --- a/include/drm/ttm/ttm_bo_driver.h<br>
> +++ b/include/drm/ttm/ttm_bo_driver.h<br>
> @@ -293,6 +293,10 @@ struct ttm_mem_type_manager {<br>
> bool io_reserve_fastpath;<br>
> spinlock_t move_lock;<br>
> <br>
> + /* waiters in list */<br>
> + spinlock_t wait_lock;<br>
> + struct list_head waiter_list;<br>
> +<br>
> /*<br>
> * Protected by @io_reserve_mutex:<br>
> */<br>
> @@ -748,6 +752,9 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,<br>
> struct ttm_mem_reg *mem,<br>
> struct ttm_operation_ctx *ctx);<br>
> <br>
> +int ttm_man_check_bo(struct ttm_mem_type_manager *man,<br>
> + struct ttm_buffer_object *bo,<br>
> + const struct ttm_place *place);<br>
> void ttm_bo_mem_put(struct ttm_buffer_object *bo, struct ttm_mem_reg *mem);<br>
> void ttm_bo_mem_put_locked(struct ttm_buffer_object *bo,<br>
> struct ttm_mem_reg *mem);<br>
<br>
</div>
</span></font><br>
<fieldset class="mimeAttachmentHeader"></fieldset> <br>
<pre>_______________________________________________
amd-gfx mailing list
<a class="moz-txt-link-abbreviated" href="mailto:amd-gfx@lists.freedesktop.org">amd-gfx@lists.freedesktop.org</a>
<a class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/amd-gfx">https://lists.freedesktop.org/mailman/listinfo/amd-gfx</a>
</pre>
</blockquote>
<br>
</blockquote>
<br>
</div>
</body>
</html>