[PATCH 07/10] drm/amdgpu: add concurrent baco reset support for XGMI

Andrey Grodzovsky Andrey.Grodzovsky at amd.com
Fri Dec 6 21:50:46 UTC 2019


Hey Ma, attached a solution - it's just compiled as I still can't make 
my XGMI setup work (with bridge connected only one device is visible to 
the system while the other is not). Please try it on your system if you 
have a chance.

Andrey

On 12/4/19 10:14 PM, Ma, Le wrote:
>
> AFAIK it's enough for even single one node in the hive to to fail the 
> enter the BACO state on time to fail the entire hive reset procedure, no ?
>
> [Le]: Yeah, agree that. I’ve been thinking that make all nodes 
> entering baco simultaneously can reduce the possibility of node 
> failure to enter/exit BACO risk. For example, in an XGMI hive with 8 
> nodes, the total time interval of 8 nodes enter/exit BACO on 8 CPUs is 
> less than the interval that 8 nodes enter BACO serially and exit BACO 
> serially depending on one CPU with yield capability. This interval is 
> usually strict for BACO feature itself. Anyway, we need more looping 
> test later on any method we will choose.
>
> Any way - I see our discussion blocks your entire patch set - I think 
> you can go ahead and commit yours way (I think you got an RB from 
> Hawking) and I will look then and see if I can implement my method and 
> if it works will just revert your patch.
>
> [Le]: OK, fine.
>
> Andrey
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20191206/96c24a60/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-drm-Add-Reusable-task-barrier.patch
Type: text/x-patch
Size: 3757 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20191206/96c24a60/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-drm-amdgpu-Add-task-barrier-to-XGMI-hive.patch
Type: text/x-patch
Size: 2242 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20191206/96c24a60/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-drm-amdgpu-Redo-concurrent-support-of-BACO-reset-for.patch
Type: text/x-patch
Size: 6997 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20191206/96c24a60/attachment-0002.bin>


More information about the amd-gfx mailing list