iommu/amd: flush IOTLB for specific domains only (v2)

Tom Lendacky thomas.lendacky at amd.com
Tue May 15 15:18:44 UTC 2018


On 5/15/2018 9:47 AM, Joseph Salisbury wrote:
> On 05/15/2018 09:08 AM, Tom Lendacky wrote:
>> On 5/15/2018 7:34 AM, Nath, Arindam wrote:
>>>
>>>> -----Original Message-----
>>>> From: Joseph Salisbury [mailto:joseph.salisbury at canonical.com]
>>>> Sent: Tuesday, May 15, 2018 5:40 PM
>>>> To: Nath, Arindam <Arindam.Nath at amd.com>
>>>> Cc: iommu at lists.linux-foundation.org; Bridgman, John
>>>> <John.Bridgman at amd.com>; joro at 8bytes.org; amd-
>>>> gfx at lists.freedesktop.org; drake at endlessm.com; stein12c at gmail.com;
>>>> Suthikulpanit, Suravee <Suravee.Suthikulpanit at amd.com>; Deucher,
>>>> Alexander <Alexander.Deucher at amd.com>; Kuehling, Felix
>>>> <Felix.Kuehling at amd.com>; linux at endlessm.com; michel at daenzer.net;
>>>> 1747463 at bugs.launchpad.net; Lendacky, Thomas
>>>> <Thomas.Lendacky at amd.com>
>>>> Subject: Re: iommu/amd: flush IOTLB for specific domains only (v2)
>>>>
>>>> On 05/15/2018 04:03 AM, Nath, Arindam wrote:
>>>>> Adding Tom.
>>>>>
>>>>> Hi Joe,
>>>>>
>>>>> My original patch was never accepted. Tom and Joerg worked on another
>>>> patch series which was supposed to fix the issue in question in addition to do
>>>> some code cleanups. I believe their patches are already in the mainline. If I
>>>> remember correctly, one of the patches disabled PCI ATS for the graphics
>>>> card which was causing the issue.
>>>>> Do you still see the issue with latest mainline kernel?
>>>>>
>>>>> BR,
>>>>> Arindam
>>>>>
>>>>> -----Original Message-----
>>>>> From: Joseph Salisbury [mailto:joseph.salisbury at canonical.com]
>>>>> Sent: Tuesday, May 15, 2018 1:17 AM
>>>>> To: Nath, Arindam <Arindam.Nath at amd.com>
>>>>> Cc: iommu at lists.linux-foundation.org; Bridgman, John
>>>>> <John.Bridgman at amd.com>; joro at 8bytes.org;
>>>>> amd-gfx at lists.freedesktop.org; drake at endlessm.com;
>>>> stein12c at gmail.com;
>>>>> Suthikulpanit, Suravee <Suravee.Suthikulpanit at amd.com>; Deucher,
>>>>> Alexander <Alexander.Deucher at amd.com>; Kuehling, Felix
>>>>> <Felix.Kuehling at amd.com>; linux at endlessm.com; michel at daenzer.net;
>>>>> 1747463 at bugs.launchpad.net
>>>>> Subject: iommu/amd: flush IOTLB for specific domains only (v2)
>>>>>
>>>>> Hello Arindam,
>>>>>
>>>>> There is a bug report[0] that you created a patch[1] for a while back.
>>>> However, the patch never landed in mainline.  There is a bug reporter in
>>>> Ubuntu[2] that is affected by this bug and is willing to test the patch.  I
>>>> attempted to build a test kernel with the patch, but it does not apply to
>>>> currently mainline cleanly.  Do you still think this patch may resolve this
>>>> bug?  If so, is there a version of your patch available that will apply to current
>>>> mainline?
>>>>> Thanks,
>>>>>
>>>>> Joe
>>>>>
>>>>> [0] https://bugs.freedesktop.org/show_bug.cgi?id=101029
>>>>> [1] https://patchwork.freedesktop.org/patch/157327/
>>>>> [2] http://pad.lv/1747463
>>>>>
>>>> Hi Arindam,
>>>>
>>>> Thanks for the feedback.  Yes, the latest mainline kernel was tested, and it is
>>>> reported the bug still happens in the Ubuntu kernel bug[0]. Is there any
>>>> specific diagnostic info we can collect that might help?
>>> Joe, I believe all the information needed is already provided in [2]. Let us wait for inputs from Tom and Joerg.
>>>
>>> I could take a look at the issue locally, but it will take me some really long time since I am occupied with other assignments right now.
>> I don't see anything in the bug that indicates the latest mainline kernel
>> was tested.  The patches/fixes in question are part of the 4.13 kernel, I
>> only see references to 4.10 kernels so I wouldn't expect the issue to be
>> resolved unless the patches from 4.13 were backported to the Ubuntu 4.10
>> kernel.
>>
>> Thanks,
>> Tom
>>
>>> BR,
>>> Arindam
>>>
>>>> Thanks,
>>>>
>>>> Joe
>>>>
>>>> [0] http://pad.lv/1747463
> Hi Tom,
> 
> The request to test mainline was in comment #30[0].  However, the bug
> reporter stated the bug still existed on IRC and not in the bug report. 
> I'll request he adds the test results to the bug.
> 

Ok, I was looking at the wrong bug.  For the original 4.13 kernel, I don't
see any attachments that have the AMD-Vi messages in question.  Were they
completion timeouts (like in the later mainline kernel test, which I'll
get to in a bit) or I/O page fault messages?  Without that information it
is hard to determine what the issue really is.

(Just as an FYI, if the IOMMU is disabled in BIOS, then iommu=soft is not
 necessary on the kernel command line).

For the upstream kernel test, since this is a Ryzen system, it's possible
that the BIOS does not have a requisite fix for SME and IOMMU (see [1]).
On the upstream kernel, if memory encryption is active by default without
this BIOS fix, then the result is AMD-Vi completion-wait timeout messages.
Try booting with mem_encrypt=off on the kernel command line or build a
kernel with CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n and see if that
allows the kernel to boot.

Thanks,
Tom

[1] https://bugzilla.kernel.org/show_bug.cgi?id=199513


> Thanks,
> 
> Joe
> 
> 
> 
> 
> [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747463/comments/30
> 


More information about the amd-gfx mailing list