Help: Samsung Exynos 7870 DECON SYSMMU panic

Robin Murphy robin.murphy at arm.com
Mon Jun 30 12:09:49 UTC 2025


On 28/06/2025 4:39 pm, Kaustabh Chakraborty wrote:
> On 2025-06-25 11:34, Robin Murphy wrote:
>> On 2025-06-25 9:42 am, Marek Szyprowski wrote:
>>> On 25.06.2025 09:39, Kaustabh Chakraborty wrote:
>>>> On 2025-06-24 17:12, Robin Murphy wrote:
>>>>> On 2025-06-18 3:02 pm, Kaustabh Chakraborty wrote:
>>>>>> Since bcb81ac6ae3c (iommu: Get DT/ACPI parsing into the proper probe
>>>>>> path),
>>>>>> The Samsung Exynos 7870 DECON device (with patches [1], [2], and
>>>>>> [3]) seems
>>>>>> to not work anymore. Upon closer inspection, I observe that there 
>>>>>> is an
>>>>>> IOMMU crash.
>>>>>>
>>>>>> [    2.918189] exynos-sysmmu 14860000.sysmmu: 14830000.decon: [READ]
>>>>>> PAGE FAULT occurred at 0x6715b3e0
>>>>>> [    2.918199] exynos-sysmmu 14860000.sysmmu: Page table base:
>>>>>> 0x0000000044a14000
>>>>>> [    2.918243] exynos-drm exynos-drm: bound 14830000.decon (ops
>>>>>> decon_component_ops)
>>>>>> [    2.922868] exynos-sysmmu 14860000.sysmmu:   Lv1 entry: 0x4205001
>>>>>> [    2.922877] Kernel panic - not syncing: Unrecoverable System MMU
>>>>>> Fault!
>>>>>> [    2.922885] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted
>>>>>> 6.16.0-rc2-exynos7870 #722 PREEMPT
>>>>>> [    2.995312] Hardware name: Samsung Galaxy J7 Prime (DT)
>>>>>> [    3.000509] Call trace:
>>>>>> [    3.002938]  show_stack+0x18/0x24 (C)
>>>>>> [    3.006582]  dump_stack_lvl+0x60/0x80
>>>>>> [    3.010224]  dump_stack+0x18/0x24
>>>>>> [    3.013521]  panic+0x168/0x360
>>>>>> [    3.016558]  exynos_sysmmu_irq+0x224/0x2ac
>>>>>> [         ...]
>>>>>> [    3.108786] ---[ end Kernel panic - not syncing: Unrecoverable
>>>>>> System MMU Fault! ]---
>>>>>
>>>>> For starters, what if you just remove this panic() from the IOMMU
>>>>> driver? Frankly it seems a bit excessive anyway...
>>>>
>>>> I've tried that, sysmmu repeatedly keeps issuing interrupts (yes, even
>>>> after clearing the interrupt bit) indefinitely.
>>>>
>>> Right, this is because decon device is still accessing system memory in
>>> a loop trying to display the splash screen. That panic is indeed a bit
>>> excessive, but what else IOMMU driver can do if no page fault handle is
>>> registered?
>>
>> Report the unhandled fault and continue, like most drivers already do. 
>> If there's another fault, then that can get reported as well. It's 
>> kind of the point that if a misbehaving device has been prevented from 
>> accessing memory then it has *not* adversely affected the rest of the 
>> system.
>>
>> I suppose if one wanted to be really clever then a driver could 
>> implement its own backoff mechanism where if it detects a sustained 
>> high rate of unhandled faults then it disables its interrupt for a 
>> bit, to mitigate the physical interrupt storm as well as avoid 
>> flooding the kernel log more than is useful.
>>
>>>>>  From the logs below it seems there is apparently unexpected traffic
>>>>> already going through the IOMMU when it wakes up. Is this the DRM
>>>>> drivers doing something sketchy, or has the bootloader left the
>>>>> display running for a splash screen? However in the latter case I
>>>>> don't obviously see why delaying the IOMMU probe should make much
>>>>> difference, given that the decon driver should still be waiting for
>>>>> it either way.
>>>>
>>>> The display is initialized by the bootloader for splash yes, but I 
>>>> reckon
>>>> it doesn't use the IOMMU as it's accessible from a framebuffer region.
>>>
>>> Right, bootloader configured decon device to display splash screen, what
>>> means that decon device is constantly reading splash screen pixel data
>>> from system memory. There is no such thing as a 'framebuffer region', it
>>> is just a system memory, which exynos sysmmu protects when enabled. So
>>> far this issue of splash screen from bootloader has not yet been solved
>>> in mainline. On other Exynos based supported boards this works only
>>> because there are also power domain drivers enabled, which are
>>> instantiated before the display related device and respective sysmmu
>>> device. That power domain driver shuts down power effectively disabling
>>> the display before the sysmmu gets probbed.
>>
>> And presumably the sysmmu device itself doesn't need to depend on that 
>> power domain? OK, that at least makes sense.
>>
>>> Long time ago I've pointed this issue and proposed some simple solution
>>> like a special initial identity mapping for the memory range used for
>>> splash screen, but that proposal is no longer applicable for the current
>>> code.
>>>
>>> As a workaround I would suggest to shutdown display in the decon device
>>> before starting the kernel (i.e. from the 'kernel loading mid-stage
>>> bootloader' if you have such).
>>
>> We do now have the tools to handle this properly - if the bootloader 
>> can be updated to add the appropriate "iommu-addresses" property[1] to 
>> the framebuffer reservation, then it's a case of hooking up support in 
>> exynos-iommu via of_iommu_get_resv_regions().
> 
> Hey, thanks a lot! I got it to work. [1] [2]

Hurrah!

> The Exynos IOMMU driver doesn't have support for get_resv_regions in
> iommu_ops. So I tried to find existing drivers which have it implemented,
> for examples. [3] [4] [5]
> 
> All of them have some reserved region allocation before calling
> iommu_dma_get_resv_regions(). I don't know what they are, and I don't know
> what base and length should be used for sysmmu. Either way I gave it a shot
> with [6]:
> 
> #define EXYNOS_DEV_ADDR_START    0x20000000
> #define EXYNOS_DEV_ADDR_SIZE    0x40000000
> 
> static void exynos_iommu_get_resv_regions(struct device *dev,
>                        struct list_head *head)
> {
>      struct iommu_resv_region *region;
>      int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> 
>      region = iommu_alloc_resv_region(EXYNOS_DEV_ADDR_START, 
> EXYNOS_DEV_ADDR_SIZE,
>                       prot, IOMMU_RESV_SW_MSI, GFP_KERNEL);
>      if (!region)
>          return;
> 
>      list_add_tail(&region->list, head);
> 
>      iommu_dma_get_resv_regions(dev, head);
> }
> 
> ...and it worked. It works even without that first allocation, so not
> sure why and if its needed.

No, unless you have client devices that use MSIs *and* you want to 
support VFIO for them, you don't need to make up a SW_MSI region - feel 
free to hook up ".get_resv_regions = iommu_dma_get_resv_regions" directly.

The only other thing to be wary of in general here is the window between 
initialising the IOMMU device itself and attaching the client(s) (e.g. 
SMMUv3 has to take special care there). However in this particular case 
I guess you're OK, since exynos_sysmmu_probe() doesn't really touch the 
hardware anyway, and you don't have that intermediate "globally enabled 
without client-specific config" state which can disrupt traffic on the 
bigger more complicated IOMMUs.

Thanks,
Robin.

> I need some input for making it upstreamable (mainly if the base and
> length are correct or no), I'll send a patch.
> 
> [1] 
> https://elixir.bootlin.com/linux/v6.15.3/source/drivers/iommu/of_iommu.c#L206
> [2] 
> https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/reserved-memory/reserved-memory.yaml
> [3] 
> https://elixir.bootlin.com/linux/v6.15.3/source/drivers/iommu/apple-dart.c#L966
> [4] 
> https://elixir.bootlin.com/linux/v6.15.3/source/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c#L3573
> [5] 
> https://elixir.bootlin.com/linux/v6.15.3/source/drivers/iommu/arm/arm-smmu/arm-smmu.c#L1594
> [6] 
> https://elixir.bootlin.com/linux/v6.15.3/source/drivers/gpu/drm/exynos/exynos_drm_dma.c#L30
> 
>>
>> For a short-term kernel-side hack you could probably implement 
>> .def_domain_type to force IDENTITY for decon devices, as long as you 
>> can then convince the DRM driver to pick another device to grab a DMA 
>> ops domain from.
>>
>> Thanks,
>> Robin.


More information about the dri-devel mailing list