etnaviv-gpu 134000.gpu: MMU fault status 0x00000002 on i.XM6 Quad Plus

Luís Mendes luis.p.mendes at gmail.com
Tue Oct 31 09:11:37 UTC 2017


Hi,

Russel: I believe that the video corruption I saw with copyNToN(...)
function of xf86-video-armada is not the root cause of the problem, but
rather a side effect of the MMU faults.
Can you help me setup a debug environment where I can dump the client
process stack from the etnaviv kernel module upon a MMU fault? I have
enough space on my SSD drive to dump the memory images.
Otherwise it will be difficult for me to find the cause of the MMU faults.
I already disabled as much hardware acceleration as I could from
xf86-video-armada, but is not enough to avoid the MMU faults.
I could try to GDB xorg-server-core, but it may not be the direct cause for
the MMU faults and I don't know where to look... stepping trhorugh xorg
until the MMU faults occur is also unfeasible, I think.

I've just installed Ubuntu MATE 17.10 and Ubuntu desktop 17.10 on my
Wandboard QuadPlus using Linux kernel
 4.14-rc5.

The MMU faults still happen with Ubuntu MATE 17.10 upon login, but not with
Ubuntu Desktop 17.10.
Furthermore if using lightdm as display manager, instead of gdm, the Ubuntu
MATE shows even worse screen corruption, as menus and dialogs are just a
white rectangle with no text or buttons rendered.

Regards,
Luís

On Mon, Oct 23, 2017 at 9:45 PM, Luís Mendes <luis.p.mendes at gmail.com>
wrote:

> Hi Russel, Lucas,
>
> I have some news regarding the above topic...
> I was supposed to check the X server API, but unfortunately I still didn't
> find the time to do it, however I've noticed that these commits in
> linux-4.14-rc5 improve the situation regarding the corrupted screen and
> screen blanking in the Ubuntu Mate 17.04 login screen with i.MX6 Quad Plus:
>
> *gpu: ipu-v3: pre: implement workaround for ERR009624*
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin
> ux.git/commit/?h=v4.14-rc6&id=11aff4b4c7c4b7257660ef890920f2ac72911ed0
>
> *gpu: ipu-v3: prg: wait for double buffers to be filled on channel startup*
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin
> ux.git/commit/?h=v4.14-rc6&id=263c3b8044f9c9356a34fdb2640b52d27e378f9c
>
> *gpu: ipu-v3: Allow channel burst locking on i.MX6 only*
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin
> ux.git/commit/?h=v4.14-rc6&id=cda77556447c782b3c9c068f81ef58136cb487c3
>
> I've just tested kernel 4.14-rc5 on my Wandboard Quad Plus and it is able
> to display the login screen without blanking or corrupting the image, while
> in linux-4.13 it wasn't working. I used the new Fabio's DTBs for the
> Wandboard Quad Plus rev. D1 in both kernel versions. These DTBs fix the
> problem I was having with corrupted screen during early boot, but not the
> login screen.
>
> Unfortunately the MMU faults remain along with the missing text in menus
> and icons as already reported.
>
> I would like to contribute more to this topic, but unfortunately I am not
> being able to find the time do it.
>
> Regards,
> Luís Mendes
>
> On Thu, Aug 31, 2017 at 1:44 PM, Lucas Stach <l.stach at pengutronix.de>
> wrote:
>
>> Am Donnerstag, den 31.08.2017, 13:36 +0100 schrieb Luís Mendes:
>>
>>
>>
>> > The imx6q is not showing any issues on this Ubuntu Mate 17.04, even
>> > for the conditions:
>> > Aug 31 11:58:08 picolo etnaviv[4694]: Mismatch src(y=0,h=24),
>> > dst(y=0,h=25), dy=0
>> > Aug 31 11:58:08 picolo etnaviv[4694]: Mismatch src(y=0,h=24),
>> > dst(y=-25,h=25), dy=25
>> > Aug 31 11:58:08 picolo etnaviv[4694]: Mismatch src(y=0,h=1),
>> > dst(y=-25,h=25), dy=1
>> > Aug 31 11:58:12 picolo etnaviv[4694]: Mismatch src(y=0,h=1),
>> > dst(y=0,h=25), dy=0
>> >
>> >
>> > No MMU faults, no screen corruptions.
>> >
>>
>> The GPUs on MY6Q are unable to produce MMU faults, so with incorrect
>> programming they are reading/writing unrelated data, or the MMU scratch
>> page.
>>
>>
>> > The imx6qp always have MMU faults and is showing the issues when
>> > removing the "goto callback" instruction.
>> > Aug 30 18:41:32 picolo etnaviv[1722]: Mismatch src(y=0,h=24),
>> > dst(y=0,h=25), dy=0
>> > Aug 30 18:41:34 picolo etnaviv[1722]: Mismatch src(y=0,h=24),
>> > dst(y=-25,h=25), dy=25
>> > Aug 30 18:41:34 picolo etnaviv[1722]: Mismatch src(y=0,h=1),
>> > dst(y=-25,h=25), dy=1
>> > Aug 30 18:41:41 picolo etnaviv[1722]: Mismatch src(y=0,h=1),
>> > dst(y=0,h=25), dy=0
>>
>> The MX6QP is much stricter on the MMU side and will stop the GPU from
>> processing any further commands if it is trying to read/write unmapped
>> data.
>>
>> >
>> > This probably indicates that there is no implementation issue with
>> > CopyNtoN and this is rather a side effect of the MMU faults.
>> >
>>
>> This indicates there is in fact an implementation error in CopyNtoN, but
>> as Russell pointed out it seems to be caused by core X server functions.
>> The question is if we can reasonably work around the issue.
>>
>> Regards,
>> Lucas
>>
>>
> Thanks for the clarification.
> I will have a look at the X server APIs that Russell is referring to...
> and check the CopyNToN implementation.
> Maybe I can find something, although I am not familiar with these APIs.
>
> Regards,
> Luís
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/etnaviv/attachments/20171031/b4750b39/attachment.html>


More information about the etnaviv mailing list