[REGRESSION] QXL display malfunction
Thomas Zimmermann
tzimmermann at suse.de
Tue Jul 2 08:05:08 UTC 2024
Am 01.07.24 um 12:02 schrieb Linux regression tracking (Thorsten Leemhuis):
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
>
> Thomas, was there some progress wrt to fixing below regression? I might
> have missed something, but from here it looks like this fall through the
> cracks.
Thanks for reminding.
>
> Makes me wonder if we should temporarily revert this for now to fix this
> for rc7 and ensure things get at least one week of testing before the final.
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
>
> #regzbot poke
>
> On 14.06.24 15:45, Kaplan, David wrote:
>> [AMD Official Use Only - AMD Internal Distribution Only]
>>
>>> -----Original Message-----
>>> From: Thomas Zimmermann <tzimmermann at suse.de>
>>> Sent: Wednesday, June 12, 2024 9:26 AM
>>> To: Linux regressions mailing list <regressions at lists.linux.dev>
>>> Cc: Petkov, Borislav <Borislav.Petkov at amd.com>;
>>> zack.rusin at broadcom.com; dmitry.osipenko at collabora.com; Kaplan, David
>>> <David.Kaplan at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>;
>>> Dave Airlie <airlied at redhat.com>; Maarten Lankhorst
>>> <maarten.lankhorst at linux.intel.com>; Maxime Ripard
>>> <mripard at kernel.org>; LKML <linux-kernel at vger.kernel.org>; ML dri-devel
>>> <dri-devel at lists.freedesktop.org>; spice-devel at lists.freedesktop.org;
>>> virtualization at lists.linux.dev
>>> Subject: Re: [REGRESSION] QXL display malfunction
>>>
>>> Caution: This message originated from an External Source. Use proper
>>> caution when opening attachments, clicking links, or responding.
>>>
>>>
>>> Hi
>>>
>>> Am 12.06.24 um 14:41 schrieb Linux regression tracking (Thorsten Leemhuis):
>>>> [CCing a few more people and lists that get_maintainers pointed out
>>>> for qxl]
>>>>
>>>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>>>> for once, to make this easily accessible to everyone.
>>>>
>>>> Thomas, from here it looks like this report that apparently is caused
>>>> by a change of yours that went into 6.10-rc1 (b33651a5c98dbd
>>>> ("drm/qxl: Do not pin buffer objects for vmap")) fell through the
>>>> cracks. Or was progress made to resolve this and I just missed this?
>>>>
>>>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker'
>>>> hat)
>>>> --
>>>> Everything you wanna know about Linux kernel regression tracking:
>>>> https://linux-regtracking.leemhuis.info/about/#tldr
>>>> If I did something stupid, please tell me, as explained on that page.
>>>>
>>>> #regzbot poke
>>>>
>>>>
>>>> On 03.06.24 04:29, Kaplan, David wrote:
>>>>>> -----Original Message-----
>>>>>> From: Kaplan, David
>>>>>> Sent: Sunday, June 2, 2024 9:25 PM
>>>>>> To: tzimmermann at suse.de; dmitry.osipenko at collabora.com; Koenig,
>>>>>> Christian <Christian.Koenig at amd.com>; zach.rusin at broadcom.com
>>>>>> Cc: Petkov, Borislav <Borislav.Petkov at amd.com>;
>>>>>> regressions at list.linux.dev
>>>>>> Subject: [REGRESSION] QXL display malfunction
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am running an Ubuntu 19.10 VM with a tip kernel using QXL video
>>>>>> and I've observed the VM graphics often malfunction after boot,
>>>>>> sometimes failing to load the Ubuntu desktop or even immediately
>>> shutting the guest down.
>>>>>> When it does load, the guest dmesg log often contains errors like
>>>>>>
>>>>>> [ 4.303586] [drm:drm_atomic_helper_commit_planes] *ERROR* head
>>> 1
>>>>>> wrong: 65376256x16777216+0+0
>>>>>> [ 4.586883] [drm:drm_atomic_helper_commit_planes] *ERROR* head
>>> 1
>>>>>> wrong: 65376256x16777216+0+0
>>>>>> [ 4.904036] [drm:drm_atomic_helper_commit_planes] *ERROR* head
>>> 1
>>>>>> wrong: 65335296x16777216+0+0
>>> I don't see how these messages are related. Did they already appear before
>>> the broken commit was there?
>> No, I did not observe them prior to the broken commit.
>>
>>>>>> [ 5.374347] [drm:qxl_release_from_id_locked] *ERROR* failed to find
>>> id in
>>>>>> release_idr
>>> Is there only one such message in the log? Or multiple/frequent ones.
>> I would usually only see one.
>>
>>> Could you provide a stack trace of what happens before?
>> Here's the top of a backtrace when the error occurs:
>> #0 qxl_release_from_id_locked (qdev=qdev at entry=0xffff88810126e000, id=id at entry=262151)
>> at drivers/gpu/drm/qxl/qxl_release.c:373
>> #1 0xffffffff819f5b6a in qxl_garbage_collect (qdev=0xffff88810126e000)
>> at drivers/gpu/drm/qxl/qxl_cmd.c:222
>> #2 0xffffffff810e3aa8 in process_one_work (worker=worker at entry=0xffff888101680300,
>> work=0xffff88810126f340) at kernel/workqueue.c:3231
>> #3 0xffffffff810e6281 in process_scheduled_works (worker=<optimized out>)
>> at kernel/workqueue.c:3312
>> #4 worker_thread (__worker=0xffff888101680300) at kernel/workqueue.c:3393
>>
>>> We sometimes draw into the buffer object from the CPU. For accessing the
>>> buffer object's pages from the CPU, only a vmap operation should be
>>> necessary. It appears as if qxl also requires a pin. My guess is that the pin
>>> inserts the buffer-object's host-side pages and the code around
>>> qxl_release_from_id_locked() appears to be garbage-collecting them.
>>> Hence without the pin, the GC complains about inconsistent state.
>>>>>> I bisected the issue down to "drm/qxl: Do not pin buffer objects for
>>> vmap"
>>>>>> (b33651a5c98dbd5a919219d8c129d0674ef74299).
>>> Thanks for bisecting. Does it work if you revert that commit?
>> Yes
>>
>> Thanks --David Kaplan
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
More information about the Spice-devel
mailing list