[PATCH v2] fbdev: Fix invalid page access after closing deferred I/O devices

Mon Jan 30 09:28:16 UTC 2023

Hi

Am 30.01.23 um 09:52 schrieb Takashi Iwai:
> On Mon, 30 Jan 2023 09:28:36 +0100,
> Thomas Zimmermann wrote:
>>
>> Hi
>>
>> Am 29.01.23 um 09:28 schrieb Takashi Iwai:
>>> When a fbdev with deferred I/O is once opened and closed, the dirty
>>> pages still remain queued in the pageref list, and eventually later
>>> those may be processed in the delayed work.  This may lead to a
>>> corruption of pages, hitting an Oops.
>>
>> Do you have more information on this problem?
> 
> The details are in SUSE bugzilla, but that's an internal bug entry
> (and you know the number :)  It happens at the following at least:
> 
> - A VM is started with VGA console, no fb, on the installer
> - VM is switched to bochs drm
> - Start fbiterm on VT1, switching to the graphics mode on VT
> - Exit fbiterm, going back to the text mode on VT;
>    at this moment, it gets Oops like:
> 
> [   42.338319][  T122] BUG: unable to handle page fault for address:
> ffffe570c1000030
> [   42.340063][  T122] #PF: supervisor read access in kernel mode
> [   42.340519][  T122] #PF: error_code(0x0000) - not-present page
> [   42.340979][  T122] PGD 34c38067 P4D 34c38067 PUD 34c37067 PMD 0
> [   42.341456][  T122] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [   42.341853][  T122] CPU: 1 PID: 122 Comm: kworker/1:2 Not tainted
> 5.14.21-150500.5.g2ad24ee-default #1 SLE15-SP5 (unreleased)
> b7a28d028376a517e888a7ff28c5e5dede93267c
> [   42.343000][  T122] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.16.0-0-gd239552-rebuilt.opensuse.org 04/01/2014
> [   42.343929][  T122] Workqueue: events fb_deferred_io_work
> [   42.344355][  T122] RIP: 0010:page_mapped+0x5e/0x90
> [   42.344743][  T122] Code: a8 01 75 d7 8b 47 30 f7 d0 c1 e8 1f c3 cc cc cc cc
> 48 89 df e8 33 9c 05 00 89 c1 31 c0 85 c9 74 13 eb d3 48 c1 e2 06 48 01 da <8b>
> 42 30 85 c0 79 c0 83 c1 01 48 8b 33 48 63 d1 b8 01 00 00 00 f7
> [   42.346285][  T122] RSP: 0018:ffffb68640207e08 EFLAGS: 00010286
> [   42.346749][  T122] RAX: 00000000b3aea8f0 RBX: ffffe570c0f00000 RCX:
> 0000000000004000
> [   42.347355][  T122] RDX: ffffe570c1000000 RSI: 000fffffc0010009 RDI:
> ffffe570c0f00000
> [   42.347960][  T122] RBP: ffffffffc0503050 R08: 0000000000000000 R09:
> 0000000000000001
> [   42.348568][  T122] R10: 0000000000000000 R11: ffffb68640207c88 R12:
> ffffffffc0503020
> [   42.349180][  T122] R13: ffff921281dcdc00 R14: ffff9212bcf08000 R15:
> ffffe570c0f00000
> [   42.349789][  T122] FS:  0000000000000000(0000) GS:ffff9212b3b00000(0000)
> knlGS:0000000000000000
> [   42.350471][  T122] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   42.350975][  T122] CR2: ffffe570c1000030 CR3: 000000001b810000 CR4:
> 00000000000006e0
> [   42.351588][  T122] Call Trace:
> [   42.351845][  T122]  <TASK>
> [   42.352069][  T122]  page_mkclean+0x6e/0xc0
> [   42.352400][  T122]  ? page_referenced_one+0x190/0x190
> [   42.353714][  T122]  ? pmdp_collapse_flush+0x60/0x60
> [   42.354106][  T122]  fb_deferred_io_work+0x13d/0x190
> [   42.354496][  T122]  process_one_work+0x267/0x440
> [   42.354866][  T122]  ? process_one_work+0x440/0x440
> [   42.355247][  T122]  worker_thread+0x2d/0x3d0
> [   42.355590][  T122]  ? process_one_work+0x440/0x440
> [   42.355972][  T122]  kthread+0x156/0x180
> [   42.356281][  T122]  ? set_kthread_struct+0x50/0x50
> [   42.356662][  T122]  ret_from_fork+0x22/0x30
> [   42.357006][  T122]  </TASK>
> 
> The page info shows that it's a compound page but it's somehow
> broken.  On VM, it's triggered reliably with the scenario above,
> always at the same position.
> 
> FWIW, the Oops is hit even if there is no rewrite on the screen.
> That is, another procedure is:
> - Start VM, run fbiterm on VT1
> - Switch to VT2, text mode
> - On VT2, kill fbiterm; the crash still happens even if no screen
>    change is performed
> 
>> The mmap'ed buffer of the fbdev device comes from a vmalloc call. That
>> memory's location never changes; even across pairs of open/close on
>> the device file. I'm surprised that a page entry becomes invalid.
>>
>> In drm_fbdev_cleanup(), we first remove the fbdefio at [1] and then
>> vfree() the shadow buffer. So the memory should still be around until
>> fbdevio is gone.
> 
> Yes, that's the puzzling part, too.  Also, another thing is that the
> bug couldn't be triggered easily when the fb is started in a different
> way.  e.g. when you run fbiterm & exit on the VM that had efifb, it
> didn't hit.
> 
> So, overall, it might be that I'm scratching a wrong surface.  But at
> least it "fixes" the problem above apparently, and the deferred io
> base code itself has certainly the potential problem in general as my
> patch suggests.

Are there multiple graphics devices? There's just recently been a bugfix 
where graphics devices accidentally shared the same list of deferred 
pages. See

https://lore.kernel.org/dri-devel/20230121192418.2814955-4-javierm@redhat.com/

> 
>> [1]
>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_fb_helper.c#L2146
>>
>>>
>>> This patch makes sure to cancel the delayed work and clean up the
>>> pageref list at closing the device for addressing the bug.  A part of
>>> the cleanup code is factored out as a new helper function that is
>>> called from the common fb_release().
>>
>> The delayed work is required to copy the framebuffer to the device
>> output. So if it's just canceled, could this result in missing
>> updates?
>>
>> There's a call to cancel_delayed_work_sync() in the new helper
>> fb_deferred_io_release(). Is this the right function? Maybe
>> flush_delayed_work() is a better choice.
> 
> I thought of that, but took a shorter path.
> OK, let's check whether this keeps working with that change.

I read that cancel_() is not enough and needs to be followed by a 
flush_() to ensure quiescence. So maybe we should call that flush_ 
unconditionally.

> 
>>> Reviewed-by: Patrik Jakobsson <patrik.r.jakobsson at gmail.com>
>>> Cc: <stable at vger.kernel.org>
>>> Signed-off-by: Takashi Iwai <tiwai at suse.de>
>>
>> This could use a Fixes tag. It's not exactly clear to me when this
>> problem got originally introduced, but the recent refactoring seems a
>> candidate.
>>
>> Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct")
> 
> Hrm, this might be.  Maybe Patrik can test with the revert of this?

That's not easily revertable.

> 
>> Cc: Thomas Zimmermann <tzimmermann at suse.de>
>> Cc: Javier Martinez Canillas <javierm at redhat.com>
>> Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>> Cc: Maxime Ripard <mripard at kernel.org>
>> Cc: Zack Rusin <zackr at vmware.com>
>> Cc: VMware Graphics Reviewers <linux-graphics-maintainer at vmware.com>
>> Cc: Jaya Kumar <jayalk at intworks.biz>
>> Cc: Daniel Vetter <daniel at ffwll.ch>
>> Cc: "K. Y. Srinivasan" <kys at microsoft.com>
>> Cc: Haiyang Zhang <haiyangz at microsoft.com>
>> Cc: Wei Liu <wei.liu at kernel.org>
>> Cc: Dexuan Cui <decui at microsoft.com>
>> Cc: Steve Glendinning <steve.glendinning at shawell.net>
>> Cc: Bernie Thompson <bernie at plugable.com>
>> Cc: Helge Deller <deller at gmx.de>
>> Cc: Andy Shevchenko <andriy.shevchenko at linux.intel.com>
>> Cc: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
>> Cc: Stephen Kitt <steve at sk2.org>
>> Cc: Peter Suti <peter.suti at streamunlimited.com>
>> Cc: Sam Ravnborg <sam at ravnborg.org>
>> Cc: Geert Uytterhoeven <geert+renesas at glider.be>
>> Cc: ye xingchen <ye.xingchen at zte.com.cn>
>> Cc: Petr Mladek <pmladek at suse.com>
>> Cc: John Ogness <john.ogness at linutronix.de>
>> Cc: Tom Rix <trix at redhat.com>
>> Cc: dri-devel at lists.freedesktop.org
>> Cc: linux-fbdev at vger.kernel.org
>> Cc: linux-hyperv at vger.kernel.org
>> Cc: <stable at vger.kernel.org> # v5.19+
> 
> Nah, please don't.  Too many Cc's, literally a spam.

Ok.

> 
>>> ---
>>> v1->v2: Fix build error without CONFIG_FB_DEFERRED_IO
>>>
>>>    drivers/video/fbdev/core/fb_defio.c | 10 +++++++++-
>>>    drivers/video/fbdev/core/fbmem.c    |  4 ++++
>>>    include/linux/fb.h                  |  1 +
>>>    3 files changed, 14 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
>>> index c730253ab85c..583cbcf09446 100644
>>> --- a/drivers/video/fbdev/core/fb_defio.c
>>> +++ b/drivers/video/fbdev/core/fb_defio.c
>>> @@ -313,7 +313,7 @@ void fb_deferred_io_open(struct fb_info *info,
>>>    }
>>>    EXPORT_SYMBOL_GPL(fb_deferred_io_open);
>>>    -void fb_deferred_io_cleanup(struct fb_info *info)
>>> +void fb_deferred_io_release(struct fb_info *info)
>>>    {
>>>    	struct fb_deferred_io *fbdefio = info->fbdefio;
>>>    	struct page *page;
>>> @@ -327,6 +327,14 @@ void fb_deferred_io_cleanup(struct fb_info *info)
>>>    		page = fb_deferred_io_page(info, i);
>>>    		page->mapping = NULL;
>>>    	}
>>> +}
>>> +EXPORT_SYMBOL_GPL(fb_deferred_io_release);
>>
>> It's all in the same module. No need to export this symbol.
> 
> I noticed it, too, but just keep the same style as other functions :)
> That said, the other exported symbols are also useless.  I can prepare
> another patch to clean it up.

Your choice, but appreciated.

Best regards
Thomas

> 
> 
> thanks,
> 
> Takashi

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20230130/e32b364d/attachment-0001.sig>