[Nouveau] [PATCH] drm/nouveau: Fix pre-nv50 pageflip events (v3) -> v4

Thierry Reding thierry.reding at gmail.com
Thu Nov 12 05:48:18 PST 2015


On Wed, Nov 11, 2015 at 09:12:33PM +0100, poma wrote:
> On 10.11.2015 17:25, Mario Kleiner wrote:
> > On 11/10/2015 05:00 PM, Thierry Reding wrote:
> >> On Tue, Nov 10, 2015 at 03:54:52PM +0100, Mario Kleiner wrote:
> >>> From: Daniel Vetter <daniel.vetter at ffwll.ch>
> >>>
> >>> Apparently pre-nv50 pageflip events happen before the actual vblank
> >>> period. Therefore that functionality got semi-disabled in
> >>>
> >>> commit af4870e406126b7ac0ae7c7ce5751f25ebe60f28
> >>> Author: Mario Kleiner <mario.kleiner.de at gmail.com>
> >>> Date:   Tue May 13 00:42:08 2014 +0200
> >>>
> >>>      drm/nouveau/kms/nv04-nv40: fix pageflip events via special case.
> >>>
> >>> Unfortunately that hack got uprooted in
> >>>
> >>> commit cc1ef118fc099295ae6aabbacc8af94d8d8885eb
> >>> Author: Thierry Reding <treding at nvidia.com>
> >>> Date:   Wed Aug 12 17:00:31 2015 +0200
> >>>
> >>>      drm/irq: Make pipe unsigned and name consistent
> >>>
> >>> Trigering a warning when trying to sample the vblank timestamp for a
> >>> non-existing pipe. There's a few ways to fix this:
> >>>
> >>> - Open-code the old behaviour, which just enshrines this slight
> >>>    breakage of the userspace ABI.
> >>>
> >>> - Revert Mario's commit and again inflict broken timestamps, again not
> >>>    pretty.
> >>>
> >>> - Fix this for real by delaying the pageflip TS until the next vblank
> >>>    interrupt, thereby making it accurate.
> >>>
> >>> This patch implements the third option. Since having a page flip
> >>> interrupt that happens when the pageflip gets armed and not when it
> >>> completes in the next vblank seems to be fairly common (older i915 hw
> >>> works very similarly) create a new helper to arm vblank events for
> >>> such drivers.
> >>>
> >>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=106431
> >>> Cc: Thierry Reding <treding at nvidia.com>
> >>> Cc: Mario Kleiner <mario.kleiner.de at gmail.com>
> >>> Cc: Ben Skeggs <bskeggs at redhat.com>
> >>> Cc: Ilia Mirkin <imirkin at alum.mit.edu>
> >>>
> >>> v2 (mario): Integrate my own review comments into Daniels patch.
> >>>     - Fix function prototypes in drmP.h
> >>>     - Add missing vblank_put() for pageflip completion without
> >>>       pageflip event.
> >>>     - Initialize sequence number for queued pageflip event to avoidng
> >>>       trouble in drm_handle_vblank_events().
> >>>     - Remove dead code and spelling fix.
> >>>
> >>> v3 (mario): Add a signed-off-by and cc stable tag per Ilja's advice.
> >>>
> >>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> >>> (v1) Reviewed-by: Mario Kleiner <mario.kleiner.de at gmail.com>
> >>> (v2/v3) Signed-off-by: Mario Kleiner <mario.kleiner.de at gmail.com>
> >>>
> >>> Cc: stable at vger.kernel.org # v4.3
> >>> ---
> >>>   drivers/gpu/drm/drm_irq.c                 | 54 ++++++++++++++++++++++++++++++-
> >>>   drivers/gpu/drm/nouveau/nouveau_display.c | 19 ++++++-----
> >>>   include/drm/drmP.h                        |  4 +++
> >>>   3 files changed, 68 insertions(+), 9 deletions(-)
> >>
> >> This looks good to me. Let me clean this up a little and submit it to
> >> Dave.
> >>
> >> Thierry
> >>
> > 
> > Btw., if somebody has a functional old card for testing this, it should 
> > be easy to verify if it works on pre-nv50. If it would not work it would 
> > deliver the pageflip event 1 frame delayed, so at least on standard 
> > nouveau + default DRI2 + default double-buffering the rate for a tight 
> > loop of page-flipped swaps should go down to 30 fps on a 60 Hz display, 
> > quite noticeable. Afaik we also have Piglit tests for OML_sync_control 
> > which would likely fail if this would be broken.
> > 
> > Oh and if someone has tips on how to resurrect an old nv-40 PC (booted 
> > with BIOS only) graphics card in a MacPro (EFI boot), i wouldn't mind 
> > hearing them. It would be nice to still be able to use that card for 
> > testing.
> > 
> > thanks,
> > -mario
> 
> 
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 313 at lib/dma-debug.c:1205 check_sync+0x169/0x6e0()
> nouveau 0000:01:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x00000000c0bf6468] [size=4096 bytes]
> Modules linked in: nouveau(+) ...
> CPU: 0 PID: 313 Comm: systemd-udevd Not tainted 4.3.0-3.fc22.i686+debug #1
> ...
> Call Trace:
>  [<c078a28f>] dump_stack+0x48/0x69
>  [<c0461fc7>] warn_slowpath_common+0x87/0xc0
>  [<c07b9029>] ? check_sync+0x169/0x6e0
>  [<c07b9029>] ? check_sync+0x169/0x6e0
>  [<c046203e>] warn_slowpath_fmt+0x3e/0x60
>  [<c07b9029>] check_sync+0x169/0x6e0
>  [<c07b96ad>] debug_dma_sync_single_for_device+0x7d/0x90
>  [<f7ece298>] ? ttm_bo_del_sub_from_lru+0x18/0x50 [ttm]
>  [<c040bef0>] ? text_poke_bp+0xd0/0xd0
>  [<f85d96db>] nouveau_bo_sync_for_device+0x8b/0xa0 [nouveau]
>  [<f85d97c4>] nouveau_bo_validate+0x34/0x40 [nouveau]
>  [<f85d9958>] nouveau_bo_pin+0x188/0x290 [nouveau]
>  [<f85d7fe0>] ? nv10_bo_put_tile_region+0x80/0x80 [nouveau]
>  [<f85ec7fd>] nouveau_channel_prep+0xfd/0x2c0 [nouveau]
>  [<f85eca17>] nouveau_channel_new+0x57/0x7a0 [nouveau]
>  [<c05da57c>] ? kfree+0xdc/0x280
>  [<f8540082>] ? nvif_object_sclass_put+0x12/0x20 [nouveau]
>  [<f85d5666>] nouveau_drm_load+0x596/0x8d0 [nouveau]
>  [<c04bc14c>] ? trace_hardirqs_on_caller+0x12c/0x1d0
>  [<f7e8bfe9>] ? drm_minor_register+0x89/0x120 [drm]
>  [<f7e8c116>] drm_dev_register+0x96/0xa0 [drm]
>  [<f7e8ece9>] drm_get_pci_dev+0x79/0x1c0 [drm]
>  [<c07d973e>] ? pcibios_set_master+0x4e/0xa0
>  [<f85d5b8e>] nouveau_drm_probe+0x1ee/0x220 [nouveau]
>  [<c07dbf3b>] pci_device_probe+0x7b/0xf0
>  [<c08bf366>] ? devices_kset_move_last+0x56/0xa0
>  [<c08c2de4>] driver_probe_device+0x204/0x490
>  [<c08c30bc>] ? __driver_attach+0x4c/0x90
>  [<c07dbc02>] ? pci_match_device+0xd2/0x100
>  [<c08c30f1>] __driver_attach+0x81/0x90
>  [<c08c3070>] ? driver_probe_device+0x490/0x490
>  [<c08c0b97>] bus_for_each_dev+0x57/0xa0
>  [<c08c25ce>] driver_attach+0x1e/0x20
>  [<c08c3070>] ? driver_probe_device+0x490/0x490
>  [<c08c214f>] bus_add_driver+0x1ef/0x290
>  [<c08c3bdd>] driver_register+0x5d/0xf0
>  [<c07da71a>] __pci_register_driver+0x4a/0x50
>  [<f7e8ef0d>] drm_pci_init+0xdd/0x100 [drm]
>  [<f7f211f9>] nouveau_drm_init+0x1f9/0x1000 [nouveau]
>  [<f7f21000>] ? 0xf7f21000
>  [<c040047a>] do_one_initcall+0xaa/0x200
>  [<f7f21000>] ? 0xf7f21000
>  [<c04dae42>] ? rcu_read_lock_sched_held+0x42/0x80
>  [<c05daf0d>] ? kmem_cache_alloc_trace+0x23d/0x310
>  [<c0582dd1>] ? do_init_module+0x21/0x1b7
>  [<c0582dd1>] ? do_init_module+0x21/0x1b7
>  [<c0582e00>] do_init_module+0x50/0x1b7
>  [<c05036ec>] load_module+0x1ebc/0x2550
>  [<c0b993d7>] ? _raw_spin_unlock_irq+0x27/0x40
>  [<c048e6aa>] ? finish_task_switch+0x8a/0x1d0
>  [<c0503ec7>] SyS_init_module+0x147/0x1a0
>  [<c04012c4>] ? do_audit_syscall_entry.isra.9+0x44/0x50
>  [<c0401627>] ? syscall_trace_enter_phase1+0x107/0x130
>  [<c0b99f05>] syscall_call+0x7/0x7
> ---[ end trace d3c14159641a1388 ]---
> 
> 
> NV34 tested with 4.3.0-3.fc22.i686
> i.e. 4.3.0-1.fc24.i686 & drm-nouveau-Fix-pre-nv50-pageflip-events-v4.patch
> 
> http://koji.fedoraproject.org/koji/buildinfo?buildID=695636
> https://patchwork.kernel.org/patch/7591531/mbox

This doesn't look at all related and has probably been an issue for
quite some time. I /think/ this happens because memory is allocated from
the non-DMA pool (i.e. using alloc_page()) and then ends up getting run
through the dma_sync_*() API for cache maintenance. But the assumption
is that you can only do cache maintenance by the dma_sync_*() API on
memory allocated by dma_alloc_*(), hence the warning.

There was some discussion about this a while ago, and there was some
conclusion that an API was needed to do cache maintenance on non-DMA-
allocated pages of memory, but I don't think any work happened towards
that API.

Adding Alex and Arnd who had been part of that discussion, though
possibly in different threads. Guys, I've been doing too many unrelated
things lately it seems, because I can't remember where exactly we left
off. I vaguely remember that at some point somebody (maybe Russell) had
objected to adding such a non-DMA cache-maintenance API, but I can't
find a link to the relevant threads.

Thierry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20151112/b2802904/attachment.sig>


More information about the dri-devel mailing list