[Nouveau] nouveau sync DMA memory not allocated

poma pomidorabelisima at gmail.com
Wed Dec 16 03:10:24 PST 2015


On 12.11.2015 14:48, Thierry Reding wrote:
> On Wed, Nov 11, 2015 at 09:12:33PM +0100, poma wrote:
>> On 10.11.2015 17:25, Mario Kleiner wrote:
>>> On 11/10/2015 05:00 PM, Thierry Reding wrote:
>>>> On Tue, Nov 10, 2015 at 03:54:52PM +0100, Mario Kleiner wrote:
>>>>> From: Daniel Vetter <daniel.vetter at ffwll.ch>
>>>>>
>>>>> Apparently pre-nv50 pageflip events happen before the actual vblank
>>>>> period. Therefore that functionality got semi-disabled in
>>>>>
>>>>> commit af4870e406126b7ac0ae7c7ce5751f25ebe60f28
>>>>> Author: Mario Kleiner <mario.kleiner.de at gmail.com>
>>>>> Date:   Tue May 13 00:42:08 2014 +0200
>>>>>
>>>>>      drm/nouveau/kms/nv04-nv40: fix pageflip events via special case.
>>>>>
>>>>> Unfortunately that hack got uprooted in
>>>>>
>>>>> commit cc1ef118fc099295ae6aabbacc8af94d8d8885eb
>>>>> Author: Thierry Reding <treding at nvidia.com>
>>>>> Date:   Wed Aug 12 17:00:31 2015 +0200
>>>>>
>>>>>      drm/irq: Make pipe unsigned and name consistent
>>>>>
>>>>> Trigering a warning when trying to sample the vblank timestamp for a
>>>>> non-existing pipe. There's a few ways to fix this:
>>>>>
>>>>> - Open-code the old behaviour, which just enshrines this slight
>>>>>    breakage of the userspace ABI.
>>>>>
>>>>> - Revert Mario's commit and again inflict broken timestamps, again not
>>>>>    pretty.
>>>>>
>>>>> - Fix this for real by delaying the pageflip TS until the next vblank
>>>>>    interrupt, thereby making it accurate.
>>>>>
>>>>> This patch implements the third option. Since having a page flip
>>>>> interrupt that happens when the pageflip gets armed and not when it
>>>>> completes in the next vblank seems to be fairly common (older i915 hw
>>>>> works very similarly) create a new helper to arm vblank events for
>>>>> such drivers.
>>>>>
>>>>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=106431
>>>>> Cc: Thierry Reding <treding at nvidia.com>
>>>>> Cc: Mario Kleiner <mario.kleiner.de at gmail.com>
>>>>> Cc: Ben Skeggs <bskeggs at redhat.com>
>>>>> Cc: Ilia Mirkin <imirkin at alum.mit.edu>
>>>>>
>>>>> v2 (mario): Integrate my own review comments into Daniels patch.
>>>>>     - Fix function prototypes in drmP.h
>>>>>     - Add missing vblank_put() for pageflip completion without
>>>>>       pageflip event.
>>>>>     - Initialize sequence number for queued pageflip event to avoidng
>>>>>       trouble in drm_handle_vblank_events().
>>>>>     - Remove dead code and spelling fix.
>>>>>
>>>>> v3 (mario): Add a signed-off-by and cc stable tag per Ilja's advice.
>>>>>
>>>>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
>>>>> (v1) Reviewed-by: Mario Kleiner <mario.kleiner.de at gmail.com>
>>>>> (v2/v3) Signed-off-by: Mario Kleiner <mario.kleiner.de at gmail.com>
>>>>>
>>>>> Cc: stable at vger.kernel.org # v4.3
>>>>> ---
>>>>>   drivers/gpu/drm/drm_irq.c                 | 54 ++++++++++++++++++++++++++++++-
>>>>>   drivers/gpu/drm/nouveau/nouveau_display.c | 19 ++++++-----
>>>>>   include/drm/drmP.h                        |  4 +++
>>>>>   3 files changed, 68 insertions(+), 9 deletions(-)
>>>>
>>>> This looks good to me. Let me clean this up a little and submit it to
>>>> Dave.
>>>>
>>>> Thierry
>>>>
>>>
>>> Btw., if somebody has a functional old card for testing this, it should 
>>> be easy to verify if it works on pre-nv50. If it would not work it would 
>>> deliver the pageflip event 1 frame delayed, so at least on standard 
>>> nouveau + default DRI2 + default double-buffering the rate for a tight 
>>> loop of page-flipped swaps should go down to 30 fps on a 60 Hz display, 
>>> quite noticeable. Afaik we also have Piglit tests for OML_sync_control 
>>> which would likely fail if this would be broken.
>>>
>>> Oh and if someone has tips on how to resurrect an old nv-40 PC (booted 
>>> with BIOS only) graphics card in a MacPro (EFI boot), i wouldn't mind 
>>> hearing them. It would be nice to still be able to use that card for 
>>> testing.
>>>
>>> thanks,
>>> -mario
>>
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 313 at lib/dma-debug.c:1205 check_sync+0x169/0x6e0()
>> nouveau 0000:01:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x00000000c0bf6468] [size=4096 bytes]
>> Modules linked in: nouveau(+) ...
>> CPU: 0 PID: 313 Comm: systemd-udevd Not tainted 4.3.0-3.fc22.i686+debug #1
>> ...
>> Call Trace:
>>  [<c078a28f>] dump_stack+0x48/0x69
>>  [<c0461fc7>] warn_slowpath_common+0x87/0xc0
>>  [<c07b9029>] ? check_sync+0x169/0x6e0
>>  [<c07b9029>] ? check_sync+0x169/0x6e0
>>  [<c046203e>] warn_slowpath_fmt+0x3e/0x60
>>  [<c07b9029>] check_sync+0x169/0x6e0
>>  [<c07b96ad>] debug_dma_sync_single_for_device+0x7d/0x90
>>  [<f7ece298>] ? ttm_bo_del_sub_from_lru+0x18/0x50 [ttm]
>>  [<c040bef0>] ? text_poke_bp+0xd0/0xd0
>>  [<f85d96db>] nouveau_bo_sync_for_device+0x8b/0xa0 [nouveau]
>>  [<f85d97c4>] nouveau_bo_validate+0x34/0x40 [nouveau]
>>  [<f85d9958>] nouveau_bo_pin+0x188/0x290 [nouveau]
>>  [<f85d7fe0>] ? nv10_bo_put_tile_region+0x80/0x80 [nouveau]
>>  [<f85ec7fd>] nouveau_channel_prep+0xfd/0x2c0 [nouveau]
>>  [<f85eca17>] nouveau_channel_new+0x57/0x7a0 [nouveau]
>>  [<c05da57c>] ? kfree+0xdc/0x280
>>  [<f8540082>] ? nvif_object_sclass_put+0x12/0x20 [nouveau]
>>  [<f85d5666>] nouveau_drm_load+0x596/0x8d0 [nouveau]
>>  [<c04bc14c>] ? trace_hardirqs_on_caller+0x12c/0x1d0
>>  [<f7e8bfe9>] ? drm_minor_register+0x89/0x120 [drm]
>>  [<f7e8c116>] drm_dev_register+0x96/0xa0 [drm]
>>  [<f7e8ece9>] drm_get_pci_dev+0x79/0x1c0 [drm]
>>  [<c07d973e>] ? pcibios_set_master+0x4e/0xa0
>>  [<f85d5b8e>] nouveau_drm_probe+0x1ee/0x220 [nouveau]
>>  [<c07dbf3b>] pci_device_probe+0x7b/0xf0
>>  [<c08bf366>] ? devices_kset_move_last+0x56/0xa0
>>  [<c08c2de4>] driver_probe_device+0x204/0x490
>>  [<c08c30bc>] ? __driver_attach+0x4c/0x90
>>  [<c07dbc02>] ? pci_match_device+0xd2/0x100
>>  [<c08c30f1>] __driver_attach+0x81/0x90
>>  [<c08c3070>] ? driver_probe_device+0x490/0x490
>>  [<c08c0b97>] bus_for_each_dev+0x57/0xa0
>>  [<c08c25ce>] driver_attach+0x1e/0x20
>>  [<c08c3070>] ? driver_probe_device+0x490/0x490
>>  [<c08c214f>] bus_add_driver+0x1ef/0x290
>>  [<c08c3bdd>] driver_register+0x5d/0xf0
>>  [<c07da71a>] __pci_register_driver+0x4a/0x50
>>  [<f7e8ef0d>] drm_pci_init+0xdd/0x100 [drm]
>>  [<f7f211f9>] nouveau_drm_init+0x1f9/0x1000 [nouveau]
>>  [<f7f21000>] ? 0xf7f21000
>>  [<c040047a>] do_one_initcall+0xaa/0x200
>>  [<f7f21000>] ? 0xf7f21000
>>  [<c04dae42>] ? rcu_read_lock_sched_held+0x42/0x80
>>  [<c05daf0d>] ? kmem_cache_alloc_trace+0x23d/0x310
>>  [<c0582dd1>] ? do_init_module+0x21/0x1b7
>>  [<c0582dd1>] ? do_init_module+0x21/0x1b7
>>  [<c0582e00>] do_init_module+0x50/0x1b7
>>  [<c05036ec>] load_module+0x1ebc/0x2550
>>  [<c0b993d7>] ? _raw_spin_unlock_irq+0x27/0x40
>>  [<c048e6aa>] ? finish_task_switch+0x8a/0x1d0
>>  [<c0503ec7>] SyS_init_module+0x147/0x1a0
>>  [<c04012c4>] ? do_audit_syscall_entry.isra.9+0x44/0x50
>>  [<c0401627>] ? syscall_trace_enter_phase1+0x107/0x130
>>  [<c0b99f05>] syscall_call+0x7/0x7
>> ---[ end trace d3c14159641a1388 ]---
>>
>>
>> NV34 tested with 4.3.0-3.fc22.i686
>> i.e. 4.3.0-1.fc24.i686 & drm-nouveau-Fix-pre-nv50-pageflip-events-v4.patch
>>
>> http://koji.fedoraproject.org/koji/buildinfo?buildID=695636
>> https://patchwork.kernel.org/patch/7591531/mbox
> 
> This doesn't look at all related and has probably been an issue for
> quite some time. I /think/ this happens because memory is allocated from
> the non-DMA pool (i.e. using alloc_page()) and then ends up getting run
> through the dma_sync_*() API for cache maintenance. But the assumption
> is that you can only do cache maintenance by the dma_sync_*() API on
> memory allocated by dma_alloc_*(), hence the warning.
> 
> There was some discussion about this a while ago, and there was some
> conclusion that an API was needed to do cache maintenance on non-DMA-
> allocated pages of memory, but I don't think any work happened towards
> that API.
> 
> Adding Alex and Arnd who had been part of that discussion, though
> possibly in different threads. Guys, I've been doing too many unrelated
> things lately it seems, because I can't remember where exactly we left
> off. I vaguely remember that at some point somebody (maybe Russell) had
> objected to adding such a non-DMA cache-maintenance API, but I can't
> find a link to the relevant threads.
> 
> Thierry
> 


Mister Thierry, if you make a patch, do not hesitate to share.

Thanks

Ref.
https://bugs.freedesktop.org/attachment.cgi?id=120544




More information about the dri-devel mailing list