[Intel-gfx] [PATCH v4 6/6] drm/i915: fix context/engine cleanup order
Mika Kuoppala
mika.kuoppala at linux.intel.com
Thu Feb 11 15:02:39 UTC 2016
Chris Wilson <chris at chris-wilson.co.uk> writes:
> On Sat, Jan 30, 2016 at 11:17:07AM +0000, Chris Wilson wrote:
>> On Fri, Jan 29, 2016 at 07:19:31PM +0000, Dave Gordon wrote:
>> > From: Nick Hoath <nicholas.hoath at intel.com>
>> >
>> > Swap the order of context & engine cleanup, so that contexts are cleaned
>> > up first, and *then* engines. This is a more sensible order anyway, but
>> > in particular has become necessary since the 'intel_ring_initialized()
>> > must be simple and inline' patch, which now uses ring->dev as an
>> > 'initialised' flag, so it can now be NULL after engine teardown. This in
>> > turn can cause a problem in the context code, which (used to) check the
>> > ring->dev->struct_mutex -- causing a fault if ring->dev was NULL.
>> >
>> > Also rename the cleanup function to reflect what it actually does
>> > (cleanup engines, not a ringbuffer), and fix an annoying whitespace
>> > issue.
>> >
>> > v2: Also make the fix in i915_load_modeset_init, not just in
>> > i915_driver_unload (Chris Wilson)
>> > v3: Had extra stuff in it.
>> > v4: Reverted extra stuff (so we're back to v2).
>> > Rebased and updated commentary above (Dave Gordon).
>> >
>> > Signed-off-by: Nick Hoath <nicholas.hoath at intel.com>
>> > Signed-off-by: David Gordon <david.s.gordon at intel.com>
>> > Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
>>
>> Have to drop that, with the recent context changes.
>>
>> You have to move the gpu-reset now for execlists.
>>
>> Basically pull it out into:
>>
>> static void i915_unload_gem(struct drm_device *dev)
>> {
>> /*
>> * Neither the BIOS, ourselves or any other kernel
>> * expects the system to be in execlists mode on startup,
>> * so we need to reset the GPU back to legacy mode. And the only
>> * known way to disable logical contexts is through a GPU reset.
>> *
>> * So in order to leave the system in a known default configration,
>> * always reset the GPU upon unload. This also cleans up the GEM
>> * state tracking, flushing off the requests and leaving the system
>> * idle.
>> *
>> * Note that is of the upmost importance that the GPU is idle and
>> * all stray writes are flushed *before* we dismantle the backing
>> * storage for the pinned objects.
>> */
>> i915_reset(dev);
>>
>> mutex_lock(&dev->struct_mutex);
>> i915_gem_context_fini(dev);
>> i915_gem_cleanup_ringbuffer(dev);
>> mutex_unlock(&dev->struct_mutex);
>> }
>>
>> and then kill the intel_gpu_reset along both the cleanup pathsh
>
> It appears this patch was applied without dropping my r-b for the issue
> I pointed out above.
Now causes a splat in intel_logical_ring_cleanup when unloading module.
Best to revert and rework on top of Dave's cleanup set?
-Mika
[ 58.170369] BUG: unable to handle kernel NULL pointer dereference at
(null)
[ 58.170374] IP: [<ffffffffa00e04d3>]
intel_logical_ring_cleanup+0x83/0x100 [i915]
[ 58.170389] PGD 0
[ 58.170391] Oops: 0000 [#1] PREEMPT SMP
[ 58.170394] Modules linked in: x86_pkg_temp_thermal intel_powerclamp
coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i915(-)
mei_me mei i2c_hid e1000e ptp pps_core
[ 58.170404] CPU: 0 PID: 5766 Comm: rmmod Not tainted 4.5.0-rc3+ #43
[ 58.170407] Hardware name: System manufacturer System Product
Name/Z170M-PLUS, BIOS 0505 11/16/2015
[ 58.170410] task: ffff880036aab380 ti: ffff8802374c0000 task.ti:
ffff8802374c0000
[ 58.170412] RIP: 0010:[<ffffffffa00e04d3>] [<ffffffffa00e04d3>]
intel_logical_ring_cleanup+0x83/0x100 [i915]
[ 58.170424] RSP: 0018:ffff8802374c3d30 EFLAGS: 00010282
[ 58.170425] RAX: 0000000000000000 RBX: ffff880237203788 RCX:
0000000000000001
[ 58.170428] RDX: 0000000087654321 RSI: 000000000000000d RDI:
ffff8802372037c0
[ 58.170430] RBP: ffff8802374c3d40 R08: 0000000000000000 R09:
0000000ad856c000
[ 58.170432] R10: 0000000000000000 R11: 0000000000000001 R12:
ffff880237200000
[ 58.170434] R13: ffff880237209638 R14: ffff880237323000 R15:
0000558a770bd090
[ 58.170438] FS: 00007f8b439ff700(0000) GS:ffff880239800000(0000)
knlGS:0000000000000000
[ 58.170442] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 58.170444] CR2: 0000000000000000 CR3: 000000023532d000 CR4:
00000000003406f0
[ 58.170447] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 58.170450] DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7:
0000000000000400
[ 58.170453] Stack:
[ 58.170455] ffff880237203788 ffff880237200000 ffff8802374c3d70
ffffffffa00d0ed4
[ 58.170460] ffff880237200000 ffff880237323000 ffff880237323060
ffffffffa0194360
[ 58.170465] ffff8802374c3d98 ffffffffa0154520 ffff880237323000
ffff880237323000
[ 58.170469] Call Trace:
[ 58.170479] [<ffffffffa00d0ed4>] i915_gem_cleanup_engines+0x34/0x60
[i915]
[ 58.170493] [<ffffffffa0154520>] i915_driver_unload+0x140/0x220
[i915]
[ 58.170497] [<ffffffff8154a4f4>] drm_dev_unregister+0x24/0xa0
[ 58.170501] [<ffffffff8154aace>] drm_put_dev+0x1e/0x60
[ 58.170506] [<ffffffffa00912a0>] i915_pci_remove+0x10/0x20 [i915]
[ 58.170510] [<ffffffff814766e4>] pci_device_remove+0x34/0xb0
[ 58.170514] [<ffffffff8156e7d5>] __device_release_driver+0x95/0x140
[ 58.170518] [<ffffffff8156e97c>] driver_detach+0xbc/0xc0
[ 58.170521] [<ffffffff8156d883>] bus_remove_driver+0x53/0xd0
[ 58.170525] [<ffffffff8156f3a7>] driver_unregister+0x27/0x50
[ 58.170528] [<ffffffff81475725>] pci_unregister_driver+0x25/0x70
[ 58.170531] [<ffffffff8154c274>] drm_pci_exit+0x74/0x90
[ 58.170543] [<ffffffffa0154cb0>] i915_exit+0x20/0x1aa [i915]
[ 58.170548] [<ffffffff8111846f>] SyS_delete_module+0x18f/0x1f0
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list