[Intel-gfx] [PATCH] drm/i915: Use new i915_gem_object_pin_map for LRC

Chris Wilson chris at chris-wilson.co.uk
Tue Apr 12 09:42:10 UTC 2016


On Tue, Apr 12, 2016 at 10:36:39AM +0100, Tvrtko Ursulin wrote:
> 
> On 12/04/16 10:30, Chris Wilson wrote:
> > On Tue, Apr 12, 2016 at 09:52:28AM +0100, Tvrtko Ursulin wrote:
> >> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>
> >> We can use the new pin/lazy unpin API for simplicity
> >> and more performance in the execlist submission paths.
> >>
> >> v2:
> >>    * Fix error handling and convert more users.
> >>    * Compact some names for readability.
> >>
> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> > 
> > Reviewed-by: Chris Wilson <chris at chris-wilson.co.uk>
> > 
> > Hmm, the stress tests that would exercise this are already currently
> > fail (thinking of gem_ctx_create etc). But this should not excerbate it
> > much!
> 
> Something is broken here as reported by the CI:
> 
> [  329.004489] ------------[ cut here ]------------
> [  329.004508] WARNING: CPU: 1 PID: 7049 at drivers/gpu/drm/i915/i915_gem.c:4641 i915_gem_free_object+0x3bd/0x430 [i915]
> [  329.004510] WARN_ON(obj->pages_pin_count)
> [  329.004511] Modules linked in:
> [  329.004512]  snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic i915(-) snd_hda_codec snd_hwdep snd_hda_core mei_me lpc_ich snd_pcm mei i2c_hid i2c_designware_platform i2c_designware_core e1000e ptp pps_core sdhci_acpi sdhci mmc_core [last unloaded: snd_hda_intel]
> [  329.004527] CPU: 1 PID: 7049 Comm: rmmod Tainted: G     U          4.6.0-rc3-gfxbench+ #1
> [  329.004528] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0249.2015.0529.1640 05/29/2015
> [  329.004529]  0000000000000000 ffff880212f7bbe0 ffffffff81405fa5 ffff880212f7bc30
> [  329.004531]  0000000000000000 ffff880212f7bc20 ffffffff81079c7c 0000122112f7bc28
> [  329.004533]  ffff880211eaad90 ffff880211eaad40 ffff880211eaae30 ffff8800ceb70000
> [  329.004535] Call Trace:
> [  329.004539]  [<ffffffff81405fa5>] dump_stack+0x67/0x92
> [  329.004541]  [<ffffffff81079c7c>] __warn+0xcc/0xf0
> [  329.004542]  [<ffffffff81079cea>] warn_slowpath_fmt+0x4a/0x50
> [  329.004555]  [<ffffffffa01f7ccd>] i915_gem_free_object+0x3bd/0x430 [i915]
> [  329.004558]  [<ffffffff8151cffb>] drm_gem_object_free+0x2b/0x50
> [  329.004570]  [<ffffffffa0208dd4>] intel_lr_context_free+0x74/0xd0 [i915]
> [  329.004580]  [<ffffffffa01e25b3>] i915_gem_context_free+0x1a3/0x270 [i915]
> [  329.004589]  [<ffffffffa01e2e6d>] i915_gem_context_fini+0x9d/0xd0 [i915]
> [  329.004603]  [<ffffffffa0280bb9>] i915_driver_unload+0x119/0x1d0 [i915]
> [  329.004605]  [<ffffffff81522524>] drm_dev_unregister+0x24/0xa0
> [  329.004606]  [<ffffffff81522afe>] drm_put_dev+0x1e/0x60
> [  329.004614]  [<ffffffffa01b82c0>] i915_pci_remove+0x10/0x20 [i915]
> [  329.004616]  [<ffffffff8144efb4>] pci_device_remove+0x34/0xb0
> [  329.004618]  [<ffffffff815467c5>] __device_release_driver+0x95/0x140
> [  329.004619]  [<ffffffff81546966>] driver_detach+0xb6/0xc0
> [  329.004620]  [<ffffffff815457c3>] bus_remove_driver+0x53/0xd0
> [  329.004622]  [<ffffffff81547447>] driver_unregister+0x27/0x50
> [  329.004623]  [<ffffffff8144e015>] pci_unregister_driver+0x25/0x70
> [  329.004625]  [<ffffffff81524214>] drm_pci_exit+0x74/0x90
> [  329.004638]  [<ffffffffa02812b5>] i915_exit+0x20/0x1a0 [i915]
> [  329.004640]  [<ffffffff8110a47f>] SyS_delete_module+0x18f/0x1f0
> [  329.004642]  [<ffffffff817d23a9>] entry_SYSCALL_64_fastpath+0x1c/0xac
> [  329.004643] ---[ end trace f0bf445e9d9a7dbe ]---
> [  329.004718] ------------[ cut here ]------------
> 
> It even happened in my local BAT run but I did not spot it due excessive eDP triggered WARNs. :(

I guess it is in the default context.
 
> Will look into it as soon as I stash the current work.
>  
> >> -	lrc_state_page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN);
> >> -	if (WARN_ON(!lrc_state_page)) {
> >> -		ret = -ENODEV;
> >> +	obj_addr = i915_gem_object_pin_map(ctx_obj);
> > 
> > Oops, there's a bug in pin_map in that we don't mark the object as
> > dirty. Can you send a quick obj->dirty = true patch?
> 
> I can if you are sure we want this. Callers might only want to read so I am not sure.

I agree it is slightly presumptuous, but not marking the objects as dirty
is a potential source of data loss, marking them as dirty even for pure
reads is just a missed optimisation.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list