[Intel-gfx] [PATCH 05/50] drm/i915: Introduce struct intel_wakeref
Chris Wilson
chris at chris-wilson.co.uk
Fri Apr 12 13:12:26 UTC 2019
Quoting Tvrtko Ursulin (2019-04-12 13:59:30)
>
> On 12/04/2019 09:53, Chris Wilson wrote:
> > For controlling runtime pm of the GT and engines, we would like to have
> > a callback to do extra work the first time we wake up and the last time
> > we drop the wakeref. This first/last access needs serialisation and so
> > we encompass a mutex with the regular intel_wakeref_t tracker.
> >
> > v2: Drop the _once naming and report the errors.
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc; Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > ---
> > drivers/gpu/drm/i915/Makefile | 1 +
> > drivers/gpu/drm/i915/Makefile.header-test | 3 +-
> > drivers/gpu/drm/i915/i915_drv.h | 3 +-
> > drivers/gpu/drm/i915/intel_wakeref.c | 62 ++++++++++
> > drivers/gpu/drm/i915/intel_wakeref.h | 133 ++++++++++++++++++++++
> > 5 files changed, 199 insertions(+), 3 deletions(-)
> > create mode 100644 drivers/gpu/drm/i915/intel_wakeref.c
> > create mode 100644 drivers/gpu/drm/i915/intel_wakeref.h
> >
> > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > index 40130cf5c003..233bad5e361f 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -50,6 +50,7 @@ i915-y += i915_drv.o \
> > intel_device_info.o \
> > intel_pm.o \
> > intel_runtime_pm.o \
> > + intel_wakeref.o \
> > intel_uncore.o
> >
> > # core library code
> > diff --git a/drivers/gpu/drm/i915/Makefile.header-test b/drivers/gpu/drm/i915/Makefile.header-test
> > index 96a5d90629ec..e6b3e7588860 100644
> > --- a/drivers/gpu/drm/i915/Makefile.header-test
> > +++ b/drivers/gpu/drm/i915/Makefile.header-test
> > @@ -31,7 +31,8 @@ header_test := \
> > intel_psr.h \
> > intel_sdvo.h \
> > intel_sprite.h \
> > - intel_tv.h
> > + intel_tv.h \
> > + intel_wakeref.h
> >
> > quiet_cmd_header_test = HDRTEST $@
> > cmd_header_test = echo "\#include \"$(<F)\"" > $@
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index fad5306f07da..62a7e91acd7f 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -74,6 +74,7 @@
> > #include "intel_opregion.h"
> > #include "intel_uc.h"
> > #include "intel_uncore.h"
> > +#include "intel_wakeref.h"
> > #include "intel_wopcm.h"
> >
> > #include "i915_gem.h"
> > @@ -134,8 +135,6 @@ bool i915_error_injected(void);
> > __i915_printk(i915, i915_error_injected() ? KERN_DEBUG : KERN_ERR, \
> > fmt, ##__VA_ARGS__)
> >
> > -typedef depot_stack_handle_t intel_wakeref_t;
> > -
> > enum hpd_pin {
> > HPD_NONE = 0,
> > HPD_TV = HPD_NONE, /* TV is known to be unreliable */
> > diff --git a/drivers/gpu/drm/i915/intel_wakeref.c b/drivers/gpu/drm/i915/intel_wakeref.c
> > new file mode 100644
> > index 000000000000..f4cfaa154303
> > --- /dev/null
> > +++ b/drivers/gpu/drm/i915/intel_wakeref.c
> > @@ -0,0 +1,62 @@
> > +/*
> > + * SPDX-License-Identifier: MIT
> > + *
> > + * Copyright © 2019 Intel Corporation
> > + */
> > +
> > +#include "intel_drv.h"
> > +#include "intel_wakeref.h"
> > +
> > +int __intel_wakeref_get_first(struct drm_i915_private *i915,
> > + struct intel_wakeref *wf,
> > + int (*fn)(struct intel_wakeref *wf))
> > +{
> > + /*
> > + * Treat get/put as different subclasses, as we may need to run
> > + * the put callback from under the shrinker and do not want to
> > + * cross-contanimate that callback with any extra work performed
> > + * upon acquiring the wakeref.
> > + */
>
> So you want to seralize the get/put callbacks but don't want lockdep to
> see any chains resulting from that. Sounds worrying.
>
> What if you moved them (callbacks) outside wf->mutex then and mandate
> the callback pairs to serialize themselves?
Doesn't work for the purpose -- we do want the global transition of
first/last use serialised with all other users, not the callbacks
themselves, and that requires the mutex across the entire 0<->1
transition. For example, we use the knowledge that there is a
global/engine mutex held to insert a GPU context switch on parking,
outside of a context where we would be allow to take any of the mutexes
required for doing so.
It's exactly the same problem we have with context pinning and
elsewhere, intel_context_pin() is expensive and cannot allow
allocations; intel_context_unpin() is cheap and may be called from the
shrinker.
> > + mutex_lock_nested(&wf->mutex, SINGLE_DEPTH_NESTING);
> > + if (!atomic_read(&wf->count)) {
> > + int err;
> > +
> > + wf->wakeref = intel_runtime_pm_get(i915);
> > +
> > + err = fn(wf);
> > + if (unlikely(err)) {
> > + intel_runtime_pm_put(i915, wf->wakeref);
> > + mutex_unlock(&wf->mutex);
> > + return err;
> > + }
> > +
> > + smp_mb__before_atomic(); /* release wf->count */
>
> These ones are like usually "What.." for me.
>
> According to the docs this line would suggest you want some other memory
> write to become visible before the atomic_inc below. Which one?
All the writes between atomic_read() and atomic_inc().
atomic_inc() broadcasts that you do not need the lock (or barrier) to
see the results, so therefore all the results must be visible beforehand.
We need to turn atomic_inc() into an atomic_inc_release/atomic_inc_unlock.
-Chris
More information about the Intel-gfx
mailing list