[Intel-gfx] [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
Bloomfield, Jon
jon.bloomfield at intel.com
Wed Dec 6 17:01:18 UTC 2017
> -----Original Message-----
> From: Chris Wilson [mailto:chris at chris-wilson.co.uk]
> Sent: Wednesday, December 6, 2017 7:38 AM
> To: intel-gfx at lists.freedesktop.org
> Cc: Chris Wilson <chris at chris-wilson.co.uk>; Bloomfield, Jon
> <jon.bloomfield at intel.com>; Harrison, John C <john.c.harrison at intel.com>;
> Ursulin, Tvrtko <tvrtko.ursulin at intel.com>; Joonas Lahtinen
> <joonas.lahtinen at linux.intel.com>; Daniel Vetter <daniel.vetter at ffwll.ch>
> Subject: [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd w/a
> and error capture
>
> Since capturing the error state requires fiddling around with the GGTT
> to read arbitrary buffers and is itself run under stop_machine(), it
> deadlocks the machine (effectively a hard hang) when run in conjunction
> with Broxton's VTd workaround to serialize GGTT access.
>
> v2: Store the ERR_PTR in first_error so that the error can be reported
> to the user via sysfs.
>
> Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Jon Bloomfield <jon.bloomfield at intel.com>
> Cc: John Harrison <john.C.Harrison at intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
It's a real shame to lose error capture on BXT. Can we wrap stop_machine to make it recursive ?
Something like...
static cpumask_t sm_mask;
struct sm_args {
cpu_stop_fn_t *fn;
void *data;
};
void do_recursive_stop(void *sm_arg_data)
{
struct sm_arg *args = sm_arg_data;
/* We're stopped - flag the fact to prevent recursion */
cpumask_set_cpu(smp_processor_id(), &sm_mask);
args->fn(args->data);
/* Re-enable recursion */
cpumask_clear_cpu(smp_processor_id(), &sm_mask);
}
void recursive_stop_machine(cpu_stop_fn_t fn, void *data)
{
if (cpumask_test_cpu(smp_processor_id(), &sm_mask)) {
/* We were already stopped, so can just call directly */
fn(data);
}
else {
/* Our CPU is not currently stopped */
struct sm_args *args = {fn, data};
stop_machine(do_recursive_stop, args, NULL);
}
}
More information about the Intel-gfx
mailing list