[i-g-t] lib/xe: set hwconfig NULL for unsupported platforms

Matt Roper matthew.d.roper at intel.com
Mon Dec 9 20:38:22 UTC 2024


On Mon, Dec 09, 2024 at 11:20:07AM -0800, John Harrison wrote:
> On 12/9/2024 00:57, Tejas Upadhyay wrote:
> > There are hardware platforms which are not supporting
> > hwconfig table, for example ADLS. Querying hwconfig
> > on unsupported platforms leads to assert and failure
> > for some of tests like,
> > ./build/tests/xe_module_load --r load
> The module reload test should not be querying the hwconfig table?!

See the Fixes: line; that patch made it so that anything calling
xe_device_get() will fetch and cache the hwconfig (and assert if it's
missing).  xe_module load calls __drm_open_driver ->
__drm_open_driver_another -> xe_device_get so it does indeed fetch the
hwconfig table now and hit the assertion added by that patch.

I don't know if always grabbing and caching the hwconfig was actually a
good idea or not.  As we've seen here, the hwconfig may legitimately be
NULL, so as we remove the assertion here, we still need to do
NULL-checks at the point the hwconfig gets used to truly be safe.

> 
> But the hwconfig test itself should be failing on platforms which do not
> have a table. That is intentional. There is no platform which is POR for the
> Xe driver which does not support hwconfig tables. So the test is supposed to
> fail if it ever does not get a valid table.

The Xe driver is only officially supported on the Xe2 platforms and
beyond, but it can still load and run on some pre-Xe2 platforms that
people were using as development platforms; some of those don't support
hwconfig.  And even though our CI setup is only trying to test a subset
of those platforms that do contain hwconfig, note that our DG2 setups
also have an ADL-S igpu that also gets probed during module load, and
that's what leads to the CI failures that were reported here.


Maybe XeKMD is mature enough now (and we have enough real platforms for
testing now) that we could prune the list of platforms that Xe is
willing to load on with unofficial support.  E.g., drop the unofficial
support for TGL, RKL, and ADL-S since those platforms don't have
hwconfig and I don't think any of our developers are still using them as
development vehicles at this point.  That's probably a separate
discussion from the patch at hand though.


Matt

> 
> John.
> 
> > 
> > https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3683
> > 
> > Cc: Matt Roper <matthew.d.roper at intel.com>
> > Cc: Jan Maslak <jan.maslak at intel.com>
> > Fixes: 37a230e50 ("lib/xe_query: add hwconfig to xe_device")
> > Signed-off-by: Tejas Upadhyay <tejas.upadhyay at intel.com>
> > ---
> >   lib/xe/xe_query.c | 6 ++++--
> >   1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
> > index f3731d9d3..8f0a5392c 100644
> > --- a/lib/xe/xe_query.c
> > +++ b/lib/xe/xe_query.c
> > @@ -51,7 +51,8 @@ static uint32_t *xe_query_hwconfig_new(int fd, uint32_t *hwconfig_size)
> >   	/* Perform the initial query to get the size */
> >   	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
> > -	igt_assert_neq(query.size, 0);
> > +	if (!query.size)
> > +		return NULL;
> >   	hwconfig = malloc(query.size);
> >   	igt_assert(hwconfig);
> > @@ -858,7 +859,8 @@ uint32_t *xe_hwconfig_lookup_value(int fd, enum intel_hwconfig attribute, uint32
> >   	igt_assert(xe_dev);
> >   	hwconfig = xe_dev->hwconfig;
> > -	igt_assert(hwconfig);
> > +	if (!hwconfig)
> > +		return NULL;
> >   	/* Extract the value from the hwconfig */
> >   	pos = 0;
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


More information about the igt-dev mailing list