[Intel-xe] [PATCH v7 1/2] drm/xe/buddy: add visible tracking

Tue Mar 21 11:15:52 UTC 2023

On Tue, 21 Mar 2023, Matthew Auld <matthew.auld at intel.com> wrote:
> +	XE_BUG_ON(min_page_size < mm->chunk_size);
> +	XE_BUG_ON(min_page_size > SZ_2G); /* FIXME: sg limit */
> +	XE_BUG_ON(size > SZ_2G &&
> +		  (vres->base.placement & TTM_PL_FLAG_CONTIGUOUS));
> +	XE_BUG_ON(!IS_ALIGNED(size, min_page_size));

Hijacking the thread, sorry. ;)

If we are supposed to not replicate i915-ism, why do we have
XE_BUG_ON()?

I think it's an anti-pattern.

Currently XE_BUG_ON() is just BUG_ON(), which should not be used for any
reason. Not even CI. The plan in i915 is to migrate all of them to
WARN_ON()'s and use panic_on_warn in CI.

The other thing is that having BUG_ON() in the name gives you the false
impression that it's sufficient for input (or any precondition)
validation. In i915 it gets conditionally built for CI only. So we catch
errors in CI but not in actual production use. You just plunge on.

Part of it is just naming. The BUG_ON() association is pretty strong for
any kernel developers. If it were called XE_ASSERT() it would have an
association with assert() that does what XE_BUG_ON() actually is: code
is only generated for debug builds.

In i915 it has become habitual to sprinkle GEM_BUG_ON() all over the
place, just in case, belt and suspenders. IMO there's not enough
discretion about what can and can't actually happen, and what can and
can't happen in CI vs. production.

BR,
Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center