[Intel-gfx] [PATCH i-g-t v2] tests/drv_hangman: test for acthd increasing through invalid VM space
Daniele Ceraolo Spurio
daniele.ceraolospurio at intel.com
Thu Feb 25 12:04:06 UTC 2016
On 25/02/16 11:32, Chris Wilson wrote:
> On Thu, Feb 25, 2016 at 11:12:06AM +0000, Daniele Ceraolo Spurio wrote:
>> On 25/02/16 10:41, Chris Wilson wrote:
>>> On Thu, Feb 25, 2016 at 10:32:11AM +0000, daniele.ceraolospurio at intel.com wrote:
>>>> +/* This test covers the case where we end up in an uninitialised area of the
>>>> + * ppgtt at an offset greater than the one where the last buffer is mapped. This
>>>> + * is particularly relevant if 48b ppgtt is enabled because the ppgtt is
>>>> + * massively bigger compared to the 32b case and it takes a lot more time to
>>>> + * wrap, so the acthd can potentially keep increasing for a long time
>>>> + */
>>>> +#define NSEC_PER_SEC 1000000000L
>>>> +static void ppgtt_walking(void)
>>>> + int fd;
>>>> + int64_t timeout_ns = 100 * NSEC_PER_SEC; /* 100 seconds */
>>> This needs a note that this has to be greater than ~5*hangcheck.
>>>> + struct drm_i915_gem_execbuffer2 execbuf;
>>>> + struct drm_i915_gem_exec_object2 gem_exec;
>>>> + uint32_t handle;
>>>> + uint32_t batch;
>>>> + fd = drm_open_driver(DRIVER_INTEL);
>>>> + igt_require(gem_gtt_type(fd) > 2);
>>> Nope, just full-ppgtt is required (and provides a sensible hangcheck
>>> test if !48bit as well).
>>> Note this does require that the hangcheck is enabled, so igt_require().
>>>> + /* the batch will be mapped to an offset < 4GB because the flag to allow
>>>> + * 48b offsets is not specified, so jump to address 0x00000001 00000000
>>>> + */
>>>> + batch = MI_BATCH_BUFFER_START | 1;
>>>> + batch = 0;
>>>> + batch = 1;
>>>> + batch = MI_BATCH_BUFFER_END;
>>> The vm is entirely empty. Just submit an unterminated (empty) batch, and
>>> it will walk from 0 to 1<<48bit and around and around and around and
>> I chose to jump instead of just leaving the batch unterminated to
>> cover the (rare) case where the rest of the allocated 4k of the
>> batch contain some random values, which could cause a hang and thus
>> falsely pass the test.
> That would be a huge kernel bug. Freshly allocated buffers have to be
> zero to avoid information leaks. I hope you are confusing allocating
> from the userspace buffer cache with a fresh kernel allocation...
Apologies for the confusion, you're correct I was thinking about it from
a libdrm level and not from a bare kernel level.
More information about the Intel-gfx