[PATCH i-g-t v2 4/4] tests/intel/xe_eudebug_online: Add read/write pagefault online tests
Gwan-gyeong Mun
gwan-gyeong.mun at intel.com
Fri Nov 22 14:33:42 UTC 2024
On 11/22/24 11:55 AM, Manszewski, Christoph wrote:
> Hi Gwan-gyeong,
>
> On 22.11.2024 09:21, Gwan-gyeong Mun wrote:
>>
>>
>> On 11/21/24 7:12 PM, Manszewski, Christoph wrote:
>>> Hi Gwan-gyeong,
>>>
>>> On 21.11.2024 13:22, Gwan-gyeong Mun wrote:
>>>> Add read and write pagefault tests to xe_eudebug_online that checks
>>>> if a
>>>> pagefault event is submitted by the KMD debugger when a pagefault
>>>> occurs.
>>>>
>>>> Test that read (load instruction) and write(store instruction)
>>>> attempt to
>>>> load or store access to unallocated memory, causing a pagefault.
>>>> Examine the address causing the page fault and the number of eu threads
>>>> causing the pagefault.
>>>>
>>>> v2: Refactor of output attention bits on pagefault event handling
>>>> (Andrzej)
>>>> remove / update redudant code (Andrzej, Christoph)
>>>> use igt_container_of() macro (Christoph)
>>>>
>>>> Co-developed-by: Jonathan Cavitt <jonathan.cavitt at intel.com>
>>>> Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun at intel.com>
>>>> ---
>>>> tests/intel/xe_eudebug_online.c | 178 ++++++++++++++++++++++++++++
>>>> +++-
>>>> 1 file changed, 173 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/tests/intel/xe_eudebug_online.c b/tests/intel/
>>>> xe_eudebug_online.c
>>>> index 0ef0d8093..a70d18ee4 100644
>>>> --- a/tests/intel/xe_eudebug_online.c
>>>> +++ b/tests/intel/xe_eudebug_online.c
>>>> @@ -36,6 +36,8 @@
>>>> #define BB_IN_VRAM (1 << 11)
>>>> #define TARGET_IN_SRAM (1 << 12)
>>>> #define TARGET_IN_VRAM (1 << 13)
>>>> +#define SHADER_PAGEFAULT_READ (1 << 14)
>>>> +#define SHADER_PAGEFAULT_WRITE (1 << 15)
>>>> #define TRIGGER_UFENCE_SET_BREAKPOINT (1 << 24)
>>>> #define TRIGGER_RESUME_SINGLE_WALK (1 << 25)
>>>> #define TRIGGER_RESUME_PARALLEL_WALK (1 << 26)
>>>> @@ -45,6 +47,7 @@
>>>> #define TRIGGER_RESUME_DSS (1 << 30)
>>>> #define TRIGGER_RESUME_ONE (1 << 31)
>>>> +#define SHADER_PAGEFAULT (SHADER_PAGEFAULT_READ |
>>>> SHADER_PAGEFAULT_WRITE)
>>>> #define BB_REGION_BITMASK (BB_IN_SRAM | BB_IN_VRAM)
>>>> #define TARGET_REGION_BITMASK (TARGET_IN_SRAM | TARGET_IN_VRAM)
>>>> @@ -61,6 +64,8 @@
>>>> #define CACHING_VALUE(n) (CACHING_INIT_VALUE + (n))
>>>> #define SHADER_CANARY 0x01010101
>>>> +#define BAD_CANARY 0xf1f1f1f
>>>> +#define BAD_OFFSET (0x12345678ull << 12)
>>>> #define WALKER_X_DIM 4
>>>> #define WALKER_ALIGNMENT 16
>>>> @@ -120,7 +125,7 @@ static struct intel_buf *create_uc_buf(int fd,
>>>> int width, int height, uint64_t r
>>>> static int get_number_of_threads(uint64_t flags)
>>>> {
>>>> - if (flags & SHADER_MIN_THREADS)
>>>> + if (flags & (SHADER_MIN_THREADS | SHADER_PAGEFAULT))
>>>> return 16;
>>>> if (flags & (TRIGGER_RESUME_ONE | TRIGGER_RESUME_SINGLE_WALK |
>>>> @@ -179,6 +184,16 @@ static struct gpgpu_shader *get_shader(int fd,
>>>> const unsigned int flags)
>>>> gpgpu_shader__common_target_write_u32(shader, s_dim.y
>>>> + i, CACHING_VALUE(i));
>>>> gpgpu_shader__nop(shader);
>>>> gpgpu_shader__breakpoint(shader);
>>>> + } else if (flags & SHADER_PAGEFAULT) {
>>>> + if (flags & SHADER_PAGEFAULT_READ)
>>>> + gpgpu_shader__read_a64_dword(shader, BAD_OFFSET);
>>>> + else
>>>> + gpgpu_shader__write_a64_dword(shader, BAD_OFFSET,
>>>> BAD_CANARY);
>>>> +
>>>> + gpgpu_shader__label(shader, 0);
>>>> + gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
>>>> + gpgpu_shader__jump_neq(shader, 0, w_dim.y, STEERING_END_LOOP);
>>>> + gpgpu_shader__write_dword(shader, SHADER_CANARY, 0);
>>>
>>> Now that I think about - do we need this to be a loop? Can't we just
>>> do the read/write instructions? This would simplify the code and I
>>> don't yet see why we need to loop within the shader. The SHADER_LOOP
>>> is used for interrupt-all because we want to interrupt the workload
>>> from the user/main igt thread. But here, similar to the basic-
>>> breakpoint test, we just submit a workload that will halt because of
>>> the hardware/kmd intervention.
>>>
>> the pagefault tests also need this concept.
>>
>> When a pagefault happened, KMD sets “Force Exception / Force External
>> Halt” in TD_CTL to cause the eu threads to enter SIP mode.
>> In the pagefault handling process of eudebug, kmd installs a null page
>> at the address where the pagefault happened and makes the halted eu
>> threads resume (make unhalt).
>>
>> It would be ideal if all unhalted eu threads immediately entered SIP
>> mode due to the FE/FEH settings, but it may not happen immediately.
>> Therefore, the purpose of using a loop is to ensure that the kernel
>> shader does not terminate until a pagefault event and attention event
>> occur by adding an additional instruction after the instruction that
>> causes the page fault.
>> Therefore, a loop is used to ensure that at least one eu thread must
>> enter SIP mode.
>
> Yeah if the count of processed instructions before the exception is not
> defined then indeed the loop has it's place here. But we still may
> reduce a little bit of code, see below.
>
>> The attention callback sets to exit this loop, so this code allows the
>> eu thread to terminate after the sip shader is processed.
>>
>> Br,
>> G.G.
>>>> }
>>>> gpgpu_shader__eot(shader);
>>>> @@ -217,6 +232,16 @@ static int count_set_bits(void *ptr, size_t size)
>>>> return count;
>>>> }
>>>> +static int eu_attentions_xor_count(const uint32_t *a, const
>>>> uint32_t *b, uint32_t size)
>>>> +{
>>>> + int count = 0;
>>>> +
>>>> + for (int i = 0; i < size / 4 ; i++)
>>>> + count += igt_hweight(a[i] ^ b[i]);
>>>> +
>>>> + return count;
>>>> +}
>>>> +
>>>> static int count_canaries_eq(uint32_t *ptr, struct dim_t w_dim,
>>>> uint32_t value)
>>>> {
>>>> int count = 0;
>>>> @@ -636,7 +661,7 @@ static void eu_attention_resume_trigger(struct
>>>> xe_eudebug_debugger *d,
>>>> }
>>>> }
>>>> - if (d->flags & SHADER_LOOP) {
>>>> + if (d->flags & (SHADER_LOOP | SHADER_PAGEFAULT)) {
>>>
>>> If we drop the loop we can drop also this.
>>>
>>>> uint32_t threads = get_number_of_threads(d->flags);
>>>> uint32_t val = STEERING_END_LOOP;
>>>> @@ -746,6 +771,44 @@ static void
>>>> eu_attention_resume_single_step_trigger(struct xe_eudebug_debugger *
>>>> data->single_step_bitmask[i] &= ~att->bitmask[i];
>>>> }
>>>> +static void eu_attention_resume_pagefault_trigger(struct
>>>> xe_eudebug_debugger *d,
>>>> + struct drm_xe_eudebug_event *e)
>>>> +{
>>>> + struct drm_xe_eudebug_event_eu_attention *att =
>>>> igt_container_of(e, att, base);
>>>> + struct online_debug_data *data = d->ptr;
>>>> + uint32_t bitmask_size = att->bitmask_size;
>>>> + uint8_t *bitmask;
>>>> +
>>>> + if (data->last_eu_control_seqno > att->base.seqno)
>>>> + return;
>>>> +
>>>> + bitmask = calloc(1, att->bitmask_size);
>>>> + igt_assert(bitmask);
>>>> +
>>>> + eu_ctl_stopped(d->fd, att->client_handle, att->exec_queue_handle,
>>>> + att->lrc_handle, bitmask, &bitmask_size);
>>>> + igt_assert(bitmask_size == att->bitmask_size);
>>>> +
>>>> + pthread_mutex_lock(&data->mutex);
>>>> +
>>>> + if (d->flags & SHADER_PAGEFAULT) {
>>>> + uint32_t threads = get_number_of_threads(d->flags);
>>>> + uint32_t val = STEERING_END_LOOP;
>>>> +
>>>> + igt_assert_eq(pwrite(data->vm_fd, &val, sizeof(uint32_t),
>>>> + data->target_offset + steering_offset(threads)),
>>>> + sizeof(uint32_t));
>>>> + fsync(data->vm_fd);
>>>> + }
>>>
>>> We can also drop this when we remove the loop. Btw. why can't we just
>>> use 'eu_attention_resume_trigger' instead of this whole function?
>
> We could remove the 'eu_attention_resume_trigger' like so:
>
> ```
> diff --git a/tests/intel/xe_eudebug_online.c b/tests/intel/
> xe_eudebug_online.c
> index a70d18ee4..c077795ee 100644
> --- a/tests/intel/xe_eudebug_online.c
> +++ b/tests/intel/xe_eudebug_online.c
> @@ -622,7 +622,10 @@ static void eu_attention_resume_trigger(struct
> xe_eudebug_debugger *d,
> eu_ctl_stopped(d->fd, att->client_handle, att->exec_queue_handle,
> att->lrc_handle, bitmask, &bitmask_size);
> igt_assert(bitmask_size == att->bitmask_size);
> - igt_assert(memcmp(bitmask, att->bitmask, att->bitmask_size) == 0);
> +
> + /* No guarantee that all pagefaulting eu threads will raise
> attention */
> + if (!(d->flags & SHADER_PAGEFAULT))
> + igt_assert(memcmp(bitmask, att->bitmask, att->bitmask_size) == 0);
>
> pthread_mutex_lock(&data->mutex);
> if (igt_nsec_elapsed(&data->exception_arrived) <
> (MAX_PREEMPT_TIMEOUT + 1) * NSEC_PER_SEC &&
> @@ -771,44 +774,6 @@ static void
> eu_attention_resume_single_step_trigger(struct xe_eudebug_debugger *
> data->single_step_bitmask[i] &= ~att->bitmask[i];
> }
>
> -static void eu_attention_resume_pagefault_trigger(struct
> xe_eudebug_debugger *d,
> - struct drm_xe_eudebug_event *e)
> -{
> - struct drm_xe_eudebug_event_eu_attention *att = igt_container_of(e,
> att, base);
> - struct online_debug_data *data = d->ptr;
> - uint32_t bitmask_size = att->bitmask_size;
> - uint8_t *bitmask;
> -
> - if (data->last_eu_control_seqno > att->base.seqno)
> - return;
> -
> - bitmask = calloc(1, att->bitmask_size);
> - igt_assert(bitmask);
> -
> - eu_ctl_stopped(d->fd, att->client_handle, att->exec_queue_handle,
> - att->lrc_handle, bitmask, &bitmask_size);
> - igt_assert(bitmask_size == att->bitmask_size);
> -
> - pthread_mutex_lock(&data->mutex);
> -
> - if (d->flags & SHADER_PAGEFAULT) {
> - uint32_t threads = get_number_of_threads(d->flags);
> - uint32_t val = STEERING_END_LOOP;
> -
> - igt_assert_eq(pwrite(data->vm_fd, &val, sizeof(uint32_t),
> - data->target_offset + steering_offset(threads)),
> - sizeof(uint32_t));
> - fsync(data->vm_fd);
> - }
> - pthread_mutex_unlock(&data->mutex);
> -
> - data->last_eu_control_seqno = eu_ctl_resume(d->master_fd, d->fd,
> att->client_handle,
> - att->exec_queue_handle, att->lrc_handle,
> - bitmask, att->bitmask_size);
> -
> - free(bitmask);
> -}
> -
> static void open_trigger(struct xe_eudebug_debugger *d,
> struct drm_xe_eudebug_event *e)
> {
> @@ -1530,7 +1495,7 @@ static void test_pagefault_online(int fd, struct
> drm_xe_engine_class_instance *h
> xe_eudebug_debugger_add_trigger(s->debugger,
> DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> eu_attention_debug_trigger);
> xe_eudebug_debugger_add_trigger(s->debugger,
> DRM_XE_EUDEBUG_EVENT_EU_ATTENTION,
> - eu_attention_resume_pagefault_trigger);
> + eu_attention_resume_trigger);
> xe_eudebug_debugger_add_trigger(s->debugger,
> DRM_XE_EUDEBUG_EVENT_VM, vm_open_trigger);
> xe_eudebug_debugger_add_trigger(s->debugger,
> DRM_XE_EUDEBUG_EVENT_METADATA,
> create_metadata_trigger);
> ```
>
> Does this look reasonable? I know it adds yet another path to
> 'eu_attention_resume_trigger' but you partially account for the
> pagefault shader in your current code anyway.
>
Yes, right, I added a separate callback (for readability) because I
didn't need the non-pagefault scenario checking routine of the attention
callback.
In order to keep the code additions in this patch as small as possible,
I'll integrate to handle pagefault scenario in
eu_attention_resume_trigger() callback, as you suggested.
Thanks,
G.G.
> Thanks,
> Christoph
>
More information about the igt-dev
mailing list