[igt-dev] [PATCH i-g-t] tests/i915/perf: verify reason field in OA reports is never 0

Lionel Landwerlin lionel.g.landwerlin at intel.com
Wed Nov 25 10:21:02 UTC 2020


On 25/11/2020 02:41, Umesh Nerlige Ramappa wrote:
> On Thu, Nov 19, 2020 at 12:46:30PM +0200, Lionel Landwerlin wrote:
>> We're about to remove the filtering in i915 on 0 reason fields because
>> we assume this was a possibility, but it turned out to be a corruption
>> in the tail pointer register that made us read cleared data.
>>
>> This test is here to verify that our assumption hold that the HW never
>> produces such reports.
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin at intel.com>
>> ---
>> tests/i915/perf.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 77 insertions(+)
>>
>> diff --git a/tests/i915/perf.c b/tests/i915/perf.c
>> index caeabd623..08c81c5c2 100644
>> --- a/tests/i915/perf.c
>> +++ b/tests/i915/perf.c
>> @@ -2539,6 +2539,76 @@ test_buffer_fill(void)
>>     __perf_close(stream_fd);
>> }
>
> I think this logic of reading all the reports and doing a sanity check 
> could be reused for some other tests. That could help us refactor the 
> perf OA test code. I queues something similar to trybot.
> https://patchwork.freedesktop.org/series/84231/
>
> If you feel we can go ahead with that ^, I will post it here. If not, 
> here are the comments.


Fair point :)


>
>>
>> +static void
>> +test_non_zero_reason(void)
>> +{
>> +    /* ~10 micro second period */
>> +    int oa_exponent = max_oa_exponent_for_period_lte(10000);
>> +    uint64_t properties[] = {
>> +        /* Include OA reports in samples */
>> +        DRM_I915_PERF_PROP_SAMPLE_OA, true,
>> +
>> +        /* OA unit configuration */
>> +        DRM_I915_PERF_PROP_OA_METRICS_SET, 
>> test_set->perf_oa_metrics_set,
>> +        DRM_I915_PERF_PROP_OA_FORMAT, test_set->perf_oa_format,
>> +        DRM_I915_PERF_PROP_OA_EXPONENT, oa_exponent,
>> +    };
>> +    struct drm_i915_perf_open_param param = {
>> +        .flags = I915_PERF_FLAG_FD_CLOEXEC,
>> +        .num_properties = sizeof(properties) / 16,
>> +        .properties_ptr = to_user_pointer(properties),
>> +    };
>> +    struct drm_i915_perf_record_header *header;
>> +    uint32_t buf_size = 5 * 65536 * (256 + sizeof(struct 
>> drm_i915_perf_record_header));
>
> 3 times instead of 5 could be good enough. Thoughts?


I mostly looked at time budget


>
>> +    uint8_t *buf = malloc(buf_size);
>> +    uint32_t total_len = 0;
>> +    int len;
>
> igt_assert(buf);
>
> just in case we didn't get the required memory.


Done.


>
>> +
>> +    igt_debug("Ready to read about %u bytes\n", buf_size);
>> +
>> +    load_helper_init();
>> +    load_helper_run(HIGH);
>> +
>> +    stream_fd = __perf_open(drm_fd, &param, true /* prevent_pm */);
>> +
>> +    while (total_len < (buf_size - sizeof(struct 
>> drm_i915_perf_record_header)) &&
>> +           ((len = read(stream_fd, &buf[total_len], buf_size - 
>> total_len)) > 0 ||
>> +        (len -1 && errno == EINTR))) {
>
> Did you mean?
>
> (len == -1 && errno == EINTR)


Oh dear... Thanks!


>
>> +        if (len > 0)
>> +            total_len += len;
>> +    }
>> +
>> +    __perf_close(stream_fd);
>> +
>> +    load_helper_stop();
>> +    load_helper_fini();
>> +
>> +    igt_debug("Got %u bytes\n", total_len);
>> +
>> +    for (uint32_t offset = 0; offset < total_len; offset += 
>> header->size) {
>> +        header = (void *) (buf + offset);
>> +
>> +        switch (header->type) {
>> +        case DRM_I915_PERF_RECORD_OA_REPORT_LOST:
> Based on some trial runs from my test, I see 40 us or 100 us 
> oa_exponent would likely result in no reports lost. On the other hand, 
> setting the HEAD/TAIL wrap bits in OASTATUS likely happens only when 
> REPORT LOST occurs. Should we check that at least one report was lost 
> here?
>
>             reports_lost++;


I see a lot of report lost actually, even with ~1ms sampling.

As far as I can tell, we really can't predict when this is going to 
happen so I rather not rely on this.


>
>> +            break;
>> +        case DRM_I915_PERF_RECORD_SAMPLE: {
>> +            const uint32_t *report = (void *) (header + 1);
>> +            uint32_t reason = (report[0] >> OAREPORT_REASON_SHIFT) &
>> +                OAREPORT_REASON_MASK;
>> +
>> +            igt_assert_neq(reason, 0);
>
> why not sanity_check_reports?


Hey, this is raising an load of interesting question... Lots of failures...


>
> Thanks,
> Umesh
>
>> +            break;
>> +        }
>> +        case DRM_I915_PERF_RECORD_OA_BUFFER_LOST:
>> +            igt_assert(!"unexpected overflow");
>> +            break;
>> +        }
>> +    }
>> +
>> +
>> +    free(buf);
>> +}
>> +
>> static void
>> test_enable_disable(void)
>> {
>> @@ -4882,6 +4952,13 @@ igt_main
>>     igt_subtest("buffer-fill")
>>         test_buffer_fill();
>>
>> +    igt_describe("Test that reason field in OA reports is never 0 on 
>> Gen8+");
>> +    igt_subtest("non-zero-reason") {
>> +        /* Reason field is only available on Gen8+ */
>> +        igt_require(intel_gen(devid) >= 8);
>> +        test_non_zero_reason();
>> +    }
>> +
>>     igt_subtest("disabled-read-error")
>>         test_disabled_read_error();
>>     igt_subtest("non-sampling-read-error")
>> -- 
>> 2.29.2
>>



More information about the igt-dev mailing list