[PATCH i-g-t v2 2/2] tests/intel/xe_oa: Remove hardcoded time heuristics
Pottumuttu, Sai Teja
sai.teja.pottumuttu at intel.com
Mon Jan 13 06:41:47 UTC 2025
On 10-01-2025 03:58, Dixit, Ashutosh wrote:
> On Sun, 05 Jan 2025 19:46:38 -0800, Sai Teja Pottumuttu wrote:
>>
>
> Hi Sai Teja,
>
> Thanks for the patch, mostly looks good but I have some usual nits.
>
>> Some tests in xe_oa tests have hardcoded timing heuristics. Refactor it to
>> make it more robust and reliable. The patch extends the wait time logically
>> but usually it would take a single iteration for the required reports to be
>> available so wait time doesn't change much.
>>
>> v2:
>> - Extend commit message [Lucas]
>> - Make wait function more generic [Lucas]
>>
>> Signed-off-by: Sai Teja Pottumuttu <sai.teja.pottumuttu at intel.com>
>> ---
>> tests/intel/xe_oa.c | 30 +++++++++++++++++++++++++-----
>> 1 file changed, 25 insertions(+), 5 deletions(-)
>>
>> diff --git a/tests/intel/xe_oa.c b/tests/intel/xe_oa.c
>> index ad3526406..b271278d6 100644
>> --- a/tests/intel/xe_oa.c
>> +++ b/tests/intel/xe_oa.c
>> @@ -4367,6 +4367,29 @@ static void map_oa_buffer_forked_access(const struct drm_xe_engine_class_instanc
>> munmap(vaddr, size);
>> }
>>
>> +static void wait_for_periodic_reports(void *oa_vaddr,
>
> Let's change this name to 'mmap_wait_for_periodic_reports', to highlight it
> only applies to the mmap'd OA buffer case.
Sure, made most of the suggested changes in v3. Some comments below.
>
>
>> + uint32_t n,
>
> optional nit: move this to the previous line, I generally try to to optimize
> vertical real estate.
>
>> + const struct drm_xe_engine_class_instance *hwe)
>> +{
>> + uint32_t period_us = oa_exponent_to_ns(oa_exp_1_millisec) / 1000;
>> + struct intel_xe_perf_metric_set *test_set = metric_set(hwe);
>> + uint64_t fmt = test_set->perf_oa_format;
>> + struct oa_format format = get_oa_format(fmt);
>
> optional nit again, but delete the two temporary variables above, see
> below.
In v3, I deleted oa_format, but kept fmt as its used at two places in
the loop.
>
>> + uint32_t num_periodic_reports = 0;
>> + uint32_t *reports;
>> +
>> + while (num_periodic_reports < n) {
>> + usleep(100 * period_us);
>
> I think this should be something like:
>
> usleep(2 * n * period_us);
>
> So the wait time should be a function of n (not constant like 100). Here
> I'm assuming if we wait for '2 * n' periods, we should probably have n
> periodic reports.
After some experiments, I finalized with 4 * n as that seems to be the
time to get n reports.
>
>> + num_periodic_reports = 0;
>> + for (reports = (uint32_t *)oa_vaddr;
>> + reports[0] && oa_timestamp(reports, fmt);
>> + reports += format.size) {
>> + if (oa_report_is_periodic(reports))
>> + num_periodic_reports++;
>> + }
>
> optional nit: I think this entire loop can just be:
>
> num_periodic_reports = 0;
> for (reports = (uint32_t *)oa_vaddr;
> reports[0] && oa_timestamp(reports, fmt) && oa_report_is_periodic(reports);
> reports += get_oa_format(test_set->perf_oa_format).size)
> num_periodic_reports++;
>
> Just a couple of general comments below, no need to change anything:
>
> * The loop is a little 'funky' in that it goes over the mapped OA buffer
> multiple times, even over previoulsy found reports. I think that is ok
> for now.
>
> * The other thing is that OA periodic reports should be generated as long
> as DRM_XE_OA_PROPERTY_OA_PERIOD_EXPONENT has been used in OA
> properties. Both places where this function is called from has that, so
> that should be ok.
>
> * If HW somehow doesn't generate periodic reports we'll hang here. But that
> is not expected, so leave as is and we'll deal with it later if we ever
> hit that.
>
>> + }
>> +}
>> +
>> static void check_reports(void *oa_vaddr, uint32_t oa_size,
>> const struct drm_xe_engine_class_instance *hwe)
>> {
>> @@ -4396,12 +4419,10 @@ static void check_reports_from_mapped_buffer(const struct drm_xe_engine_class_in
>> {
>> void *vaddr;
>> uint32_t size;
>> - uint32_t period_us = oa_exponent_to_ns(oa_exp_1_millisec) / 1000;
>>
>> vaddr = map_oa_buffer(&size);
>>
>> - /* wait for approx 100 reports */
>> - usleep(100 * period_us);
>> + wait_for_periodic_reports(vaddr, 20, hwe);
>
> I am wondering if we should make this 10 instead of 20, and also change 20
> to 10 in check_reports().
That makes sense, added a patch to reduce number of reports being
checked to 10 in v3.
>
>> check_reports(vaddr, size, hwe);
>>
>> munmap(vaddr, size);
>> @@ -4426,12 +4447,11 @@ static void closed_fd_and_unmapped_access(const struct drm_xe_engine_class_insta
>> };
>> void *vaddr;
>> uint32_t size;
>> - uint32_t period_us = oa_exponent_to_ns(oa_exp_1_millisec) / 1000;
>>
>> stream_fd = __perf_open(drm_fd, ¶m, false);
>> vaddr = map_oa_buffer(&size);
>>
>> - usleep(100 * period_us);
>> + wait_for_periodic_reports(vaddr, 20, hwe);
>
> Here too.
>
>> check_reports(vaddr, size, hwe);
>>
>> munmap(vaddr, size);
>> --
>> 2.34.1
>>
>
> Thanks.
> --
> Ashutosh
More information about the igt-dev
mailing list