Can you help me debug an issue?
Peter Senna Tschudin
peter.senna at gmail.com
Sat Mar 23 13:05:39 UTC 2024
Dear List,
I found a commit that introduced a regression that broke the tests
gem_exec_capture at many-4k-incremental and
gem_exec_capture at many-4k-zero. Reverting 93c5ec210 fixes the issue,
but it does not tell what the problem is.
The problem is that `e` gets corrupted when `many()` calls the macro
`find_first_available_engine`, and the corruption happens at the line
`saved = configure_hangs(fd, e, ctx->id);`. By corrupted, I mean that
the field name gets empty and the field class gets a large number.
After `e` gets corrupted, the call to __captureN() will fail because
it expects 'e' to be valid. A simple fix is to add `e =
&saved_engine.engine;` before the call to __captureN().
I have been trying to understand why `e` gets corrupted for a few
hours, and I ran out of ideas. To make the code more gdb-friendly, I
have unfolded the macro find_first_available_engine, but that did not
help me find the reason for the `e` corruption. Here is how I have
unfolded the macros:
- find_first_available_engine(fd, ctx, e, saved_engine);
+ ctx = intel_ctx_create_all_physical(fd);
+ igt_assert(ctx);
+ for (struct intel_engine_data i =
intel_engine_list_for_ctx_cfg(fd, &(ctx)->cfg);
+ (e = intel_get_current_engine(&i));
+ intel_next_engine(&i)) {
+ if ((gem_class_can_store_dword(fd, e->class)))
+ break;
+ }
+ igt_assert(e);
+ printf("e->name: %s\n", e->name);
+ saved_engine = configure_hangs(fd, e, ctx->id);
+ printf("e->name: %s\n", e->name);
Reverting 93c5ec210 stops the corruption from happening, and I am
trying to understand why. Can you help me debug this further?
Thank you,
Peter
--
Peter
More information about the igt-dev
mailing list