[PATCH i-g-t v2] Fix memory access issue due to variable block scope

Kamil Konieczny kamil.konieczny at linux.intel.com
Wed Mar 27 16:59:32 UTC 2024


Hi Peter,

On 2024-03-26 at 22:41:27 +0100, Peter Senna Tschudin wrote:
> Hi Kamil,
> 
> Thank you for your email!
> 
> On Tue, Mar 26, 2024 at 7:20 PM Kamil Konieczny
> <kamil.konieczny at linux.intel.com> wrote:
> >
> > Hi peter.senna,
> > On 2024-03-26 at 14:18:43 +0100, Peter Senna Tschudin wrote:
> > > Hi Andi,
> > >
> > > Thank you for your reply.
> > >
> > > On Tue, Mar 26, 2024 at 1:05 PM Andi Shyti <andi.shyti at linux.intel.com> wrote:
> > > >
> > > > Hey Peter,
> > > >
> > > > On Mon, Mar 25, 2024 at 10:35:48PM +0100, Peter Senna Tschudin wrote:
> > > > > This patch fixes the tests gem_exec_capture at many-4k-incremental and
> > > > > gem_exec_capture at many-4k-zero that are currently failing with an invalid file
> > > > > descriptor error.
> > > >
> > > > where is gem_exec_capture calling for_each_ctx_cfg_engine()?
> > >
> > > many(), userptr(), capture_invisible()
> > >     find_first_available_engine()
> > >         for_each_ctx_engine()
> > >
> > > When called by many(), 'e' gets corrupted when configure_hangs() tries
> > > to assign 'e' to another variable. Then after 'e' is corrupted, the
> > > call __captureN() will fail because it expects 'e' to be valid.
> > >
> >
> > imho first step whould be to replace macro with a function:
> >
> > saved = find_first_available_engine(fd, ctx, e);
> 
> When I was done with the replacement function, I found a simple
> solution that seems to be adequate. I made a small change to
> find_first_available_engine() to reassign e to an element of saved. My
> patch v3 has the details. What do you think?

I will reply to your v4 patch, it looks promising.

> 
> May I ask you to increase the time the CI will waits for the tests to complete:
>  - many-4k-zero: 2 minutes
>  - many-4k-incremental: 5 minutes (It takes 4 minutes on my N3150)

It is not posibble, sorry, as there are many of them and time adds up...

Regards,
Kamil

> 
> Thank you,
> 
> Peter
> 
> >
> > Regards,
> > Kamil
> >
> > > >
> > > > >  struct intel_execution_engine2 *
> > > > >  intel_get_current_engine(struct intel_engine_data *ed)
> > > > >
> > > > > When intel_get_current_engine is called from the macro
> > > > > for_each_ctx_cfg_engine(), the variable *ed is defined within a for loop. The
> > > > > scope of *ed is limited to that loop, leading to access violations when
> > > > > attempting to access its contents outside the loop.
> > > > >
> > > > > Before to this patch, intel_get_current_engine() would return an element of *ed
> > > > > and attempting to use it after the loop ended resulted in undefined behavior.
> > > > >
> > > > > This patch introduces a memcpy() to copy the contents of ed->current_engine to
> > > > > a memory area not confined by the loop's scope, ensuring safe access to the
> > > > > data.
> > > > >
> > > > > v2: Added 'i-g-t' to the Subject.
> > > > >
> > > > > Signed-off-by: Peter Senna Tschudin <peter.senna at gmail.com>
> > > > > ---
> > > > >  lib/i915/gem_engine_topology.c | 14 +++++++++++++-
> > > > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
> > > > > index afb576afb..b3b809482 100644
> > > > > --- a/lib/i915/gem_engine_topology.c
> > > > > +++ b/lib/i915/gem_engine_topology.c
> > > > > @@ -189,12 +189,24 @@ static int __query_engine_list(int fd, struct intel_engine_data *ed)
> > > > >  struct intel_execution_engine2 *
> > > > >  intel_get_current_engine(struct intel_engine_data *ed)
> > > > >  {
> > > > > +     struct intel_execution_engine2 *ret = NULL;
> > > > > +
> > > > >       if (ed->n >= ed->nengines)
> > > > >               ed->current_engine = NULL;
> > > > >       else if (!ed->n)
> > > > >               ed->current_engine = &ed->engines[0];
> > > > >
> > > > > -     return ed->current_engine;
> > > > > +     // When called from the macro for_each_ctx_cfg_engine(), *ed is defined
> > > > > +     // inside a for loop. In that case, not memcping ed->current_engine
> > > > > +     // will lead to a memory access violation when trying to access the
> > > > > +     // contents of ed->current_engine after the end of the for loop
> > > >
> > > > can you please use /* ... */ style of comment?
> > >
> > > sure, thank you for pointing this out.
> > >
> > > >
> > > > > +     if (ed->current_engine) {
> > > > > +             ret = malloc(sizeof(*ret));
> > > >
> > > > should this be freed at some point?
> > >
> > > I am not convinced that his patch is the best approach. If the problem
> > > is indeed the block scope of '*ed', I will propose to give
> > > for_each_ctx_engine() some serious love. I tried to come up with a
> > > proper solution but I failed to find one. The problems I faced are:
> > >  - for_each_ctx_engine() requires an struct intel_execution_engine2 as
> > > an iterator
> > >  - I did not find any way to keep for_each_ctx_engine() as a macro and
> > > change the scope(other than declaring ed outside the macro, argh...).
> > >  - Because for_each_ctx_engine() is a macro, I could not find a
> > > compiler friendly way to define the struct intel_execution_engine2
> > > within the macro
> > >  - To use free() 'properly', we need to save the information of when
> > > 'e' was allocated to prevent the code from trying to free something
> > > that should not be freed.
> > >
> > > This is why I asked for help here* on how you want me to fix it: I am
> > > under the impression that there is not a good solution for this
> > > problem. And then it becomes a question of long term maintenance.
> > > Should we drop the macro? Forbid the use of 'e' after the macro ends?
> > > Properly trace memory allocation of 'e' to prevent the potential
> > > free() issue? Something else? Please let me know.
> > >
> > > * - https://lists.freedesktop.org/archives/igt-dev/2024-March/070468.html
> > >
> > >
> > >
> > > >
> > > > Andi


More information about the igt-dev mailing list