[PATCH i-g-t v2] Fix memory access issue due to variable block scope

Kamil Konieczny kamil.konieczny at linux.intel.com
Tue Mar 26 18:20:06 UTC 2024


Hi peter.senna,
On 2024-03-26 at 14:18:43 +0100, Peter Senna Tschudin wrote:
> Hi Andi,
> 
> Thank you for your reply.
> 
> On Tue, Mar 26, 2024 at 1:05 PM Andi Shyti <andi.shyti at linux.intel.com> wrote:
> >
> > Hey Peter,
> >
> > On Mon, Mar 25, 2024 at 10:35:48PM +0100, Peter Senna Tschudin wrote:
> > > This patch fixes the tests gem_exec_capture at many-4k-incremental and
> > > gem_exec_capture at many-4k-zero that are currently failing with an invalid file
> > > descriptor error.
> >
> > where is gem_exec_capture calling for_each_ctx_cfg_engine()?
> 
> many(), userptr(), capture_invisible()
>     find_first_available_engine()
>         for_each_ctx_engine()
> 
> When called by many(), 'e' gets corrupted when configure_hangs() tries
> to assign 'e' to another variable. Then after 'e' is corrupted, the
> call __captureN() will fail because it expects 'e' to be valid.
> 

imho first step whould be to replace macro with a function:

saved = find_first_available_engine(fd, ctx, e);

Regards,
Kamil

> >
> > >  struct intel_execution_engine2 *
> > >  intel_get_current_engine(struct intel_engine_data *ed)
> > >
> > > When intel_get_current_engine is called from the macro
> > > for_each_ctx_cfg_engine(), the variable *ed is defined within a for loop. The
> > > scope of *ed is limited to that loop, leading to access violations when
> > > attempting to access its contents outside the loop.
> > >
> > > Before to this patch, intel_get_current_engine() would return an element of *ed
> > > and attempting to use it after the loop ended resulted in undefined behavior.
> > >
> > > This patch introduces a memcpy() to copy the contents of ed->current_engine to
> > > a memory area not confined by the loop's scope, ensuring safe access to the
> > > data.
> > >
> > > v2: Added 'i-g-t' to the Subject.
> > >
> > > Signed-off-by: Peter Senna Tschudin <peter.senna at gmail.com>
> > > ---
> > >  lib/i915/gem_engine_topology.c | 14 +++++++++++++-
> > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/lib/i915/gem_engine_topology.c b/lib/i915/gem_engine_topology.c
> > > index afb576afb..b3b809482 100644
> > > --- a/lib/i915/gem_engine_topology.c
> > > +++ b/lib/i915/gem_engine_topology.c
> > > @@ -189,12 +189,24 @@ static int __query_engine_list(int fd, struct intel_engine_data *ed)
> > >  struct intel_execution_engine2 *
> > >  intel_get_current_engine(struct intel_engine_data *ed)
> > >  {
> > > +     struct intel_execution_engine2 *ret = NULL;
> > > +
> > >       if (ed->n >= ed->nengines)
> > >               ed->current_engine = NULL;
> > >       else if (!ed->n)
> > >               ed->current_engine = &ed->engines[0];
> > >
> > > -     return ed->current_engine;
> > > +     // When called from the macro for_each_ctx_cfg_engine(), *ed is defined
> > > +     // inside a for loop. In that case, not memcping ed->current_engine
> > > +     // will lead to a memory access violation when trying to access the
> > > +     // contents of ed->current_engine after the end of the for loop
> >
> > can you please use /* ... */ style of comment?
> 
> sure, thank you for pointing this out.
> 
> >
> > > +     if (ed->current_engine) {
> > > +             ret = malloc(sizeof(*ret));
> >
> > should this be freed at some point?
> 
> I am not convinced that his patch is the best approach. If the problem
> is indeed the block scope of '*ed', I will propose to give
> for_each_ctx_engine() some serious love. I tried to come up with a
> proper solution but I failed to find one. The problems I faced are:
>  - for_each_ctx_engine() requires an struct intel_execution_engine2 as
> an iterator
>  - I did not find any way to keep for_each_ctx_engine() as a macro and
> change the scope(other than declaring ed outside the macro, argh...).
>  - Because for_each_ctx_engine() is a macro, I could not find a
> compiler friendly way to define the struct intel_execution_engine2
> within the macro
>  - To use free() 'properly', we need to save the information of when
> 'e' was allocated to prevent the code from trying to free something
> that should not be freed.
> 
> This is why I asked for help here* on how you want me to fix it: I am
> under the impression that there is not a good solution for this
> problem. And then it becomes a question of long term maintenance.
> Should we drop the macro? Forbid the use of 'e' after the macro ends?
> Properly trace memory allocation of 'e' to prevent the potential
> free() issue? Something else? Please let me know.
> 
> * - https://lists.freedesktop.org/archives/igt-dev/2024-March/070468.html
> 
> 
> 
> >
> > Andi


More information about the igt-dev mailing list