[PATCH 0/6] Fix crash after reloading a driver using ttm

Karol Herbst kherbst at redhat.com
Wed Apr 17 00:10:53 UTC 2019


On Wed, Apr 17, 2019 at 1:09 AM Eric Anholt <eric at anholt.net> wrote:
>
> Christian König <ckoenig.leichtzumerken at gmail.com> writes:
>
> > Am 16.04.19 um 02:35 schrieb Karol Herbst:
> >> Kobjects are supposed to be dynamically allocated, but with recent changes
> >> this rule was violated. Reverting those commits fixes crashes when a drm
> >> driver using TTM gets loaded again.
> >>
> >> The object in question is "ttm_mem_glob" declared inside
> >> "include/drm/ttm/ttm_memory.h" and instatiated inside
> >> "drivers/gpu/drm/ttm/ttm_memory.c".
> >>
> >> from "Documentation/kobject.txt":
> >> "Because kobjects are dynamic, they must not be declared statically or on
> >> the stack, but instead, always allocated dynamically.  Future versions of
> >> the kernel will contain a run-time check for kobjects that are created
> >> statically and will warn the developer of this improper usage."
> >>
> >> Unloading ttm before reloading the driver workarounds that crash, because
> >> the memory backing the kobject member "kobj" is cleaned up. The kobject_del
> >> and kobject_put function never free or clean up the kobject object leaving
> >> it in an undefined state.
> >>
> >> I reverted a few more commits to make it less painful for me to rever this
> >> rather big change.
> >
> > Well, NAK. By reverting those change you also re-introduced the problems
> > we originally fixed with those patches.
> >
> > Please work on a proper fix instead,
>
> That's not Karol's responsibility, that's yours as the author.  I would
> like to remind about Linux's regressions policy, quoting from
> Documentation/process/4.Coding.rst:
>
> "One final hazard worth mentioning is this: it can be tempting to make a
> change (which may bring big improvements) which causes something to break
> for existing users.  This kind of change is called a "regression," and
> regressions have become most unwelcome in the mainline kernel.  With few
> exceptions, changes which cause regressions will be backed out if the
> regression cannot be fixed in a timely manner.  Far better to avoid the
> regression in the first place.
>
> It is often argued that a regression can be justified if it causes things
> to work for more people than it creates problems for.  Why not make a
> change if it brings new functionality to ten systems for each one it
> breaks?  The best answer to this question was expressed by Linus in July,
> 2007:
>
> ::
>
>         So we don't fix bugs by introducing new problems.  That way lies
>         madness, and nobody ever knows if you actually make any real
>         progress at all. Is it two steps forwards, one step back, or one
>         step forward and two steps back?"

he already wrote a fix. Search for "[PATCH 1/2] drm/ttm: fix re-init
of global structures"


More information about the dri-devel mailing list