[Intel-gfx] linux-firmware-i915 pull request (bxt dmc, kbl dmc)

Vivi, Rodrigo rodrigo.vivi at intel.com
Wed Aug 3 15:48:02 UTC 2016


On Wed, 2016-08-03 at 17:26 +0200, Daniel Vetter wrote:
> On Wed, Aug 3, 2016 at 5:12 PM, Vivi, Rodrigo <rodrigo.vivi at intel.com
> > wrote:
> > 
> > 
> > But we know that 1.23 is bad and cause issues regardless the kernel
> > version. And please keep in mind this is the most common case.
> > Usually a previous minor version was dropped in favor of a new one
> > because we found a bug that got fixed in a following minor version.
> > This is the minor version idea. So regardless the kernel version,
> > the
> > newest minor is probably safest than the previous one.
> > 
> > So, I don't want to keep all versions in linux-firmware.git,
> > specially
> > those that we know that cause bad issues.
> > 
> > And here is the case were only symbolic link would help imho.
> If a system goes from "mostly works" to "fails because DMC isn't
> there
> any more" then that's a regression. Which means we _must_ resstore
> 1.23.

From what I can remember it causes GPU Hangs depending on what you are
running on the GPU.



> Of course it's not great that it's just "mostly works" and not "works
> really well", and for that we need to make sure a fixed DMC is in
> linux-firmware, and once that has landed we can backport the
> kernel-side bugfixes plus allow the kernel to either load the new
> fixed dmc, or if that's not there fall back to 1.23. Because if we
> don't do that, then it goes again boom.

What also highlights that we will never be testing with all combination
of possible userspace graphical environments and versions out there.
So
it is lame to tell that we just support combinations we tested and we
are sure because we are not testing with all combinations out there
anyway.

> 
> Jani's point (which I fully support) is that we should do that
> backporting of the support for the new/fixed dmc firmware explicitly,
> and not automatically through a symlink. Because it could be that the
> new firmware also needs some (small) kernel changes to work well. 

The rule is that if any software change is required we should bump the
major version. The only accepted patch that shouldn't bump the major
are the one touching the recommended version or blacklisting known bad
firmware versions.

> And
> we definitely want to be able to test with the new firmware, and have
> the ability to revert the backport on stable kernels again in case it
> blows up for some reason that we don't understand and don't have the
> resources to debug. Since again all these cases would be regressions.

This is another thing Chris had pointed already also. Without
installing all firmware blobs possible in the system we won't be able
to bisect anyway.

> 
> So definitely don't want to let people hung out there with bad dmc
> versions. We just need to be careful that we don't regress anything,
> and that we can make sure that we can revert again if some backport
> does blow up.

But what I also don't want is to be blamed by blowing up the size of
the linux-firmware.git packages in all OSV's distros because we decided
to keep all versions there.

We are the only linux-firmware.git user that keeps multiple versions
and with this new rule of having to let old blobs there, it will blow
up soon.

> 
> I hope that explains.
> 

It does. I just don't agree. Well, not anymore actually ;) 

But this is a maintainer call so I will get 1.23 back and keep all old
versions now on.

Thanks,
Rodrigo.

> Cheers, Daniel
> 
> 
> > 
> > 
> > On Wed, 2016-08-03 at 17:08 +0200, Daniel Vetter wrote:
> > > 
> > > On Wed, Aug 3, 2016 at 5:06 PM, Vivi, Rodrigo <rodrigo.vivi at intel
> > > .com
> > > > 
> > > > wrote:
> > > > 
> > > > So, issues like https://bugs.freedesktop.org/show_bug.cgi?id=97
> > > > 182
> > > > 
> > > > will appear with frequency now...
> > > > 
> > > > should we just close all as wontfix?
> > > It sounds like we should fix that by restoring 1.23. Certainly
> > > not
> > > WONTFIX. WONTIFXing regression is pretty much the only guaranteed
> > > way
> > > to terminally piss of Dave&Linus.
> > > -Daniel
> > > 
> > > > 
> > > > 
> > > > 
> > > > On Wed, 2016-08-03 at 17:02 +0200, Daniel Vetter wrote:
> > > > > 
> > > > > 
> > > > > On Wed, Aug 3, 2016 at 11:08 AM, Jani Nikula
> > > > > <jani.nikula at linux.intel.com> wrote:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > I believe this is another point in favor of bringing
> > > > > > > > the
> > > > > > > > sym
> > > > > > > > links
> > > > > > > > back.
> > > > > > > > 
> > > > > > > > But also because we need to remove any firmware that we
> > > > > > > > know it
> > > > > > > > is bad
> > > > > > > > and that would break the user. If it was blacklisted it
> > > > > > > > was
> > > > > > > > removed
> > > > > > > > from repo.
> > > > > > > > 
> > > > > > > > Yet another reason for symbolic link. If we know the
> > > > > > > > firmware
> > > > > > > > is bad it
> > > > > > > > is bad for previous versions as well, but if we stay
> > > > > > > > with
> > > > > > > > the
> > > > > > > > version
> > > > > > > > hardcoded we are forcing the user to stay with a
> > > > > > > > firmware
> > > > > > > > that
> > > > > > > > we know
> > > > > > > > it is bad.
> > > > > > > Indeed.  Please don't put a full version number in the
> > > > > > > filenames
> > > > > > > requested by drivers.  Where it's not possible to
> > > > > > > maintain
> > > > > > > ABI
> > > > > > > compatibility between driver and firmware indefinitely
> > > > > > > then
> > > > > > > include an
> > > > > > > ABI version in the filename, but not the full version.
> > > > > > I'm starting to sound like a broken record, but here goes
> > > > > > again.
> > > > > > 
> > > > > > We do not have the bandwidth to test all combinations of
> > > > > > kernel
> > > > > > and
> > > > > > firmware versions.
> > > > > > 
> > > > > > If we update linux-firmware to change the firmware blob to
> > > > > > use
> > > > > > (either
> > > > > > by changing where the symlink points or by replacing the
> > > > > > file)
> > > > > > we
> > > > > > roll
> > > > > > out untested firmware/kernel combinations to stable kernel
> > > > > > users.
> > > > > > 
> > > > > > IMO we should be specific which firmware version(s) to
> > > > > > accept
> > > > > > in
> > > > > > the
> > > > > > kernel, limiting to known good and tested combinations. If
> > > > > > there's
> > > > > > a
> > > > > > need to update the firmware to use for stable kernels, it's
> > > > > > a
> > > > > > matter of
> > > > > > backporting the commit accepting another firmware version.
> > > > > > This
> > > > > > can
> > > > > > be
> > > > > > done by us or an OSV.
> > > > > > 
> > > > > > Even when there's supposed to be ABI compatibility, I
> > > > > > wouldn't
> > > > > > liberally
> > > > > > roll out firmware updates across all past stable kernels
> > > > > > without
> > > > > > testing. Anyone suggesting that obviously doesn't have to
> > > > > > be in
> > > > > > the
> > > > > > receiving end of the bug reports when things go wrong in
> > > > > > mysterious
> > > > > > and
> > > > > > non-bisectable ways.
> > > > > > 
> > > > > > I don't think it's a good idea to give the control of
> > > > > > firmware
> > > > > > version
> > > > > > selection to the user space and linux-firmware.
> > > > > +1
> > > > > 
> > > > > We discussed why symlinks are not a great pick for gpus at
> > > > > length,
> > > > > all
> > > > > those reasons are still valid. Mostly it boils down to that
> > > > > the
> > > > > actual
> > > > > interface between gpu components is _extremely_ wide, and
> > > > > includes
> > > > > all
> > > > > kinds of fun things like minute timing details, w/a settings
> > > > > and
> > > > > really just everything.
> > > > > 
> > > > > I'd say for the same reasons we only support open source
> > > > > userspace
> > > > > drivers (anything else can't be audited when it breaks and
> > > > > debugged)
> > > > > we need to restrict the combinatorial interaction madness
> > > > > with
> > > > > firmware. If that makes gpus special in yet another way, so
> > > > > be
> > > > > it.
> > > > > -Daniel
> > > 
> 
> 


More information about the Intel-gfx mailing list