[Bug 204181] NULL pointer dereference regression in amdgpu

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Mon Sep 30 02:07:49 UTC 2019


https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #60 from jamespharvey20 at gmail.com ---
(In reply to Sergey Kondakov from comment #59)
> 
> And how about instead of knowingly pushing untested code with known fatal
> errors you stop taking QA notes from FGLRX in the first place and do your
> own full testing ? You do realize that I, as all others, paid for that card
> to your employer, right ? And people don't buy your top cards,
> RX[4-5][7-8]0, VEGAs and so on, to use them as expensive bare output
> controllers.

This.  If this were just a free project with volunteers giving their time, many
of us who occasionally throw a tantrum towards AMD wouldn't be.  But, some of
us are throwing money at AMD to try to have a stable system again, and keep
getting regressions introduced that are either fixed very slowly, or not at
all.

I'm here, because I was running an R9 390, and kernel 4.19 introduced a
regression that causes a complete boot failure.  Others confirmed the same. 
See https://bugs.freedesktop.org/show_bug.cgi?id=108781  (As I explain way
below, this is still unfixed in 5.3.)

On that bug, I'm asked by an amd.com developer to bisect.  I run into hundreds,
or even a thousand, commits that don't even compile, and only a later commit
fixes that issue.  Fun, thanks for pushing those, guys.  I finally achieve a
bisected commit, where 0d9988910989 causes a boot hang and the one previous to
it doesn't.  Upon being told this shouldn't have to do with the bug I've
posted, I do discover that this bug causes a black screen boot hang, but it's a
different bug!  I then go on to document that I've found between 3 and 5
crashing commits in the new 4.19 commits.

So, how am I supposed to bisect this garbage, when a lot doesn't even compile,
and there are multiple bugs popping in and out of existence causing the same
symptom?  Boot crashes with black screen, and I'm supposed to know to mark that
commit as good because it's a different bug causing the same issue?

I ask the AMD devs to tell me exactly which card they use in testing (if any,
at all) so I can just buy that and be done with this.  No response.

So, I pay AMD more money and buy a RX 580, which is mostly a downgrade from the
R9 390.  Get frequent crashes from that as well.

So, I just decide to buy a Vega 64.  I don't need the extra power, I just want
to run a stable machine.  Since AMD devs aren't saying what card I could use
that they do, in a hope that they might fix crashes before they push them, I
figure the latest and greatest might be getting more attention.

All goes well until this regression is introduced.

I go back to try my R9 390, and guess what?  The same bug introduced in kernel
4.19 is still there in 5.3!  AMD's just ignored it, and hasn't bothered to try
to reproduce it themselves and try to untangle the mess of spaghetti.

Since running a custom kernel with the patchset, I haven't had this crash, but
come on guys!  Couldn't AMD have a bank of 50 computers running different
cards, constantly running the latest unpushed code and going through different
stress tests?  Hey, Jim, monitor #14 and #36 keep crashing, let's look into
it.....

-- 
You are receiving this mail because:
You are watching the assignee of the bug.


More information about the dri-devel mailing list