[PATCH v7 03/10] drm/xe/devcoredump: Add ASCII85 dump helper function

Mon Sep 16 17:46:54 UTC 2024

On 9/16/2024 08:32, Rodrigo Vivi wrote:
> On Thu, Sep 12, 2024 at 11:06:20AM -0700, John Harrison wrote:
>> On 9/12/2024 06:57, Rodrigo Vivi wrote:
>>> On Wed, Sep 11, 2024 at 12:59:54PM -0700, John Harrison wrote:
>>>> On 9/11/2024 12:54, Souza, Jose wrote:
>>>>> On Wed, 2024-09-11 at 12:35 -0700, John Harrison wrote:
>>>>>> On 9/11/2024 12:30, Souza, Jose wrote:
>>>>>>> On Wed, 2024-09-11 at 14:12 -0500, Lucas De Marchi wrote:
>>>>>>>> On Tue, Sep 10, 2024 at 01:17:11PM GMT, John Harrison wrote:
>>>>>>>>> On 9/10/2024 12:43, Lucas De Marchi wrote:
>>>>>>>>>> On Mon, Sep 09, 2024 at 06:31:41PM GMT, John Harrison wrote:
>>>>>>>>>>> On 9/6/2024 19:06, John Harrison wrote:
>>>>>>>>>>>> On 9/5/2024 20:04, Lucas De Marchi wrote:
>>>>>>>>>>>>> On Thu, Sep 05, 2024 at 07:01:33PM GMT, John Harrison wrote:
>>>>>>>>>>>>>> On 9/5/2024 18:54, Lucas De Marchi wrote:
>>>>>>>>>>>>>>> On Thu, Sep 05, 2024 at 01:50:58PM GMT,
>>>>>>>>>>>>>>> John.C.Harrison at Intel.com wrote:
>>>>>>>>>>>>>>>> From: John Harrison <John.C.Harrison at Intel.com>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> There is a need to include the GuC log and other large
>>>>>>>>>>>>>>>> binary objects
>>>>>>>>>>>>>>>> in core dumps and via dmesg. So add a helper for dumping
>>>>>>>>>>>>>>>> to a printer
>>>>>>>>>>>>>>>> function via conversion to ASCII85 encoding.
>>>>>>>>>>>>>>> why are we not dumping the binary data directly to devcoredump?
>>>>>>>>>>>>>> As per earlier comments, there is a WiFi driver or some such
>>>>>>>>>>>>>> that does exactly that. But all they are dumping is a binary
>>>>>>>>>>>>>> blob.
>>>>>>>>>>>>> In your v5 I see you mentioned
>>>>>>>>>>>>> drivers/net/wireless/ath/ath10k/coredump.c, but that is a
>>>>>>>>>>>>> precedence for
>>>>>>>>>>>>> including it as is from the device rather converting it to ASCII85 or
>>>>>>>>>>>>> something else. It seems odd to do that type of conversion in kernel
>>>>>>>>>>>>> space when it could be perfectly done in userspace.
>>>>>>>>>>>> It really can't. An end user could maybe be expected to zip or
>>>>>>>>>>>> tar a coredump file before attaching it to a bug report but they
>>>>>>>>>>>> are certainly not going to try to ASCII85 encode random bits of
>>>>>>>>>>>> it. Whereas, putting that in the kernel means it is just there.
>>>>>>>>>>>> It is done. And it is pretty trivial - just call a helper
>>>>>>>>>>>> function and it does everything for you. Also, I very much doubt
>>>>>>>>>>>> you can spew raw binary data via dmesg. Even if the kernel would
>>>>>>>>>>>> print it for you (which I doubt), the user tools like syslogd
>>>>>>>>>>>> and dmesg itself are going to filter it to make it ASCII safe.
>>>>>>>>>>>>
>>>>>>>>>>>> The i915 error dumps have been ASCII85 encoded using the
>>>>>>>>>>>> kernel's ASCII85 encoding helper function since forever. This
>>>>>>>>>>>> patch is just a wrapper around the kernel's existing
>>>>>>>>>>>> implementation in order to make it more compatible with printing
>>>>>>>>>>>> to dmesg. This is not creating a new precedent. It already
>>>>>>>>>>>> exists.
>>> No! no big dump in dmesg.
>> Please see other response. Dumping via dmesg is intended for internal use
>> and impossible to repro fatal scenarios. There is zero expectation and end
>> user would ever see a dump via dmesg. This entire patch set only includes
>> one actual trigger for doing such dumps and that is only if
>> CONFIG_DRM_XE_DEBUG is defined. Which it would not be for an end user.
>>
>> And if you are saying that internal developers are not allowed to use dumps
>> via dmesg then we might as well give up and go home because there are very
>> many bugs for which this is the only viable debug method.
> What I'm saying is, as a user of CONFIG_DRM_XE_DEBUG, I really don't want to see
> this big polution in my dmesg everytime.
How many times have you seen "big polution in your dmesg" so far?

Right now, checked in, as we are currently shipping the Xe driver to the 
public, the spam is far worse than after this patch set. On a CT 
failure, it will dump the the CT buffer contents as a hexdump. If the 
buffer is very backed up (which it often is in the case of a fatal GuC 
failure), that could be something like 2MB of data dumped has a 
one-word-per-line hexdump to dmesg. And that has no config option to 
turn it off at all. I.e., that is enabled and shipping for end users 
with their RedHat default kernel build. This version provides a lot more 
useful information in a much more compact and less spewing form, and is 
off by default. That is, it is producing less "pollution" not more.

Also, this feature has been part of i915 for over a year (or two, or 
three?). How much "big pollution" have you suffered on i915 in all that 
time?

>
> I understand the case that the machine entirely hang at boot and you have nothing
> but the serial log messages that was dumped there. And you want more information
> in that view. I understand the case, although I know that we should be really
> working to avoid this kind of failure to start with.
>
> A failure on a Firmware communication should never ever caused the platform to hang.
> But well, you might tell that in order to solve cases like this you would need
> information from why the firmware failed so badly, right?!
This is not about debugging common hangs caused by end users, bad UMDs, 
etc. This is about debugging fatal kernel or firmware bugs. Those are 
things that should never happen. If you are hitting any of this on a 
regular basis then the driver is fundamentally broken and you can't 
actually be using it for anything meaningful anyway. This is about 
debugging the issues that occur once in a blue moon. And if you ever hit 
one locally (on production hardware, with a production driver and 
production firmware) then I would be incredibly surprised and absolutely 
want to get that log from you to debug what happened.

>
> This should be a one off case, or a totally separate config, and definitely not
> driving decisions of stable api/abi like the devcoredump. Clear now?
I have no idea what you mean by "driving decisions of stable api/abi 
like the devcoredump". This is not changing any stable API. It is not 
changing the devcoredump infrastructure. It is adding extremely useful 
features to our private and internally defined coredump content. And if 
you are referring to the dump-via-dmesg (which is config option 
controlled or locally added by a developer themselves), that is not part 
of devcoredump. The intent is to use devcoredump as the container being 
dumped because why duplicate code? Although currently it can't because 
of how we have implemented our devcoredump capture code. But either way, 
it has absolutely zero bearing on any APIs or on how devcoredumps are 
used for regular debugging of end user hangs.

As for config options, sure we can create as many config options as we 
like but this feature absolutely needs to be enabled for CI builds. 
Plus, last time I tried adding a new config option for debug code, I was 
told that we should just use the existing one and not pollute the config 
space.

But like I said, the currently shipping Xe driver has the CT dump 
permanently enabled. There is no config option at all. So using any 
config option is still a significant improvement.

John.

>
>>
>>>>>>>>>>>>> $ git grep ascii85.h
>>>>>>>>>>>>> drivers/gpu/drm/i915/i915_gpu_error.c:#include <linux/ascii85.h>
>>>>>>>>>>>>> drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:#include <linux/ascii85.h>
>>>>>>>>>>>>> drivers/gpu/drm/msm/adreno/adreno_gpu.c:#include <linux/ascii85.h>
>>>>>>>>>>>>> drivers/gpu/drm/xe/xe_lrc.c:#include <linux/ascii85.h>
>>>>>>>>>>>>> drivers/gpu/drm/xe/xe_vm.c:#include <linux/ascii85.h>
>>>>>>>>>>>> And the list of drivers which dump raw binary data in a coredump
>>>>>>>>>>>> file is... ath10k. ASCII85 wins 3 to 1.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>> We want the devcoredump file to still be human readable.
>>>>>>>>>>>>>> That won't be the case if you stuff binary data in the
>>>>>>>>>>>>>> middle of it. Most obvious problem - the zeros in the data
>>>>>>>>>>>>>> will terminate your text file at that point. Potentially
>>>>>>>>>>>>>> bigger problem for end users - random fake ANSI codes will
>>>>>>>>>>>>>> destroy your terminal window if you try to cat the file to
>>>>>>>>>>>>>> read it.
>>>>>>>>>>>>> Users don't get a coredump and cat it to the terminal.
>>>>>>>>>>>>> =(lk%A8`T7AKYH#FD,6++EqOABHUhsG%5H2ARoq#E$/V$Bl7Q+@<5pmBe<q;Bk;0mCj at .3DIal2FD5Q-+E_RBART+X@VfTuGA2/4Dfp.E at 3BN0DfB9.+E1b0F(KAV+:8
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Lucas De Marchi
>>>>>>>>>>>> They might. Either intentionally or accidentally. I've certainly
>>>>>>>>>>>> done it myself. And people will certainly want to look at it in
>>>>>>>>>>>> any random choice of text editor, pager, etc. 'cos you know, it
>>>>>>>>>>>> is meant to be read by humans. If it is full of binary data then
>>>>>>>>>>>> that becomes even more difficult than simply being full of ASCII
>>>>>>>>>>>> gibberish. No matter what you are doing, the ASCII version is
>>>>>>>>>>>> safer and easier to look at the rest of the file around it.
>>>>>>>>>>>>
>>>>>>>>>>>> I don't understand why you are so desperate to have raw binary
>>>>>>>>>>>> data in the middle of a text file. The disadvantages are
>>>>>>>>>>>> multiple but the only advantage is a slightly smaller file. And
>>>>>>>>>>>> the true route to smaller files is to add compression like we
>>>>>>>>>>>> have in i915.
>>>>>>>>>>>>
>>>>>>>>>>>> John.
>>>>>>>>>>>>
>>>>>>>>>>> PS: Also meant to add that one of the important uses cases for
>>>>>>>>>>> dumping logs to dmesg is for the really hard to repro bugs that
>>>>>>>>>>> show up in CI extremely rarely. We get the driver to dump an error
>>>>>>>>>>> capture to dmesg and pull that out from the CI logs. Even if you
>>>>>>>>>>> could get binary data through dmesg, pretty sure the CI tools
>>>>>>>>>>> would also not be happy with it. Anything non-printable will get
>>>>>>>>>>> munged for sure when turning it into a web page.
>>>>>>>>>> I think that's the main source of confusion on what we are discussing.
>>>>>>>>>> I was not talking about dmesg at all. I'm only complaining about feeding
>>>>>>>>>> ascii85-encoded data into a *devcoredump* when apparently there isn't a
>>>>>>>>>> good reason to do so. I'd rather copy the binary data to the
>>>>>>>>>> devcoredump.
>>>>>>>>> But the intent is to dump a devcoredump to dmesg. It makes much sense
>>>>>>>> It seems like an awful idea to dump hundreds of MB to dmesg.  When we
>>>>>>>> talked about printing to dmesg it was about **GuC log** and on very
>>>>>>>> initial states of driver probe where we didn't actually have a good
>>>>>>>> interface for that. And the log wouldn't be so big. If we can already
>>>>>>>> capture the devcoredump, what would be the reason to dump to dmesg
>>>>>>>> (other than the non-valid "our CI captures dmesg, and doesn't
>>>>>>>> capture devcoredump", which should be fixed).
>>>>>>>>
>>>>>>>> If any sysadmin have their serial console flooded by such garbage there
>>>>>>>> are 2 reactions: 1) someone got in control of my machine; 2) something
>>>>>>>> went really bad with this machine. It's not "fear not, wait for it to
>>>>>>>> complete, it's just normal debug data I will attach to an issue in
>>>>>>>> gitlab".  And I'm mentioning a serial console here due to that
>>>>>>>> cond_resched() added, which is only needed because you are trying to do
>>>>>>>> in kernel space what should be in userspace.
>>>>>>>>
>>>>>>>> Oh well... looking at this the main reason to use ascii85 I can see is
>>>>>>>> because we already have parts of *our* devcoredump using it, and
>>>>>>>> userspace relying on that. That's new to me. Let's stop bringing dmesg
>>>>>>>> into this discussion.
>>>>>>>>
>>>>>>>>> to have a single implementation that can be used for multiple
>>>>>>>>> purposes. Otherwise you are duplicating a lot of code unnecessarily.
>>>>>>>>>
>>>>>>>>> And I still think it is a *very* bad idea to be including binary data
>>>>>>>>> in a text file. The devcoredump is supposed to be human readable. It
>>>>>>>> no, it's not. devcoredump doesn't dictate the format, it's up to the
>>>>>>>> drivers to do that. See their documentation.
>>>>>>>>
>>>>>>>>> is supposed to be obtained by end users and passed around. Having
>>>>>>>>> binary data in there just makes everything more complex and error
>>>>>>>>> prone. We want this to be as simple, easy and safe as possible.
>>>>>>>>>
>>>>>>>>>> For dmesg, there's a reason to encode it as you pointed out... but
>>>>>>>>>> no users shouldn't actually see it - we should be getting all of those
>>>>>>>>>> cases in CI. For the escape scenarios, yeah... better having it
>>>>>>>>>> ascii85-encoded.
>>>>>>>>>>
>>>>>>>>>> What you are adding to devcoredump also doesn't even seem to be an
>>>>>>>>>> ascii85 representation, but a multiple lines that should be concatenated
>>>>>>>>>> to form the ascii85 representation. For dmesg it makes sense. Not for
>>>>>>>>>> devcoredump.  We should also probabaly need a length field (correctly
>>>>>>>>>> accounting for the additional characters for each line) so we don't
>>>>>>>>>> have an implicit dependency on what's the next field to know how much to
>>>>>>>>>> parse.
>>>>>>>>> The decoding is pretty trivial given that line feeds are not part of
>>>>>>>>> the ASCII85 character set and so can just be dropped. Besides The
>>>>>>>>> output is already not 'pure' ASCII85 because the ASCII85 data is
>>>>>>>>> embedded within a devcoredump. There is all sorts of other text about,
>>>>>>>>> including on the start of the line. There are multiple ASCII85 blobs
>>>>>>>>> in there that need to be decoded separately. This is nothing new to my
>>>>>>>>> patch set. All of that is already there. And as per comments on the
>>>>>>>>> previous devcoredump patches from Matthew B, the object data can many
>>>>>>>>> hundreds of MBs in size. Yet no-one batted an eyelid when that was
>>>>>>>>> added. So why the sudden paranoia about adding a couple of MB of GuC
>>>>>>>>> log in the same form?
>>>>>>>> I suppose you are talking about commit 4d5242a003bb ("drm/xe: Implement capture of
>>>>>>>> HWSP and HWCTX").  Probably because I haven't seen that commit doing an
>>>>>>>> ascii85 encoding before, otherwise I'd have similar review feedback.
>>>>>>>>
>>>>>>>> Looking at this just now, so I will also have to balance the previous
>>>>>>>> users and existing userspace consuming it.
>>>>>>>>
>>>>>>>> +José, would it be ok from the userspace POV to start adding the \n?
>>>>>>>> Then we can at least have all fields in our devcoredump to follow the
>>>>>>>> same format. Are these the decoder parts on the mesa side?
>>>>>>>>
>>>>>>>> 	src/intel/tools/aubinator_error_decode.c
>>>>>>>> 	src/intel/tools/error2hangdump.c?
>>>>>>>>
>>>>>>>>      From a quick look, read_xe_data_file() already continues the previous
>>>>>>>> topic when it reads a newline, but the parsers for HWCTX and HWSP
>>>>>>>> seems to expect to to have the entire topic in a single line. But I may
>>>>>>>> be missing something.
>>>>>>> Sorry I'm not following up this thread.
>>>>>>> Add a '\n' where exactly?
>>>>>> To break very long ASCII85 streams across multiple lines. That will
>>>>>> allow the devcoredump file to output via dmesg for the situations where
>>>>>> reading from sysfs is not possible.
>>>>>>
>>>>>> This patch is adding an ASCII85 encoding helper that is more friendly to
>>>>>> output via dmesg. The patch does not currently change the existing
>>>>>> ASCII85 encodings of VMs and hw contexts. However, the intention would
>>>>>> be to update that code to use this helper eventually.
>>>>> It will break the parser and I don't think we are allowed to break it at this point.
>>>> The intent has always been to add compression as we had in i915. That will
>>>> presumably also break the parser. Although I'm not seeing how a debug log
>>>> parser counts as critical user space that must not ever be broken.
>>> We shouldn't be breaking the current userspace tools. Any change like this would
>>> need to be synchronized between all the current decode tools.
>> That is why the comment was 'eventually we would like to'. Nothing in this
>> patch will break a user tool unless that tool cannot cope with new data
>> being added to the core dump. And if we can't add new stuff to core dumps
>> then again, we are fundamentally broken.
>>
>>>>> What I can suggest is add another sysfs with a coredump that skips the binary dumps so it is readable by humans.
>>>> That will just lead to confusion and problems with people sending the wrong
>>>> file. This needs to be kept as simple and error-proof as possible.
>>> Perhaps it is time for us to convince devcoredump folks to accept
>>> multiple files like nvme folks were asking a while ago? [1]
>>>
>>> [1] - https://lore.kernel.org/lkml/1557676457-4195-4-git-send-email-akinobu.mita@gmail.com/
>> Is that likely to happen any time soon? That thread was from May 2019. We
>> need something now. And it's not clear if that is adding a filing system
>> within a single core dump file (i.e. read one file from sysfs and then
>> extract later like a tarball) or creating multiple sysfs files that all must
>> be read together and passed around together. I would strongly advise against
>> the latter. As I said already, the whole coredump thing needs to be as
>> simple and easy to use as possible.
>>
>> If more official support is added to devcoredump itself in some number of
>> years time and is actually beneficial, then we can move over to using that
>> interface (and break the user land tools again...). But until then, this
>> version works with the least amount of disruption for significant benefit.
>> I.e., it lets us actually debug problems right now.
>>
>> John.
>>
>>
>>>> John.
>>>>
>>>>>> John.
>>>>>>
>>>>>>>> Lucas De Marchi
>>>>>>>>
>>>>>>>>> And again, arbitrarily long lines (potentially many thousands of
>>>>>>>>> characters wide) in a text file can cause problems. Having it line
>>>>>>>>> wrapped gets rid of those potential problems and so is safer. Anything
>>>>>>>>> that reduces the risk of an error report being broken is a good thing
>>>>>>>>> IMHO. Robustness is worthwhile!
>>>>>>>>>
>>>>>>>>> John.
>>>>>>>>>
>>>>>>>>>> Lucas De Marchi
>>>>>>>>>>
>>>>>>>>>>> John.
>>>>>>>>>>>