[CI] drm/xe/guc: Remove spurious line feed in debug print

Wed Jul 3 14:54:43 UTC 2024

On 03.07.2024 16:15, Rodrigo Vivi wrote:
> On Thu, Jun 27, 2024 at 03:27:56PM +0000, Souza, Jose wrote:
>> On Thu, 2024-06-27 at 16:42 +0200, Michal Wajdeczko wrote:
>>>
>>> On 27.06.2024 15:53, Souza, Jose wrote:
>>>> On Wed, 2024-06-26 at 23:18 +0200, Michal Wajdeczko wrote:
>>>>>
>>>>> On 26.06.2024 22:35, Souza, Jose wrote:
>>>>>> On Wed, 2024-06-26 at 22:12 +0200, Michal Wajdeczko wrote:
>>>>>>> From: John Harrison <John.C.Harrison at Intel.com>
>>>>>>>
>>>>>>> Including line feeds at the start of a debug print messes up the
>>>>>>> output when sent to dmesg. The break actually appears between all the
>>>>>>> usefu prefix information and the actual string being printed. In this
>>>>>>> case, each block of data has a very clear start line and an extra
>>>>>>> delimeter is really not necessary. So don't do it.
>>>>>>
>>>>>> I guess the indention was to have one blank like between guc_ctb_snapshot_print() 'cmd[X]' lines and 'G2H CTB (all sizes in DW):'.
>>>>>> I agree with the change if you add one line breaker in the end of guc_ctb_snapshot_print().
>>>>>
>>>>> but 'cmd[XXX]' lines already have additional indent compared to 'H2G|G2H
>>>>> CTB' headers so IMO this extra separation line seems unnecessary:
>>>>>
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0: TLB invalidation time'd out,
>>>>> seqno=27, recv=26
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0: H2G CTB (all sizes in DW):
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      size: 1024
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      resv_space: 0
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      head: 0
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      tail: 572
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      space: 451
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      broken: 0
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      head (memory): 568
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      tail (memory): 572
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      status (memory): 0x0
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      cmd[568]: 0x80670003
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      cmd[569]: 0x20007000
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      cmd[570]: 0x0000001b
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      cmd[571]: 0x80000003
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0: G2H CTB (all sizes in DW):
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      size: 4096
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      resv_space: 1024
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      head: 210
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      tail: 0
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      space: 3068
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      broken: 0
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      head (memory): 210
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      tail (memory): 210
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      status (memory): 0x0
>>>>> [ ] xe 0000:4d:00.1: [drm] *ERROR* GT0:      g2h outstanding: 1
>>>>>
>>>>> and IMO printing empty lines is just a waste of paper ;)
>>>>
>>>> This same function is called during gpu hang capture(devcoredump) and the output of that file has new lines between sections for better reading.
>>>> This a old sample that I had around:
>>>>
>>>>
>>>>
>>>> **** Xe Device Coredump ****
>>>> kernel: 6.9.0-rc6-zeh-xe+
>>>> module: xe
>>>> Snapshot time: 1715877420.647211377
>>>> Uptime: 70684.605982665
>>>> PCI ID: 0x9a49
>>>> PCI revision: 0x01
>>>> GT id: 0
>>>> 	Type: main
>>>> 	IP ver: 0.0.0
>>>> 	CS reference clock: 19200000
>>>>
>>>> **** GuC CT ****
>>>> H2G CTB (all sizes in DW):
>>>> 	size: 1024
>>>> 	resv_space: 0
>>>> 	head: 978
>>>> 	tail: 599
>>>> 	space: 378
>>>> 	broken: 0
>>>> 	head (memory): 599
>>>> 	tail (memory): 599
>>>> 	status (memory): 0x0
>>>>
>>>
>>> but G2H is still part of the CT, no need for extra split
>>>
>>>> G2H CTB (all sizes in DW):
>>>> 	size: 4096
>>>> 	resv_space: 1024
>>>> 	head: 626
>>>> 	tail: 0
>>>> 	space: 3071
>>>> 	broken: 0
>>>> 	head (memory): 626
>>>> 	tail (memory): 626
>>>> 	status (memory): 0x0
>>>> 	g2h outstanding: 0
>>>>
>>>> GuC ID: 9
>>>
>>> hmm, so it looks that we need to fix two more places:
>>>
>>> 1) in xe_guc_exec_queue_snapshot_print() we need to drop spurious \n as
>>> well:
>>>
>>> - 	drm_printf(p, "\nGuC ID: %d\n", snapshot->guc.id);
>>> + 	drm_printf(p, "GuC ID: %d\n", snapshot->guc.id);
>>>
>>>
>>> 2) in xe_devcoredump_read() we need to add proper decorations/separation
>>> lines since xe_guc_exec_queue_snapshot_print() is not part of "GuC CT"
>>>
>>> - 	drm_printf(&p, "\n**** GuC CT ****\n");
>>> + 	drm_printf(&p, "\n**** GuC CT ****\n\n");
>>> 	xe_guc_ct_snapshot_print(coredump->snapshot.ct, &p);
>>> +	drm_printf(&p, "\n**** GuC submission ****\n\n");
>>> 	xe_guc_exec_queue_snapshot_print(coredump->snapshot.ge, &p);
>>>
>>> and maybe also
>>>
>>> - 	drm_printf(&p, "**** Xe Device Coredump ****\n");
>>> + 	drm_printf(&p, "\n**** Xe Device Coredump ****\n\n");
>>>
>>>> 	Name: rcs9
>>>> 	Class: 0
>>>> 	Logical mask: 0x1
>>>> 	Width: 1
>>>> 	Ref: 4
>>>> 	Timeout: 0 (ms)
>>>> 	Timeslice: 1000 (us)
>>>> 	Preempt timeout: 640000 (us)
>>>> 	HW Context Desc: 0x01480000
>>>> 	LRC Head: (memory) 280
>>>> 	LRC Tail: (internal) 552, (memory) 552
>>>> 	Start seqno: (memory) -125
>>>> 	Seqno: (memory) -126
>>>> 	[HWSP].length: 0x1000
>>>>
>>>> ....
>>>
>>> then the output will look like:
>>>
>>> **** Xe Device Coredump ****
>>>
>>> kernel: 6.9.0-rc6-zeh-xe+
>>> module: xe
>>> Snapshot time: 1715877420.647211377
>>> Uptime: 70684.605982665
>>> PCI ID: 0x9a49
>>> PCI revision: 0x01
>>> GT id: 0
>>> 	Type: main
>>> 	IP ver: 0.0.0
>>> 	CS reference clock: 19200000
>>>
>>> **** GuC CT ****
>>>
>>> H2G CTB (all sizes in DW):
>>> 	size: 1024
>>> 	resv_space: 0
>>> 	head: 978
>>> 	tail: 599
>>> 	space: 378
>>> 	broken: 0
>>> 	head (memory): 599
>>> 	tail (memory): 599
>>> 	status (memory): 0x0
>>> G2H CTB (all sizes in DW):
>>> 	size: 4096
>>> 	resv_space: 1024
>>> 	head: 626
>>> 	tail: 0
>>> 	space: 3071
>>> 	broken: 0
>>> 	head (memory): 626
>>> 	tail (memory): 626
>>> 	status (memory): 0x0
>>> 	g2h outstanding: 0
>>>
>>> **** GuC submission ****
>>>
>>> GuC ID: 9
>>> 	Name: rcs9
>>> 	Class: 0
>>> 	...
>>>
>>> would that work for you?
>>
>> In my opinion this looks worst.
>> Lets see what Rodrigo that implemented most of it thinks.
> 
> Well, I do believe that keeping empty lines in text helps
> our human brains to parse the text. For the same reason that
> I kept the empty line between this paragraph and the text I was
> responding to. But also for the same reason that our entire
> code has many empty lines between blocks.
> 
> Even for similar blocks in the same function we add some empty
> lines. It makes things easier to read.
> 
> Mainly this reason was the main reason that drove me to add
> these empty lines there. But that is just my personal preference,
> and I don't have stronger feelings here. I would be totally
> okay with the removal of these empty lines.

but to be clear, me and John don't want to remove empty lines between
different sections in devcoredump, so decorations like:

**** GuC CT ****
blah
blah

**** GuC submission ****
blah
blah

will still be there (and we can even improve that by adding additional
empty line to highlight the section):

**** GuC CT ****

blah
blah

**** GuC submission ****

blah
blah

I just want to remove empty lines from inside the single section, which
IMO could be treated as single paragraph, where you need use empty lines
to split sentences:

**** GuC CT ****

blah

blah

**** GuC submission ****

blah

blah

after all, in most cases, each devcoredump section matches what we
expose over debugfs, and in debugfs we usually don't print empty lines,
do we ?

> 
> Also, I have to agree that empty lines in dmesg are really ugly.
> 
> My biggest stronger feeling here would be on breaking current
> tools that are already parsing devcoredump. If this breaks the current
> tools I would prefer to avoid that breakage and keep things as is, or
> changing for dmesg, bug tweaking the devcoredump as Jose suggested.
> 
> Jose, would this break Mesa tools?

hmm, do we really want to claim that our printer outputs must now be
treated as (semi-)stable uabi?

> 
>>
>>>
>>>>
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>>>>>>> Reviewed-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>>>>>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>>>>>>> ---
>>>>>>> cherry-picked from [1]
>>>>>>>
>>>>>>> [1] https://patchwork.freedesktop.org/patch/598024/?series=134695&rev=2
>>>>>>> ---
>>>>>>>  drivers/gpu/drm/xe/xe_guc_ct.c | 2 +-
>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
>>>>>>> index 873d1bcbedd7..113e6348c6ac 100644
>>>>>>> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
>>>>>>> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
>>>>>>> @@ -1489,7 +1489,7 @@ void xe_guc_ct_snapshot_print(struct xe_guc_ct_snapshot *snapshot,
>>>>>>>  		drm_puts(p, "H2G CTB (all sizes in DW):\n");
>>>>>>>  		guc_ctb_snapshot_print(&snapshot->h2g, p);
>>>>>>>  
>>>>>>> -		drm_puts(p, "\nG2H CTB (all sizes in DW):\n");
>>>>>>> +		drm_puts(p, "G2H CTB (all sizes in DW):\n");
>>>>>>>  		guc_ctb_snapshot_print(&snapshot->g2h, p);
>>>>>>>  
>>>>>>>  		drm_printf(p, "\tg2h outstanding: %d\n",
>>>>>>
>>>>
>>