[PATCH v2 1/1] drm/panfrost: Add support for devcoredump

Wed Jun 22 01:54:55 UTC 2022

Hi Alyssa, thanks for the feedback.

If I don't answer directly to any of your concerns in this message, it's because
I applied your suggestions in v3 of the patch straight away.

On 21.06.2022 09:03, Alyssa Rosenzweig wrote:
> Hi Adrian,
> 
> Great work on the devcoredump support! This is really cool to see coming
> along, thank you! I've left a few notes below:
> 
> > +		if (panfrost_dump_registers[i] >= JS_HEAD_LO(0) &&
> > +		    panfrost_dump_registers[i] <= JS_CONFIG_NEXT(0))
> > +			js_as_offset = slot * 0x80;
> > +		else if (panfrost_dump_registers[i] >= AS_TRANSTAB_LO(0) &&
> > +			 panfrost_dump_registers[i] <= AS_STATUS(0))
> > +			js_as_offset = ((as_nr) << 6);
> 
> I'm not a fan of the magic numbers. Do you think it makes sense to add
> 
> 	#define JS_SLOT_STRIDE 0x80
> 	#define MMU_AS_SHIFT 0x6
> 
> in the appropriate places in panfrost_regs.h, reexpress the existing
> #defines in terms of those
> 
> 	#define JS_HEAD_LO(n) (JS_BASE + ((n) * JS_SLOT_STRIDE) + 0x00)
> 	...
> 	#define JS_FLUSH_ID_NEXT(n) (JS_BASE + ((n) * JS_SLOT_STRIDE) + 0x70)
> 	...
> 	#define MM_AS(as) (0x2400 + ((as) << MMU_AS_SHIFT)
> 
> and then use those here?
> 
> Also, drop the parans around (as_nr), this isn't a macro.
> 
> > +	/* Add in the active buffer objects */
> > +	for (i = 0; i < job->bo_count; i++) {
> > +		dbo = job->bos[i];
> > +		file_size += dbo->size;
> > +		n_bomap_pages += dbo->size >> PAGE_SHIFT;
> > +		n_obj++;
> > +	}
> 
> Strictly, I don't think this is right -- what happens if the CPU is
> configured to use 16K or 64K pages? -- however, that mistake is pretty
> well entrenched in panfrost.ko right now and it doesn't seem to bother
> anyone (non-4K pages on arm64 are pretty rare outside of fruit
> computers).
> 
> That said, out-of-context there looks like an alignment question. Could
> we add an assert for that, documenting the invariant:
> 
> 	WARN_ON(!IS_ALIGNED(dbo->size, PAGE_SIZE));
> 
> > +	iter.start = __vmalloc(file_size, GFP_KERNEL | __GFP_NOWARN |
> > +			__GFP_NORETRY);
> > +	if (!iter.start) {
> > +		dev_warn(pfdev->dev, "failed to allocate devcoredump file\n");
> > +		return;
> > +	}
> > ...
> > +	memset(iter.hdr, 0, iter.data - iter.start);
> 
> Why are we using __GFP_NOWARN and __GFP_NORETRY? Why not plain vmalloc?
> 
> Also, why vmalloc instead of vzalloc? (Or adding __GFP_ZERO to the list
> of __vmalloc flags if __GFP_NOWARN/__GFP_NORETRY are really needed?) Are
> there relevant performance or security considerations?

I borrowed this code from Etnaviv a while ago and the same doubt struck me
then. My understanding of its intended behaviour is that because the dump file
might be huge, we don't want the memory manager to trigger the OOM killer and
annoy quite a few running processes because of a debug feature. Also since the
code already handles the situation when an allocation fails by refusing to
generate a dump, there's no need for the allocator to generate specific error
messages.

So I guess it boils down to 'if there's quite enough memory to allocate a huge
dump file, go ahead, otherwise don't reclaim any processes' pages for something
that isn't essential'.

I don't see much use for __GFP_ZERO in this case, because the dump file gets
memcpy'd with the contents of every single bo so whatever the original
contents of the memory were at the time of the allocation, they're overwritten
immediately.

> > +/* Definitions for coredump decoding in user space */
> > +#define PANFROSTDUMP_VERSION_1 1
> 
> I'm not a fan of an enum that just represents a number. Using the
> numbers directly means we can compare them in a natural way. Also, using
> a major/minor split like Steven suggested can help with semantic
> versioning.
> 
> Cheers,
> Alyssa

I've also rebased v3 on top of drm-misc-next and the compiler error because of
the removed panfrost_job structure member is gone.

Adrian Larumbe