KFD event handling questions

Oded Gabbay oded.gabbay at gmail.com
Sun Sep 24 10:11:35 UTC 2017


On Sat, Sep 23, 2017 at 3:44 AM, Thangirala, Hari
<Hari.Thangirala at amd.com> wrote:
> Hi Felix,
>
> Event IDs reserved for debug events was an artifact of a debugger design. We changed our tools strategy so those limitations do not exist anymore. There's no need to reserve special event-IDs for debugger. 4096 is good enough and the new debug trap handler will use events from that pool.
>
> Thanks
> Hari
>
>
>
> -----Original Message-----
> From: Kuehling, Felix
> Sent: Thursday, September 21, 2017 9:36 PM
> To: amd-gfx at lists.freedesktop.org; Oded Gabbay <oded.gabbay at gmail.com>; Thangirala, Hari <Hari.Thangirala at amd.com>
> Subject: KFD event handling questions
>
> [Hari, can you comment on the last paragraph below regarding the events limit? Would it hurt the runtime much to lose 8 signals out of 4096?]
>
> Hi Oded,
>
> I'm looking into upstreaming some event handling enhancements. I see that you actually worked on increasing the number of events to 4096 and working around some problems with debug events when you were at AMD.
>
> As I'm looking at this, I'm confused by the code that allocates signal pages. It allocates a single signal page that's big enough to accommodate all signals. But allocate_signal_page looks like it was meant to allocate smaller signal pages incrementally that are accumulated in a kfd_process::signal_event_pages list. However, this feature never gets used. It would also break user mode because the Thunk expects to be able to map a single signal page for all its signals.
>
> Therefore I'm inclined to simplify allocate_signal_page and related code to only deal with a single allocation. Do you have any concerns or objections about that?

Feel free to simplify that code. At the time I believe we didn't know
how many events will be needed so we prepared a more generic mechanism
for allocation. I guess we are now smarter (after almost 4 years) so
simplification seems like the right thing to do if it matches the rest
of the stack.

>
> Somewhat related to that, you also added a workaround that increases the signal number to 4096+512 to accommodate 8 debug events while maintaining page alignment. However, the signal page is allocated with __get_free_pages, which allocates in powers of 2. So in fact it needs to allocate 8192 signals, wasting about 28KB per process. It's not a lot of waste. But an alternative would be to instead sacrifice 8 user signals.
> The runtime has a fallback for when it runs out of signals, and the difference between 4096 and 4088 seems insignificant. What do you think?

I believe Hari already answered that and I don't have anything more to add.

Thanks,
Oded

>
> Regards,
>   Felix
>
> --
> F e l i x   K u e h l i n g
> PMTS Software Development Engineer | Vertical Workstation/Compute
> 1 Commerce Valley Dr. East, Markham, ON L3T 7X6 Canada
> (O) +1(289)695-1597
>    _     _   _   _____   _____
>   / \   | \ / | |  _  \  \ _  |
>  / A \  | \M/ | | |D) )  /|_| |
> /_/ \_\ |_| |_| |_____/ |__/ \|   facebook.com/AMD | amd.com
>


More information about the amd-gfx mailing list