[systemd-devel] [RFC 5/8] HACK1: optimize copy operations in kdbus_conn_queue_alloc()

Thu Jun 26 04:09:24 PDT 2014

Hello Kay,

On 06/25/2014 07:20 PM, Kay Sievers wrote:
> On Wed, Jun 25, 2014 at 11:13 AM, AKASHI Takahiro
> <takahiro.akashi at linaro.org> wrote:
>> This function copies the reply message directly into receiver's buffer
>> (pool slice). Looking into the implementation closely, however, there are
>> multiple occurrences of copy operations, kdbus_pool_slice_copy(), which
>> then executes costly prolog/epilog functions, say set_fs() and
>> write_begin/write_end(), on each entry/exit.
>>
>> This patch uses a temporary buffer on the stack to compose a message and
>> replace a costly function, kdbus_pool_slice_copy(), to memcpy() wherever
>> possible, although I have to further confirm that this change is actually
>> safe.
>
>> +#if KDBUS_HACK1
>> +       tmp = kmalloc(vec_data, GFP_KERNEL);
>
> You meant a malloced temporary buffer, not one allocated on the stack, right?

Oops. Shame on me!

> We need to handle larger vectors than kmalloc can handle, that's why
> we copy directly to the receiver, and to avoid 2 copies.

Right. That is, what you say, "bound to RAM," isn't it?
But my point here is that we have some possibility of optimizing the performance
in some specific cases, ie. short messages, by eliminating repeated kdbus_pool_slice_copy()'s.

I guess that, including systemd?, most app will be expected to transfer large data through
shmem, not via message payload, and only small "control" messages are passed on.

> Did you find out what exactly in the call chain of our "open-coded
> write" shows up as expensive here?

No. The results mentioned in my study are all what I have now, but
I think that we can still see small improvement here though.

Thanks,
-Takahiro AKASHI

> Kay
>