[Beignet] [PATCH 2/2] Fix DRM Memory leak BUG

Pan, Xiuli xiuli.pan at intel.com
Mon Sep 21 21:51:41 PDT 2015


I agree about the complex event handlings, and maybe we should do that update somewhere else, but the leaked event is newed from clEnqueueNDRangeKernel and passed to user and it is a very rare usage. As it is not a user event, and the only chance for us to update the event status  in the last_event is here. If the last_event is completed it will be deleted from the event update function, otherwise it will be lost and cause leak, so we need to force it updating here. Also if the event is completed before that, the last_event should be NULL. I think if we did it like gpgpu in a linked list, maybe we could not do blocking update, but now we may do a block update to make other things easier in these cases. We should have more tests about the events, but now the memory leak caused by rare usage of event is now be fixed.

The rare usage of event from the PSieve-CUDA case:
  checkCUDAErr(clEnqueueReadBuffer(commandQueue,
        d_factor_found,
        CL_TRUE,
        0,
        cthread_count*sizeof(cl_uint),
        factor_found,
        0,
        NULL,
        &dev_read_event), "Retrieving results");
//It get the ReadBuffer event here, as well as NDRangeKernel event.

  checkCUDAErr(clWaitForEvents(1, &dev_read_event), "Waiting for results read. (clWaitForEvents)");
  checkCUDAErr(clReleaseEvent(dev_read_event), "Release event object 3. (clReleaseEvent)");

//Then wait and release the event, it is very different from our usage.

I will have a deep look about this usage path. Thank you for your advice.



-----Original Message-----
From: Zhigang Gong [mailto:zhigang.gong at linux.intel.com] 
Sent: Tuesday, September 22, 2015 10:30 AM
To: Pan, Xiuli <xiuli.pan at intel.com>
Cc: beignet at lists.freedesktop.org
Subject: Re: [Beignet] [PATCH 2/2] Fix DRM Memory leak BUG

Nice catch! But may not be a correct fix.
We don't need to do the blocking event updating all the time.
We only need to do that when there is potential possibility to leak a event. If a event has a user call back function registered is such a case, and my best guessing here is:
one event in the wait list of the last event has user call back function registered and has been missed.

We may need to check all the wait list of the last event before we do a locking event updating here.

Thanks,
Zhigang Gong.

On Mon, Sep 21, 2015 at 04:41:52PM +0800, Pan Xiuli wrote:
> This bug is cased by event flush, we should not only run usr event but 
> also event made by enqueue functions.
> If the event haven't been completed before it is been overwite in the 
> last_event, the related gpgpu buffer will not be unreference. And will 
> cause all related drm buffers unreference and thenw leak.
> 
> Signed-off-by: Pan Xiuli <xiuli.pan at intel.com>
> ---
>  src/cl_command_queue.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/cl_command_queue.c b/src/cl_command_queue.c index 
> 4b92311..fd1d613 100644
> --- a/src/cl_command_queue.c
> +++ b/src/cl_command_queue.c
> @@ -261,7 +261,7 @@ cl_command_queue_flush(cl_command_queue queue)
>    // the event any more. If we don't do this here, we will leak that event
>    // and all the corresponding buffers which is really bad.
>    cl_event last_event = get_last_event(queue);
> -  if (last_event && last_event->user_cb)
> +  if (last_event)
>      cl_event_update_status(last_event, 1);
>    cl_event current_event = get_current_event(queue);
>    if (current_event && err == CL_SUCCESS) {
> --
> 2.1.4
> 
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/beignet


More information about the Beignet mailing list