[Beignet] How widespread is "Exec event...error...-5" (#98647 / #100639)?

Yang, Rong R rong.r.yang at intel.com
Thu Apr 27 06:49:42 UTC 2017


As we know, this issue is introduce by commit https://cgit.freedesktop.org/beignet/commit/?id=ff57cee0519db1287053c7c05a2cb4e9700d3334.

To clarify, this commit is not only for ocl 2.0, ocl 1.2 also need it for null point check in the opencl kernel.
Suppose a corner case:
__kernel test(__global char* src, __global char* dst)
{
    If(src == NULL)
       return;
    ......
}
Because in GEN's address space, address 0 is legal, src's GPU address maybe NULL even src have been set in the clSetKernelArg.
The most natural solution is don't allocate the address from 0 in the drm kernel driver, but unfortunately, they don't accept it.
So we try to occupy the address 0 by a fake bo with drm_intel_bo_set_softpin_offset.

The fix commit https://cgit.freedesktop.org/beignet/commit/?id=8b04f0be372da8eabdc93d6ae1b81a3c83cba284 is just a walk around.
It only ensure taking the address 0  has no error when create device, but can't ensure no error of following executions. But we don’t found these case.

Hi, Rebecca,

     Have you found the cases which Disable HAS_BO_SET_SOFTPIN could fix but commit 
https://cgit.freedesktop.org/beignet/commit/?id=8b04f0be372da8eabdc93d6ae1b81a3c83cba284 still exist?

Thanks,
Yang Rong
> -----Original Message-----
> From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf Of
> Rebecca N. Palmer
> Sent: Wednesday, April 26, 2017 6:42
> To: beignet at lists.freedesktop.org
> Subject: [Beignet] How widespread is "Exec event...error...-5" (#98647 /
> #100639)?
> 
> Debian 9 (stretch)/Ubuntu 17.04 (zesty) have beignet 1.3.0, libdrm
> 2.4.74/2.4.76 and Linux 4.9/4.10.
> 
> On some hardware (possibly all of Ivy Bridge and Haswell??), this does not
> work at all: attempting to run anything fails with
> 
> drm_intel_gem_bo_context_exec() failed: Device or resource busy
> Beignet: "Exec event 0xnnnn error, type is 4592, error staus is -5"
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=98647
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=860805
> 
> As the package maintainer, I'd like to fix this.  I am aware of two fixes, either
> of which works for me, but
> https://bugs.freedesktop.org/show_bug.cgi?id=100639 reports that neither
> of them is perfect:
> 
>   - The fix used in 1.4:
> https://cgit.freedesktop.org/beignet/commit/?id=8b04f0be372da8eabdc93d
> 6ae1b81a3c83cba284
> 
> - Disable HAS_BO_SET_SOFTPIN: fixes more (but still not everything), but
> also disables some functionality (OpenCL 2.0).  This is probably why the bug
> only appears in recent Linux versions, and hence was missed when I tested
> the packages in a chroot on Linux 3.16: softpin was only introduced in Linux
> 4.5.
> 
> Has anyone other than its reporter seen #100639 (i.e. this error persisting
> after applying the 1.4 fix, particularly when using multiple OpenCL kernels
> such as in clFFT)?
> 
> Any other suggestions?
> 
> _______________________________________________
> Beignet mailing list
> Beignet at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/beignet


More information about the Beignet mailing list