[Libva] High CPU usage in __i915_wait_request when encoding video

Xiang, Haihao haihao.xiang at intel.com
Sun Nov 1 18:58:25 PST 2015


> [This is a resend -- I believe the first one was stuck infinitely in
>  moderation, so I subscribed and re-sent.]
> 
> Hi,
> 
> I'm on a Haswell system, using the new Mesa 11 EGL/VAAPI interop to
> render
> directly into VAAPI buffers and get H.264 video out. I have three
> relevant
> threads:
> 
>   * Thread #1 drives OpenGL rendering into the right textures, and
> sends
>     frame numbers (and fences) into the encoding queue.
> 
>   * Thread #2 reads from the encoding queue, waits for the fence (so
> that
>     the GPU is done rendering) and asks VAAPI to encode the frame
> through
>     vaBeginPicture() etc., then sends the frame numbers into the
> storage
>     queue.
> 
>   * Thread #3 reads from the storage queue, waits for the frame to be
>     done encoding (through vaSyncSurface) and stores it to disk.
> 
> Now, my problem is that thread #3 is using a lot of CPU; in
> particular,
> __i915_wait_request uses around 30% of the total CPU of my
> application (which
> uses almost 1.5 of the two cores when the thermal constraints come
> and clock
> down my CPU). Looking at the stack trace from perf, this comes from
> the
> vaSyncSurface call, which as I understand it waits for the encoder to
> be done
> with the frame.

Are you sure all __i915_wait_request() comes from vaSyncSurface() ? 

> 
> Is it possible to make it wait without busylooping? It seems like a
> strange
> way to use the CPU. (I'm fine with extra latency if need be.)

vaSyncSurface() just calls drm_intel_bo_wait_rendering() to make sure
all GPU operations with the surface are finished.
drm_intel_bo_wait_rendering() is implemented with SET_DOMAIN ioctl(), I
don't think it waits with busylooping.

BTW is HW semaphore enabled on your system ? You can
check /sys/kernel/debug/dri/0/i915_semaphore_status for the status.

> 
> /* Steinar */


More information about the Libva mailing list