[Libva] High CPU usage in __i915_wait_request when encoding video
Xiang, Haihao
haihao.xiang at intel.com
Sun Nov 1 18:58:25 PST 2015
> [This is a resend -- I believe the first one was stuck infinitely in
> moderation, so I subscribed and re-sent.]
>
> Hi,
>
> I'm on a Haswell system, using the new Mesa 11 EGL/VAAPI interop to
> render
> directly into VAAPI buffers and get H.264 video out. I have three
> relevant
> threads:
>
> * Thread #1 drives OpenGL rendering into the right textures, and
> sends
> frame numbers (and fences) into the encoding queue.
>
> * Thread #2 reads from the encoding queue, waits for the fence (so
> that
> the GPU is done rendering) and asks VAAPI to encode the frame
> through
> vaBeginPicture() etc., then sends the frame numbers into the
> storage
> queue.
>
> * Thread #3 reads from the storage queue, waits for the frame to be
> done encoding (through vaSyncSurface) and stores it to disk.
>
> Now, my problem is that thread #3 is using a lot of CPU; in
> particular,
> __i915_wait_request uses around 30% of the total CPU of my
> application (which
> uses almost 1.5 of the two cores when the thermal constraints come
> and clock
> down my CPU). Looking at the stack trace from perf, this comes from
> the
> vaSyncSurface call, which as I understand it waits for the encoder to
> be done
> with the frame.
Are you sure all __i915_wait_request() comes from vaSyncSurface() ?
>
> Is it possible to make it wait without busylooping? It seems like a
> strange
> way to use the CPU. (I'm fine with extra latency if need be.)
vaSyncSurface() just calls drm_intel_bo_wait_rendering() to make sure
all GPU operations with the surface are finished.
drm_intel_bo_wait_rendering() is implemented with SET_DOMAIN ioctl(), I
don't think it waits with busylooping.
BTW is HW semaphore enabled on your system ? You can
check /sys/kernel/debug/dri/0/i915_semaphore_status for the status.
>
> /* Steinar */
More information about the Libva
mailing list