<div dir="ltr">I am working on a project that involves taking an OpenGL application and streaming the rendered frames as H.264 wrapped in transport stream/RTP to a multicast destination.<br> <br>I'm working with an i7-3517UE embedded Ivy Bridge platform running an embedded distribution of Linux and 3.13 kernel with no display. (It's a server on a commercial aircraft that is intended to stream a moving map to a bunch of seatback displays.)<br> <br> I've been working to get this functionality implemented and have gotten to the point where it works, but I have a feeling it is less than optimal performance wise.<br> <br><div>At the moment, the architecture involves a patch to Mesa to do a glreadpixels of the frame each time glxswapbuffers is called. Also in this patch, once the glreadpixels is completed, we are using "libyuv" to convert the RGB frame into YUY2 as we believe this is required by libva and placing the "libyuv" converted output in a buffer that libva can directly use. From there on, it's pretty standard ffmpeg bits to get it on the network.<br> <br></div><div>The two areas I'm thinking there may be opportunity are the grabbing of the buffer from the GPU using glreadpixels and the color space conversion on the CPU.<br><br>For glreadpixels, we applied a patch to Mesa to speed up the process of moving the data by doing one large memcpy instead of a bunch of little ones. (Patch attached.) This resulted in a much faster glreadpixels, but none the less a halting of GPU processing while the memory is copied using the CPU.<br> <br>For the color space conversion, libyuv does a good job of using the SIMD instructions of the platform, but none the less, it is still using the CPU.<br><br>Is there a better way to get the frames from GPU memory space to libva? Maybe something involving a zero-copy. (The application being used is a binary that I cannot change so using special gl extensions or making any code changes to the application is not an option. Only changes to Mesa are possible.)<br> <br></div><div>Is there a better way to do the color space conversion, if it is in fact necessary? I wonder if this is something that can be done with a Mesa patch to have a shader do the work? Would that be faster and consume less bus bandwidth? What about libva? I see some VPP functionality, but the fact that it is referred to as "post" processing makes me feel that it is intended for after decoding and not targeted at "pre" processing before an encode. Is it possible to do the color space conversion with the libva API?<br> <br></div>Any recommendations would be appreciated.<br><br>Also, for what it's worth, I've posted the Mesa patches at the following URLs:<br><br><a href="http://pastebin.com/XQS11iW4">http://pastebin.com/XQS11iW4</a><br><a href="http://pastebin.com/g00SHFJ1">http://pastebin.com/g00SHFJ1</a><br> <br>Regards,<br><br>Chris</div>