<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Mar 28, 2017 at 5:14 PM, Frediano Ziglio <span dir="ltr"><<a href="mailto:fziglio@redhat.com" target="_blank">fziglio@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">><br> > The main goal is to reduce time in GDI callback (PresentDisplayOnly) and<br> > avoid<br> > situation when the processing takes more than 2 seconds causing class driver<br> > watchdog.<br> ><br> > 1. We offload sending of drawable commands to separate thread (waiting for<br> > room in command ring<br> > may take unpredictable time)<br> > 2. In case the usage of device memory is high, allocation of bitmap for<br> > rectangle to draw<br> > also may take unpredictable time (note that single full screen redraw<br> > requires >3 MB of space)<br> > So, we make drawable objects allocation from GDI callaback fast and<br> > non-forced and in case they<br> > fail we provide alternate allocation from OS heap<br> > 3. The thread before send drawable command shall take care on these objects<br> > that was allocated from<br> > OS heap and allocate them from device memory (now we are not limited by<br> > time)<br> > 4. We still do not enable VSync automatically, but this can be done for<br> > evaluation/testing purpose via<br> > setting in the driver's registry<br> ><br> <br> </span>A big issue of this approach is that it does not entirely solve<br> the problem but move it.<br></blockquote><div><br></div><div>We can't spend too much time waiting for memory in OS callback.</div><div>In our own thread our wait can be as long as we want.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Instead of waiting for device memory we fallback to system one<br> and when we can send commands we copy back to device memory and<br> send it increasing system memory usage and memory copies.<br></blockquote><div><br></div><div>Yes, that's correct. Our processing in OS callback must be fast and I do not see how we can</div><div>solve it without using host memory and without skipping operation.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> However I cannot see any limitation so potentially we'll fill<br> system memory till the guest crash. And if we add a limitation<br> potentially this will just move the hang to later.<br></blockquote><div><br></div><div>We allocate pageable memory which is much less limited than non-pageable </div><div>and typical amount of available pageable memory is > 1G</div><div>When working in LAN environment, there is rare cases when we need to allocate host memory.</div><div>With long end-to-end delay under heavy scenarios I did not see huge amount of outstanding allocation.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> As far as I know we always have available 3 times the amount<br> of memory of the maximum frame buffer to in theory plenty of<br> space. But trying to see the drawing from the client I can see<br> lot of redrawing of the same area again and again so maybe<br> this is causing the issues with the memory.<br> Maybe we can find a smarter way to solve this memory issue?<br></blockquote><div><br></div><div>I would suggest to look for possible improvements later.</div><div>I have some ideas but they do not invalidate current solution.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div class="HOEnZb"><div class="h5"><br> > Yuri Benditovich (12):<br> > qxl-wddm-dod: Prepare system thread for rendering<br> > qxl-wddm-dod: Use rendering offload thread<br> > qxl-wddm-dod: Introduce TimeMeasurement class for timing debugging<br> > qxl-wddm-dod: Debug warning on long wait on event<br> > qxl-wddm-dod: Reduce amount of unnecessary printouts<br> > qxl-wddm-dod: Registry-based control over VSync<br> > qxl-wddm-dod: Set VSync indication period to 200ms<br> > qxl-wddm-dod: Prepare for failure to allocate memory<br> > qxl-wddm-dod: PutBytesAlign supports non-forced allocation<br> > qxl-wddm-dod: Optimize allocation of memory chunks<br> > qxl-wddm-dod: Implement non-forced bitmap allocation<br> > qxl-wddm-dod: Non-forced memory allocations with VSync<br> ><br> > qxldod/QxlDod.cpp | 581<br> > ++++++++++++++++++++++++++++++<wbr>+++++++++++++++---------<br> > qxldod/QxlDod.h | 87 +++++++-<br> > qxldod/driver.cpp | 35 ++++<br> > 3 files changed, 606 insertions(+), 97 deletions(-)<br> ><br> <br> </div></div><span class="HOEnZb"><font color="#888888">Frediano<br> </font></span></blockquote></div><br></div></div>