[Spice-devel] [PATCH] how can i trace monitor change (etc) events

David Mansfield spice at dm.cobite.com
Wed May 7 14:41:26 PDT 2014


On 05/02/2014 09:05 AM, Marc-André Lureau wrote:
> Hi
>
>> FYI: I have been running with the attached patch (not the inline above)
>> to spice-gtk for one week now, and so has my colleague.  Dual monitor
>> works perfectly.  There is one other crash scenario (xorg in guest
>> crashes and won't restart) which sometimes happens consistently upon
>> logging in.  This can be "fixed" by removing the file
>> ~/.config/monitors.xml (in the guest).
>>
>> I recommend strongly that the attached fix be included. I understand it
>> is a "band-aid" but so is "fixme: goto whole" (see source code) which is
>> broken by definition.
> You are basically ignoring the surface_id from the monitor config with this patch. This isn't helping much, most probably this point out to invalid config, so I'd rather not do that and keep displaying the whole surface instead. Did you manage to trace back to where that surface id was generated? That's what I would do. It could be corrupted memory...
I have done a bit of poking into this.  I added this into qxl.ko's 
qxl_object.c:

--- qxl_object.c~	2014-03-30 23:40:15.000000000 -0400
+++ qxl_object.c	2014-05-07 17:05:24.989185760 -0400
@@ -311,6 +311,7 @@
  		ret = qxl_hw_surface_alloc(qdev, bo, NULL);
  		if (ret)
  			return ret;
+		DRM_ERROR("qxl_bo_check_id: %p %d %d", bo, (int)bo->is_primary, (int)bo->surface_id);
  	}
  	return 0;
  }

And lo-and-behold, the system keeps creating new "primary" surfaces, and 
not one of the resulting surface_id is '0'.

[   25.393840] [drm:qxl_bo_check_id] *ERROR* qxl_bo_check_id: 
ffff880209670000 1 1
[   86.104660] [drm:qxl_bo_check_id] *ERROR* qxl_bo_check_id: 
ffff8801fb5dcc00 1 2                <- this is the one that 
remote-viewer sees with surface_id #2
[  118.560840] [drm:qxl_bo_check_id] *ERROR* qxl_bo_check_id: 
ffff8801fb748800 1 3


I think that the entire concept of is_primary vs surface_id=0 is very 
broken in this code, and that sometimes the "real" surface_id (which is 
not 0) is "leaking out" into the protocol. These id's are allocated by 
idr.  Or perhaps the #2 surface is allocated as a primary before the #1 
has been freed, and somewhere a check on "only one primary allowed" is 
hit, and so the primary gets toggled off. (surface #3 is when I resized 
the second monitor and can be ignored).

>
>> I'm going to open a bug now on bugzilla with all my collected
>> information - I can proceed no further with this without any assistance.
> Yes, please.
Someone opened a freedesktop.org bug for this.  Should I pile onto that 
one or do you want a redhat or gnome.org bugzilla?



More information about the Spice-devel mailing list