[Spice-devel] bad primary surface and server crash after migration

Sun Jul 10 01:27:22 PDT 2011

On 07/04/2011 12:23 PM, Gerd Hoffmann wrote:
> On 07/04/11 10:51, Yonit Halperin wrote:
>> Hi Gerd,
>> I encountered several problems after migration, maybe you can help:
>>
>> 1) on qxl_pre_load, sometimes the command ring is not empty and when
>> handle_dev_destroy_surface (on hard reset), flush_all_qxl_commands is
>> called. When attempting to process a command we receive
>>
>> id 0, group 0, virt start 0, virt end ffffffffffffffff, generation 0,
>> delta 0
>> validate_virt: panic: virtual address out of range
>> virt=0x175f99c+0xbf slot_id=1 group_id=1
>> slot=0x0-0x0 delta=0x0
>>
>> Is it valid that the command ring is not empty? Maybe we shouldn't
>> process commands as long as worker->running is not set?
>
> Yes, that should fix it, otherwise you'll try to process commands before
> qxl fully restored the state (especially memory slots and surfaces)
> which will not work.

Hi, I pushed this fix. But, as Alon discovered, after shutdown of the vm 
and then calling system_reset, it causes another problem:
before starting the vm, qxl_hard_reset is called. It calls 
destroy_all_surfaces, which attempts to flush the command rings, but 
since the worker is not running, it doesn't. Then qxl_reset_state is 
called and it crashes upon:
assert(SPICE_RING_IS_EMPTY(&ram->cmd_ring));
assert(SPICE_RING_IS_EMPTY(&ram->cursor_ring));

what do you think should we do?
1) remove assert
2) track worker state and assert only if worker is running.
3) worker should process commands if it is stopped, but only if it is 
not before loadvm. With the current interface I think we can't 
distinguish between different reasons for not running worker.

Cheers,
Yonit.