[Spice-devel] UMS memory management

Søren Sandmann sandmann at cs.au.dk
Wed Mar 27 01:07:02 PDT 2013


Hello,

The following is some notes on what I am planning to do for UMS memory
management. It basically amounts to a rewrite of qxl-surface-ums.c

Comments appreciated, especially regarding eviction policy, where my
current plan is very naive: Just keep kicking out the least recently
used surface until there is room.

Søren


In the current system, a qxl_surface_t corresponds precisely to a
device surface; ie., it always has an ID so there there can be only
1024 of them at any time. In the new system, there can be an unlimited
number of qxl_surfaces, so they no longer correspond to device
surfaces.

In the new system, a surface will be in one of the following states:

   - live in video memory
   - cached in video memory
   - live in host memory

     "in video memory":
     - has associated ID
     - has associated pixmap
     - pixman image may not exist

     "cached in video memory":
     - has associated ID
     - does not have associated pixmap
     - pixman image may not exist

     "in host memory":
        - no associated ID
        - has associated pixmap
        - pixman image may not exist

The following state changes are possible:

    IN_HOST => IN_VIDEO
    CACHED => IN_VIDEO
    IN_VIDEO => IN_HOST
    IN_VIDEO => CACHED
    IN_VIDEO =>
    IN_HOST =>
    CACHED =>

Code must exist to handle each of these transitions.

- Rendering to a surface
  - IN_HOST => IN_VIDEO

- prepare_access:
  - IN_VIDEO => IN_HOST (but see below)

- Leave VT:
  for all surfaces s that are IN_VIDEO:
  - s.(IN_VIDEO => IN_HOST)
  - add s to list of evacuated surfaces

- Enter VT:
  for all evacuated surfaces s, s.(IN_HOST => IN_VIDEO)
  (Or maybe simply rely on this happening when rendering begins?)


Notes regarding prepare_access():

  - This is historically very performance sensitive; however with
    Render support it may not be so much anymore. Hence, initially it
    will be done as simply an IN_VIDEO => IN_HOST transition.

  - If it still is a performance issue, the existing optimizations may
    be considered:

    * For readonly access, don't copy back to video memory
    * Only copy those parts of the surface that actually need to be
      accessed.

    Both of these optimizations rely on the IN_HOST state being
    transient as it is in the current system.

    For the first one, a similar effect could be achieved by having a
    new state where the surface doesn't have an ID, but the video
    memory is still allocated. Yonit thinks this should work. I
    suspect actually attempting it may turn up unexpected issues.

    To do the second, a surface would have to be put into a new state
    where it has an ID, but its actual content is partially in video
    memory and partially in host memory. This would be a significant
    complication, but not impossible to do.


Naming

In the existing system, the word cache is used for two different
concepts. One is the singleton object that manages all the
surfaces. The other is a small cache of unused surfaces that are
allocated in video memory. This cache is important for workloads that
create and destroy surfaces at a high rate where the overhead of
submitting a command per pixmap creation would be a problem.

In the new system the singleton object will be called memory_manager
and the word 'cached' will be reserved for unused surfaces that are
allocated in video memory.


Details:

** IN_VIDEO => IN_HOST

- update_area is issued for the surface in question
- data is copied to pixman image
- state is set to IN_HOST
- destroy command is issued for the ID

** IN_HOST => IN_VIDEO

- Allocate ID and video memory (see below)
- video memory is allocated (see below)
- a qxl_image_t is allocated for host memory
- a draw command is issued

** Rendering

- Perform an IN_HOST => IN_VIDEO transition on all involved surfaces
- Check that all involved surfaces are now IN_VIDEO (ie., that the
  transition didn't fail). Bail if they aren't
- Move involved surfaces to front of LRU list

** Allocate ID and video memory

- If a suitable cached surface is available, use that, else

  1. Get ID

     - If free ID exists, use that
     - Else, if cached surfaces exist, kill the least recently used
     - Else, do IN_VIDEO => HOST of least recently used non-cached
     - surface

  2. Allocate memory

     - Call malloc()
     - If that fails, then destroy cached surface, wait, GC, then try
     - again
     - If no cached surfaces, do IN_VIDEO => HOST of least recently
       used non-cached surface, wait, GC, then try again
     - If that failed, issue oom() and garbage collect
     - If that failed, give up

After issuing a destroy, need to wait for the command ring to go idle,
then garbage collect. Hopefully this will trigger a recycle call,
which will then free the associated memory.

** Newly allocated surfaces

- Start out in host memory because that is simplest and because Yonit
  says it's common for surfaces to be used that never become visible
  to the client.


More information about the Spice-devel mailing list