[Spice-devel] UMS memory management
Søren Sandmann
sandmann at cs.au.dk
Wed Mar 27 01:07:02 PDT 2013
Hello,
The following is some notes on what I am planning to do for UMS memory
management. It basically amounts to a rewrite of qxl-surface-ums.c
Comments appreciated, especially regarding eviction policy, where my
current plan is very naive: Just keep kicking out the least recently
used surface until there is room.
Søren
In the current system, a qxl_surface_t corresponds precisely to a
device surface; ie., it always has an ID so there there can be only
1024 of them at any time. In the new system, there can be an unlimited
number of qxl_surfaces, so they no longer correspond to device
surfaces.
In the new system, a surface will be in one of the following states:
- live in video memory
- cached in video memory
- live in host memory
"in video memory":
- has associated ID
- has associated pixmap
- pixman image may not exist
"cached in video memory":
- has associated ID
- does not have associated pixmap
- pixman image may not exist
"in host memory":
- no associated ID
- has associated pixmap
- pixman image may not exist
The following state changes are possible:
IN_HOST => IN_VIDEO
CACHED => IN_VIDEO
IN_VIDEO => IN_HOST
IN_VIDEO => CACHED
IN_VIDEO =>
IN_HOST =>
CACHED =>
Code must exist to handle each of these transitions.
- Rendering to a surface
- IN_HOST => IN_VIDEO
- prepare_access:
- IN_VIDEO => IN_HOST (but see below)
- Leave VT:
for all surfaces s that are IN_VIDEO:
- s.(IN_VIDEO => IN_HOST)
- add s to list of evacuated surfaces
- Enter VT:
for all evacuated surfaces s, s.(IN_HOST => IN_VIDEO)
(Or maybe simply rely on this happening when rendering begins?)
Notes regarding prepare_access():
- This is historically very performance sensitive; however with
Render support it may not be so much anymore. Hence, initially it
will be done as simply an IN_VIDEO => IN_HOST transition.
- If it still is a performance issue, the existing optimizations may
be considered:
* For readonly access, don't copy back to video memory
* Only copy those parts of the surface that actually need to be
accessed.
Both of these optimizations rely on the IN_HOST state being
transient as it is in the current system.
For the first one, a similar effect could be achieved by having a
new state where the surface doesn't have an ID, but the video
memory is still allocated. Yonit thinks this should work. I
suspect actually attempting it may turn up unexpected issues.
To do the second, a surface would have to be put into a new state
where it has an ID, but its actual content is partially in video
memory and partially in host memory. This would be a significant
complication, but not impossible to do.
Naming
In the existing system, the word cache is used for two different
concepts. One is the singleton object that manages all the
surfaces. The other is a small cache of unused surfaces that are
allocated in video memory. This cache is important for workloads that
create and destroy surfaces at a high rate where the overhead of
submitting a command per pixmap creation would be a problem.
In the new system the singleton object will be called memory_manager
and the word 'cached' will be reserved for unused surfaces that are
allocated in video memory.
Details:
** IN_VIDEO => IN_HOST
- update_area is issued for the surface in question
- data is copied to pixman image
- state is set to IN_HOST
- destroy command is issued for the ID
** IN_HOST => IN_VIDEO
- Allocate ID and video memory (see below)
- video memory is allocated (see below)
- a qxl_image_t is allocated for host memory
- a draw command is issued
** Rendering
- Perform an IN_HOST => IN_VIDEO transition on all involved surfaces
- Check that all involved surfaces are now IN_VIDEO (ie., that the
transition didn't fail). Bail if they aren't
- Move involved surfaces to front of LRU list
** Allocate ID and video memory
- If a suitable cached surface is available, use that, else
1. Get ID
- If free ID exists, use that
- Else, if cached surfaces exist, kill the least recently used
- Else, do IN_VIDEO => HOST of least recently used non-cached
- surface
2. Allocate memory
- Call malloc()
- If that fails, then destroy cached surface, wait, GC, then try
- again
- If no cached surfaces, do IN_VIDEO => HOST of least recently
used non-cached surface, wait, GC, then try again
- If that failed, issue oom() and garbage collect
- If that failed, give up
After issuing a destroy, need to wait for the command ring to go idle,
then garbage collect. Hopefully this will trigger a recycle call,
which will then free the associated memory.
** Newly allocated surfaces
- Start out in host memory because that is simplest and because Yonit
says it's common for surfaces to be used that never become visible
to the client.
More information about the Spice-devel
mailing list