Performance improvement to vga arbitration

Dave Airlie airlied at gmail.com
Mon Jun 14 16:18:59 PDT 2010


On Thu, Jun 10, 2010 at 4:02 PM, Henry Zhao <henry.zhao at oracle.com> wrote:
> Proposal for improving vgaarb arbitration method
>
> It appears  that after session is up,  in most cases,  drivers only do
> non-legacy accesses.
> Non-legacy accesses do not need to block each other. Blocking arbitration is
> needed
> mostly for session initialization and exiting. To improve performance, we
> need to treat
> differently to legacy and non-legacy accesses, and allow non-legacy accesses
> to proceed
> concurrently among devices without blocking each other. Non-legacy accesses
> is assumed
> to be the default for operating functions after initialization. In case
> legacy accesses are
> necessary for some of them, drivers can redefine them per function group
> bases.
> Here are some details:
>
> (1) New lock for non-legacy access
>
>  Define another lock, vgadev->locks2 (locks2), for non-legacy access locking
>  in addition to vgadev->locks (locks1), currently used for legacy access
>  locking.
>
>  Non-legacy access requests from a device that does not have legacy access
>  decoding ability should always be honored without a need of acquiring a
> lock.
>  Non-legacy access requests from a device that has legacy access decoding
>  ability needs to acquire locks2 before proceeding.
>
>  Request for locks2 is blocked only when some other device already has
> locks1
>  (on the same resources).  Request for locks1 is blocked when some other
> device
>  already has locks1 or locks2 (on the same resource). This means request for
>  locks2 should not be blocked just because some other device already has
> locks2
>  (on the same resources).

Originally I think Ben H had something like this, and I think avoiding
deadlock was really really impossible, so I gave up.

The thing is you have to guarantee that deadlock can't occur when you
combine any set of n userspace drivers, I think the original design
went a step further though and allowed separate io and mem requests
and that might have caused the deadlock situations rather than the
legacy/non-legacy issues.

So I'd like to make sure all the possible locking/unlocking combos are
thought out before going ahead with anything.

>  but only two strings for them, "io" and "mem". Add "IO" and "MEM" for non-
>  legacy accesses.

You'll notice io+mem is only interpreted in the kernel, separate
io/mem was a deadlock fest, esp when you also consider non-X apps
running at the same time as X. X since it is single threaded can't
really deadlock, but if X has locked the non-legacy io+mem, and
vbetool locks the legacy io+mem, and waits for the non-legacy lock, it
starts to get really messy. As long as you've considered this in the
design I'm fine with it.

> (2) Function group based resource request
>
>  Need to distinguish between decoding ability and decoding request (resource
>  request). Decoding ability is still maintained in struct vga_device of
> kernel
>  driver with
>
>       unsigned int decodes;
>
>  and a userland copy in dev->vgaarb_rsrc.
>
>  Currently all lock/unlocking mechanism uses resource requests from
>  dev->vgaarb_rsrc, which is actually decoding ability. In new design
> however,
>  this is only the case for xf86VGAarbiterLock() and xf86VGAarbiterUnlock(),
> run
>  during session initialization and exiting. During normal run, resource
> request
>  is determined by a resource mask associated with each function.
>
>  Wrapping function are grouped into MAX_VGAARB_OPS_MASK number of
>  groups with resource masks assigned to each of them. The default setting of
> mask is
>  VGA_RSRC_NORMAL_IO|VGA_RSRC_NORMAL_MEM, meaning non-legacy
>  access, but drivers can redefine any of them. In an extreme if a driver
> redefines all
>  masks to
>      VGA_RSRC_NORMAL_IO|VGA_RSRC_NORMAL_MEM|
>  VGA_RSRC_LEGACY_IO|VGA_RSRC_LEGACY_MEM
>
>  we are returning to old arbitration algorithm.
>
> (3) Other changes
>
>  * pci_device_vgaarb_set_target() is heavily called. Currently it involves
> two
>   syscalls.  These calls can be saved if the device in question is the same
> as
>   in the previous call (recorded in pci_sys->vga_target). This contributes
>   to major performance improvement.
>
>  * OpenConsole()/CloseConsole() need to be protected by lock and unlock as
> they
>   may have vga register accesses. Further, OpenConsole()/CloseConsole() is
> run
>   only on a session with primary device.
>

In-kernel I'd really like to make some more changes, that I've been
hacking on but really don't have time to fix.

The main one is to use bridge control when possible to switch stuff
on/off, the other was some sort of callback for nvidia (though maybe
they can make a patch).

The callback would be called instead of the current forcing
enable/disable, and the driver in-kernel would be responsible for
doing that, it would allow more flexibility in the non-kms world
(which I don't care about enough to do much more).

[attached initial patch - kernel patch not X.org, also probably broken.

Dave.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-vgaarb-use-bridges-to-control-VGA-routing-where-poss.patch
Type: application/mbox
Size: 9611 bytes
Desc: not available
URL: <http://lists.x.org/archives/xorg-devel/attachments/20100615/2d15933e/attachment-0001.bin>


More information about the xorg-devel mailing list