Input and games.

Todd Showalter todd at electronjump.com
Fri Apr 19 09:31:19 PDT 2013


On Fri, Apr 19, 2013 at 5:18 AM, Pekka Paalanen <ppaalanen at gmail.com> wrote:

> I am going to reply from the Wayland protocol point of view, and what
> Wayland explicitly can (and must) do for you. This is likely much lower
> level than what a game programmer would like to use. How SDL or some
> other higher level library exposes input is a different matter, and I
> will not comment on that. We just want to make everything possible on
> the Wayland protocol level.

    That's fair.  We don't use SDL in our projects, so I'm coming at
this partly from the point of view of someone who will be operating at
the protocol level.

> I do not think we can happily let client applications open input devices
> themselves, so this is clearly a thing we need to improve on. In other
> words, I believe we should come up with a protocol extension where the
> server opens the input devices, and either passes the file descriptor to
> a client, or the server translates evdev events into Wayland protocol
> events. "How" and "what" are still open questions, as is every other
> detail of input devices that are not keyboards, mice, or touchscreens.

    This is certainly what I'd prefer, personally, whether it's a
file-descriptor based system, event messaging, or polling functions.
It would be really nice to get gamepads and the like in there, if
possible.

> There was once some talk about "raw input event protocol", but there is
> not even a sketch of it, AFAIK.

    I'm not familiar enough with Wayland yet to take the lead on
something like that, but I can certainly help.

>>     It would be really nice if there was some sort of configuration
>> that could be read so we'd know how the player wanted these things
>> mapped, and some sort of way for the player to set that configuration
>> up outside the game.
>
> Right, and whether this could be a Wayland thing or not, depends on the
> above, how to handle misc input devices in general.
>
> Keyboards already have extensive mapping capabilities. A Wayland server
> sends keycodes (I forget in which space exactly) and a keymap, and
> clients feed the keymap and keycodes into libxkbcommon, which
> translates them into something actually useful. Maybe something similar
> could be invented for game controllers? But yes, this is off-topic for
> Wayland, apart from the protocol of what event codes and other data to
> pass.

    Fair enough.

> Wayland protocol in event driven. Polling does not make sense, since it
> would mean a synchronous round-trip to the server, which for something
> like this is just far too expensive, and easily (IMHO) worked around.
>
> So, you have to maintain input state yourself, or by a library you use.
> It could even be off-loaded to another thread.

    This is what we do now, essentially; accumulate the incoming
events to assemble each frame's input device state.  It would be
convenient if Wayland did it for us, but obviously we're already
operating this way on X11, Win32 and OSX.

> There is also a huge advantage over polling: in an event driven design,
> it is impossible to miss very fast, transient actions, which polling
> would never notice. And whether you need to know if such a transient
> happened, or how many times is happened, or how long time each
> transient took between two game ticks, is all up to you and available.

    In truth, we don't usually deal with pure polling at the low level
unless it's a game where we can guarantee that we're not going to drop
frames.  Even then, things like mouse, touch or stylus input can come
in way faster than vsync, and game simulation ticks are usually (for
relatively obvious reasons) timed to vsyncs.

    In our engine, the input system has several parts, collected in a
per-player virtualized input structure.  It contains:

- analog axis
  - previous position
  - current position
  - delta (current - prev)
  - array of positions used to generate this frame's data

- buttons
  - previous frame state bitmap (1 bit per key/button)
  - current frame state bitmap
  - trigger bitmap (cur & ~prev)
  - release bitmap (prev & ~cur)
  - byte map of presses

    If a key/button event was received since the last update, that key
or button is left down for at least one update, even if it went up
again before the snapshot went out.  If the game cares how many times
a button or key was pressed between updates, it can look the key up in
the byte map rather than the bitmap.

    Likewise, while accumulated position/delta is usually good enough
for mouse/touch/stylus input and almost always good enough for
joystick input, there are times when you want to do things like
gesture recognition where it really pays to have the data at the
finest possible resolution.  Most parts of the game won't care, but
the data is there if it's needed.

> I once heard about some hardcore gamer complaining, that in some
> systems or under some conditions, probably related to the
> ridiculous framerates gamers usually demand, the button sequence he hits
> in a fraction of a second is not registered properly, and I was
> wondering how is it possible for it to not register properly. Now I
> realised a possible cause: polling.

    Pure polling can do it, definitely.  That said, there are many
possible points of failure; games are often fairly complex systems.
Also, a lot of people have *really* crappy keyboards.  We still see
keyboards where if you hold down the wrong 3 keys no other key presses
will register.

    I actually got bitten by this recently myself; the X11 support in
our engine was written about a decade ago, and it was calling
XQueryPointer() to get the mouse button data; there was some reason I
wrote it that way at the time, but you'd need a time machine to go
back and ask me why.  It worked fine, but (for reasons you can guess)
we never saw scroll wheel messages.  Cracking that old code open and
starting to bring it up to date is part of what led me here.

    Which reminds me; it would be extremely useful to be able to shut
off key repeat for a specific client (ie: a game) without having to
shut it off globally.

> Event driven is a little more work for the "simple" games, but it gives
> you guarantees. Would you not agree?

    We can definitely work with it.  As much as anything it's a
question of convenience; the question is really how much
superstructure we need to build on top to get what we need.  We've
already got that superstructure elsewhere, so porting it over is
simple enough.  It would be more convenient if we didn't have to, but
it's not a deal breaker.

    For context, I'm not trying to convince you to change the protocol
or the model per se; aside from anything else, I don't yet understand
it well enough to seriously critique it.  A large part of what I'm
hoping to do here is offer some insight into how games tend to use
input, the kind of needs games often have, and the sorts of
considerations that make a system easier or harder to put a game on.
Wayland obviously has competing considerations, some of which are
arguably more important than games.  If one can imagine such a thing.

    One thing worth noting here is why we want operate on virtualized
input structures rather than raw events.  One reason I mentioned
above; accumulating events so that they can be applied between frames.
 Another reason is complexity management; games can be quite complex
beasts consisting of many parts, and everything that can be done to
isolate those parts makes the game easier to develop and maintain.

    The classic problem with a purely event-driven program is that
somewhere in it there is a giant event loop that knows about
everything in the program.  In something simple like a calculator,
it's not a problem, but once you scale up to a large system with
multiple subsystems the event loop can turn into a nightmare.  Having
virtualized input structures that the game can query means that input
tests can be isolated to the code where they belong. ie:

if(KeyTrigger(KEY_D) && KeyDown(KEY_CTRL))
{
  Log("heap integrity %d\n", check_heap_integrity());
}

    You can achieve some of the same modularity with function pointer
lists or similar hooks, but having a virtualized input structure has
(in my experience at least) been the cleanest abstraction.

> Is this referring to the problem of "oops, my mouse left the Quake
> window when I tried to turn"? Or maybe more of "oops, the pointer hit
> the monitor edge and I cannot turn any more?" I.e. absolute vs.
> relative input events?

    Partly.  The issue is that *sometimes* a game wants the mouse and
keyboard to behave in the standard way (ie: the mouse controls the
pointer and lets you click gui elements, the keyboard is for entering
text and hitting control keys) and *sometimes* the game wants the
mouse motion to control an in-game object (often the camera) and just
wants the keyboard and mouse buttons to be a big bag of digital
buttons.  With the Quake example, when the pause menu is up, or when
the terminal has been called down, the game wants the keyboard to be
generating text commands on the terminal and the mouse to be able to
select text and click on buttons.  When the terminal is gone and the
game isn't paused, Quake wants the mouse to control the camera view
and the keyboard WASD keys are emulating a game controller dpad.

    So, yes, absolute vs. relative events is part of the issue, but
it's also part of a greater context; whether the keyboard is
generating strings or digital inputs, whether the mouse is generating
positions or deltas, under what circumstances focus is allowed to
leave the window, whether the mouse pointer is visible, and things
like how system-wide hotkeys factor in to things.  Can I capture the
keyboard and mouse without preventing the user from using alt-tab to
switch to another program, for instance?

    Clean, fast switching between these states is part of it as well;
in a game like Quake, as above, you want to be able to capture the
mouse when the game is playing, but "uncapture" it when the pause menu
or the game terminal are up, or if the player switches focus to
another program.  In an RTS, you might want a visible cursor but want
to constrain the mouse to the window to allow the map to scroll.  You
might want to use the keyboard mostly for hotkeys, but if they hit
enter you want them to be able to type a string in to broadcast to
multiplayer chat.  The scroll wheel might control either the message
scrollback or the zoom level, depending on what the cursor is floating
over.

    There's also the question of clean recovery; if a game has changed
the video mode (if that's allowed any more, though these days with LCD
panels and robust 3D hardware maybe that's just a bad idea), turned
off key repeat and captured the mouse, all of that needs to be
reverted if the game exits ungracefully.  Which sometimes happens,
especially during development.

> There is a relative motion events proposal for mice:
> http://lists.freedesktop.org/archives/wayland-devel/2013-February/007635.html

    Something like that will be needed for a lot of styles of game,
but also has use elsewhere.  For example, there used to be a widget on
Irix machines IIRC, that looked like a trackball.  If you put the
mouse pointer on it, held down the mouse button and then moved the
mouse, it would scroll the trackball control rather than move the
mouse pointer.

    Similarly, when you're doing scroll bars, if you want to make a
scroll bar where dragging the thumb moves the scrolled view at a rate
that is pixel-proportional rather than window-size proportional, you
have to be able to warp the pointer; otherwise, the view is slaved to
the thumb position, so taller views scroll by faster.

    Concrete example:  Let's say I have a document that is 1000 pixels
tall, in a view that's 400 pixels tall.  Let's fudge the math a bit,
say the thumb is one pixel tall and the region the thumb can be
scrolled over is the full height of the window.  The window shows 40%
of the document.  Without pointer warping, each step in the scroll bar
is (600 / 400) pixels, so we're scrolling on average 1.5 pixels of
view for every pixel the thumb moves up or down the screen.

    Now, in the same view, we have a 250000 pixel tall document.  The
document got longer, but the scroll bar is the same height (and thus,
the same number of steps).  Each step of the scroll bar is now (249600
/ 400), or 624 pixels, enough that each scroll thumb movement scrolls
more than 1.5x the view area.

    The classic solution to this is when the scroll amount goes above
or below sane thresholds, the view is moved by a sane amount, the
scroll bar is moved by the correct amount (if any) for the new view,
and if necessary the pointer is warped to the new thumb position.

> Clients cannot warp the pointer, so there is no way to hack around it.
> We need to explicitly support it.

   Hmm.  The inability to warp the pointer is going to put constraints
on gui designs even outside of games.  Consider the scrollbar example,
above.  That one isn't just a matter of locking the pointer somewhere,
it's a matter of positioning the pointer based on the new scroll thumb
position.  If anything we're actually better off in games in that
scenario, because we can just shut off the pointer draw and draw a
pointer in-engine.

    I'm assuming there are sane protocol reasons for not allowing
pointer warping, but I think you'll find it's one of those PITAs that
you need to implement to avoid greater pain later.  Bad scroll bar
behavior is one of those things that can grate on people.

    Within games, there's the classic "try to move the mouse off the
window, the pointer stops and the map scrolls" case that we'd like to
be able to handle.

> Ah yes, deltas are the relative motion events, see above.

    Deltas are quite useful, though obviously we can calculate them
ourselves.  Some of the desire to have deltas from the system input
comes admittedly from an admittedly somewhat childish engineering
distaste for repeated translation back and forth between deltas and
absolute positions as the data percolates up through the software
stack.  Coming out of the hardware (at least for classical mice and
trackballs) the "analog" values are all deltas.

> Aah, reading this the third time, I finally understood what you meant
> by input capture. The URL above for the relative motion events should
> be exactly this. We are more accustomed to the term "pointer grab" or
> "grabbing", meaning that during the grab, all input events go to this
> particular window, until the grab is ended.

    Ok, I'll try to stick to that term.  The thing is, we don't
necessarily want *all* events routed to us; we don't want to trap
system-level stuff like program switching (alt-tab), the "lock screen"
button, the volume and brightness controls, the screenshot button (if
any) and so forth.  We want *most* of the events routed to us, but not
to the exclusion of system and window manager functionality.

> One thing you didn't list is input latency. In Wayland, every
> input event from user actions has a timestamp corresponding to when
> they occurred, but the events may not be relayed to clients ASAP.
> Instead, for instance Weston relays input only during the refresh
> cycle, I think. That might be a problem for games wanting to minimize
> input latency, since it limits input state update rate to the monitor
> refresh rate.

    That's potentially an issue; I'd just assumed events would be
delivered immediately.  Might it be possible to have a knob we could
push to request "real time" input, for whatever value of "real time"
the system can manage?  Again, in some cases (mostly where a game is
not grabbing input) the game can deal with keyboard and mouse input at
desktop rates; things like text entry and gui interaction aren't
always time critical in games (though they can be).

    Often, though, we want the lowest latency the system can manage.
There's often an update lag on monitors already, and some input
systems (especially things like touch panels; try a draw program on
the ipad to see why Nintendo still uses resistive touch screens
despite their disadvantages) can have atrocious lag.  In a living-room
game PC hooked up to a TV, you can be looking at lags of several
frames between the video signal going out on the wire and appearing on
the display due to HDMI signal processing and cleanup, plus a
potential frame or two of lag on wireless gamepads, keyboards and
mice.  The game adds at least a frame of lag due to the nature of the
simulation tick, and potentially another depending on how display
buffering is done.  Any lag on top of that and we're wandering
dangerously close to the 150ms delay that my HCI prof said was
"perceptible input lag", and those studies were done with people using
VT102s to do text entry, not gamers playing twitch games.

    I think the best option would be the "real time" switch; let a
client tell the server "these events are time-critical to me".  We
probably don't need all events at max speed; window metadata (resize,
move, destroy...) and so forth can be delivered whenever it's
convenient.  A game might only really need "real time" input for (say)
the mouse and WASD keys, or it might only care about "real time" input
from the gamepad.  The actual requirements may well differ at
different parts of the game.

> Depending on the game and physics engine, of course, is it possible to
> make use of the input event timestamps to integrate the effect of, say,
> a button going down some time in the past, instead of assuming it went
> down when this game tick started?

    In some games, sure.  The problem is, any lag like that can
potentially end badly for the player.  What if we've already killed
them before the input comes in?  What if it's a network game, and the
new input means that instead of being killed by player B, they
actually got player B first?

    In general, the problem is that yes, we can go back and correct
the simulation for the revised input, but what we *can't* do is revise
the player's decisions based on the previously incorrect simulation
that we've already showed them.  Games strive to have as tight a
feedback loop as possible, so if the simulation is not fed input when
it happens, we're putting information in front of the player that
we're going to revise *after* they have started reacting to it.

> What I'm trying to ask is, are the timestamps useful at all for games,
> and/or would you really need a minimum latency input event delivery
> regardless of the computational and power cost?

    Timestamps can be useful as a fallback, but minimum latency is by
far the highest priority.  Lower latency translates directly to a
better play experience.  The difference of even a frame of lag has a
measurable effect on player enjoyment and control.

> Keeping in mind, that event based input delivery does not rely on high
> update rates, like polling does, to not miss anything.

    If the events are just coming in as a pile in 60Hz ticks, it's all
good and we can do everything we need to.  If they're coming in as a
pile at 10Hz ticks, it's going to be difficult to make anything more
active than Solitaire.

> There is also one more catch with the timestamps. Their base is
> arbitrary, and a client does not know which clock produces them.
> Therefore they are only useful as realtive to other input event
> timestamps. Would you need a way to get the current time in the input
> clock to be able to use them properly?

    At least in our case, we're typically running the simulation off
of either a vsync clock (consoles, mostly) or a millisecond clock
(gettimeofday() or the platform equivalent).  Anything coming in we
typically try to relate to those.  Some sort of timestamp we could
relate to an actual world clock would be important; without it we'd be
into calculating times based on heuristics, with all that implies.

    VSync stamps would be good enough, or millisecond stamps.
Anything with fixed time units.  As long as we know the size of the
time unit and some arbitrary base time (ie: the timestamp of the first
event we got), that's all we really need; if we need to relate it to
the wall clock, we can call gettimeofday() and compare.  If the time
units aren't fixed (ie: if they're just monotonically increasing IDs
that don't actually encode time values and are only useful for
establishing order), the results for games will be unfortunate.

                                            Todd.

--
 Todd Showalter, President,
 Electron Jump Games, Inc.


More information about the wayland-devel mailing list