<div dir="ltr"> <div class="gmail_extra"> <div class="gmail_quote">On Fri, Aug 15, 2014 at 12:18 AM, Pekka Paalanen <<a href="mailto:ppaalanen@gmail.com" target="_blank">ppaalanen@gmail.com</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">On Thu, 14 Aug 2014 11:31:27 -0700 Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> wrote: > On Tue, Aug 12, 2014 at 1:34 PM, Jasper St. Pierre <<a href="mailto:jstpierre@mecheye.net">jstpierre@mecheye.net</a>> > wrote: > > > In the xdg-shell thread recently, Pekka had a lot of concerns of the kind > > "what happens when the client makes an unexpected or illegal request. > > Should it be an error?" > > > > Yes and no. More on that in a line or two. > > > > I have an answer for this: Any behavior not defined is undefined behavior. > > If it's not explicitly mentioned in the protocol, it's undefined behavior. > > > > Undefined behavior means that anything can happen. The compositor can send > > you a fatal error. It can mess up your objects in undetectable ways. The > > surface contents you present might now be garbage. Obviously, compositors > > should try *very* hard to catch all edge cases and send protocol errors, > > but it might be difficult or impractical that we can catch all cases. > > > > I'm a little hesitant to above fully. I would rather reserve the term > "undefined behavior" to mean things that may not be what the client intends > but which are not fatal. That said, most things shouldn't be this kind of > undefined behavior; most client errors should be the kind that potentially > lead to death. </div>I'm fully with Jason here. I do want a category of behaviours that are not fatal, but also quite likely to not do what you except or what is good for the user looking at the screen or using the program. The term "undefined behaviour" seems to fit well, no? And we have already used the term in the core spec, IIRC. One example of it is destroying wl_buffer while it is reserved by the compositor. It's not a fatal error, but can cause a glitch on screen. It's not too different from a client rendering badly, or rendering into a wl_buffer that is reserved by the compositor. <div><div class="h5"> > I think the better thing to do would be to clarify the protocol to say how > it should be used. Any time a client does something that's outside of what > the protocol says, it's liable to get killed. We should be more vigilant > when writing protocols to specify how it should properly be used so that we > aren't left with strange ambiguity. There a couple of reasons for this. > > 1) It will make better protocols. If protocols are written as "this is > what you should do" it forces the writer to think about all of the valid > ways to use it. Most likely, those are the only ways of using it that will > be properly implemented in the compositor anyway, so clients doing anything > else are liable to run into problems. > > 2) It's better for backwards-compatibility. It's easier to define a > certain behaviour in a later version if the result in earlier versions was > "don't do this, or I'll kill your client". If there were strange, > poorly-defined, behaviors before, we have to keep those working in future > versions. > > 3) If there is something that should result in non-fatal undefined > behaviour, we want to identify and document it. This way we make sure that > compositors do something sensible and, at the very least, don't crash. > > That said, I think it's better to clean up the protocol docs than to try to > make some blanket statement about undefined behaviour. Anywhere weston > currently kills the client without it being particularly well-documented, I > think we can feel free to put that in the "don't do this" category. But > I'd rather see us have those rules written down where they are relevant > than have another nebulous statement to help us interpret the other > nebulous statements. </div></div>Yeah, agreed. However, the protocol specifications are not only instructions on how to use it (from client perspective), there should we also intructions on how a compositor should behave when it matters. That is why I would encourage explicitly specifying that a protocol error is raised if a client does not do what is expected or does what is explicitly defined as an error (e.g. specifying a negative width when there is no sense in that). One reason is that error codes are interface-specific, and it is much better to raise a protocol error on the object involved, than just a wl_display message "You fail!", because we have an async protocol and it can be hard to track the error back to the request that caused it. Another reason is that testing apps between compositors would be a little more consistent. I don't expect everyone to test on every compositor out there, or even everyone to test even on Weston, before releasing an app. </blockquote><div> </div><div>Yeah, it would be good if compositors tried to kill apps for bad behavior. We have to have those error-check paths anyway to keep the compositor from crashing. That said, I don't know that we always need to be more specific than something like XDG_SHELL_USAGE_ERROR. Specific things should have specific errors, but enumerating all possible errors with their own error enum is a pain. The big thing is that we shouldn't abuse the wl_display error enum like we have in the past; we should use the error system properly. </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <div class=""> > > As a quick example, we have been discussing "surface roles" in the past > > week with xdg_surface. Unfortunately, Weston doesn't properly mark cursor > > or DND surfaces as having roles, and there's no easy way to check for it in > > the codebase. > > > > From my reading of the code, it seems what would happen is that the cursor > > would disappear as the desktop-shell code changes the surface's position, > > and on the next mouse move, the cursor would reappear as a wl_pointer.enter > > event is sent out over the new surface. > > > > Deleting the cursor surface without wl_pointer.set_cursor(null) is > something that should probably count as "undefined behavior" in the sense > that it would be strange but not fatal. For instance, the compositor could > just hang on to the last buffer and keep using it. There are all sorts of > ways to gracefully handle that. However, the client shouldn't expect > anything in particular. It could, for instance, result in no cursor. > > > > Obviously, no valid program should ever try to reuse a cursor surface as > > an xdg_surface. > > > > Yeah, feel free to kill the client for that. But we should add a line to > wl_pointer that says something like "Once a surface has been used in > set_cursor, it is considered a cursor surface and should never be used for > anything other than set_cursor". > > Make sense? </div>Yes, we should definitely add that sentence, because right now there is no good point in the protocol that would actually take away the cursor role from a surface. I think we should for all cases which don't have an explicit clear way for the client to remove the role. (Hmm, actually there is wl_pointer.leave event...) But that still leaves open the question: if a role has been removed, is it allowed to set a different role? As I've said, the (Weston) code is cleaner and more symmetric when you write it as if you could switch roles, but from the protocol point of view it is not useful. Therefore I think I am willing agree, that changing a wl_surface role is an error. We can just add a boolean to weston_surface for "has role ever been set", and check it as needed. Or maybe use weston_surface::configure != NULL for that, if it does not lead to strange cases if the client is still committing on that surface after the role is gone. Why changing roles can never be useful is that all role-specific state is lost when the role is lost. That leaves only the wl_surface state, and that state is dispensable: it is easy for a client to recreate the state on a new wl_surface without a role. Right? There is a slight chance that making this change retro-actively will cause some apps to fail. The bug should be in toolkits in that case. Is this a serious concern? </blockquote><div> </div><div>I think it's probably ok, but it's possible there are surface-recycling clients out there. I just looked at Weston and we don't actually prevent surface re-use. What we prevent is a surface being a cursor surface and another type of surface at the same time. I'm kind of a fan of changing it and seeing if anything breaks. If we have to, we can bump the wl_pointer version when we make the change. </div><div>--Jason </div></div></div></div>