protocol handling spec?

Wed Aug 11 22:01:22 EEST 2004

I'd like to resolve this long discussion so we can get on with the
constructive part (what to do about solving the actual problem).  So here
goes:

On Wed, Aug 11, 2004 at 10:41:16AM +0100, Dave Cridland wrote:
> On Tue Aug 10 19:18:11 2004, Avery Pennarun wrote:
> >If you don't try to retrieve the URI corresponding to the folder item
> >list, there is no reason for it to do so.
>
> But generating a folder message list is expensive. Generating part of 
> one is cheap.

If you want only *part* of the message list, ask for it using the
appropriate URI (such as one that implements an imap search).

We can add infinite levels of twiddly bits onto the imap URI for any
possible subset of data you could possibly want; in the end, imap just
always returns either a list of messages or some kind of message content
(headers, body, attachments, etc).  The list of messages may be the full
contents of a folder, or some subset generated using any method at all.

imap is a really easy example because of this; it really does map perfectly
onto the bitstream/list model I've been proposing, and you don't even need a
separate API to make it efficient.  If we must continue this part of the
discussion, let's use a different uri scheme that works more poorly, like
"irc" or "mailto" or "smtp".

> >I both agree and disagree with you here: I think that, to simplify the
> >model of the world seen by things like web browsers and Nautilus,
> >rendering to the lowest-common-denominator (an octet-stream) is very
> >useful.
>
> In some cases, though, I don't think it's suitable.

Why?  It's usually not much work, and uniform APIs can be more useful than
we think.  That is, after all, why Unix's design is so convenient... because
they mapped things into files that, until then, people figured that they
shouldn't.  I want to continue in the same direction, particularly if it's
not any extra work.

> >(There are lots of so-called "component systems" that implement this
> >kind of behaviour.  The nice thing about XPLC is it implements 
> >little else.)
>
> Nor does COM, strictly speaking. It's all the additional cruft to do 
> with Microsoft incorporating the object broker into the OS that 
> causes the problems. But anyway, so XPLC is a nice COM clone.

XPLC is really not a COM clone.  It's only 26k.  COM is more than 26k.
However, I'm glad we agree about the major feature we expect from a
component system.  I hope we can also agree that the essential point of this
discussion is what specific standard interface(s) a URI handling component
should provide.  If that's not the case, then I don't think we can talk
about a "protocol handling spec", because there would be nothing to
standardize.

> It's this URI API business I have problems with. URIs are uniform, 
> the resources are not. I don't see the point in coercing them all to 
> be alike.

Somehow or another, they are all alike, because I can type a uri into the
URL bar in konqueror/nautilus/mozilla and I expect "something" to happen. 
If we can define what that something is, then we have to have some kind of
standard API.

But what is that API?  Maybe it's "run this program when you get this kind
of URI".  Maybe it's the read/write blob thing like KDE has.  Or maybe it's
something else.  Do you have a suggestion that I'm missing?

> >Now you have an object.  Let's say the URI API gives you a MIME type for
> >that object (which is probably the case).  You can see that the MIME type
> >is something like message/rfc822 - okay, I know how to view those:
> >resolve("message/rfc822", "MIME Viewer API").  If this returns nothing, I
> >simply can't deal with that type of object; offer to save the bitstream
> >to a file or something.  It it returns something, I *can* deal with that
> >type of object.  Good; pass the URI object to the MIME object, and let
> >him do the rest.
>
> No problem with that. Hopelessly inefficient in some cases, but 
> possible in some cases.

In which cases is this inefficient?  The above *only* requires a
uri->get_mime_type() function, which for many kinds of URIs may not require
retrieving the resource at all.  If it does require contacting the server,
it's because we don't know the MIME type of the resource, which means we
know almost nothing, which means we certainly don't know which app to run.

> But the only way you're likely to detirmine, in many cases, what the 
> MIME type is is because you've actually hit the network. The only 
> exceptions I can think of are for imap URIs where the URI is 
> identifiable as a message, for data scheme URIs, and of course for 
> local file URIs (although we don't actually know that's local).

True.  If someone gives me an http uri pointing at an MPEG movie,
but my software doesn't yet know that it's an MPEG movie (because it can't
possibly), then it *will* have to touch the network.  I can't think of a way
around that.  If you can, please tell me, because then we can abandon this
whole line of reasoning.

> Well, I'd not imagine for a moment that a desktop system is going to 
> understand IMAP without a mail reader available, but still. Yes, this 
> is a potential fallback interface.

KDE does.  It's quite neat.  And I can download imap content when someone
sends me a URI, but not need to open my mail reader app to do it.

We can argue for hours about whether this is useful or not; the point is
that it was no extra work, once a good design was in place.  It will also be
very easy to extend my mail reader to retrieve its mail from a webdav (eek!)
service someday, in case I go crazy.

The point is to separate the *retrieving* of the object from the *doing
stuff* with the object.  If done properly, the result is quite elegant.

> Secondly, I don't really see a problem with saying "This URI has no 
> client available". You have to with some URIs anyway. It's not a big 
> deal.

Of course that's fine, although you'd want this error to come up as seldom
as possible.  When I tell my app to do something, it should try to do it if
it possibly can, not me give arbitrary error messages about why it can't.

Besides which, this discussion started out as a plan to unify URI handling
between the different desktops.  If most URI handling is "no client
available", I guess we're already done :)

> Fourthly, if you really did want this sort of thing, then surely 
> you'd really want to have an interface defined as a core interface 
> which provided a random access array interface to a set of other 
> objects, which'd surely be more generally useful rather than 
> serializing data into some internal format?

We can discuss what interface would be best.  I actually don't care what the
interface is, as long as most URI handlers can support it (in addition to
their "native" interface, if any).

> s/route/hack/

If your design is so simple that features fall out of it without you doing
any work, it's not a hack.  It's a good design.

> There's no bitstream, there cannot be, we can see that from the URI.

Who are "we"?  Nautilus doesn't understand most kinds of URI, so it certainly
*can't* see from the URI that there is no bitstream.  It will have to ask a
URI handler, "Is there a bitstream?"  My proposal was that it would ask for
a bitstream and get NULL.  If you want an is_there_a_bitstream() function,
that's fine with me too.

> The only reason I can think of is that current desktop systems in 
> xdg-land are only capable of dispatch based on MIME type, and hence 
> the only way to dispatch a protocol is to pretend it's got a MIME 
> type somewhere.

This isn't true at all.  gnome can run a given program when you give it a
particular URI.  It can run a different program when an http URI returns a
jpeg image.  KDE has similar functionality, but doesn't share the config
with gnome.

There is (and this was always my whole point) really no required
relationship at all between the MIME handler component and the URI handler
component, although in some cases (Mozilla) they're in the same application.

> For some, it starts to make a real difference - text/icalendar, for 
> instance, uses parameters to indicate what the iCalendar object is 
> doing there - if it's a request or not, for instance. So the 
> parameters are intended to be useful to the recipient, allowing them 
> to make decisions about how to handle the content *without* having to 
> examine it first.
> 
> All current desktop MIME dispatching, as far as I'm aware, uses 
> solely the top-level type and subtype, and never the parameters. 

That's nice.  But I'm not sure I would ever want to send my iCalendar object
*to a different program* depending on the MIME parameters.  Maybe I'm wrong.

> The problem is that if you grabbed the URI from subversion, or a web 
> folder, and slung it onto your desktop, then a reasonable expectation 
> would be to reopen it in the same way later.

I think I understand you; you're saying that http://foo/blah might refer to
a weird subversion-version-retrieval operation with lots of parameters
attached to it, if subversion was using that URI, but refers to something
totally different if my web browser was just browsing that URI.

Okay then.  Well, I think we can agree that *most* people, when they see an
http: url, expect it to be treated as a web page.  So for URIs that expect
to be treated as something else, we can use another handy feature of
monikers: nesting.

	svn:http://foo/blah

The URI handler for svn makes sure we're in Subversion and/or using the
Subversion protocol.  It parses the right-hand-side of the URI and does what
it would always do.  If you enter a URI into Subversion itself, it would
simply parse the URI directly, and any "svn:" prefix would be stripped off
transparently (since you're already in Subversion).

This is the magic of monikers.  The people who saw my UniConf talk or read
my UniConf paper will be starting to see where I'm going here.
(See http://open.nit.ca/uniconf.pdf.  Monikers in UniConf are expected to
return objects implementing IUniConfGen, but you can nest them many levels
deep, just as you might want to do with your URI handlers in nautilus.)

Have fun,

Avery