protocol handling spec?

Tue Aug 10 21:18:11 EEST 2004

I'm going to rearrange and trim the discussion a bit, since we seem to each
be saying the same things five times per message.

On Tue, Aug 10, 2004 at 01:14:28PM +0100, Dave Cridland wrote:

> On Tue Aug 10 08:54:01 2004, Avery Pennarun wrote:
> >[imap uri downloading]
> >It downloads a list of messages, and to get other messages I download...
> >other URIs.
>
> [...] A lot of what I do involves reading very remote, very large, folders
> occasionally. My client doesn't ever bother reading in all the message
> list, because that takes too long.

If you don't try to retrieve the URI corresponding to the folder item list,
there is no reason for it to do so.  If you have the URI of a particular
message, you can retrieve that directly, without going through the folder
list first.

> (I've been a shade unclear, I admit - when I say "A URI does not have 
> a MIME type", I mean "The resource pointed to by a URI does not have 
> a MIME type", etc. A list of URIs formatted one per line, with CRLF 
> EOLs, does of course have a MIME type no matter what the URIs are, 
> but that's irrelevant.)

I use the same shorthand, so let's just agree to understand each other
here :)

> No, I mean, encoding something as an octet-stream automatically will 
> slow things down if it wasn't an octet-stream to begin with. XML or 
> not doesn't matter. I'm well aware that any information can be 
> encoded as a stream of octets (or bits) - Shannon proved that quite 
> convincingly a while ago.
> 
> What I'm not convinced about is that this is a good idea when a 
> better interface can be made available.

I both agree and disagree with you here: I think that, to simplify the model
of the world seen by things like web browsers and Nautilus, rendering to the
lowest-common-denominator (an octet-stream) is very useful.

I also agree that, for speed, we can make a better interface available.  I
learned most of the stuff I know about monikers from Pierre Phaneuf, a guy
who works next to me and wrote xplc (http://xplc.sourceforge.net).  It's a
tiny (26k) library that gives you language- and platform-independent
semantics essentially like this:

 - an object implements one or more interfaces.  It always implements
   interface IObject (for reference counting), and usually at least one
   more.

 - monikers can be "resolved" into objects that implement a particular
   interface.  For the sake of simplicity (xplc is actually slightly
   different), let's say that the resolve function takes two parameters: a
   moniker (aka URI) and an interface name.  If you just ask for something
   that implements IObject, you might get one object; if you just ask for
   something that implements the URI API, you might get another object;
   if you ask for something that implements an IMAP API, you might get
   another object.  Or they might be all the same object.  Or if you ask
   for the IMAP API from an http: moniker, you might get no object at all.

 - you can ask any object whether it implements a particular interface, and
   convert it to that.  They all implement IObject, but if someone hands you
   an object implementing the URI API, you can ask if it also supports the
   IMAP API.

(There are lots of so-called "component systems" that implement this
kind of behaviour.  The nice thing about XPLC is it implements little else.)

Anyway, now imagine that you're Nautilus, and someone gives you an imap uri. 
What do you do?  Well, you can't really do anything until you resolve it to
an object, and you don't understand objects other than IObject and URI API. 
IObject is kind of useless, so you resolve(moniker, "URI API").  

Now you have an object.  Let's say the URI API gives you a MIME type for
that object (which is probably the case).  You can see that the MIME type is
something like message/rfc822 - okay, I know how to view those:
resolve("message/rfc822", "MIME Viewer API").  If this returns nothing, I
simply can't deal with that type of object; offer to save the bitstream to a
file or something.  It it returns something, I *can* deal with that type of
object.  Good; pass the URI object to the MIME object, and let him do the
rest.

Let's say the URI object is from an imap: URI, and the MIME object is my
mail reader (because one of the various possible MIME-types that URI might
point to is set to be viewable by my mail reader).  My mail reader receives
the URI object, and has two choices: treat it like a bitstream, which will
definitely work, or check if it supports the IMAP API, which might work.  If
my mail reader is smart, he'll try the second one first (optimizing IMAP
performance where possible), but fall back to the first one, because things
like message/rfc822 can come from places other than just IMAP servers.

Now let's say I don't have a mail reader installed on my system at all, but
I try to retrieve the same message/rfc822 URI.  Nautilus will do
resolve("message/rfc822", "MIME Viewer API") and still get an object,
because I have a low-priority viewer defined for message/rfc822: it's the
same as my text/plain viewer.

Furthermore, if imap folder objects can render themselves as "list of uri"
objects, then Nautilus can display the messages in the folder as literally a
set of items in a file folder.  It's not the best way to view imap - but if
you don't have a mail reader installed, it's the *only* way.

> "Using" a telnet or mailto scheme doesn't give you an octet-stream.

While this is true, it can still give you an object implementing the URI
API.  For that object, trying to retrieve the bitstream will launch a
subprogram (telnet or your mail composer) and return NULL.

If it makes you feel better, we could have it return an empty bitstream with
a MIME type that resolves to your mail composer, but that's just a longer
route to the same thing.

> [...] the parameters of a MIME type are not uniform. (I'm referring to the
> parameters of a MIME type, specifically.)

An easy way to deal with this is to have a hash table of key:value pairs for
your MIME type.  Monikers are actually very powerful, though, and allow the
resolved object to continue parsing the moniker to see what it wants to do. 
(For example, "http://foo/blah/x/y?a=b&c=d" could be said to have
non-uniform parameters; but the "http" moniker handler is responsible for
parsing the rest of the string.  Even though it's non-uniform, it's still
all one string.)

> It seems to me that for every time you request the object attached to 
> a URI - let's call it dereferencing it - then you need to provide a 
> URI used for context.

This doesn't seem necessary to me.  Where a file is linked *from* doesn't
have much effect on what I want the file to do, in general.  What
application I'm running might have an effect, but that's fine; each app
should be able to provide overrides for how it wants to view certain MIME
types.  (I doubt that letting apps override how they want to deal with
certain URIs will be much use.)

> But supposing you enter it into a file manager - or rather, you enter 
> it into an application working within a file-like context, more 
> generally - then you probably want the URI to be treated as a DAV 
> item if possible. (If not, you punt it to the web browser.)

It seems to me that I wouldn't want it to be treated as a DAV item unless it
was a DAV URI, but maybe I'm misunderstanding.  In any case, file manager
apps really love working with read/writable bitstreams.

Perhaps I would expect my file manager to open items using a MIME "edit"
action by default instead of a "view" action.

> Actually, given the notion of a context URI with which to handle the 
> current URI, I think it covers everything.

If it were me, and I dragged an http: URI onto my desktop, I'd want it to
act the same regardless of whether I dragged it there from my mail reader,
my file browser, or my web browser.  Thus attaching a context to it wouldn't
help me very much.

> If I "copy" a mailto URI, I don't want an empty file, I want an error. And
> I certainly don't want to actually send an email.

Similarly, if you click on a mailto: URI in your web browser, you don't want
it to pop up an empty browser window.  It's the same problem in both cases. 

But we can deal with it.  For example, the URI API might be defined to
ignore the output entirely if the object returns a NULL blob, but to display
a blank page if it returns a "" (empty string) blob.  Or we could just add a
function to the API, is_actually_viewable().  Or whatever we want.

Have fun,

Avery