Clipboard mime-types (Was Re: [Annoyances] X-Windows Copy & Paste)

Dom Lachowicz domlachowicz at yahoo.com
Thu Aug 21 18:20:00 EEST 2003


--- Thomas Leonard <tal00r at ecs.soton.ac.uk> wrote:
> Why not just use the MIME types, as we do for
> drag-and-drop (which works
> just like the clipboard anyway)?
> (image/png, application/octet-stream for your
> examples)

This is what a lot of the major Office-like
applications have been doing for a while now. AbiWord,
for instance, exports the following atoms (in addition
to the various STRING/TEXT ones):

text/plain (utf8 text)
application/richtext
text/rtf
text/html (but really a complete xhtml 1.0 document,
utf8)
application/xhtml+xml (complete xhtml 1.0 document,
utf8)
image/png
image/svg
...

For our 2.2 release, we plan on advertizing *all* of
our supported import and export formats to the
clipboard via mime-types, barring some new standard
proposed here.

Gnumeric and Evolution behave similarly to AbiWord. I
know that Mozilla posts text/html, but it's kind-of
yicky (UCS2 data that is basically a template and
needs to be merged with other clipboard atoms to
become useful). OpenOffice accepts *at least* the
richtext and html mime-types as clipboard atoms. We'd
probably also want to put the OOo format on the
clipboard in the near future too. OOo probably does so
already.

I'd also like to avoid overly generic mime-types like
the "octet-stream" you mention above. I don't think
that they are terribly useful for clipboard usage
because they lack concrete specificity where it is
most needed (to avoid file-type identification on the
client side and to avoid unecessary round-trips to the
server of potentially huge documents for the "miss"
cases). 

Consider the following entirely hypothetical examples:

* File roller posts a Zip file as
"application/octet-stream" to the clipboard
* AbiWord looks at the clipboard for MSWord documents,
who also have an "application/octet-stream" generic
mime-type.
* AbiWord sucks down the data from the server and then
has to do file-type identification on the document's
contents before using it. Abi finds that it can't use
it, must request a new atom from the target list. More
server round-tripping.

Or consider an even simpler case, where something like
File Roller puts a Zip on the clipboard, Gnumeric
looks for an OpenOffice file (which is a Zip file),
and then needs to do some deep identification on the
file's substreams to determine if it is really usable
or not. This would be easily avoided if
"application/vnd.openoffice.whatever_it_is" was the
target atom instead.

That said, I'd love to create some sort of standard
here regarding use of mime-types, URIs, or otherwise
on the clipboard. The currently defined atoms are too
limiting in what they can reliably represent, and
aren't capable of being descriptive enough as to their
contents' types. They're also quite limiting as to the
number of formats you can advertize - eg. if you end
up  putting HTML inside of (say) COMPOUND_TEXT, you
can't also advertize that you've got DocBook up for
grabs too.

Mime-types have the benefit of being standardized and
in broad usage by the various applications, desktops,
and toolkits represented on this list. They also
support vendor extensions, which I think would be
useful here. As it stands, there is something like a
defacto standard already today, and that is "use
mime-types." If we can make this a formal
recommendation, that would make me really happy.

Further, I, Jody Goldberg, and a few Mozilla guys are
trying to formulate a course of action as to what we'd
like to see exist in the various HTML atoms, including
a multi-part extension (XHTML + CSS + images bundled
together) so that you can find interesting reliable
ways to express compound documents.

Speaking from experience, using mime-types as cliboard
atoms has been extremely powerful and beneficial for
AbiWord and other Office applications. Office
applications tend to be good stress-tests for this
sort of thing. This is partly because they generally
handle a great deal input/export formats and image
types, have compound structures (lists, tables,
images, sub-documents, ...), and our users demand
content and format-preserving clipboard behavior. If
you build a system that will handle these use-cases,
it should be able to handle just about anything you
throw at it.

This is a problem worth thinking about. When I have a
bit more done on my HTML "spec", I'll forward it here
for consideration.

Best regards and sorry for the verbosity,
Dom

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com





More information about the xdg mailing list