Universal package specification

Eugene Gorodinsky e.gorodinsky at gmail.com
Fri Nov 27 11:20:23 PST 2009


Hi all.

I believe one of the main problems on Linux is the lack of a universal
package format that works on all major distributions. The absence of a
common package format that could be used by all distributions results
in needing to waste additional resources in order to have software
working in a certain distribution. The main reasons for this are the
limitations of current package management solutions and slight
differences in filesystem hierarchies and core system stack. There are
two approaches that can be take to lessen the burden of package
management. One is to make all distros conform to a certain standard
in such a way that all packages of one distro could be installed on
another distro. But this makes all distros alike, which isn't very
good for innovation, as innovation implies differences. Another
approach is to separate system software from user applications and
define a standard set of interfaces used between them. This approach
allows distributions to be different and as a result helps to
innovate. The format I'm suggesting takes this approach and is
intended to be used alongside native package formats.

Existing package formats were developed for specific distributions and
don't offer the features required by software that is not
distribution-specific. However most software isn't really distribution
specific. A lot of times the only difference between software packaged
for one distribution and the same software packaged for another
distribution is the directories this software is installed into. This
is why the packages in the proposed format are installed into a
separate directory such as /opt/<package_name>.

Also since packages are not only distributed by themselves, but (and
this happens more and more often) through repositories, some of the
header data in packages such as rpm or dpkg packages is duplicated. In
order to eliminate duplication the new format separates package
information (such as dependencies, package name, provides etc.) from
the actual files that need to be installed. The package information is
contained in the package manifest and the files that will be installed
are contained in package archives.

This specification is by no means complete, and I would like to get some input.

Format of the package manifest.
The package manifest contains the following fields (some are required,
others optional):

Magic
This is the field with the magic number of the package format.

Version
This is the format version.

Key
In order to ensure that a given package came from a trusted vendor,
all packages are digitally signed using DSA. This is the public key.
Required.

Sign
This is the digital signature. Required.

Canonical package name
This name identifies the package. It must be comprised of english
letters, numbers, underscore and dash symbols only (e.g. firefox).
Required.

User readable package name
The user-readable package name in UTF-8 (e.g. Mozilla Firefox). Required.

Package version
This is the version of the package. Required.

Package icon
The name of the icon file. The icon format should either be png or
svg. Optional.

Package base
Some software allows plugins developed by third parties. For example
pidgin, eclipse or firefox. The package base field is used by the
plugin packages to specify what package they are a part of. Optional.

Plugins
As I already mentioned some programs allow third parties to develop
plugins. This field is an array of "plugin path" fields. Optional.

Plugin path
In order for the package manager to know where to copy files from
plugin packages, packages that allow third party plugins define paths
to their plugin directories. These paths are given a name each.
Plugins that are being installed use these names in their file paths
(e.g. %path1%/MyPlugin/README)


Components
This is an array of component fields. Each package contains one or
more components. Each component in the package contains the following
fields: component id, component group id, component dependencies
field, component provisions field and files field.

Component id
This is the id of the component. Component with an id of 0 is
mandatory. Other components are optional.

Component group id
This is the id of the group a component belongs to. 0 means the
component doesn't belong to any groups.

Component dependencies
Currently packages depend on package names. What they should depend on
is functionality. And so in this format the packages depend on
specific interfaces. The component dependencies field is an array
consisting of arrays of subfields: interface type, interface name,
interface version. Optional.

Interface type
So far there are two interface types: shared libraries and dbus
interfaces. I believe the less interface types we have the better.
This gives distributions more power over how different they want to
be, thus allowing to innovate and expiriment. There probably should be
mono and java interface types too - I haven't thought about it yet.

Interface name
For library interface types this is the name of the shared library.
For dbus interface types this is the name of the interface.

Interface version
For libraries this is the version of the library required. For dbus
interfaces this is NULL as to my knowledge there's no support of
interface versioning in dbus.


Component provisions
This field is equivalent to component dependencies with the following
exception: since the universal packages can not be installed into an
absolute path, they can't provide shared libraries. Optional.

Files
This field is an array of package archive filenames, their intended
architectures and hashes.



Languages
This field is an array of language fields. Each language field
contains language id and language-specific data. Language-specific
data field contains two arrays: component fields array and group
fields array. Each of which contains an id field, a name field and a
description field.


Here's a rough example of the manifest file in XML (XML is only used
to make this example more readable, I'm leaning towards using a binary
format for the manifest)

<manifest version="1.0">
	<key>key</key>
	<sign>sign</sign>
	<package canon_name="my_canon_name" name="My Package" version="1.0"
base="package_name" icon="icon.png">
	<plugins>
		<path name="path1">path/to/plugins/relative/to/this/package/root</path>
		<path name="path2">another/path/to/plugins</path>
	</plugins>
        <component id=0 group_id=0>
		<depends>
			<interface type="library" name="libname" version="2.14" />
			<interface type="dbus" name="org.freedesktop.DBus.Hello" />
		</depends>
		<provides>
			<interface type="type" name="name" version="interface.version" />
		</provides>
		<files>
			<file name="filename1" hash="hash" arch="arch" />
		</files>
        </component>
        <component id=1 group_id=0>
		<depends>
			<interface type="library" name="libcool" version="2.14" />
		</depends>
		<files>
			<file name="filename2" hash="hash" arch="arch" />
		</files>
	</component>
	<languages>
		<language id="en">
			<components>
				<component id=1 name="name" description="blah lah blah" />
			</components>
			<groups>
				<group id=1 name="name" description="blah lah blah" />
			</groups>
		</language>
	</languages>
	</package>
</manifest>


Format of package archives:

The package archives are simply binary files containing a magic
number, a table of contents which contains file paths to which the
data in the archive should be written during installation and the
lzma-comressed data blocks.


More information about the Distributions mailing list