[packagekit] 1-click; Third party vendors; etc.

Mon Jun 2 05:12:10 PDT 2008

Greetings All,

I have been watching with interest the discussion of one-click and
third party repositories resulting from fosscamp on this list. I have
been too busy with exams to risk commenting and embroiling myself in a
discussion until now. I would like to comment on some of the points
raised throughout this thread. I'll start a new thread as the other
was already getting rather off topic and there's no single message to
which I want to reply.

For those of you who are unfamiliar with what OCI is, or how it is
used, you might find my slides from fosdem[0] or the wiki section[1]
helpful.

The first thing that was asked was what the use cases for OCI are. The
traditional open package manager -> search -> install works well for
finding and installing software in the distribution's official
repository. However, there are two limitations:

a) Only software published by the system vendor is available (unless
other repositories are added)
b) Software is only locatable based on metadata supplied by the
packagers - name, summary, provides...

OCI addresses the first point by making it no more difficult to
install software published by a third party vendor than the
distributor. Obviously there are security and trust issues here which
I will cover later. The second point is also addressed by allowing
direct installation from wherever the user finds the software. Let us
consider how users actually discover that they wish to install
something. I do not believe it is always through perusing the software
available in the package manager. Some other examples would be:

* User finds a product's webpage, thinks this looks good, wants to install it.
* User purchases or is given a DVD containing some software, wants to
install it.
* User reads a review or article about software, and wants to install it.
* User reads a developer blog about some software and wishes to install it.
* User finds a tutorial explaining how to accomplish a goal, which
requires some software to be installed.

If we look at some examples of you get from a project/product web page
to install some software right now, such as pidgin[2], amarok[3],
amazon mp3 downloader[4]... They usually involve first the user to
drill down selecting their distribution and version, and then the user
either gets an individual rpm/deb package or complex instructions
regarding adding a package repository and installing the software.

Textual instructions for adding a repository and installing software
are in my opinion too complicated for most users to follow. A link to
an individual package has other problems - how will the package be
kept up to date with security and other fixes, what if that package
requires other software not in the distribution's main repository?

One option is to have for example an RPM that installs a .repo and key
for a repository and uses that repository in dependency resolution.
However, this creates a very distribution specific solution - a
different package will most likely be required for supporting
different distributions. Different repositories and package names will
likely be required for each different distribution. At the very least
a different package would be required for debian based systems. So
this solution would still, at best, require a user to know what
operating system he or she is using. Also, if you're relying on normal
packages to also add and remove repositories, is it possible for the
user interface to make it clear what is happening? There are also
other approaches to enable a link to install software, such as
apturl[5], and ThirdPartyApt[6], although these are both very
ubuntu-specific solutions and have some other limitations.

OCI allows a single install link to contain instructions for multiple
distributions. The handler should choose the instructions appropriate
for the distribution it is running under. This is already being used
successfully to have install links that work regardless of the version
of openSUSE the user is using[7]. The format is actually fairly
similar to that proposed by David Zeuthen[8]. It allows instructions
for multiple distributions, localisable information, mirrors, etc.
There is a brief overview[9] and a full schema[10] if you want more
information. There is also a proof of concept implementation of a
handler for ubuntu[11].

It is quite possible to implement a OCI handler for every
distribution, which does the right thing with each distribution's
package management. If OCI could be built on top of packagekit then a
single handler could work on every distribution with a packagekit
backend implemented. Indeed, it would seem foolish to implement a
distribution specific packagekit implementation, and then have to do
the same again for an OCI handler. As far as I am aware it is not
currently possible to implement using the packagekit API, as there
appears to be no way to add or remove repositories[12].

Having considered some of the uses for OCI, perhaps now is a good
point to consider whether easy addition of third party repositories,
or even the very concept of third party repositories is a
fundamentally bad idea.

Patryk[13] and David[8] advocated using only distro-provided packages
as third party packages can lead to dependency-hell problems. I see at
least two separate issues here:

a) Conflicts etc caused by installing incompatible or poorly built packages.

Installing packages built for an entirely different distribution is
clearly a bad idea, although many people do it with random packages
they find online. With OCI the instructions are labeled with the
distribution they are intended for, incompatible packages should not
be installed. There is less that can be done about poor quality third
party packages, it comes down to whether the user trusts the vendor to
produce quality software and packages. Through package & metadata
signing users are already able to establish the identity of the
packager, but there is not usally a good way for users to make a
meaningful decision as to whether to trust this identity. I think this
is something that can be done better, the packager's identity can be
related to how much the community trusts him/her. There is some
interesting work being done on such a system for the openSUSE build
service[14]. If the installer always looked up and displayed to the
user both the packager's trust rating and the software's own rating,
then the user could make a more informed choice as to whether to
install the software or not.

b) Difficulty upgrading distribution between versions caused by third
party packages installed on the system not having an upgrade path.

This is indeed a problem, and has often caused upgrade issues for suse
users. The upgrade can usually be performed without problems by
disabling all old repositories, adding the new repositories, and
upgrading. Unfortunately this method often means that removal of old
third party software that is not available in the distribution is be
required, as its dependencies will not be satisfied on the new system.
Although this can be suggested and performed automatically the user
does not really want to lose all his/her third party software.

There are other ways to solve this problem though. Such as having the
upgrade process check each of the repositories enabled on the system
for some metadata indicating an update path, and/or asking a central
service for suggested replacements for the repositories in the new
version.

Richard brought up another potential problem[15]:

c) Updates being withheld by packages in a third party repository.

This could indeed be a major problem. There are also other related
issues, such as "vendor-bouncing" that can occur when a user is
subscribed to multiple repositories that supply the same package. When
one repository has an updated version the package manager may update
to that one, then the first repository publishes an update and the
package manager switches back.

Zypp on suse solves both of these problems by using "vendor-sticky"
updates will only be chosen from the vendor that the user chose to
install the package from in the first place. The user must make a
specific choice to change the vendor of an installed package. This
seems to quite effectively solve the problem, a similar but less
effective result can be achieved by adding third party repositories
with a sufficiently low priority. Of course these strategies only help
under normal circumstances, they do not defend against malicious
content in a third party repository, which comes back to trust again.

So, in my opinion, most of the problems that can occur when dealing
with third party repositories are not problems with the concept of
third party repositories, but rather failings in how the system deals
with those repositories. Nevertheless, there is clearly more
complexity to deal with when there are multiple vendors, than when
there is just one. Why might one wish to spend effort playing nicely
with third party vendors rather than just concentrating on getting
everything into the distribution itself?

a) Licences and Legalities

Many distribution projects are either backed by an entity based in a
specific country, or rely heavily on infrastructure hosted in that
some country. Therefore, they are bound by that country's legislation.
This often means that a project cannot distribute software that
infringes software patents in the united states, whereas a separate
entity with no ties to the united states could happily distribute that
software for other users of the distribution. There is also the
problem of incompatible licences, or licences that prohibit
redistribution. Like it or not there are ISVs shipping non-free
software to linux users. From games such as ut2004, to software like
skype. Neither is redistributable, is it better that these companies
all NIH[16] their own installers, and make it difficult for users to
remove the software afterwards, and the system's package management is
unaware of the presence of such software?

Telling vendors they have publish their software under a specific
licence, or that they must work under the distributor's own policies,
procedures, and timescales to reach users seems like a good way of
discouraging vendors from targeting your distribution.

b) Release Schedules

Many distributions operate on a stable release every $TIMEPERIOD
system, that involves freezing the versions of applications bundled
with each version of the distribution for the lifetime of that
distribution. While this is great for ensuring that that software all
works well together, where does it leave the person who needs to
follow the latest version of $APPLICATION for his/her work? e.g. A
java developer might require the latest version of eclipse/netbeans
and other java productivity software. Should he/she follow the
unstable branch of the distribution just so that he can have the
latest software? Or should he install the software manually on the
stable version and lose all the benefit of the package management
system? In openSUSE it is common to use third party repositories to
allow special interest groups to follow current versions of specific
software on the stable version of the distribution. There are
repositories for latest java tools, latest KDE, etc. The users can
keep their stable supported system and install the newer versions of
the software that they require.

The other question was that even if we accept the concept of third
party repositories, does OCI make it too easy for users to add them,
without understanding the consequences? Is it, like Sebastian said
"Opening the gates of hell"[17]? It is true that OCI makes adding
third party repositories easy, that is the whole point. However,
consider these points:

a) You control the process.

It is better that users follow through the OCI process to install
software than learn to copy and paste random "sudo <command here>"
commands from fora, without an understanding of what they do. At least
with OCI there is a chance to try and make the user aware of what will
be done, the user must choose whether to trust the packager. If you
want to add a malware check stage into the process then that is
possible too.

b) It is not the easiest attack vector.

If you wanted to do something malicious to a users' machine, why would
you utilise a process that requires the user to trust your identity
and be presented with warnings telling them that it is potentially a
bad idea? It is considerably easier to get users to execute arbitrary
binaries.

c) Is "too easy" possible?

In some ways I think that the argument that making things easier makes
it easier for users to do the wrong thing is a bit like the security
by obscurity argument. Sure, if the user can't work out how to do
something then he/she will not do it wrongly. However, if the is able
to do something and does it wrong it is not really the fault of the
software for being too easy to use, but rather that it was poorly
designed and did not give the user the correct information and
guidance.

In summary, I believe I have covered:

0: What is OCI?
1: What are its advantages over alternatives covered in the original
thread and others.
2: Benefits of combining with packagekit.
3: Third Party Repository problems.
4: Third Party Repository advantages.
5: Is OCI too easy?

Apologies for such a long message, but there were quite a number of
points to respond to. It would be good to get other people's opinions
on these matters. If OCI is a good idea then it would be much better
if it would work on any distribution, and we can enhance the format if
necessary to accommodate others. If it is a fundamentally flawed
concept then we should probably not even be using it in openSUSE.

Discuss.

__

[0]  http://files.opensuse.org/opensuse/en/b/b0/OneClickInstallFosdemTalk.pdf
[1]  http://en.opensuse.org/One_Click_Install
[2]  http://www.pidgin.im/download/
[3]  http://amarok.kde.org/wiki/Download
[4]  http://www.amazon.com/gp/dmusic/help/amd.html/ref=dm_gs_amd
[5]  https://launchpad.net/apturl/
[6]  https://wiki.ubuntu.com/ThirdPartyApt
[7]  http://packman.links2linux.de/package/amarok
[8]  http://lists.freedesktop.org/archives/packagekit/2008-May/003093.html
[9]  http://en.opensuse.org/One_Click_Install/ISV
[10] http://en.opensuse.org/Standards/One_Click_Install
[11] http://video.google.com/videoplay?docid=494315634277751745
[12] http://www.packagekit.org/pk-reference.html
[13] http://lists.freedesktop.org/archives/packagekit/2008-May/003084.html
[14] http://en.opensuse.org/Build_Service/Concepts/Trust
[15] http://lists.freedesktop.org/archives/packagekit/2008-May/003064.html
[16] Not Invented Here
[17] http://lists.freedesktop.org/archives/packagekit/2008-May/003069.html

--
Benjamin Weber