AppStream/SC/PK GSoC report #2

Matthias Klumpp matthias at tenstral.net
Sat Jun 9 04:00:32 PDT 2012


Hi!

Again some news about my project on making the Software-Center (forked
from Ubuntu) cross-distributional, usable and a pleasure to use ;-)

Right now I'm still at improving PackageKit, because PK is the key
component to make the SC work great.
Last week I did many improvements on PK and also introduced a
package-sqlcache which can be used to fetch package details extremely
fast.
I've seen this as the only way to improve the current situation and
make it possible to read package data fast & in parallel to other
running transactions.

I did not really like the database cache, and many people agree with
the point that a cache is not very nice. Daniel Nicoletti (Aptcc dev,
Apper maintainer) suggested a session-daemon approach to solve the
parallelization issue, which Richard didn't like. I didn't like it too
very much. As cache was bad and parallalization not easy to have, I
suggested to write an abstraction library to access package databases
directly by resuing existing backends. This way, DBus would also be
avoided (Telepathy doesn't send logs over DBus because of speed too,
so I thought this was a good idea for the large amount of data)
Daniel didn't like my idea, as Qt tools would then need to interface
with a GLib access library. Richard didn't like my idea too, as it was
"too complicated" and instead suggested to implement a feature which
we tried to add about a year before too, but then the issue I wanted
to solve with it was solved with another solution, so we didn't need
it back then: parallel transactions. Parallel transactions means we
would be able to execute read-only actions which don't need a database
lock in parallel to other transactions.
This idea was disliked by both me and Daniel, I first was against it
because we would force backend maintainers to implement a threadsafe
backend and probably change their package manager, Daniel didn't like
it because Apt, the Debian package-manager was not threadsafe.
To solve all of this, spwaning a new process and doing IPC was
suggested, but I didn't like that every backend had to implement an
own solution for that and wanted a generic solution (as probably other
backends had this problem too) and Daniel didn't like to do extensive
IPC in a backend.
When the discussion was close to a deadlock, Michael Vogt (Apt
maintainer) said that it would be possible to change Apt to be
threadsafe for most transactions.
This suddenly removed nearly all arguments against parallel
transactions, so I'm now working on a implementation of parallel
processing for PackageKit.

It is important that backend maintainers make changes on their
backends to support parallel transactions, otherwise the backends will
fail miserably as soon as parallel transactions are implemented.
So I put some hours on thinking about all (hopefully all!) corner
cases which could possibly appear before implementing parallelization.
I added some new backend API which PackageKit backends need to use to
make this feature work.
I also extended our backend porting guide to describe how backend
maintainers should adjust their backends to continue working. The
porting guide contains information how to proceed on backends which
are threadsafe as well as those which aren't.
Our top-maintained backends (Yum, Zif, Aptcc) will be ported soon. As
soon as I have a working implementation of parallel transactions I
will also write a mail to PackageKit development list as a reminder
for people to please adjust their backends to the changes.

The parallel transactions approach is great, as it will solve the most
important issues without creating an extra cache. Also, it is
something which is done right(tm), without any workaround or ugly
hack. Problem is that backend code has to be touched, but we already
broke backend API for some other improvements and the changes aren't
too complicated.
With parallel transactions, the SC will be able to GetDetails() of a
package while running InstallPackages() in parallel. We still need a
simple cache to improve speed, but the cache was present for a very
long time anyway, is small and can be created quickly.
This feature will also reduce waiting times in PackageKit frontends
and in general improve user experience in all PK clients.
Problem is that not all backends will support that, for example
Fedora's yum can't run transactions in parallel. Those backends will
still have to run a queue of transactions and therefore the SC will
not work as great on these distributions as it does on others. But it
would also not work very well without PackageKit, it's a problem of
the package-manager. Solution for Fedora in this case would be
switching to Zif for example.
I think all other issues can be adressed too.
I'm glad that we will have a working and sane solution for this issue
soon, which will make PackageKit fast as hell - if you like
experimental, breaking software you can aleady try the PackageKit
master branch ;-)

Finally some other things: I'll have exams in a few weeks, and
unfortunately I don't study informatics/computer science but molecular
biomedicine, which means I have to learn anatomy, histology,
biochemistry and physics the next weeks to pass the exams. I will have
to reduce my work on this SoC project a little during these weeks, but
I won't stop of course, I'll just be a bit slower. After I hopefully
passed the exams (eeh...) I can work days and nights (probably more
during nighttime) on PK and SC. This is just FYI.
As there seems to be a very high interest in my SoC by various people:
You can always catch me on #PackageKit, #freedesktop or
#opensuse-project (and various other channels) on IRC, my nickname is
ximion. I'll try to answer any questions as good as I can :-)
Or just reply to this mail :P
Bye!
  Matthias


More information about the Distributions mailing list