[packagekit] PackageKit Apt backend

Tom Parker palfrey at tevp.net
Wed Oct 17 14:40:55 PDT 2007


On 17/10/2007, Sebastian Heinlein <glatzor at ubuntu.com> wrote:
> You can avoid opening the dependency cache even in the python bindings:

Knew I wasn't *completely* insane. I've just been running a series of
tests both with the sqlite-cached apt-backend'ed PackageKit and with
the example python script you gave, on two different machines
1) Debian desktop (stable/testing/unstable/experimental mess, but
leading towards unstable). 3Ghz P4 running 2.6.22 SMP kernel with
hyperthreading enabled.
2) Ubuntu laptop (approximately feisty, with some gutsy). Another 3Ghz
P4, but the mobile variant (yes, it's a brick) running 2.6.20 SMP
kernel, but apparently no hyperthreading available.
Both have apt/python-apt 0.7.3-something.

On the first, typical runtimes for the script are about 0.7s. Haven't
double-checked the sqlite caching on that machine, but it was
previously reporting times in the 0.4-0.5s range for it.
On the second, typical runtimes for the script are about 1.2s, with
sqlite-caching hitting ~0.4s. Hitting very cold cache sometimes seems
to hit 7s first time, but provided you've used PackageKit in the last
minute or so and haven't unloaded something massive, 0.4s is stable,
which leads me to expect that this is almost entirely I/O bound and
that slower machines will see greater degradation in the apt-caching
time vs. sqlite speed.

Notably, swapping OpProgress for OpTextProgress spots that the vast
majority of that runtime for the script is in the apt cache reading,
and hence why I'd gone with the sqlite caching (also that the cache
init appears to do even more work when called via libapt....).

This raises two points:
1) Is this bad enough to justify the sqlite caching, which will
probably also require extra complexity for anything else we want fast
results for? The only way to get really fast results out of
libapt/python-apt appears to be to cache their state, and I don't have
other better ways to do this offhand.
2) Is there other ways to reduce the startup overhead for python-apt
for simple queries? Can we get it to only partially load the cache
(say ignoring dependancy data for text searches)?

I like the sqlite caching, but that comes at a complexity cost.
Actually, one thing we might be able to do is to use the python sqlite
bindings instead to reduce its complexity (at least for all tasks
aside from building the db, which is slow enough already).

One really crazy option that comes to mind is using grep - a sample
egrep for "^Package: pidgin" (with -A 100 to get *everything* after a
match) in the apt lists folder takes no more than 0.1s. Reducing the
number of lines to search by a couple of orders of magnitude might be
good enough....

Tom


More information about the PackageKit mailing list