Starting to adopt AppStream

Tue Aug 5 11:22:04 PDT 2014

2014-08-05 19:42 GMT+02:00 Aleix Pol <aleixpol at kde.org>:
> On Tue, Aug 5, 2014 at 7:19 PM, Matthias Klumpp <matthias at tenstral.net>
> [...]
>> > First thing, I don't know what to do, appstream-index itself seems to be
>> > broken, I've tried the commands in the documentation [1] and they don't
>> > respond well:
>> Looks like you've got old documentation content - try this link:
>> http://www.freedesktop.org/software/appstream/docs/api/html/re16.html
>> I'll better delete the old page then, to prevent confusion (all pages
>> link the correct manpage though, how did you get this link?)
>
> From a search engine I think, either Google or DuckDuckGo.
Hmm, okay - I deleted refreshed the whole documentation, removing all
old pages, so that should be solved then.
Anyway, "man appstream-index" is what counts ;-)

>> [...]
>
>>
>>
>> > Then I understand that I need to generate the xapian database somehow,
>> > but I
>> > don't see anywhere what command needs to be run.
>> Fedora runs
>> appstream-index --refresh --force
>
> appstream-index refresh
> ^
> That right? --refresh is not supported it says.
Yes - I was thinking
$ appstream-index refresh --force
but my hands wrote the other thing ^^ The above command is correct.

> I think I'm getting an empty database now, though, given that my
> distribution doesn't seem to be supported.
You have to generate the XML for your distribution first and ship it
in an AppStream metadata directory for others to consume.

>> on every package upgrade. There is also a PackageKit plugin taking
>> care of an up-to-date cache.
>> But in order to generate the cache, you will need AppStream data which
>> your distribution has to provide/generate.
>> For Debian-based distros, a generator is under construction, Fedora
>> has one (reusable for other (so far RPM-based) distros!), and OpenSUSE
>> has one.
>> The format of the metadata which is provided by distributors is
>> documented at
>> http://www.freedesktop.org/software/appstream/docs/chap-DistroData.html#sect-AppStream-ASXML
>> , together
>> with the locations where it should be placed.
>> If you want some real-world data, you can take the Fedora data
>> (contains a few non-standard fields at time, but is otherwise 100%
>> valid): http://koji.fedoraproject.org/koji/packageinfo?packageID=18639
>> Debian is working on a YAML-based implementation (called DEP-11),
>> which will also be processed by the cache generator very soon.
>> If you need help with generating the data for your distribution, let
>> me know! (Especially if there is documentation missing)

> Are generators what we have in src/data-providers, correct?
No, those are the things parsing the output generated by those generators.

> There's something I don't understand though. It seems to me that if we have
> a packagekit backend, then we're served, as in we can populate the database
> only with the appdata.xml files we can do the inverse look-up of the provide
> fields and populate the database. Is there anything I'm missing?
Not sure if I understand this correctly...
So, first of all, PackageKit and AppStream don't have that much in
common. PackageKit is a package-meagement abstraction layer, while
AppStream is an agreement to provide additional metadata about the
software which is available in a distribution's package repositories.
(Of course, both projects are related, because as soon as you have the
metadata, you definitively want to install or remove some of that
software, or query technical details about it, which is where you need
PackageKit)

So, how do we get that metadata? The data of course is placed in the
packages, or better: In files shipped with the packages (mostly
.desktop files and AppStream upstream metadata in /usr/share/appdata).
We need to get the data out of these files and publish it in a form
which is simple to consume for clients (like GNOME-Software or Apper).
That's what the distribution-specific generator does: It takes the
packages available in a distribution, extracts useful metadata and
writes out a large XML file, which is then moved to the client
machines (ideally downloaded by the package-manager, but shipping an
appstream-data package is also okay).
That XML is loaded by libappstream, which creates a Xapian index for
fast searching and features like stemming etc. The cache-builder also
takes data from other sources into account (like 3rd-party installed
software). There is also libAppstreamQt, which can read the cache
using a Qt interface, and libappstream-glib, which is used by
GNOME-Software and does not use a Xapian cache at all but just parses
the XML directly.

So, what would you need to do?
Basically, the only thing you need to do is create a tool which
inspects Arch packages, extracts metadata from them and forms an XML
file. That file then has to be downloaded on the client, and then
you're all set :)

> I would
> prefer not to write much archlinux-dependent code if it's not strictly
> necessary.
The only thing that's necessary is the distro-specific data generator.
That thing is usually hooked up to the distribution's infrastructure,
so it will likely be distro-specific. (You might be able to customize
the appstream-glib[1] generator though (which is primarily used in
(and designed for) Fedora))

There's also work on a thing for Debian and OpenSUSE, but at least the
Debian thing hooks up to an internal database (and uses that heavily),
so it might not be of much use.
Cheers,
    Matthias

[1]: https://github.com/hughsie/appstream-glib/tree/master/libappstream-builder

-- 
Debian Developer | Freedesktop-Developer
I welcome VSRE emails. See http://vsre.info/