Matching package names across distributions

Enrico Zini enrico at enricozini.org
Wed Feb 2 05:48:04 PST 2011


Hello,

now that I'm back from vacations and we are all in this list, it's time
to announce that we have an initial prototype for a distribution
matching tool: http://www.enricozini.org/2011/debian/distromatch/

You can follow that link for technical details, git URLs and examples.
Please feel free to quote bits here if you'd like to discuss them.

It's hard to measure the quality of the result because we don't have an
optimal match to benchmark against. My idea for "evaluating" it is to
just start using it for something, notice if there are gaps and see how
to fill them.

In order to 'start using it for something', I have generated Debtags
datasets for all distros currently supported by distromatch, please help
yourselves :)

  http://people.debian.org/~enrico/2011-01/tags-fedora.gz
  http://people.debian.org/~enrico/2011-01/tags-mandriva.gz
  http://people.debian.org/~enrico/2011-01/tags-opensuse.gz

Now I would like to move to the next step and have it run on live data;
for this we need to have each distribution set up some daily or weekly
public exports of the datasets used by distromatch[1].

So, let's set up the exports. I've added a scripts/ directory in
distromatch's git repo with the scripts I'm using in Debian, and I'm
running them once every two days. You can find the results at:
http://dde.debian.net/exports/ (files starting with "distromatch-*")
Note that in files.gz I directly generate the filtered interesting-files
data, so there is less data to store and download.

Please find me on IRC as 'enrico' on Freenode, OFTC and GIMPNet and we
can coordinate a bit.


Ciao,

Enrico


[1] after we have the live exports, my plan is to:

 1. using the exports, I can work on having a live distromatch
    instance deployed on, say, http://dde.debian.net/dde
 2. I'd be happy to assist anyone who'd like to run their own
    distromatch instance; this ought to lead to a little README for
    setting up distromatch, including the links to the per-distribution
    exports.
-- 
GPG key: 4096R/E7AD5568 2009-05-08 Enrico Zini <enrico at enricozini.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: Digital signature
URL: <http://lists.freedesktop.org/archives/distributions/attachments/20110202/ad40fdc1/attachment.pgp>


More information about the Distributions mailing list