[Clipart] Authority Control
Nicu Buculei
nicu at apsro.com
Sun Jan 23 23:38:07 PST 2005
Jonadab the Unsightly One wrote:
> I've run it over (a copy of) the 0.9 release, and after some basic
> consolidation (such as case folding) it came up with the statistics
> below. (Keywords that only occur once in the whole collection are not
> listed.) I'd like to have a couple of additional sets of eyeballs
> besides mine look over this list. I know there are pairs of keywords
> in there that are functionally equivalent and should be combined.
>
> There are some that I already know need to be combined:
> bug,bugs
> mammal,mammals
> bird,birds
> animal,animals
bill,gates
tool,tools
sign, signs
> Those may be combined already on the list below, as are versions of
> keywords that differ only in capitalization. Plus I also already know
> we need to strip leading whitespace. And I already know we want to
striping leading whitespace will make the list easier to parse and
easily identify other potential problems
> remove the "unsorted" keyword from images that aren't in the unsorted
> directory. But I'm sure there are more on the list than that. For
> example, are "action" and "actions" used in the same sense? What
> about "application" and "apps"?
i believe "apps" comes for the subdirectory name in the icon themes
we cal also remove unuseful errors introcuced by the system, like:
improvisedkeywordparse
hash
0x996c42c
and meaningless ones, which were part of longer sentences, like:
un
u
--
nicu
More information about the clipart
mailing list