[PATCH libxkbcommon v2 1/2] makekeys: use GNU gperf to generate perfect hashtables
Ran Benita
ran234 at gmail.com
Wed Oct 3 01:18:32 PDT 2012
Hi David,
On Tue, Oct 02, 2012 at 07:51:53PM +0200, David Herrmann wrote:
> Instead of using a home-brew hashtable generator, we should instead use
> the gperf program which is known to work.
>
> This removes the "makekeys" programs and instead replaces it by a file
> that can generate input files for gperf. Gperf then generates hashtables
> for all of these input files and writes them concatenated into
> ks_tables.h which then can be used from keysym.c
>
> Unfortunately, gperf does not support integer keys but only strings or
> binary data. Therefore, we have to make the keysym->name lookup
> little-endian to avoid errors during cross compilation.
>
> Signed-off-by: David Herrmann <dh.herrmann at googlemail.com>
> ---
The code is really nice, I have no comments on it.
I noticed though that it really blows up our binary size. Here's size -A
(only the relevant sections, CFLAGS=-O2) of current master:
section size addr
.text 110708 10608
.rodata 95684 121344
.data.rel.ro 2192 237728
Total 238568
Here it is after adding the third table to old makekeys, in the v1 patch
you sent:
section size addr
.text 110724 10608
.rodata 122340 121376
.data.rel.ro 2192 264416
Total 265240
And here it is with these patches:
section size addr
.text 112788 28912
.rodata 716478 141728
.data.rel.ro 55824 879136
Total 933614
So gperf is clearly doing... something here. In the gperf manual they
mention:
The size of the generate static keyword array can get extremely large
if the input keyword file is large or if the keywords are quite
similar. This tends to slow down the compilation of the generated C
code, and greatly inflates the object code size. If this situation
occurs, consider using the ‘-S’ option to reduce data size, potentially
increasing keyword recognition time a negligible amount. Since many C
compilers cannot correctly generate code for large switch statements it
is important to qualify the -S option with an appropriate numerical
argument that controls the number of switch statements generated.
To reduce the size I tried removing %compare-length from the name-to-key
tables (which helped a bit). Also tried using a few %switch numbers (and
thus let gcc create the lookup tables on its own), which reduced it to
about 550000, but then the compilation takes about a minute :)
So to be honest, the hashing that gperf and makekeys do is nice, but I
don't see why we do it anyway, if it complicates thing. For example, I
just took 15 minutes to do it in the obvious way of creating simple
sorted {name, keysym} arrays and doing binary search on them, to replace
the current makekeys code (see attached patch - just a proof of concept
hacked up python script and a couple bsearch's). I don't see any
performance issues, and the size is:
.text 110516 48016
.rodata 66366 158560
.data.rel.ro 39568 241696
Total 283860
With adding the third table it is:
.text 110548 67184
.rodata 80278 177760
.data.rel.ro 58768 278752
Total 336180
So since makekeys is ugly and gperf is a bit excessive, maybe we should
just keep it simple, what do you think?
Ran
-------------- next part --------------
>From 8fb5efb045b7207b010c979cbeae8f8222759961 Mon Sep 17 00:00:00 2001
From: Ran Benita <ran234 at gmail.com>
Date: Wed, 3 Oct 2012 10:09:48 +0200
Subject: [PATCH libxkbcommon] Replace makekeys with python script + binary
search
Signed-off-by: Ran Benita <ran234 at gmail.com>
---
Makefile.am | 9 +-
configure.ac | 14 +--
makekeys.py | 23 ++++
makekeys/.gitignore | 1 -
makekeys/Makefile.am | 10 --
makekeys/makekeys.c | 302 ---------------------------------------------------
src/keysym.c | 97 ++++++-----------
7 files changed, 62 insertions(+), 394 deletions(-)
create mode 100644 makekeys.py
delete mode 100644 makekeys/.gitignore
delete mode 100644 makekeys/Makefile.am
delete mode 100644 makekeys/makekeys.c
diff --git a/Makefile.am b/Makefile.am
index 26646fb..dfbd8d9 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -1,7 +1,5 @@
ACLOCAL_AMFLAGS = -I m4
-SUBDIRS = makekeys
-
pkgconfigdir = $(libdir)/pkgconfig
pkgconfig_DATA = xkbcommon.pc
@@ -92,11 +90,8 @@ src/xkbcomp/parser.c: $(top_builddir)/src/$(am__dirstamp) $(top_builddir)/src/xk
src/xkbcomp/parser.h: $(top_builddir)/src/$(am__dirstamp) $(top_builddir)/src/xkbcomp/$(am__dirstamp)
src/xkbcomp/scanner.c: $(top_builddir)/src/$(am__dirstamp) $(top_builddir)/src/xkbcomp/$(am__dirstamp)
-src/ks_tables.h: $(top_builddir)/makekeys/makekeys$(EXEEXT)
- $(AM_V_GEN)$(top_builddir)/makekeys/makekeys $(top_srcdir)/xkbcommon/xkbcommon-keysyms.h > $@
-
-$(top_builddir)/makekeys/makekeys$(EXEEXT): $(top_srcdir)/makekeys/makekeys.c
- $(MAKE) -C makekeys
+src/ks_tables.h: makekeys.py
+ $(AM_V_GEN)$(PYTHON) $(top_srcdir)/makekeys.py $(top_srcdir)/xkbcommon/xkbcommon-keysyms.h > $@
# Documentation
diff --git a/configure.ac b/configure.ac
index df8a99e..570dda0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -61,6 +61,9 @@ if test ! -f "src/xkbcomp/parser.c"; then
AC_MSG_ERROR([yacc not found - unable to compile src/xkbcomp/parser.y])
fi
fi
+AM_PATH_PYTHON([2.6], [], [
+ AC_MSG_ERROR([python not found - unable to run makekeys])
+])
# Checks for library functions.
AC_CHECK_FUNCS([strcasecmp strncasecmp])
@@ -71,16 +74,6 @@ fi
AC_CHECK_FUNCS([eaccess euidaccess])
-# Build native compiler needed for makekeys
-AC_ARG_VAR([CC_FOR_BUILD], [Build native C compiler program])
-if test "x$CC_FOR_BUILD" = x; then
- if test "$cross_compiling" != no; then
- AC_PATH_PROGS([CC_FOR_BUILD], [gcc cc], [cc])
- else
- CC_FOR_BUILD="$CC"
- fi
-fi
-
XORG_TESTSET_CFLAG([CFLAGS], [-fvisibility=hidden])
# Define a configuration option for the XKB config root
@@ -121,7 +114,6 @@ AC_DEFINE_UNQUOTED([DEFAULT_XKB_LAYOUT], ["$DEFAULT_XKB_LAYOUT"],
AC_CONFIG_FILES([
Makefile
- makekeys/Makefile
xkbcommon-uninstalled.pc
xkbcommon.pc
doc/Doxyfile
diff --git a/makekeys.py b/makekeys.py
new file mode 100644
index 0000000..4466256
--- /dev/null
+++ b/makekeys.py
@@ -0,0 +1,23 @@
+#!/usr/bin/env python
+
+import re, sys, itertools
+
+pattern = re.compile(r'^#define\s+XKB_KEY_(?P<name>\w+)\s+(?P<value>0x[0-9a-fA-F]+)\s')
+matches = [pattern.match(line) for line in open(sys.argv[1])]
+entries = [(m.group("name"), int(m.group("value"), 16)) for m in matches if m]
+
+print('''struct name_keysym {
+ const char *name;
+ xkb_keysym_t keysym;
+};\n''')
+
+print('static const struct name_keysym name_to_keysym[] = {');
+for (name, _) in sorted(entries, key=lambda e: e[0]):
+ print(' {{ "{name}", XKB_KEY_{name} }},'.format(name=name))
+print('};\n')
+
+# Only keep the first name given to each keysym value. Hey, it works.
+print('static const struct name_keysym keysym_to_name[] = {');
+for (name, _) in (next(g[1]) for g in itertools.groupby(sorted(entries, key=lambda e: e[1]), key=lambda e: e[1])):
+ print(' {{ "{name}", XKB_KEY_{name} }},'.format(name=name))
+print('};')
diff --git a/makekeys/.gitignore b/makekeys/.gitignore
deleted file mode 100644
index 2bdb5e0..0000000
--- a/makekeys/.gitignore
+++ /dev/null
@@ -1 +0,0 @@
-makekeys
diff --git a/makekeys/Makefile.am b/makekeys/Makefile.am
deleted file mode 100644
index 5d9a441..0000000
--- a/makekeys/Makefile.am
+++ /dev/null
@@ -1,10 +0,0 @@
-AM_CFLAGS = $(BASE_CFLAGS) -I$(top_srcdir)
-
-# need to use build-native compiler
-
-CC = $(CC_FOR_BUILD)
-CPPFLAGS = $(CPPFLAGS_FOR_BUILD)
-CFLAGS = $(CFLAGS_FOR_BUILD)
-LDFLAGS = $(LDFLAGS_FOR_BUILD)
-noinst_PROGRAMS = makekeys
-
diff --git a/makekeys/makekeys.c b/makekeys/makekeys.c
deleted file mode 100644
index 62d7255..0000000
--- a/makekeys/makekeys.c
+++ /dev/null
@@ -1,302 +0,0 @@
-/*
- *
- * Copyright 1990, 1998 The Open Group
- *
- * Permission to use, copy, modify, distribute, and sell this software and its
- * documentation for any purpose is hereby granted without fee, provided that
- * the above copyright notice appear in all copies and that both that
- * copyright notice and this permission notice appear in supporting
- * documentation.
- *
- * The above copyright notice and this permission notice shall be included
- * in all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
- * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
- * IN NO EVENT SHALL THE OPEN GROUP BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Except as contained in this notice, the name of The Open Group shall
- * not be used in advertising or otherwise to promote the sale, use or
- * other dealings in this Software without prior written authorization
- * from The Open Group.
- *
- */
-
-/*
- * Constructs hash tables for xkb_keysym_to_string and
- * xkb_string_from_keysym.
- */
-
-#include "xkbcommon/xkbcommon.h"
-
-#include <inttypes.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-
-typedef uint32_t Signature;
-
-#define KTNUM 4000
-
-static struct info {
- char *name;
- xkb_keysym_t val;
-} info[KTNUM];
-
-#define MIN_REHASH 15
-#define MATCHES 10
-
-static char tab[KTNUM];
-static unsigned short offsets[KTNUM];
-static unsigned short indexes[KTNUM];
-static xkb_keysym_t values[KTNUM];
-static int ksnum = 0;
-
-static int
-parse_line(const char *buf, char *key, xkb_keysym_t *val, char *prefix)
-{
- int i;
- char alias[128];
-
- /* See if we can catch a straight XK_foo 0x1234-style definition first;
- * the trickery around tmp is to account for prefices. */
- i = sscanf(buf, "#define %127s 0x%" SCNx32, key, val);
- if (i == 2 && strncmp(key, "XKB_KEY_", 8) == 0) {
- prefix[0] = '\0';
- memmove(key, key + 8, strlen(key + 8) + 1);
- return 1;
- }
-
- i = sscanf(buf, "#define %127s %127s", key, alias);
- if (i == 2)
- fprintf(stderr, "can't parse keysym definition: %s", buf);
-
- return 0;
-}
-
-int
-main(int argc, char *argv[])
-{
- FILE *fptr;
- int max_rehash;
- Signature sig;
- int i, j, k, l, z;
- char *name;
- char c;
- int first;
- int best_max_rehash;
- int best_z = 0;
- int num_found;
- xkb_keysym_t val;
- char key[128], prefix[128];
- char buf[1024];
-
- for (l = 1; l < argc; l++) {
- fptr = fopen(argv[l], "r");
- if (!fptr) {
- fprintf(stderr, "couldn't open %s\n", argv[l]);
- continue;
- }
-
- while (fgets(buf, sizeof(buf), fptr)) {
- if (!parse_line(buf, key, &val, prefix))
- continue;
-
- if (val == XKB_KEY_VoidSymbol)
- val = 0;
- if (val > 0x1fffffff) {
- fprintf(stderr, "ignoring illegal keysym (%s, %" PRIx32 ")\n",
- key,
- val);
- continue;
- }
-
- name = malloc(strlen(prefix) + strlen(key) + 1);
- if (!name) {
- fprintf(stderr, "makekeys: out of memory!\n");
- exit(1);
- }
- sprintf(name, "%s%s", prefix, key);
- info[ksnum].name = name;
- info[ksnum].val = val;
- ksnum++;
- if (ksnum == KTNUM) {
- fprintf(stderr, "makekeys: too many keysyms!\n");
- exit(1);
- }
- }
-
- fclose(fptr);
- }
-
- printf("/* This file is generated from keysymdef.h. */\n");
- printf("/* Do not edit. */\n");
- printf("\n");
-
- best_max_rehash = ksnum;
- num_found = 0;
- for (z = ksnum; z < KTNUM; z++) {
- max_rehash = 0;
- for (name = tab, i = z; --i >= 0; )
- *name++ = 0;
- for (i = 0; i < ksnum; i++) {
- name = info[i].name;
- sig = 0;
- while ((c = *name++))
- sig = (sig << 1) + c;
- first = j = sig % z;
- for (k = 0; tab[j]; k++) {
- j += first + 1;
- if (j >= z)
- j -= z;
- if (j == first)
- goto next1;
- }
- tab[j] = 1;
- if (k > max_rehash)
- max_rehash = k;
- }
- if (max_rehash < MIN_REHASH) {
- if (max_rehash < best_max_rehash) {
- best_max_rehash = max_rehash;
- best_z = z;
- }
- num_found++;
- if (num_found >= MATCHES)
- break;
- }
-next1:;
- }
-
- z = best_z;
- printf("#ifndef KS_TABLES_H\n");
- printf("#define KS_TABLES_H\n\n");
- printf("static const unsigned char _XkeyTable[] = {\n");
- if (z == 0) {
- fprintf(stderr, "makekeys: failed to find small enough hash!\n"
- "Try increasing KTNUM in makekeys.c\n");
- exit(1);
- }
- printf("0,\n");
- k = 1;
- for (i = 0; i < ksnum; i++) {
- name = info[i].name;
- sig = 0;
- while ((c = *name++))
- sig = (sig << 1) + c;
- first = j = sig % z;
- while (offsets[j]) {
- j += first + 1;
- if (j >= z)
- j -= z;
- }
- offsets[j] = k;
- indexes[i] = k;
- val = info[i].val;
- printf("0x%.2" PRIx32 ", 0x%.2" PRIx32 ", 0x%.2" PRIx32 ", "
- "0x%.2" PRIx32 ", 0x%.2" PRIx32 ", 0x%.2" PRIx32 ", ",
- (sig >> 8) & 0xff, sig & 0xff, (val >> 24) & 0xff,
- (val >> 16) & 0xff, (val >> 8) & 0xff, val & 0xff);
- for (name = info[i].name, k += 7; (c = *name++); k++)
- printf("'%c',", c);
- printf((i == (ksnum - 1)) ? "0\n" : "0,\n");
- }
- printf("};\n");
- printf("\n");
- printf("#define KTABLESIZE %d\n", z);
- printf("#define KMAXHASH %d\n", best_max_rehash + 1);
- printf("\n");
- printf("static const unsigned short hashString[KTABLESIZE] = {\n");
- for (i = 0; i < z; ) {
- printf("0x%.4x", offsets[i]);
- i++;
- if (i == z)
- break;
- printf((i & 7) ? ", " : ",\n");
- }
- printf("\n");
- printf("};\n");
-
- best_max_rehash = ksnum;
- num_found = 0;
- for (z = ksnum; z < KTNUM; z++) {
- max_rehash = 0;
- for (name = tab, i = z; --i >= 0; )
- *name++ = 0;
- for (i = 0; i < ksnum; i++) {
- val = info[i].val;
- first = j = val % z;
- for (k = 0; tab[j]; k++) {
- if (values[j] == val)
- goto skip1;
- j += first + 1;
- if (j >= z)
- j -= z;
- if (j == first)
- goto next2;
- }
- tab[j] = 1;
- values[j] = val;
- if (k > max_rehash)
- max_rehash = k;
-skip1:;
- }
- if (max_rehash < MIN_REHASH) {
- if (max_rehash < best_max_rehash) {
- best_max_rehash = max_rehash;
- best_z = z;
- }
- num_found++;
- if (num_found >= MATCHES)
- break;
- }
-next2:;
- }
-
- z = best_z;
- if (z == 0) {
- fprintf(stderr, "makekeys: failed to find small enough hash!\n"
- "Try increasing KTNUM in makekeys.c\n");
- exit(1);
- }
- for (i = z; --i >= 0; )
- offsets[i] = 0;
- for (i = 0; i < ksnum; i++) {
- val = info[i].val;
- first = j = val % z;
- while (offsets[j]) {
- if (values[j] == val)
- goto skip2;
- j += first + 1;
- if (j >= z)
- j -= z;
- }
- offsets[j] = indexes[i] + 2;
- values[j] = val;
-skip2:;
- }
- printf("\n");
- printf("#define VTABLESIZE %d\n", z);
- printf("#define VMAXHASH %d\n", best_max_rehash + 1);
- printf("\n");
- printf("static const unsigned short hashKeysym[VTABLESIZE] = {\n");
- for (i = 0; i < z; ) {
- printf("0x%.4x", offsets[i]);
- i++;
- if (i == z)
- break;
- printf((i & 7) ? ", " : ",\n");
- }
- printf("\n");
- printf("};\n");
- printf("\n#endif /* KS_TABLES_H */\n");
-
- for (i = 0; i < ksnum; i++)
- free(info[i].name);
-
- exit(0);
-}
diff --git a/src/keysym.c b/src/keysym.c
index d659354..61de5ba 100644
--- a/src/keysym.c
+++ b/src/keysym.c
@@ -49,47 +49,43 @@
#include "xkbcommon/xkbcommon.h"
#include "utils.h"
-#include "ks_tables.h"
#include "keysym.h"
+#include "ks_tables.h"
+
+static int compare_by_keysym(const void *a, const void *b)
+{
+ const struct name_keysym *key = a, *entry = b;
+ if (key->keysym < entry->keysym)
+ return -1;
+ if (key->keysym > entry->keysym)
+ return 1;
+ return 0;
+}
+
+static int compare_by_name(const void *a, const void *b)
+{
+ const struct name_keysym *key = a, *entry = b;
+ return strcmp(key->name, entry->name);
+}
+
XKB_EXPORT int
xkb_keysym_get_name(xkb_keysym_t ks, char *buffer, size_t size)
{
- int i, n, h, idx;
- const unsigned char *entry;
- unsigned char val1, val2, val3, val4;
+ const struct name_keysym search = { .name = NULL, .keysym = ks };
+ const struct name_keysym *entry;
if ((ks & ((unsigned long) ~0x1fffffff)) != 0) {
snprintf(buffer, size, "Invalid");
return -1;
}
- /* Try to find it in our hash table. */
- if (ks <= 0x1fffffff) {
- val1 = ks >> 24;
- val2 = (ks >> 16) & 0xff;
- val3 = (ks >> 8) & 0xff;
- val4 = ks & 0xff;
- i = ks % VTABLESIZE;
- h = i + 1;
- n = VMAXHASH;
-
- while ((idx = hashKeysym[i])) {
- entry = &_XkeyTable[idx];
-
- if ((entry[0] == val1) && (entry[1] == val2) &&
- (entry[2] == val3) && (entry[3] == val4)) {
- return snprintf(buffer, size, "%s", entry + 4);
- }
-
- if (!--n)
- break;
-
- i += h;
- if (i >= VTABLESIZE)
- i -= VTABLESIZE;
- }
- }
+ entry = bsearch(&search, keysym_to_name,
+ sizeof(keysym_to_name) / sizeof(*keysym_to_name),
+ sizeof(*keysym_to_name),
+ compare_by_keysym);
+ if (entry)
+ return snprintf(buffer, size, "%s", entry->name);
if (ks >= 0x01000100 && ks <= 0x0110ffff)
/* Unnamed Unicode codepoint. */
@@ -102,42 +98,17 @@ xkb_keysym_get_name(xkb_keysym_t ks, char *buffer, size_t size)
XKB_EXPORT xkb_keysym_t
xkb_keysym_from_name(const char *s)
{
- int i, n, h, c, idx;
- uint32_t sig = 0;
- const char *p = s;
+ const struct name_keysym search = { .name = s, .keysym = 0 };
+ const struct name_keysym *entry;
char *tmp;
- const unsigned char *entry;
- unsigned char sig1, sig2;
xkb_keysym_t val;
- while ((c = *p++))
- sig = (sig << 1) + c;
-
- i = sig % KTABLESIZE;
- h = i + 1;
- sig1 = (sig >> 8) & 0xff;
- sig2 = sig & 0xff;
- n = KMAXHASH;
-
- while ((idx = hashString[i])) {
- entry = &_XkeyTable[idx];
-
- if ((entry[0] == sig1) && (entry[1] == sig2) &&
- streq(s, (const char *) entry + 6)) {
- val = (entry[2] << 24) | (entry[3] << 16) |
- (entry[4] << 8) | entry[5];
- if (!val)
- val = XKB_KEY_VoidSymbol;
- return val;
- }
-
- if (!--n)
- break;
-
- i += h;
- if (i >= KTABLESIZE)
- i -= KTABLESIZE;
- }
+ entry = bsearch(&search, name_to_keysym,
+ sizeof(name_to_keysym) / sizeof(*name_to_keysym),
+ sizeof(*name_to_keysym),
+ compare_by_name);
+ if (entry)
+ return entry->keysym;
if (*s == 'U') {
val = strtoul(&s[1], &tmp, 16);
--
1.7.12.2
More information about the wayland-devel
mailing list