Using libpci for reading PCI vendor/device/etc
Daniel Drake
dsd at gentoo.org
Fri Jun 22 16:52:07 PDT 2007
Danny Kukawka wrote:
> I can only point to:
> http://lists.freedesktop.org/archives/hal/2007-June/008807.html
I think this was written before it was understood that libpci could be
nicely used for this task.
> Before we can discuss this: What are the consequences for performance and
> memory usage (of HAL and in the system) with this patch? Need this patch more
> memory, slow it HAL down, what happen on embedded devices (which may have not
> the lib, but a special pci.id file) ... ? We need to see numbers ...
I think we're going a little overboard on a small issue here. The only
functionality that pci.ids provides to hal is pretty names for PCI
devices, right? Surely, if you are very concerned about footprint (for
example for an embedded device build), you would compile hal without PCI
IDs support (this is already possible) and not ship a pci.ids at all?
I guess the other reason for concern is largely inflated memory for
desktop users, e.g. if using libpci/pci.ids.gz were to increase memory
usage significantly, this would certainly be a valid concern.
Anyway, you asked for it, so here are some numbers :)
ids_find_pci (the exported function for this interface) is only ever
called in 2 situations:
- on startup, when enumerating the PCI bus
- on PCI hotplug, or cardbus hotplug (presumably, I don't have any
hardware to test)
I split out the pci.ids parsing code, unmodified, into its own app. I
then recorded the calls that go to ids_find_pci() on HAL startup, and
copied that list into my own app.
In my app, I then call ids_find_pci for all of my devices, repeated 50
times. (this is on a core duo, 2ghz)
The current plaintext pci.ids parsing code from hal executes in 1.577
seconds.
The code proposed in Mike's latest patch executes the same operations
(again on plaintext pci.ids) in 0.104 seconds, i.e. using libpci is 15
times faster.
I then compared the smaps of these 2 processes for when they have done
all their work.
For the plaintext parsing code, 396kb RSS is used on the pci.ids
mapping. This is the size of my pci.ids file. It is all private_clean
memory. libc has 220kb of RSS in it's r-xp segment. No heap is used.
For the libpci code: 536kb RSS is used on the heap (entirely private
dirty), presumably from the libpci code. libc has 256kb of r-xp RSS. The
libz maps occupy a further 28kb of RSS.
Summary: libpci is slightly less efficient in terms of memory usage,
140kb more, and it's dirty memory rather than clean. libc allocations
also increased a little. libz mapping was insignificant (and this wasn't
a gz-compressed pci.ids anyway).
Next experiments were the libpci code, using an uncompressed pci.ids vs
a compressed one.
Performance test: from the last results, uncompressed pci.ids took 0.104
seconds for 50 cycles. pci.ids.gz takes 0.136 seconds for the same work,
i.e. the initial decompress operation took approximately 32ms.
Memory maps: here are the results from the uncompressed run again:
536kb RSS is used on the heap (entirely private dirty), presumably from
the libpci code. libc has 256kb of r-xp RSS. The libz maps occupy a
further 28kb of RSS.
For the pci.ids.gz run:
572kb RSS is used on the heap (entirely private dirty). libc has 260kb
of r-xp RSS. The libz maps occupy a further 52kb RSS.
Summary: gzip incurred a tiny speed penalty, increased heap usage by
36kb. libc RSS usage increased insignificantly. libz maps approximately
doubled in RSS usage, but still remained surprisingly small.
Final experiments: how does Mike's patch affect the size of hald?
Compiling using default CFLAGS="-g -O2" and then stripping the objects
and binaries before measuring size. Default configure options, including
verbose messages.
Originally:
hald: 284332
ids.o: 20800
Now after applying Mike's patch:
hald: 301392
ids.o: 19552
The signficant hald increase is because it is creating a static link
against libpci. We can retest after we start using dynamically linked
libpci.so libraries.
Conclusion:
The plaintext parsing approach doesn't use any dirty memory or the heap.
pcilib approach uses heap dirty memory, and memory usage is slightly
(but not significantly) higher. (I don't know exactly what the
difference between clean and dirty memory is, so I stop my comments there)
The plaintext parsing approach is somewhat inefficient, in that we end
up PFing all the file pages into memory anyway (I would guess this is
very typical for a system) and the code to look up the ID's is slow
(although I very much doubt this was written with performance in mind -
who cares?)
Uncompressed vs compressed pci.ids using libpci code didn't really yield
significant differences -- using pci.ids.gz doesn't appear to be a
resource eater.
Mike's patch is also a nice cleanup if we decide to take this route.
6 files changed, 71 insertions(+), 283 deletions(-)
Further comments and test suggestions are welcome.
Daniel
More information about the hal
mailing list