Using libpci for reading PCI vendor/device/etc

Fri Jun 22 16:52:07 PDT 2007

Danny Kukawka wrote:
> I can only point to: 
> http://lists.freedesktop.org/archives/hal/2007-June/008807.html

I think this was written before it was understood that libpci could be 
nicely used for this task.

> Before we can discuss this: What are the consequences for performance and 
> memory usage (of HAL and in the system) with this patch? Need this patch more 
> memory, slow it HAL down, what happen on embedded devices (which may have not 
> the lib, but a special pci.id file) ... ? We need to see numbers ... 

I think we're going a little overboard on a small issue here. The only 
functionality that pci.ids provides to hal is pretty names for PCI 
devices, right? Surely, if you are very concerned about footprint (for 
example for an embedded device build), you would compile hal without PCI 
IDs support (this is already possible) and not ship a pci.ids at all?

I guess the other reason for concern is largely inflated memory for 
desktop users, e.g. if using libpci/pci.ids.gz were to increase memory 
usage significantly, this would certainly be a valid concern.

Anyway, you asked for it, so here are some numbers :)

ids_find_pci (the exported function for this interface) is only ever 
called in 2 situations:
  - on startup, when enumerating the PCI bus
  - on PCI hotplug, or cardbus hotplug (presumably, I don't have any 
hardware to test)

I split out the pci.ids parsing code, unmodified, into its own app. I 
then recorded the calls that go to ids_find_pci() on HAL startup, and 
copied that list into my own app.

In my app, I then call ids_find_pci for all of my devices, repeated 50 
times. (this is on a core duo, 2ghz)

The current plaintext pci.ids parsing code from hal executes in 1.577 
seconds.
The code proposed in Mike's latest patch executes the same operations 
(again on plaintext pci.ids) in 0.104 seconds, i.e. using libpci is 15 
times faster.

I then compared the smaps of these 2 processes for when they have done 
all their work.

For the plaintext parsing code, 396kb RSS is used on the pci.ids 
mapping. This is the size of my pci.ids file. It is all private_clean 
memory. libc has 220kb of RSS in it's r-xp segment. No heap is used.

For the libpci code: 536kb RSS is used on the heap (entirely private 
dirty), presumably from the libpci code. libc has 256kb of r-xp RSS. The 
libz maps occupy a further 28kb of RSS.

Summary: libpci is slightly less efficient in terms of memory usage, 
140kb more, and it's dirty memory rather than clean. libc allocations 
also increased a little. libz mapping was insignificant (and this wasn't 
a gz-compressed pci.ids anyway).

Next experiments were the libpci code, using an uncompressed pci.ids vs 
a compressed one.

Performance test: from the last results, uncompressed pci.ids took 0.104 
seconds for 50 cycles. pci.ids.gz takes 0.136 seconds for the same work, 
i.e. the initial decompress operation took approximately 32ms.

Memory maps: here are the results from the uncompressed run again:
536kb RSS is used on the heap (entirely private dirty), presumably from 
the libpci code. libc has 256kb of r-xp RSS. The libz maps occupy a 
further 28kb of RSS.

For the pci.ids.gz run:
572kb RSS is used on the heap (entirely private dirty). libc has 260kb 
of r-xp RSS. The libz maps occupy a further 52kb RSS.

Summary: gzip incurred a tiny speed penalty, increased heap usage by 
36kb. libc RSS usage increased insignificantly. libz maps approximately 
doubled in RSS usage, but still remained surprisingly small.

Final experiments: how does Mike's patch affect the size of hald?
Compiling using default CFLAGS="-g -O2" and then stripping the objects 
and binaries before measuring size. Default configure options, including 
verbose messages.

Originally:
hald: 284332
ids.o: 20800

Now after applying Mike's patch:
hald: 301392
ids.o: 19552

The signficant hald increase is because it is creating a static link 
against libpci. We can retest after we start using dynamically linked 
libpci.so libraries.

Conclusion:
The plaintext parsing approach doesn't use any dirty memory or the heap. 
pcilib approach uses heap dirty memory, and memory usage is slightly 
(but not significantly) higher. (I don't know exactly what the 
difference between clean and dirty memory is, so I stop my comments there)

The plaintext parsing approach is somewhat inefficient, in that we end 
up PFing all the file pages into memory anyway (I would guess this is 
very typical for a system) and the code to look up the ID's is slow 
(although I very much doubt this was written with performance in mind - 
who cares?)

Uncompressed vs compressed pci.ids using libpci code didn't really yield 
significant differences -- using pci.ids.gz doesn't appear to be a 
resource eater.

Mike's patch is also a nice cleanup if we decide to take this route.
  6 files changed, 71 insertions(+), 283 deletions(-)

Further comments and test suggestions are welcome.

Daniel