hal draft spec

David Zeuthen david at fubar.dk
Tue Sep 23 22:53:15 EEST 2003


On Tue, 2003-09-23 at 04:18, Dave Malcolm wrote:
> Perhaps all the <value> tags should be expressed with a more
> strongly-typed format.  We could then write a schema that would mean we
> could automatically validate deviceinfo files to stop certain kinds of
> error.
> 
> Perhaps things like:
> 	<value type="boolean">
> 	<value type="translatable-string">
> 
> Or somesuch.  It also gives people writing XML editors (like me) a
> chance to provide appropriate widgets.
> 

Validation and editors would be very nice. Can't really do that with
scripts at all..

If we do XML, I can put in a requirement for having a type attribute in
the spec and what values it can assume (e.g. boolean, string, hex,
decimal, version-number etc.). 

> 
> > Most fields are obvious I hope. There is a <match> section to determine
> > if a device matches and a <merge> section for merging properties. The
> > former uses AND and OR to test properties. If the so formed statement
> > evaluates to TRUE the device is identified and we proceed to the <merge>
> > section.
> 
> If I understand this correctly, you have an object with a load of
> key-value mappings, and you test if the device-info matches it.  You can
> then add a load more key-value mappings containing the extra information
> that the device-info file has told you. 
> 
> Since the keys and values have changed, do you now have to re-test the
> object against all device-info files, in case it now matches some of the
> other ones (including the ones you might have tested already)?
> 

Well, no, that wasn't the idea. The <match> section should only match
against bus-specific properties (e.g. usb.idVendor), e.g. properties
available from e.g. a hot-plug agent. In the <match> section you are not
allowed to set any properties. 

Would it be useful to merge information for several device info files
for a single device?

> 
> > The <merge> also uses <if> tags to test for different OS'es.
> > 
> > Note that instead of using the <equal> tag (meaning the property with
> > that key must have that exact value) one could use tags that interpreted
> > the value as a hex-number, decimal-number or version-number
> > (major.minor.sub) and used ranges to check on. regexp tags could be
> > introduced as well.
> 
> Aha - good.  Then we get validation etc as I mentioned above.
> 

Yeah, but having type attributes is better!

> > 
> > So, I quickly realized that if I added a <while> tag I would have my own
> > little programming language.. 
> 
> Yes, but if you don't need such a tag don't add one! :-)
> 

No.. yes.. no.. well, it would be fun!

> > 
> > Now, no one should have to create programming languages for matching
> > devices and device info files!
> 
> > So, in the best interest of using existing widely-deployed technology it
> > would be good to use existing languages. The device-info files could
> > perhaps be python scripts?
> 
> Or you could have an xml tag that invoked an external script, perhaps
> something like this:
> <match>
> 	<and>
> 		<external-script 
> 			filename="test-for-frobnicator.sh"
> 			needs-root="no"/>
> 		<internal-script lang="python">
> some stuff here:
> 	and now we get into 
> 		horrible whitespace arguments</internal-script>
> 	</and>
> </match>
> 

Could do.

> XML gives you validation, and lets you use stylesheets as a query
> language to autogenerate various kinds of useful report (e.g. "find all
> USB drivers, make a table of them displaying the vendor/product IDs
> along with their names in German") if you're that way inclined.
> 

I suppose distributors would like to do that. One could require script
authors to put in a "--list-supported" switch, but would they do this?
(me thinks not).

> It lets you put metadata into the file that can be accessed without
> having to execute the file.
> 
> You could also do neat stuff like embed SVG icons in the <device-info>
> file.

Yes, XML is inherently extensible. It's a very good point.

> 
> > 
> > Security considerations? Well, if we stayed with XML the malicious
> > attacker could just put rouge code in BootProgram, so installing device
> > info files would require root privileges anyway.. We could run the
> > scripts as nobody and in chroot jail
> 
> With the XML approach the attacker can only do bad stuff in the
> device-info file after it has been matched, and only in certain ways. 
> With an arbitrary programming language the attacker can put bad stuff
> into the device-info file at the matching stage: there's no way to
> separate the match from the merge - you're just calling some script.
> 

True, but..

The XML file could have an, in effect, empty <match> section. 

Clearly we can guard ourselves against the trivial empty section, but
there are many ways to express something that is always true. It is
probably impractical to catch situations like this.

> With the XML approach, you can download a device-info file (perhaps
> automatically) and test if it's appropriate to your hardware without
> needing any privileges.
> 

Yeah, but the same things can be said about scripts (see below).

> Also, distributors could provide a web service which runs on a server
> and can determine which device-info is appropriate.  If a user plugs
> something unrecognised into their computer, the web service could be
> invoked, and then the server can do the device-info match, and suggest a
> package to the user.  I believe this would be much easier to do using an
> XML approach than with scripts.
> 
> > Footprint? Only hald and applications wanting to assist users in
> > selecting / composing device info files would link with this. Not libhal
> > applications.
> > 
> > Performance? Good question.. In the worst case we would have to traverse
> > all device info scripts whenever a new device is inserted. There may be
> > thousands of there.
> 
> If you parse all the device-info XML files into a specialised in-memory
> representation, and keep this around (at a cost of perhaps a k per
> device-info file), then the pure XML approach is faster - you've got the
> parse tree of all the programs to hand and you just walk them.  
> 
> Easier to code is to simply keep the DOM tree of the XML, which might
> cost perhaps a few k per device-info file, and is somewhat more
> expensive to walk.
> 
> Plus you can do the kinds of queries I mentioned above.
> 

Yes, this was what Havoc mentioned about on-disk formats. It's a good
point. Remember that .py files also can be stored as .pyc.

> > 
> > I guess, first of all we would embed the python interpreter and extend
> > it with an object that the scripts access. 
> > 
> > Second, the scripts would terminate rather quickly since they do trivial
> > tests in the beginning. Does anyone have any numbers on this?
> > 
> > Let me know what you guys think. Is it too much to embed a script
> > interpreter in hald or too ugly not to use something as standard and
> > accepted as XML?
> > 
> > (Python is of course just an example, but I like Python)
> 
> As you may have guessed, I like XML :-)
> 

Yes I know :-) 

XML is nice, I agree, but it hurts my eyes to look at how clumsy the
device-info file looks like. Don't you agree?

Other reasons for using scripts are

  o  An OEM can ship a single device info file for all of his devices

  o  OEM's can use existing web-service infrastructure for pulling data

  o  Last-to-be-matched scripts[1], shipped with HAL, scans pci.ids
     and usb.ids from /usr/share/hwdata to give some kind of id to the
     device.

Regarding security I'm not that concerned. Today people (not enterprises
though) install RPM's and tarballs from anywhere it sadly seems..
Installing device info files will require root access anyway (to guard
against malicious code in BootProgram). And scripts can always run as
nobody. 

In any event, we should add certificates to what we end up using and at
least sign the HAL supplied device info files. I guess distributors and
enterprises would require something like this.

Hmm.. now I'm even more confused about what to use :-) 

(As George pointed out earlier it's quite key to this right initially)

Cheers,
David

[1] : I suppose that device info files, XML or scripts, would be stored
      in a structure with directories like this where we search in the
      highest number directory first

      /var/cache/hal/device-info-files/10_hal
                                       20_debian   (or other os vendor)
                                       30_dell     (or other PC OEM)
                                       40_site
                                       90_local
                                       99_user





More information about the xdg mailing list