[Roadster] sqlite database spec

Ian McIntosh ian at openanswers.org
Fri Aug 18 10:48:11 PDT 2006


In response to Adam Renner's post, here's one possible database spec
that has been developed jointly by Jeremy Cole, Ofer Achler, and me.

It doesn't require any "spatial" features in the database.

The main problems with our current approach using the MySQL Spatial
Index are:

1) Every point takes 16 bytes.  Storing the roads in the US would take
several gigabytes this way.

2) It doesn't let you select by area AND by type (eg "highway") in any
optimized way, which is related to 3, below.. 

3) For every road segment read in (or just examined), MySQL potentially
does a 'seek' operation on the harddrive.  Seeks are slow, and hundreds
of seeks are unacceptably slow.

Put simply, we need fewer seeks and smaller storage on disk.

======================

Here's the proposal:

* Split the world up into square tiles.
* All objects within that part of the world are stored in the tile.
* Objects that span 2+ tiles are cut.
* Object coordinates are stored as integer offsets from the tile's
origin (with 1 unit = ~5 feet).
* A tile number is derived from its Lan/Lon position in the world,
making it possible to generate a list of tile IDs that cover any given
area of Earth.

CREATE TABLE tile (
  id         	INT4            NOT NULL,
  data          MEDIUMBLOB      NOT NULL,
  PRIMARY KEY (id)
);

CREATE TABLE object_name (
  id		INT3 UNSIGNED   NOT NULL,
  name		VARCHAR(30)     NOT NULL,
  tile_ids	MEDIUMBLOB	NOT NULL,

  PRIMARY KEY (id),
  INDEX (name(7))
);

(The object_name.tile_ids field lists tiles where this roadname shows up
and is used for search.)

The size of each tile is chosen so that each point within the tile only
requires 3 bytes (instead of MySQL's 16) while maintaining 5 foot
precision.

We have the format of tile.data spec'd out, but I'd like to hear
feedback on this approach before diving deeper.

-Ian



More information about the roadster mailing list