[Roadster] transcript from roadster meeting - Feb 27, 2004

Ian McIntosh ian_mcintosh at linuxadvocate.org
Sun Feb 27 11:54:53 PST 2005


(It was rather spur-of-the-moment; I'll announce future ones on the
list.)

Attendees:
ian - Ian McIntosh (me)
noif - Nathan Fredrickson
sgarrity - Steven Garrity

(All three of us are subscribed to this list in case there are
questions/comments.)

-Ian

======================

(12:51:45) ian: ok. should we just throw out some likely user search
formats?
(12:52:14) noif: the one google gets wrong: "kneeland st, boston ma"
(12:52:22) ian: that's pathetic
(12:52:43) ian: seems like an easy one, given presence of "st"
(12:52:56) noif: is the comma required?
(12:53:21) ian: well the ideal is to not require it right?
(12:54:22) noif: a single word like "Massachusetts" should probably
return results in order of importance: state, street, POI 
(12:54:37) sgarrity: agreed
(12:55:03) noif: yeah, ideally no commas required, I don't know how
feasible that is though
(12:55:19) ian: in general I think putting POI in front of streets is
good, because you can always ADD "st" or "ave" but if the POI is simply
named "Massachusetts" you can't be more specific
(12:56:35) ian: although we can deal with ordering later
(12:56:48) ian: another possibility is to always order by distance 
(12:56:48) noif: ok, so one of the first steps is looking for roads
types (st/ave/place) to split up the search terms
(12:57:00) sgarrity: so, what types of things can be searched for?
(12:57:17) sgarrity: roads/streets, POI, states, etc...
(12:57:21) ian: can you search for cities?
(12:57:29) ian: "cambridge, ma" yeah why not right?
(12:57:34) sgarrity: I think so, yeah.
(12:57:43) ian: what about just "cambridge"
(12:57:45) sgarrity: I would start on a trip to Boston with "Boston"
(12:57:49) noif: yeah, I do that in google all the time
(12:58:21) ian: how do we prioritize cities? there are many Bostons
(12:58:21) noif: and without the comma
(12:58:50) noif: google knows that "boston" means "boston ma"
(12:59:03) sgarrity: offer a list organized by state?
(12:59:04) ian: I think we should say officially commas do nothing--
unless we can think of a case where it helps to disambiguate
(12:59:16) noif: does the tiger data contain any info on the size of a
city?
(12:59:24) ian: good question-- I don't know
(12:59:49) ian: it almost seems like we should hardcode the top 50
cities or something
(12:59:50) sgarrity: yeah, population would be a good criteria to order
by
(13:00:04) ian: I mean, this is "Roadster: US Edition" right now-- we
could totally cheat
(13:00:29) sgarrity: Google has the advantage that they can use
frequency of search, something we won't have.
(13:00:47) ian: true, although we COULD use frequency of selection from
the results
(13:01:11) ian: so the second time you search for "Boston" it will show
the chosen one from last search on top
(13:01:33) sgarrity: that would make for "dancing" search results too
though
(13:01:35) noif: yeah, but it's all local, google aggregates all
searched
(13:02:08) ian: I have two so far:
(13:02:09) ian: kneeland st, boston ma
(13:02:09) ian: cambridge, ma
(13:02:15) ian: oh and just cityname
(13:02:41) ian: what else?
(13:02:48) ian: just street name?
(13:02:53) sgarrity: what about just a street address (no city, state?):
(13:02:54) sgarrity: 1600 Pennsylvania ave.
(13:03:09) noif: do we have a home preference? the location of the
initially shown map, and the default for POI searches
(13:03:22) ian: we don't have one but I think we should
(13:03:29) ian: what do you mean default for POI searches?
(13:04:56) noif: are POI searched within the current map, or the entire
db?
(13:05:20) noif: i can see within the current map being more useful for
POI
(13:05:26) ian: well, I'm still not a big fan of "current map" but that
may be due to it currently being slow when zoomed out
(13:06:07) ian: sgarrity: added that one.
(13:06:50) ian: how do zip codes factor in to this?
(13:07:25) ian: seems just like a further criteria for "16 Kneeland St,
Boston MA" but then, what if the user is wrong? do we not show any
results?
(13:07:33) sgarrity: So, if you search for "Springfield", and it is the
name of 35 cities, 12 counties, and 1500 roads, do we present results
divided by type ("cities:, counties:, roads:, etc.")
(13:08:06) ian: we don't do counties right now, do we need to?
(13:08:36) sgarrity: not sure, I just meant as an example
(13:09:11) ian: what are your thoughts on splitting up results by type?
(13:09:57) noif: I was thinking mixed results, but identifiable by type
icons
(13:10:10) sgarrity: I kinda like it. Either with header/title-dividers
starting with the largest types, or all together, sorted by relevance,
but with icon types.
(13:10:17) sgarrity: noif, yeah, that's cool.
(13:10:22) ian: icon types could be sexy
(13:10:49) ian: what types would there be? just road, city and state?
(and icons from POI sets)
(13:11:23) sgarrity: what about other objects on the maps, lakes, parks,
etc? are those searchable?
(13:11:43) ian: they are now, although not by city/state
(13:12:41) ian: they can span cities but I think we could assign a state
to them
(13:13:23) sgarrity: what state is the Mississippi River in? ;-)
(13:13:50) noif: lakes and parks are POI
(13:14:12) ian: many are polygons too
(13:14:16) noif: why do we need to assign a state to them?
(13:14:41) sgarrity: Wow, did you realize that Google Maps lets you use
+, -, and the arrow keys for navigation?
(13:14:57) ian: yeah I found the arrow key thing, very nice
(13:15:04) noif: the simplest POI is a lat/lon and some text
(13:15:32) ian: well, do we need state info for parks/schools/etc.?
maybe not
(13:16:09) noif: only if it's available, but roadster should not require
it
(13:16:12) ian: should "smith park" match all smith parks in the
country?
(13:16:47) sgarrity: probably, yeah - and you have to specify a state to
narrow down a search.
(13:17:11) sgarrity: search should probably start with the largest item
type in the search and whittle down from there?
(13:17:19) noif: or optionally search for POI in the current map (not
sure how to specify that)
(13:18:15) sgarrity: noif: yeah, that's what I was thinking.
(13:18:32) ian: not sure how to specify what?
(13:18:54) sgarrity: whether you are searching current map or everywhere
(13:19:18) sgarrity: btw, I'm getting this on "make" with the latest
roadster CVS: /usr/bin/ld: cannot find -lmysqld
(13:20:14) noif: yeah, you don't have embedded mysql... I'm going to
make an RPM for use with fedora
(13:20:22) sgarrity: ok, thanks.
(13:20:36) sgarrity: let me know when you do and I'll update the wiki
(13:20:38) noif: building mysql from source takes forever
(13:21:42) ian: so do we show cities before roads? states before cities?
(13:22:02) sgarrity: yeah, I think so, order by size/scale
(13:22:05) ian: "Boston" has tons of results
(13:22:24) ian: there is a "Boston Road" in probably every state
(13:22:46) ian: there's about 20 in MA
(13:23:17) ian: at least one in VT
(13:23:36) sgarrity: yeah, states first, then cities, then roads.
(13:24:05) ian: so "Springfield" will not give you any usable road
results
(13:25:19) ian: perhaps that's ok? the user knows more info. at least
state.
(13:25:44) sgarrity: yeah, you can expect it to know which "Main St."
you want
(13:25:46) ian: for example if they want a road named Springfield but
don't know the suffix
(13:25:59) ian: can? can't?
(13:26:23) sgarrity: can't.
(13:26:34) ian: ideally it would know, you want the local one
(13:26:40) ian: usually. right?
(13:26:50) sgarrity: current map first?
(13:27:10) ian: I'm thinking maybe just sorted by distance
(13:27:22) ian: that would solve a lot of these issues
(13:28:13) ian: it could be sorted by major type (states then cities
then roads) but within those, sorted by distance?
(13:28:55) ian: then "main st." will give you the top X (~100?) roads
named Main St.
(13:29:03) ian: with the closest first
(13:29:04) sgarrity: yeah, that's not bad.
(13:29:06) noif: the problem is the sometimes you care about the current
map, and distance is relevant, and other times you're searching for a
new place and size of the object is the only metric
(13:29:19) ian: noif: example?
(13:29:22) sgarrity: yeah, like "boston"
(13:29:46) ian: wouldn't that work well with Major Type then Distance
sorting?
(13:30:13) ian: you'd see all Boston states :) then cities in order of
distance from you, then roads
(13:30:29) sgarrity: yeah, that does work ok
(13:30:29) ian: or perhaps each one could be sorted differently-- 
(13:30:55) noif: i have a map of boston and I want to find kneeland
st... i search "kneeland"
(13:30:55) noif: or I have some random map and i want to find a city
somewhere named "kneeland"
(13:31:04) ian: if we had pop. info for cities that would be an ok way
to sort. but then the poor folks near Boston, Kentucky or something will
always see Boston, MA first
(13:31:18) sgarrity: noif: Ian
(13:31:28) sgarrity: noif: Ian's proposal works ok for that, doesn't it?
(13:31:48) sgarrity: if you show states and cities first
(13:31:56) noif: yes, distance within types?
(13:32:01) sgarrity: yeah
(13:32:42) noif: that should work since there are relatively few of the
major types
(13:32:59) ian: well there are a bunch of cities :)
(13:33:26) sgarrity: yeah, could be a problem looking for "Springfield"
street
(13:33:28) ian: what about limiting the # of results within each major
type 
(13:33:38) sgarrity: yeah, with "show more" options.
(13:33:49) ian: or just "be more specific, loser" option :)
(13:34:21) noif: or live filtering of the results... don't re-search,
just filter
(13:34:32) ian: I mean, what would a person looking for "springfield"
want? 
(13:34:57) ian: most likely either a local springfield st. or a local
springfield city, no?
(13:34:59) sgarrity: probably a city, which they could pick from a list
with the state with it.
(13:35:10) noif: that depends on whether the current map means anything
to me
(13:35:13) sgarrity: I wouldn't assume local.
(13:35:41) ian: you don't think it's more likely that they want local
results?
(13:36:07) ian: remember that Boston, MA is more local for you than
Boston, Kentucky
(13:36:09) sgarrity: I might open up Roadster with my "home" location on
PEI, and type "Boston" because I'm going on a trip.
(13:36:30) ian: if people are traveling by anything but airplane, I
think distance matters
(13:36:43) sgarrity: but distance from what?
(13:36:58) ian: from current location
(13:37:49) ian: so worst case is you're browsing california and type in
"Boston" and it says "Boston, Oregon?"
(13:37:49) sgarrity: but then I have to zoom out in order to start a
"new" search, no?
(13:38:11) ian: or you type in Boston, MA
(13:38:26) ian: we're only talking about cases where the user is lazy or
doesn't know where something is
(13:38:35) noif: sgarrity: zoom level is not related to location
(13:38:52) sgarrity: ah, I see.
(13:39:13) ian: yeah zoom level would only affect a "current map" search
(13:39:33) sgarrity: gotcha.
(13:39:34) ian: but I'm convinced of the usefulness of that yet :)
(13:39:55) ian: +not
(13:40:39) ian: now is it important to split up city/state/road results?
what about just weighting them differently?
(13:41:09) ian: so cities tend to come up higher, but not if there is a
Boston St. like 20 feet over (then it will come up second...?)
(13:42:06) sgarrity: yeah, that's interesting...
(13:42:07) noif: weights might be hard to figure out, but mixed results
are fine
(13:42:52) ian: Kneeland St, Boston MA
(13:42:52) ian: 16 Kneeland St, Boston MA
(13:42:52) ian: 1600 Pennsylvania Ave
(13:42:52) ian: Cambridge, MA
(13:42:52) ian: Boston
(13:43:01) ian: =========
(13:43:01) ian: that's what I have down so far
(13:44:21) sgarrity: I presume we're not getting into directions for
now?
(13:44:46) ian: as far as searching for roads/addresses, are there more
formats? (what about zip codes?)
(13:44:51) ian: yeah no directions :)
(13:45:08) sgarrity: well, zip code alone: 90210
(13:45:18) sgarrity: and in any combination with the others you posted.
(13:46:02) ian: what would 90210 show? just any old road in 90210 or
would it try to set zoomlevel etc. to see the whole thing?
(13:46:22) sgarrity: zoom to whole thing, ideally
(13:46:52) ian: that is hard right now since we don't have area polygons
for zips (or cities, states)
(13:47:03) ian: but we could select a random road in it and set the
zoomlevel
(13:47:06) ian: it would be fairly good
(13:47:20) sgarrity: what do you have with ZIPs? a point?
(13:47:58) ian: nothing actually
(13:48:11) noif: could find the median of all the road points with that
zip
(13:48:17) ian: TIGER might contain something, but it's not extracted
yet
(13:48:54) ian: yeah we could do that
(13:50:15) ian: so if a user supplies an incorrect ZIP, no results?
(13:50:40) noif: with other search terms?
(13:51:20) ian: yeah like "150 Main St. 02139" when it's in 02140
(13:52:41) ian: one idea is to handle the 0-results case by dropping
some of the filters
(13:54:08) noif: yeah, and drop zip first
(13:54:44) ian: and then what?
(13:54:55) sgarrity: and probably notify the user of dropped criteria
(13:55:08) ian: maybe we should determine what to drop by testing
(13:55:12) ian: sgarrity: yeah
(13:55:31) ian: like a little message above/below results?
(13:55:48) sgarrity: yeah
(13:57:40) ian: Kneeland St, Boston MA
(13:57:40) ian: 16 Kneeland St, Boston MA
(13:57:40) ian: 1600 Pennsylvania Ave
(13:57:40) ian: Cambridge, MA
(13:57:40) ian: Boston
(13:57:40) ian: 90210
(13:57:47) ian: what else do users type in?
(13:58:08) noif: just a state
(13:58:20) ian: ok
(13:59:07) noif: 16 Kneeland, Boston MA
(13:59:27) ian: ok
(13:59:33) sgarrity: also, what if I want "Keenland Street", but I type
in "Kneeland Road"
(13:59:42) sgarrity: or avenue, blvd, etc.
(14:00:00) ian: that could be one thing to auto-drop. but keep in mind
it works well if you leave the suffix off
(14:02:18) ian: so "Cambridge, MA" will match the city first then
"Cambridge St." etc. in MA, right?
(14:02:37) sgarrity: yeah
(14:04:54) ian: so essentially all this is doing is stripping off house
number if present, zip if present, state if present, city if present,
and using the rest as street name
(14:05:20) ian: then perhaps dealing with ambiguities by doing another
search
(14:06:53) ian: does that seem about right?
(14:07:04) noif: yeah, but how do you determine what's a city and what's
a street?
(14:07:21) noif: the street type could be an important clue if present
(14:07:38) sgarrity: I suppose there could be a Springfield Street in
Boston, and a Boston Street in Springfield.
(14:07:56) noif: and there probably is
(14:08:21) ian: well, can a user type "springfield boston"? 
(14:08:38) noif: that's springfield st in boston
(14:08:41) ian: right
(14:08:51) noif: or a POI named springfield in boston
(14:09:01) ian: and that's how it would be interpreted I think.
(14:09:03) noif: or a POI named "springfield boston"
(14:09:09) ian: exactly
(14:09:21) ian: so POI search will have to be a bit more flexible
(14:09:52) ian: it will probably always do several searches and combine
them (or fancy SQL)
(14:10:16) noif: alot of businesses have city/state names as part of
their name
(14:14:32) ian: the only thing we're requiring of users is that they
write things in a certain order
(14:15:32) noif: that's reasonable
(14:17:00) noif: and should make it fairly internationalizable... only
the ordering might change and no reliance on punctuation
(14:21:32) ian: I have no idea about internationalization 
(14:21:38) ian: in the database for example
(14:21:48) ian: it has a road->city->state->country hierarchy
(14:22:21) ian: would that model fit everywhere? I don't know (perhaps
using different names for the objects)
(14:25:42) noif: we might be able to generalize the city, state, country
tables into something generic like "region" and be able to support
different hierarchies. Probably not worth thinking about right now
though...



More information about the Roadster mailing list