We have blogged previously about two interesting challenges that present themselves when you get into the nitty-gritty of building a multi-modal search system:
1) Detecting landmasses. Accurate routing needs to be aware of landmasses and islands, and what is connected to what by land, road, ferry and air.This is critical for finding all possible routes between points, and what combination of ferries and flights will make that connection possible.
2) Political borders. Certain borders cannot be easily crossed; suggesting a driving route from South Korea to North Korea is both unrealistic and an embarrassing user experience. Similar complexities exist in places such as Israel, Pakistan, and Afghanistan.
Political data is also important for displaying accurate place names in our geocoder system. For example, North Elizabeth Station is located in New Jersey state in the USA, hence the fully qualified name of North Elizabeth Station, NJ, USA.
This week we launched significant improvements to the accuracy of our internal system for detecting both landmasses and political regions. Our original implementation utilized data from the Natural Earth dataset, however this data was limited by insufficient resolution and some landmasses were missing. We have now transitioned to the more comprehensive Open Street Maps (OSM) planet data.
The difference is illustrated in the maps below of the Puget Sound area near Seattle, with each landmass represented by a unique color. The original data (left) has much smoother, lower resolution coastlines that are missing much of the detail in the landmass shapes. Some of the smaller islands are completely missing. The new OSM data (right) is more detailed and includes the smaller islands.
Steilacoom to Anderson Island is an example query that has been improved. The original, low resolution data caused the routing system to suggest a ferry to nearby Ketron Island instead of Anderson Island (left). The new data fixes this problem (right).
On the political front, Cairo to Amman was a problematic query where Rome2rio suggested driving through Israel (left). The more popular Taba-Aqaba ferry route is now displayed (right) as well as various bus and ferry combinations.
Whilst developing the new technology, Miles on our team also tackled the engineering challenges involved with implementing a system for very fast lookups of this data. Each lookup provides the landmass and political information for a latitude / longitude co-ordinate, and a search on the Rome2rio site requires thousands of such lookups.
Miles learnt a few interesting facts in the process:
- The OSM data contains 497,040 separate landmasses (that is, land with a closed coastline).
- 89% of those landmasses have a coastline perimeter of less than 2 kilometers.
- The longest bridge between two landmasses is the Donghai Bridge near Shanghai.
- The region with the greatest density of separate landmasses is around Horsey Island in the UK.
We will continue geeking out on geospatial data as we keep refining Rome2rio’s search results.