Walking around the city wisely: how I did the service to build interesting walking routes

UPD: since the topic has gone well and has shown that there is a demand for such a service, I will develop it further. He started a public VKontakte to collect feedback and publish information about updates https://vk.com/sightsafari

The unfamiliar area of the city, a small amount of free time and the need (or desire) on foot to get to the subway / hotel / station - probably every once in this situation. At the same time, on the one hand, you want to see some beautiful and interesting places, but on the other hand, for a limited time, it does not allow you to dodge too much from the direct route.

The situation is even more complicated if there are no major attractions nearby, which everyone knows about and which could be included in your itinerary after a short search on the Internet. What to do if you are stuck in some Kupchino, about which you only heard that there is better not to get stuck? You have to go through the navigator, hoping that something interesting will meet on the way. However, popular navigators take into account only the distance and travel time, but do not take into account the interesting route. I came across more projects trying to take into account the convenience of the hiking route (leading bypassing noisy highways), but I want to go not only comfortably, but also to see some beauty.

Thinking a little, I decided to take on this task myself. As always, the general idea of the algorithm is quite simple, but the devil is in the details. And in the case of navigation, little things can be quite significant and with a risk to health, because any tourist will hardly be happy when the navigator in search of attractions leads him into the jungle of a semi-abandoned industrial zone for a small memorial tablet (labor, it happened once) .

Description of the algorithm and examples of work under the cut, the link at the end.

main idea

I had the original idea: download the Open Street Map map, parse it, rip out information about all objects that are potentially interesting for pedestrians (they still had to decide on their list), draw some buffer zones around them. We are looking for ways with some standard framework, a little hacking the process of building a navigation graph, so that in these areas the weights of the ribs are lower and thus we organize the attraction of pedestrian routes to them.

No sooner said than done. To find the way, the GraphHopper library was used, which can read OSM maps out of the box, build routes for different types of transport (car, pedestrian, bicycle), has several different algorithms for finding the way (simple search, searching for alternative routes, all sorts of accelerated-optimized options ) and can preprocess the navigation graph to speed up the search (the basic search in the city works very quickly, within a few milliseconds). For the example of work, my native St. Petersburg was chosen - here I could myself assess the quality and interestingness of the constructed routes.

As a result, the basic version of the algorithm was made on the knee for a couple of evenings, but then a fascinating journey began on rakes and trifles, in which, as we know, the devil is hidden and about which I will tell further.

Tourist facilities and OSM issues

In the Open Street Map, each object is a geometry (Node, Way, or Relation) plus a number of string key-value pairs.

Here is the Winter Palace in OSM:

The problem is that since OSM is an open and editable map by participants, standardization is limping on both legs. To designate the same type of objects, a different set of tags and a different combination of objects can be used, some of the tags are considered “canonical” and described on the wiki, but there are still a lot of options, both simple alternative and frankly erroneous, but nevertheless used . As a result, any code that works with OSM (especially all sorts of navigators and renderers) has to take all this into account and contain a lot of code to handle such special cases.

For example, the highway = unspecified tag means not “some kind of an unknown type of road,” as many mappers think, but a very specific type of road according to the European classification, but they mold it anywhere because of the name. Moreover, this type of road assumes the presence of a pedestrian sidewalk or sidewalk, so the navigator builds pedestrian routes along it, while pedestrians do not walk along this real road (this is the carriageway of the street). Or another example: we sometimes use the addr: housename tag for the name of a building, for example, for some reason, the western wing of the General Staff building on Palace Square is named after this tag. While in the guides of the OSM itself it is said that it should be used only in countries where names are used instead of house numbers (in Japan they seem to do this), and for official names of buildings use the name tag and the like.

Another moment bothering me is the marking of green areas. For this purpose there are two different tags, leisure = park and landuse = grass. On the map, they look about the same: just the green zone, slightly different in color. As a result, they are mixed together as one wants. Because of what often the separation lawn between the carriageways of the street becomes a “park” and begins to attract pedestrian routes.

All these nuances had to be discovered as the construction and analysis of routes.
As a set of objects of interest to pedestrians, the following list was finally selected:

Tourist attractions tagged tourism
Green areas. leisure = park, garden. After some reflection, landuse = cemetry was added, cemeteries. On the one hand, the sight is so-so, on the other - for example, on Vasilyevsky Island in St. Petersburg, the only large green area is the cemetery, which is used by the locals instead of the park, and there are no real parks there.
Water: rivers, lakes, ponds. There is a jumble of water, waterway tags and heaps of duplicate values. It is so pleasant to walk along the embankment on a hot day. In any case, I thought so, until I tried to process Smolensk - it suddenly turned out that in the depths of the river bank it was not a beautiful embankment like here in St. Petersburg, but an overgrown and littered waste ground, from which pedestrians would prefer to stay away. But it is not yet possible to distinguish these situations purely from the map data.
Historical buildings and structures, what is tagged as historic. They are usually just beautiful
Any other urban small things, marked by amenity tag. He has a lot of meanings, I chose only a few, for example, a street clock (clock) - often beautiful, religious buildings (place_of_worship), street art every (graffiti) and some others
Pedestrian streets and squares highway = pedestrian

In the course of the study, I realized that in addition to the positive zones that attract pedestrians, we must also add negative ones that repel them. This list so far includes:

Building landuse = construction. It is not pleasant for pedestrians to walk under scaffolding, in dust flying from a building site.
Industrial zones and garages landuse = industrial, garages. Just then that nuance happened with the establishment of a pedestrian (and we at the Institute of Design and Urban ITMO tested this for students who walked along the laid routes and then wrote reviews as part of the study of the pedestrian convenience of the Petrograd district) in the jungle of the Lenpoligraphmash industrial zone. It turned out that there is not the whole block marked with this tag (as is usually done for marking up large industrial zones), but each building is separate.
Ideally, I still want to divert pedestrians from the wide urban highways, where there are dusty, noisy, a lot of cars and usually nothing to watch. But so far it has not turned out to unambiguously detect them. In OSM, there is essentially only the number of car lanes, but this criterion is not enough (many important tourist streets, such as Nevsky Prospect, are also multi-lane)

The same Lenpoligrafmash, containing somewhere in its wilds a monument to the printing press, and where my algorithm dragged the poor student

The importance of sights

It is obvious that the sights are different. There are large, world-famous objects - like the Eiffel Tower or St. Isaac's Cathedral in St. Petersburg, which attract a huge number of tourists, and for the visit of which people can make a decent detour. And there are some small, parochial decorations - some street art, a small sculpture in the yard, which people are ready to inspect only along the way and do not want to drag themselves to them from afar. For the correct construction of interesting and convenient routes, it was necessary to learn how to separate different categories of sights, while all that we have in OSM is some geometry and a set of tags. I had to come up with a set of rules of thumb for assigning the “importance” of a landmark, which subsequently determines the changes in weights in the graph.

Initially, importance is zero and increases when the following conditions are met:

+3 if you have the historic tag - only important historic buildings have it, and even then not all
+3 for the presence of wikipedia or wikidata tags. Usually only important objects have their wiki pages.
+1 for having a link or url - again, not everyone has its own site, but often this tag leads to a page of a directory and there are small objects
+1 for each name tag. The name can be specified in a bunch of ways; there can be any old_name for historical names or names translated into other languages. Again, the presence of many names indicates the sufficient importance of the object (since someone was tired of putting them all down)
building: architecture - architectural style, usually put on all sorts of beautiful architectural monuments

This list is determined empirically and at the very least makes it possible to separate the Winter Palace from the nameless graffiti on the outskirts. As a result, the importance equal to 0 means some local small nameless object (a piece of greenery, graffiti), about 3-4 - this is already something interesting (church, square where you can sit and relax), closer to 10 the sights of the city level begin, same winter palace.

The list is not perfect and relies heavily on OSM data, which is often incomplete. For example, the Narva Gate initially had only one unit of importance, since nothing besides the name was written for them. I had to go to OSM myself and add the names, style, years of construction, height (for correct determination of visibility, what’s next), etc. In general, there is also public benefit in this - in order to improve the quality of routes, I occasionally go to OSM and put missing tags there, which other navigators or programs can then use.

Areas of influence

Sights come in different sizes. Any small sculpture should be viewed from a distance of no more than 5-7 meters. The Bronze Horseman is well visible from 20-30. St. Isaac's Cathedral is one of the tallest buildings in the center of the city - decently visible from 200-300 (by this I mean that the tourist does not have to come close, but quite comfortably enjoy the view from this distance, so it can also be seen from the other bank of the Neva , but without details). How to determine the distance at which a landmark should influence pedestrian routes?

The Bronze Horseman and the dome of St. Isaac's Cathedral in the distance

First, I empirically built radii of visibility. They depend on all available information about the sights and turn it into one of four radii: small 30 meters, medium 100, large 250 and huge 350 meters.

A little apart is the visibility of rivers and parks. For them, I set 30 meters, i.e. roughly corresponds to the width of the embankment or the street around the park. Since it’s rather meaningless to look at the park from afar, you have to go near it.

The type of visibility is determined by the rules:

Point (i.e. OSM-type Point) objects are Small visibility, they are usually small monuments and street art.
But the dotted and tagged with historic is Medium, because these are often large monuments on high pedestals, such as the same Bronze Horseman
Areas less than 20 * 20 meters (way or relation) are Medium
More - Large
If an object has a height tag (height in meters) or building: levels (number of floors), then at a height of more than 50 meters it is considered to be Huge - this is done specifically for Isaac and other large cathedrals and buildings visible from afar

But a problem arose: in the conditions of dense building of the historical center of St. Petersburg, the naive approach with radii did not work, since the real area of visibility of a temple standing in the back of the courtyard was much smaller, in fact it is visible only from a street section directly opposite it. I had to start building honest (well, almost) polygons of visibility.

The Church of St. Catherine stands in the depths of the courtyard, surrounded by houses on all sides:

At first it was necessary to determine the obstacles. Well, everything is simple, I took and read all the polygons with the tag building from OSM data. These will be polygons blocking our visibility. Then I wrote a simple naive algorithm for constructing a point visibility polygon using ray tracing. High accuracy to me there is no need, a dozen rays were enough for a point. At first I, without thinking further, took the centroid of the geometry of the sights, but this did not give the best results for the extended (long and narrow) buildings. Therefore, in the future, for large sights, I began to take three points — a centroid and two points furthest from it and from each other on the outer border. Why I did not begin to build appearance fairly? Because if the algorithm for constructing the field of visibility of a point is trivial (letting rays from a point go in all directions, look where they crossed the nearest obstacles, connect these points), then build an honest appearance of the edge (and as a result of the polygon) is much more difficult (the first one coming to head decision - to build the appearance of the two ends of the edge and combine - obviously wrong).

The result was a good approximation. It builds imperfectly, but for the needs of pedestrian navigation such accuracy is enough for us. The only problem is that it does not take into account the height of buildings, i.e. any small box will block our view of the five-story bell tower. But there's nothing you can do about it - OSM data does not always contain a number of floors, and building visibility volumes in 3D is much more complicated. Although maybe I'll come back to this.

Constructed polygons of visibility for this and neighboring churches

The beauty of the route and how to improve it

So, we have learned to count the importance and visibility of sights and seem to have started building good routes. In any case, this was the case while I was testing it in the central districts of St. Petersburg, which have a very high density of beautifulness per square kilometer.

However, it was worth moving away from the center, as suddenly the algorithm began to recognize its impotence. And the route he built began to coincide with the shortest one. Since the sights in these areas are weeping, they are located far from each other, so when searching for the path using the beauty + distance metric, the contribution of the first term turned out to be near zero, as a result, the algorithm built just the shortest routes.

Of course, one could always say “this is not us, these cities are so boring,” but that would not be very correct. Therefore, I wondered how to evaluate the constructed route, and how to improve it. The simplest solution that immediately comes to mind is let's extend the route by forcibly inserting a detour into it to some sights left aside.

Now in cases if a) the total amount of importance of all the sights of the route is one kilometer less than a certain value or b) the user himself chose to build the most interesting route, my algorithm tries to improve it. For this, the initial route is built, a buffer is taken around it (its thickness is determined by the length, the longer the route is, the longer the hook is allowed to do), several new (not already included in the route) sights with a score> 2 are searched for in this buffer (we do not want making hooks in kilometer to noname public squares) and new routes are laid from the starting point to this intermediate target, and from it to the destination point. At the same time, the length is additionally controlled; as a result, we should get a route no more than twice as long as the shortest path between the starting and ending points.

The first version of the algorithm (left) was powerless to find something interesting and built the shortest route. But the version with the addition of intermediate landmarks (right) included the KV-19 DOT in it, it is in the lower right corner of the route (you cannot see it at this zoom level, but the service will show it in the list and let you find it on the map by clicking on the name) .

The same pillbox. In general, in Kupchino there are enough similar facilities related to the defense of Leningrad, since it was there that the defensive lines of the city were held:

Of course, not every pedestrian will agree to make a detour for the sake of some minor sights in his opinion. That is why the service shows the length of the route compared to the shortest and the list of attractions on it, and the user can decide for himself whether such a route is interesting to him. Plus there is a slider that allows you to reduce (or vice versa increase) the maximum allowable hook.

In practice, I had to face some more problems and oddities. First, with a naive implementation of the buffer around the route, it “got out” for the beginning and the end. And often situations were obtained when the route went past the end point for another couple of hundred meters. Or vice versa, he walked in the opposite direction from the end from the start, and only then returned to the desired direction. Although such routes and allowed to see more attractions, pedestrians do not like when they are taken away strongly away from the target. It was necessary first to “gouge out” the area around the end point on the navigation graph (to make the edges impassable when searching for the intermediate route), and then completely build a buffer not around the entire route line, but only from a certain distance from the start and ending at a certain distance to the end .

The second problem was return routes along the same street. I do not know about you, but I can not stand to return the same route that I went forward. I always prefer to go the other way. And therefore I try to achieve the same behavior from my algorithm. The truth is that with this problem. As a first attempt, I made sure that all the edges (except the last) that participated in the path from the start to the intermediate point are cut out from the graph when searching for a path from intermediate to the end. This avoids a return in exactly the same way, but does not protect from almost the same way — for example, a return on the other side of the same street. Cutting all the edges in the neighborhood often makes finding a way back impossible. In general, there is still something to think about. Although in this regard, my algorithm is already working better than some of the competitors who simply do not think about this topic and easily build dead-end branches of the route.

Work examples

The current implementation is on sightsafari.city
There is almost no geocoder (there is an OSM-th, but it works quite poorly), so it’s better to put the dots directly on the map, with a right click or a long tap. The slider is responsible for the type of route: the leftmost position is looking for the shortest, without taking into account sights, the default third gives a good balance, the rightmost one is always trying to improve the route and often generates rather intricate paths.

Here are a couple of examples. The route from the house where I used to live to the Moskovskaya metro station according to Yandex, is a boring way through the courtyards and small streets:

But the route according to my algorithm. It passes through the edge of Victory Park, past the historical museum, past the Chesme Church and Palace, around the Square of the House of Soviets, past fountains and steep Stalinist architecture.

Chesme Church I myself usually made a small detour to go through in this place, and not through the boring courtyards of Khrushchev straight.

Here is another example: the shortest route from Smolny (where the St. Petersburg administration is located) to the Ploshchad Vosstaniya metro station goes along Suvorovsky Avenue. But there, a little to the side, there is a beautiful Tauride Garden, where my algorithm will offer you a peek.

Conclusion

For the time being, the service only works for St. Petersburg, Pushkin and Smolensk (the zones with accessible navigation are marked with a red dotted line). There is still much to be improved, and for this, first of all, reviews are needed. So try, write reviews on the routes (there is a button above the list of attractions), I hope it will be interesting and useful for someone.

Upon request, I can connect new cities - Moscow, I’m probably afraid, the server will crack (although not all is possible, but only the center, of course), something smaller can be done. The main thing for the region was more or less qualitatively marked OSM, well, and that there were some sights, with which the small towns are not very interesting (in the same Smolensk, which I added for one local comrade there, everything interesting is in the 2-3 places of the city, and you can bypass them without any smart navigation).

UPD: added Moscow within TTK, Ufa, Kaliningrad, Nizhny Novgorod, Kiev, Kazan, Rostov-on-Don, Blagoveshchensk, Saratov, Penza, Odessa, Minsk, Yekaterinburg, after which the server ran out of memory. So applications for new cities are temporarily not accepted, until I figure out how to optimize this business.

UPD 2: The OSM geocoder, as I already wrote, does not work well (knows few addresses, requires structured input data), so it’s better to put points on the map manually rather than enter the address. In the future, something will have to be invented with this, but all normal geocoders (for example, on Yandex) cost too much money for a hobby project, and in the free version they have a restrictive license (for example, you can display search results only on a map of Yandex itself).

UPD 3: I got a public in VK where I can upload my ideas, requests for new cities and where I will write about service updates https://vk.com/public168028574

Source: https://habr.com/ru/post/414433/

All Articles