How we restored the intercity bus schedule



Like in all normal markets, bus tickets have been selling online for a couple of years. It is not necessary to stand in line at the bus station cash desk to get a ticket. In Russia, up to 60% of the routes (on the best days, the score is somewhat blurred, taking into account the "gray" flights) can be bought online. Including us, Tutu.

The first thing we wanted was to make the schedule also online and make it so that you could buy a ticket for yourself in a couple of clicks. We are accustomed to solve such tasks and somehow a dog has been eaten by rail. Buses at first glance did not look very complicated. All we have to do is to negotiate with bus station automation systems, unload their flights using the API and comb them a bit.

Light job, they said. The project for a couple of days, they said.

Our delusions:

1. All bus stations in the country are automated.
2. Ok, most are automated.
3. Accounting in the notebook is not carried anywhere else.
4. Ok, but there is always some way to find out the schedule remotely.
5. Those that are automated show the same route in the same way.
6. Well, at least those that are automated by the same system, show the same route the same way.
7. Well, where there is no automation, at least there is a schedule.
8. Well, there must be a schedule, because without it, only illegal transportation!
9. There is little illegal traffic.
10. Ok, there are less than 10% of the market.
11. You can buy a ticket right back and forth.
12. There are no one-way routes.
13. Well, at least the buses come back! Once upon a time ...
14. For the year 300 buses can not go to another city and not return.
15. Stops have different unique names.
16. There will be no problems with stopping with the name “Turning” or “Refueling”.
17. Stops within a region have different names.
18. There will be no problems with the stop with the name "Route".
19. Stops within the city have different names.
20. A stop can have only one name.
21. Ok, no more than five alias names.
22. These aliases are also quite unique.
23. In any case, the stops in the official documents have coordinates.
24. Ok, the stop is at least indicated with the city / region.
25. When there is a timetable for stopping, it is known where the bus came from and where it will go.
26. Well, it is always possible to find out from the documents.
27. Damn it, at least there is a schedule for a specific stop!
28. Route city A - city B - it is from the center to the center.
29. Well, at least he leads to town B.
30. Well, at least he once led to the city of B.
31. Buses can not take and disappear in the middle of the route.
32. One flight at one specific time = one bus.
33. The bus can not move faster than 120 kilometers per hour.
34. The city center is a clear point.
35. Okay, this is at least the main bus station.
36. Okay, this is at least some kind of bus station.
37. Ok, this is at least in the city.
38. It cannot be that the schedule is one and the station system’s response on a particular flight is different.
39. If you can buy a ticket there, then on arrival you can buy a return.
40. Round-trip tickets cost the same.
41. Well, they can not differ in price by half.

This is basic. Every day we supplement this list with new trifles. Now let's talk a little more. First, about the basis of the foundations, official flights.

What is a gray flight?


There are different trains on the railways and different carriers and trains quite actively compete with each other. But when you buy a ticket, you buy it eventually in one window, because the infrastructure comes down to Russian Railways. Their rails are their standard, it will not work.

In aviation, carriers are already an order of magnitude larger, but there are large nodes - airports, common standards and a single information system (more precisely, several), in which the unique identifiers of each node are elementary.

Bus informatization reached a long time ago, but from an IT point of view, the market is similar to the platypus. This is a thousand bus stations and ten thousand carriers. Yes, there are large clusters like Mostransavto, but there is also a PI with one rusty bus. Even small bus stations in the villages have their own standards.

And the funniest thing: if there is no illegal transportation on the railways and in the air, the bus market is still largely “gray”. I’m talking about “stopping at a pillar near the bus station” to get more passengers, and about minibuses at the services of fellow travelers, for example.

If the train conductor picks up passengers for cash from the station and puts them in his compartment, that is, of course, a gray carriage. But the flight itself is not “gray”. And in the bus transport is important route and schedule. If you take your bus and carry passengers somewhere in it, then this is either a customized commercial one-time flight (chartering a bus, something like a charter where the list of passengers is known in advance), or non-commercial transportation (buses to shopping centers or buses to offices from the subway).

Got friends in the "Gazelle" and drove to the cottage - you have no right to sell tickets. Or pay a fine of about 200 thousand rubles from the legal entity carrier.

A “gray” flight is either something that looks like a customized flight, but it isn’t (illegal ticket sales), or simply an undocumented trip with passengers, which should be free of charge, but in fact the driver collects money for tickets and does not issue any supporting documents.

And still the “gray” flight can be realized with the use of the infrastructure of the bus station, although recently there are less and less such cases. This is when passengers are disembarked legally, but disembarking is at an illegal stopping point.

If something happens to you during a gray flight, this is your problem. In the case of the official, this is a carrier problem. It should be noted that the mode of work and rest in a gray carrier is not regulated. Also, the “gray” driver does not pass the pre-trip medical examination.

What is "buying a ticket online"?


When Russia bravely stepped into the digital age, the invisible hand of the market showed that it was necessary to sell tickets online. It is elementary more profitable bus stations.

However, since the market is fragmented, as a rule, the following entities come into play - aggregators. There are several large ticketing systems and dozens of smaller ones. And the bus station can acquire its own information system and try to somehow change the data from the big one.

The three largest players in automation in Russia are E-traffic, CEC and Avibus. They automate the bus station and allow you to open the API for aggregators, if the bus station or carrier does not mind. Through their systems you can reach the bus station or carrier tickets. For example, Big Country Buses gives us direct access to our routes - this is how work is directly built with the largest players. But with those SP, who bought a bus in the 80s and goes, it will not work. Or there are carriers that have worked well for 40 years without your Internet and do not understand why they need it at all. When any attempts are made to gather them all into one association (remember, these are tens of thousands of legal entities and individual entrepreneurs), they all wonder and sincerely wonder why this is necessary.

We specifically work with small carriers through the aggregator, which collects their flights at the bus station level of departure.

The next major brake on the market is the ticket printing requirements (more precisely, the itinerary receipt). You can drop into the train by electronic check-in simply with a passport in your hand. On the plane you everywhere will print a ticket at the airport before boarding. But on the bus - look for a printer, such services are not always provided by bus stations. Fortunately, here the paper must be defeated just as it was defeated on the railway. With time.

Now blood and tears


Automation systems, even the largest, work very locally. That is, automate the city, at best, the region. Then, if possible, go to the next.
Therefore, all systems use a very simple (or not at all) geodata structure. Such a parameter at the stop, as coordinates, may not be at all, the area / region / edge is also most often absent.

This means that as soon as you mix data from the two systems, for example, so that you can buy a ticket back and forth on the interregional route, you need to bring all the data on the stops to one type.

As a result, we had to write our own geobase with the correct structure and data set. OpenStreetMap was taken as the basis.

Geo-objects from integrable systems are already compared with geo-objects from the main base and try to become attached to them. In integrable systems there are a lot of stops like “Track”, “Turn”, etc., which are, in fact, part of the route. Names like Aleksandrovka, Mikhailovka, etc. - this is always a surprise, because there are dozens and even hundreds of such villages in Russia.

But we are mathematics! Solution: adjust the engine, which will build a hypothesis about where the bus will have time to get between the already known (tied) points of its route - which one of Mikhailovka? It may turn out that in no way. By the way, this means that either the unrecorded one hid there, or it was Mikhailovka 20 years ago, or something else. We must call and ask the locals.

Further more interesting. Not all systems send stopover data, and passengers may not know where the bus is arriving or from. In different systems, the same stops are called differently. Sometimes points are marked as material points, and this is important, for example, if a passenger has bought a ticket to the city, and the bus stops on the highway on the outskirts of the city and travels further. From the point of view of the information system, the city can be one object, and the passenger another 10 kilometers on foot.

With great difficulty, we received and continue to receive the necessary data. At the same stage in cities where there is no automation, we asked people to help - to send photos of papers from bus stations. These papers played a very important role later: we checked the accuracy of hypotheses, for example, on the reverse movement of buses.

Yes! Reverse traffic is often the task of rebuilding the route. Because A - B is one route of one region, and B - A is another route of another region. And they can be in different information systems. And tickets for them are sold at different stations. And the schedule is not known at the box office, but the driver knows.

The hypothesis was that it was possible to mathematically predict the movement of a bus along a route, based on the idea that buses sometimes return home and that their number on the route is limited. In general, this turned out to be correct. In general, because there are situations where the bus goes to another city, then disappears for a couple of weeks (apparently, on other routes), and then suddenly it turns out in the city of launch. That's just a piece of paper schedules helped to catch such incidents.

More need to glue flights. Because in different systems the same flight can also be taken into account differently. And even leave at a slightly different time. As a result, the schedule can get four flights at 20:00, for example. We had to write the mechanics of something like a perceptual hash — comparing flights from different systems based on 4–5 parameters. A similar scheme was needed for connecting flights A - B - C - sometimes it is one actual flight, but two in information systems.

Schedule changes


In comparison with our favorite electric trains, where everything is automated a hundred times, but often changes, the bus changes much less often. Because for this you need to agree on a new route map and travel time. The procedure for updating the schedule is quite simple, the cache on the search is also not very complex - at least it warms.

What happened



Bus Schedule Rostov-On-Don — Moscow

Now we have the opportunity to show users information about flights and the opportunity to buy a bus ticket for about 40-50% of bus flights that travel around Russia. In 50–60% of users do not find any information about available flights (although they actually go, and they write to us that they go). Therefore, we decided to take the path of closing at least the basic need for information, while expanding the assortment for buying tickets in parallel.

We can restore routes with the same degree of error, which gives the usual difference in the movement of vehicles.

Well, we are recruiting a large database of reviews on all flights, as we do with trains and airplanes. This allows you to very clearly understand the features of each route, bus and what surprises a passenger can expect.

And here is more about how the buses work in Russia in general.

Source: https://habr.com/ru/post/411629/


All Articles